Error while using Multiprocessing with Cython












0















I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)' I've looked into writing a __reduce__ function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__ method but saw even less that seemed relevant.



Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.



Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?



sequence.py



import multiprocessing

import numpy as np

from dng import DNG


def test_decode():
input_file = np.zeros(3000, dtype=np.intc)

pool = multiprocessing.Pool()
tasks =

for i in range(10):
task = pool.apply_async(thread, (i, input_file))
tasks.append(task)

pool.close()
pool.join()

for task in tasks:
print(task.get())


def thread(i, input_file):
dng = DNG(input_file)
return i, dng.image


if __name__ == '__main__':
test_decode()


dng.py



import numpy as np

import ljpeg


class DNG:
def __init__(self, input_file):

self.image = ljpeg.decode(input_file)


ljpeg.pyx



cpdef int[:] decode(int[:] encoded_image):
encoded_image = __bar(encoded_image, 10000, 1000)
return encoded_image


cdef int[:] __bar(int[:] array, int i, int ii):
for j in range(i):
for jj in range(ii):
array = __foo(array)
return array


cdef int[:] __foo(int[:] array):
array[0] += 1
return array


output:



Traceback (most recent call last):
File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
test_decode()
File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
print(task.get())
File "C:Python36libmultiprocessingpool.py", line 644, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'

Process finished with exit code 1









share|improve this question



























    0















    I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)' I've looked into writing a __reduce__ function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__ method but saw even less that seemed relevant.



    Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.



    Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?



    sequence.py



    import multiprocessing

    import numpy as np

    from dng import DNG


    def test_decode():
    input_file = np.zeros(3000, dtype=np.intc)

    pool = multiprocessing.Pool()
    tasks =

    for i in range(10):
    task = pool.apply_async(thread, (i, input_file))
    tasks.append(task)

    pool.close()
    pool.join()

    for task in tasks:
    print(task.get())


    def thread(i, input_file):
    dng = DNG(input_file)
    return i, dng.image


    if __name__ == '__main__':
    test_decode()


    dng.py



    import numpy as np

    import ljpeg


    class DNG:
    def __init__(self, input_file):

    self.image = ljpeg.decode(input_file)


    ljpeg.pyx



    cpdef int[:] decode(int[:] encoded_image):
    encoded_image = __bar(encoded_image, 10000, 1000)
    return encoded_image


    cdef int[:] __bar(int[:] array, int i, int ii):
    for j in range(i):
    for jj in range(ii):
    array = __foo(array)
    return array


    cdef int[:] __foo(int[:] array):
    array[0] += 1
    return array


    output:



    Traceback (most recent call last):
    File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
    test_decode()
    File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
    print(task.get())
    File "C:Python36libmultiprocessingpool.py", line 644, in get
    raise self._value
    multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'

    Process finished with exit code 1









    share|improve this question

























      0












      0








      0








      I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)' I've looked into writing a __reduce__ function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__ method but saw even less that seemed relevant.



      Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.



      Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?



      sequence.py



      import multiprocessing

      import numpy as np

      from dng import DNG


      def test_decode():
      input_file = np.zeros(3000, dtype=np.intc)

      pool = multiprocessing.Pool()
      tasks =

      for i in range(10):
      task = pool.apply_async(thread, (i, input_file))
      tasks.append(task)

      pool.close()
      pool.join()

      for task in tasks:
      print(task.get())


      def thread(i, input_file):
      dng = DNG(input_file)
      return i, dng.image


      if __name__ == '__main__':
      test_decode()


      dng.py



      import numpy as np

      import ljpeg


      class DNG:
      def __init__(self, input_file):

      self.image = ljpeg.decode(input_file)


      ljpeg.pyx



      cpdef int[:] decode(int[:] encoded_image):
      encoded_image = __bar(encoded_image, 10000, 1000)
      return encoded_image


      cdef int[:] __bar(int[:] array, int i, int ii):
      for j in range(i):
      for jj in range(ii):
      array = __foo(array)
      return array


      cdef int[:] __foo(int[:] array):
      array[0] += 1
      return array


      output:



      Traceback (most recent call last):
      File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
      test_decode()
      File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
      print(task.get())
      File "C:Python36libmultiprocessingpool.py", line 644, in get
      raise self._value
      multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'

      Process finished with exit code 1









      share|improve this question














      I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)' I've looked into writing a __reduce__ function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__ method but saw even less that seemed relevant.



      Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.



      Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?



      sequence.py



      import multiprocessing

      import numpy as np

      from dng import DNG


      def test_decode():
      input_file = np.zeros(3000, dtype=np.intc)

      pool = multiprocessing.Pool()
      tasks =

      for i in range(10):
      task = pool.apply_async(thread, (i, input_file))
      tasks.append(task)

      pool.close()
      pool.join()

      for task in tasks:
      print(task.get())


      def thread(i, input_file):
      dng = DNG(input_file)
      return i, dng.image


      if __name__ == '__main__':
      test_decode()


      dng.py



      import numpy as np

      import ljpeg


      class DNG:
      def __init__(self, input_file):

      self.image = ljpeg.decode(input_file)


      ljpeg.pyx



      cpdef int[:] decode(int[:] encoded_image):
      encoded_image = __bar(encoded_image, 10000, 1000)
      return encoded_image


      cdef int[:] __bar(int[:] array, int i, int ii):
      for j in range(i):
      for jj in range(ii):
      array = __foo(array)
      return array


      cdef int[:] __foo(int[:] array):
      array[0] += 1
      return array


      output:



      Traceback (most recent call last):
      File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
      test_decode()
      File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
      print(task.get())
      File "C:Python36libmultiprocessingpool.py", line 644, in get
      raise self._value
      multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'

      Process finished with exit code 1






      python multiprocessing cython






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 24 '18 at 18:36









      chadat23chadat23

      83




      83
























          1 Answer
          1






          active

          oldest

          votes


















          1














          I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.



          There's no real need to specify the return type of decode (or make it cpdef) since it's only called from Python. At the end of decode return the .base of the memoryview to get the underlying object that it wraps:



          def decode(int[:] encoding_image):
          # ...
          return encoding_image.base





          share|improve this answer
























          • I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

            – DavidW
            Nov 24 '18 at 19:55













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461240%2ferror-while-using-multiprocessing-with-cython%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.



          There's no real need to specify the return type of decode (or make it cpdef) since it's only called from Python. At the end of decode return the .base of the memoryview to get the underlying object that it wraps:



          def decode(int[:] encoding_image):
          # ...
          return encoding_image.base





          share|improve this answer
























          • I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

            – DavidW
            Nov 24 '18 at 19:55


















          1














          I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.



          There's no real need to specify the return type of decode (or make it cpdef) since it's only called from Python. At the end of decode return the .base of the memoryview to get the underlying object that it wraps:



          def decode(int[:] encoding_image):
          # ...
          return encoding_image.base





          share|improve this answer
























          • I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

            – DavidW
            Nov 24 '18 at 19:55
















          1












          1








          1







          I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.



          There's no real need to specify the return type of decode (or make it cpdef) since it's only called from Python. At the end of decode return the .base of the memoryview to get the underlying object that it wraps:



          def decode(int[:] encoding_image):
          # ...
          return encoding_image.base





          share|improve this answer













          I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.



          There's no real need to specify the return type of decode (or make it cpdef) since it's only called from Python. At the end of decode return the .base of the memoryview to get the underlying object that it wraps:



          def decode(int[:] encoding_image):
          # ...
          return encoding_image.base






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 24 '18 at 19:54









          DavidWDavidW

          14.8k12342




          14.8k12342













          • I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

            – DavidW
            Nov 24 '18 at 19:55





















          • I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

            – DavidW
            Nov 24 '18 at 19:55



















          I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

          – DavidW
          Nov 24 '18 at 19:55







          I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!

          – DavidW
          Nov 24 '18 at 19:55






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461240%2ferror-while-using-multiprocessing-with-cython%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          404 Error Contact Form 7 ajax form submitting

          How to know if a Active Directory user can login interactively

          TypeError: fit_transform() missing 1 required positional argument: 'X'