Error while using Multiprocessing with Cython
I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
I've looked into writing a __reduce__
function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__
method but saw even less that seemed relevant.
Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.
Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?
sequence.py
import multiprocessing
import numpy as np
from dng import DNG
def test_decode():
input_file = np.zeros(3000, dtype=np.intc)
pool = multiprocessing.Pool()
tasks =
for i in range(10):
task = pool.apply_async(thread, (i, input_file))
tasks.append(task)
pool.close()
pool.join()
for task in tasks:
print(task.get())
def thread(i, input_file):
dng = DNG(input_file)
return i, dng.image
if __name__ == '__main__':
test_decode()
dng.py
import numpy as np
import ljpeg
class DNG:
def __init__(self, input_file):
self.image = ljpeg.decode(input_file)
ljpeg.pyx
cpdef int[:] decode(int[:] encoded_image):
encoded_image = __bar(encoded_image, 10000, 1000)
return encoded_image
cdef int[:] __bar(int[:] array, int i, int ii):
for j in range(i):
for jj in range(ii):
array = __foo(array)
return array
cdef int[:] __foo(int[:] array):
array[0] += 1
return array
output:
Traceback (most recent call last):
File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
test_decode()
File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
print(task.get())
File "C:Python36libmultiprocessingpool.py", line 644, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
Process finished with exit code 1
python multiprocessing cython
add a comment |
I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
I've looked into writing a __reduce__
function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__
method but saw even less that seemed relevant.
Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.
Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?
sequence.py
import multiprocessing
import numpy as np
from dng import DNG
def test_decode():
input_file = np.zeros(3000, dtype=np.intc)
pool = multiprocessing.Pool()
tasks =
for i in range(10):
task = pool.apply_async(thread, (i, input_file))
tasks.append(task)
pool.close()
pool.join()
for task in tasks:
print(task.get())
def thread(i, input_file):
dng = DNG(input_file)
return i, dng.image
if __name__ == '__main__':
test_decode()
dng.py
import numpy as np
import ljpeg
class DNG:
def __init__(self, input_file):
self.image = ljpeg.decode(input_file)
ljpeg.pyx
cpdef int[:] decode(int[:] encoded_image):
encoded_image = __bar(encoded_image, 10000, 1000)
return encoded_image
cdef int[:] __bar(int[:] array, int i, int ii):
for j in range(i):
for jj in range(ii):
array = __foo(array)
return array
cdef int[:] __foo(int[:] array):
array[0] += 1
return array
output:
Traceback (most recent call last):
File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
test_decode()
File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
print(task.get())
File "C:Python36libmultiprocessingpool.py", line 644, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
Process finished with exit code 1
python multiprocessing cython
add a comment |
I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
I've looked into writing a __reduce__
function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__
method but saw even less that seemed relevant.
Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.
Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?
sequence.py
import multiprocessing
import numpy as np
from dng import DNG
def test_decode():
input_file = np.zeros(3000, dtype=np.intc)
pool = multiprocessing.Pool()
tasks =
for i in range(10):
task = pool.apply_async(thread, (i, input_file))
tasks.append(task)
pool.close()
pool.join()
for task in tasks:
print(task.get())
def thread(i, input_file):
dng = DNG(input_file)
return i, dng.image
if __name__ == '__main__':
test_decode()
dng.py
import numpy as np
import ljpeg
class DNG:
def __init__(self, input_file):
self.image = ljpeg.decode(input_file)
ljpeg.pyx
cpdef int[:] decode(int[:] encoded_image):
encoded_image = __bar(encoded_image, 10000, 1000)
return encoded_image
cdef int[:] __bar(int[:] array, int i, int ii):
for j in range(i):
for jj in range(ii):
array = __foo(array)
return array
cdef int[:] __foo(int[:] array):
array[0] += 1
return array
output:
Traceback (most recent call last):
File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
test_decode()
File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
print(task.get())
File "C:Python36libmultiprocessingpool.py", line 644, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
Process finished with exit code 1
python multiprocessing cython
I wrote a class that uses a Cython utility module. I then tried to speed things up with Multiprocessing to process multiple instances of the class simultaneously but got an error. Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
I've looked into writing a __reduce__
function but about everything I've seen pertains to pickling classes, not methods or modules. I also looked into writing a __cinit__
method but saw even less that seemed relevant.
Below is a simplistic represenntation of the package and module layouts that generates the error (there'll really be hundreds of DNG objects to process, each referencing a unique 20ish MB file, and ljpeg really has hundreds of lines and is called tens to hundreds of times for each DNG). In the example the error can be fixed by removing array type declarations but if I was to do that in the real thing the performance hit would be orders of magnitude larger than the multiprocessing gains.
Can this be fixed without slowing it down appreciably or major refactoring, and if so, how?
sequence.py
import multiprocessing
import numpy as np
from dng import DNG
def test_decode():
input_file = np.zeros(3000, dtype=np.intc)
pool = multiprocessing.Pool()
tasks =
for i in range(10):
task = pool.apply_async(thread, (i, input_file))
tasks.append(task)
pool.close()
pool.join()
for task in tasks:
print(task.get())
def thread(i, input_file):
dng = DNG(input_file)
return i, dng.image
if __name__ == '__main__':
test_decode()
dng.py
import numpy as np
import ljpeg
class DNG:
def __init__(self, input_file):
self.image = ljpeg.decode(input_file)
ljpeg.pyx
cpdef int[:] decode(int[:] encoded_image):
encoded_image = __bar(encoded_image, 10000, 1000)
return encoded_image
cdef int[:] __bar(int[:] array, int i, int ii):
for j in range(i):
for jj in range(ii):
array = __foo(array)
return array
cdef int[:] __foo(int[:] array):
array[0] += 1
return array
output:
Traceback (most recent call last):
File "F:/Documents/Python/threading_multi/sequence.py", line 31, in <module>
test_decode()
File "F:/Documents/Python/threading_multi/sequence.py", line 22, in test_decode
print(task.get())
File "C:Python36libmultiprocessingpool.py", line 644, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '(0, <MemoryView of 'ndarray' at 0x19de04081f0>)'. Reason: 'TypeError('no default __reduce__ due to non-trivial __cinit__',)'
Process finished with exit code 1
python multiprocessing cython
python multiprocessing cython
asked Nov 24 '18 at 18:36
chadat23chadat23
83
83
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.
There's no real need to specify the return type of decode
(or make it cpdef
) since it's only called from Python. At the end of decode
return the .base
of the memoryview to get the underlying object that it wraps:
def decode(int[:] encoding_image):
# ...
return encoding_image.base
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461240%2ferror-while-using-multiprocessing-with-cython%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.
There's no real need to specify the return type of decode
(or make it cpdef
) since it's only called from Python. At the end of decode
return the .base
of the memoryview to get the underlying object that it wraps:
def decode(int[:] encoding_image):
# ...
return encoding_image.base
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
add a comment |
I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.
There's no real need to specify the return type of decode
(or make it cpdef
) since it's only called from Python. At the end of decode
return the .base
of the memoryview to get the underlying object that it wraps:
def decode(int[:] encoding_image):
# ...
return encoding_image.base
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
add a comment |
I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.
There's no real need to specify the return type of decode
(or make it cpdef
) since it's only called from Python. At the end of decode
return the .base
of the memoryview to get the underlying object that it wraps:
def decode(int[:] encoding_image):
# ...
return encoding_image.base
I'm pretty sure that this an error returning the memoryviews generated by each thread to the main thread (because the memoryview can't be pickled). However, the memoryview itself wraps another Python object that probably can be pickled.
There's no real need to specify the return type of decode
(or make it cpdef
) since it's only called from Python. At the end of decode
return the .base
of the memoryview to get the underlying object that it wraps:
def decode(int[:] encoding_image):
# ...
return encoding_image.base
answered Nov 24 '18 at 19:54
DavidWDavidW
14.8k12342
14.8k12342
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
add a comment |
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
I'm afraid I'm not currently able to test this at the moment, but I think it should work - sorry if it doesn't!
– DavidW
Nov 24 '18 at 19:55
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461240%2ferror-while-using-multiprocessing-with-cython%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown