Python multiprocessing.Pool slower than sequential execution
I'm trying to write a program that operates on a long list of elements (the list is called training_set in the code example). Each row of the list contains two numbers that have to be found on another list called IDs: hence, my program iterates on training_set's rows and, for each of them, finds the corresponding 2 numbers in IDs and then performs some more computation (not shown in the code).
With sequential execution, this requires about 300s. Since each row of training_set is indipendend from the others, I was thinking of parallelizing the computation by splitting the input by the #cpu_cores, using multiprocessing.Pool.
However, the parallelized version is slower than the sequential one.
num_procs = int(multiprocessing.cpu_count())
with open("training_set.txt", "r") as f:
reader = csv.reader(f)
training_set = list(reader)
training_set = [element[0].split(" ") for element in training_set]
with open("node_information.csv", "r") as f:
reader = csv.reader(f)
node_info = list(reader)
IDs = [element[0] for element in node_info]
batch_size = int(len(training_set)/num_procs)
inputs=
# split list into batches to feed to the different threads
for i in range(num_procs):
if i == (num_procs-1): inputs.append(list(training_set[int(i*batch_size):(len(training_set)-1)]))
else: inputs.append(list(training_set[int(i*batch_size): int((i+1)*batch_size)]))
def init(IDs):
global identities
identities = copy.deepcopy(IDs)
def analyze_pairs(partialList):
pairsSet = copy.deepcopy(partialList)
for i in range(len(pairsSet)):
source = pairsSet[i][0] # an ID of edges
target = pairsSet[i][1] # an ID of edges
## find an index maching to the source ID
index_source = identities.index(source)
index_target = identities.index(target)
***additional computation***
if __name__ == '__main__':
pool = Pool(num_procs, initializer=init, initargs=(IDs,))
training_features = pool.map(analyze_pairs, inputs)
I'm not showing the rest of the code of the for loop (at the end of analyze_pairs()) because the problem persists even if i remove that code, hence it's not there that the problem resides.)
I know that there are already many questions on this topic, but I couldn't find a solution for my case.
I don't think that here parallelism introduces more overhead than speedup because the input of each thread is large (on a 8 threads cpu, each thread should take at least 35s) and there is no explicit message passing. I also tried to use copy.deepcopy to make sure that each thread works on a separate list(althought it shouldn't be a problem since each thread only performs 'read' actions on the list), but it didn't work.
What could the problem be? Thanks in advance.
python multiprocessing threadpool
add a comment |
I'm trying to write a program that operates on a long list of elements (the list is called training_set in the code example). Each row of the list contains two numbers that have to be found on another list called IDs: hence, my program iterates on training_set's rows and, for each of them, finds the corresponding 2 numbers in IDs and then performs some more computation (not shown in the code).
With sequential execution, this requires about 300s. Since each row of training_set is indipendend from the others, I was thinking of parallelizing the computation by splitting the input by the #cpu_cores, using multiprocessing.Pool.
However, the parallelized version is slower than the sequential one.
num_procs = int(multiprocessing.cpu_count())
with open("training_set.txt", "r") as f:
reader = csv.reader(f)
training_set = list(reader)
training_set = [element[0].split(" ") for element in training_set]
with open("node_information.csv", "r") as f:
reader = csv.reader(f)
node_info = list(reader)
IDs = [element[0] for element in node_info]
batch_size = int(len(training_set)/num_procs)
inputs=
# split list into batches to feed to the different threads
for i in range(num_procs):
if i == (num_procs-1): inputs.append(list(training_set[int(i*batch_size):(len(training_set)-1)]))
else: inputs.append(list(training_set[int(i*batch_size): int((i+1)*batch_size)]))
def init(IDs):
global identities
identities = copy.deepcopy(IDs)
def analyze_pairs(partialList):
pairsSet = copy.deepcopy(partialList)
for i in range(len(pairsSet)):
source = pairsSet[i][0] # an ID of edges
target = pairsSet[i][1] # an ID of edges
## find an index maching to the source ID
index_source = identities.index(source)
index_target = identities.index(target)
***additional computation***
if __name__ == '__main__':
pool = Pool(num_procs, initializer=init, initargs=(IDs,))
training_features = pool.map(analyze_pairs, inputs)
I'm not showing the rest of the code of the for loop (at the end of analyze_pairs()) because the problem persists even if i remove that code, hence it's not there that the problem resides.)
I know that there are already many questions on this topic, but I couldn't find a solution for my case.
I don't think that here parallelism introduces more overhead than speedup because the input of each thread is large (on a 8 threads cpu, each thread should take at least 35s) and there is no explicit message passing. I also tried to use copy.deepcopy to make sure that each thread works on a separate list(althought it shouldn't be a problem since each thread only performs 'read' actions on the list), but it didn't work.
What could the problem be? Thanks in advance.
python multiprocessing threadpool
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the__main__
?
– Sam Mason
Nov 25 '18 at 23:27
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
my guess is still around file loading time; have you tried addingprint
statements at various points in the code to figure out where your time is going?
– Sam Mason
Nov 26 '18 at 21:51
add a comment |
I'm trying to write a program that operates on a long list of elements (the list is called training_set in the code example). Each row of the list contains two numbers that have to be found on another list called IDs: hence, my program iterates on training_set's rows and, for each of them, finds the corresponding 2 numbers in IDs and then performs some more computation (not shown in the code).
With sequential execution, this requires about 300s. Since each row of training_set is indipendend from the others, I was thinking of parallelizing the computation by splitting the input by the #cpu_cores, using multiprocessing.Pool.
However, the parallelized version is slower than the sequential one.
num_procs = int(multiprocessing.cpu_count())
with open("training_set.txt", "r") as f:
reader = csv.reader(f)
training_set = list(reader)
training_set = [element[0].split(" ") for element in training_set]
with open("node_information.csv", "r") as f:
reader = csv.reader(f)
node_info = list(reader)
IDs = [element[0] for element in node_info]
batch_size = int(len(training_set)/num_procs)
inputs=
# split list into batches to feed to the different threads
for i in range(num_procs):
if i == (num_procs-1): inputs.append(list(training_set[int(i*batch_size):(len(training_set)-1)]))
else: inputs.append(list(training_set[int(i*batch_size): int((i+1)*batch_size)]))
def init(IDs):
global identities
identities = copy.deepcopy(IDs)
def analyze_pairs(partialList):
pairsSet = copy.deepcopy(partialList)
for i in range(len(pairsSet)):
source = pairsSet[i][0] # an ID of edges
target = pairsSet[i][1] # an ID of edges
## find an index maching to the source ID
index_source = identities.index(source)
index_target = identities.index(target)
***additional computation***
if __name__ == '__main__':
pool = Pool(num_procs, initializer=init, initargs=(IDs,))
training_features = pool.map(analyze_pairs, inputs)
I'm not showing the rest of the code of the for loop (at the end of analyze_pairs()) because the problem persists even if i remove that code, hence it's not there that the problem resides.)
I know that there are already many questions on this topic, but I couldn't find a solution for my case.
I don't think that here parallelism introduces more overhead than speedup because the input of each thread is large (on a 8 threads cpu, each thread should take at least 35s) and there is no explicit message passing. I also tried to use copy.deepcopy to make sure that each thread works on a separate list(althought it shouldn't be a problem since each thread only performs 'read' actions on the list), but it didn't work.
What could the problem be? Thanks in advance.
python multiprocessing threadpool
I'm trying to write a program that operates on a long list of elements (the list is called training_set in the code example). Each row of the list contains two numbers that have to be found on another list called IDs: hence, my program iterates on training_set's rows and, for each of them, finds the corresponding 2 numbers in IDs and then performs some more computation (not shown in the code).
With sequential execution, this requires about 300s. Since each row of training_set is indipendend from the others, I was thinking of parallelizing the computation by splitting the input by the #cpu_cores, using multiprocessing.Pool.
However, the parallelized version is slower than the sequential one.
num_procs = int(multiprocessing.cpu_count())
with open("training_set.txt", "r") as f:
reader = csv.reader(f)
training_set = list(reader)
training_set = [element[0].split(" ") for element in training_set]
with open("node_information.csv", "r") as f:
reader = csv.reader(f)
node_info = list(reader)
IDs = [element[0] for element in node_info]
batch_size = int(len(training_set)/num_procs)
inputs=
# split list into batches to feed to the different threads
for i in range(num_procs):
if i == (num_procs-1): inputs.append(list(training_set[int(i*batch_size):(len(training_set)-1)]))
else: inputs.append(list(training_set[int(i*batch_size): int((i+1)*batch_size)]))
def init(IDs):
global identities
identities = copy.deepcopy(IDs)
def analyze_pairs(partialList):
pairsSet = copy.deepcopy(partialList)
for i in range(len(pairsSet)):
source = pairsSet[i][0] # an ID of edges
target = pairsSet[i][1] # an ID of edges
## find an index maching to the source ID
index_source = identities.index(source)
index_target = identities.index(target)
***additional computation***
if __name__ == '__main__':
pool = Pool(num_procs, initializer=init, initargs=(IDs,))
training_features = pool.map(analyze_pairs, inputs)
I'm not showing the rest of the code of the for loop (at the end of analyze_pairs()) because the problem persists even if i remove that code, hence it's not there that the problem resides.)
I know that there are already many questions on this topic, but I couldn't find a solution for my case.
I don't think that here parallelism introduces more overhead than speedup because the input of each thread is large (on a 8 threads cpu, each thread should take at least 35s) and there is no explicit message passing. I also tried to use copy.deepcopy to make sure that each thread works on a separate list(althought it shouldn't be a problem since each thread only performs 'read' actions on the list), but it didn't work.
What could the problem be? Thanks in advance.
python multiprocessing threadpool
python multiprocessing threadpool
asked Nov 25 '18 at 22:16
gnignignigni
61
61
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the__main__
?
– Sam Mason
Nov 25 '18 at 23:27
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
my guess is still around file loading time; have you tried addingprint
statements at various points in the code to figure out where your time is going?
– Sam Mason
Nov 26 '18 at 21:51
add a comment |
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the__main__
?
– Sam Mason
Nov 25 '18 at 23:27
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
my guess is still around file loading time; have you tried addingprint
statements at various points in the code to figure out where your time is going?
– Sam Mason
Nov 26 '18 at 21:51
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the
__main__
?– Sam Mason
Nov 25 '18 at 23:27
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the
__main__
?– Sam Mason
Nov 25 '18 at 23:27
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
my guess is still around file loading time; have you tried adding
print
statements at various points in the code to figure out where your time is going?– Sam Mason
Nov 26 '18 at 21:51
my guess is still around file loading time; have you tried adding
print
statements at various points in the code to figure out where your time is going?– Sam Mason
Nov 26 '18 at 21:51
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53472551%2fpython-multiprocessing-pool-slower-than-sequential-execution%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53472551%2fpython-multiprocessing-pool-slower-than-sequential-execution%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
you might be loading the files multiple times which could take a long time; maybe put the file loading code into the
__main__
?– Sam Mason
Nov 25 '18 at 23:27
@SamMason i tried but it didn't help, anyway thank you for your answer ;)
– gnigni
Nov 25 '18 at 23:41
my guess is still around file loading time; have you tried adding
print
statements at various points in the code to figure out where your time is going?– Sam Mason
Nov 26 '18 at 21:51