How to run spacy algorithm on multiple cores












1















I have two lists. List A contains 500 words. List B contains 10000 words. I am trying to find similar words for List A with respect to B.I am using Spacy's similarity function.



The problem I am facing is that it takes ages to compute. I am new to multicore usage, hence request help.



How do I speed up the running of the algorithm through multicore processing in python?



The following is my code.



ListA =['Dell', 'GPU',......] #500 words lists
ListB = ['Docker','Ec2'.......] #10000 words lists
s_words =
for token1 in ListB:
list_to_sort =
for token2 in ListA:
list_to_sort.append((token1, token2,nlp(str(token1)).similarity(nlp(str(token2)))))
sorted_list = sorted(list_to_sort, key = itemgetter(2), reverse=True)[0][:2]
s_words.append(sorted_list)









share|improve this question





























    1















    I have two lists. List A contains 500 words. List B contains 10000 words. I am trying to find similar words for List A with respect to B.I am using Spacy's similarity function.



    The problem I am facing is that it takes ages to compute. I am new to multicore usage, hence request help.



    How do I speed up the running of the algorithm through multicore processing in python?



    The following is my code.



    ListA =['Dell', 'GPU',......] #500 words lists
    ListB = ['Docker','Ec2'.......] #10000 words lists
    s_words =
    for token1 in ListB:
    list_to_sort =
    for token2 in ListA:
    list_to_sort.append((token1, token2,nlp(str(token1)).similarity(nlp(str(token2)))))
    sorted_list = sorted(list_to_sort, key = itemgetter(2), reverse=True)[0][:2]
    s_words.append(sorted_list)









    share|improve this question



























      1












      1








      1








      I have two lists. List A contains 500 words. List B contains 10000 words. I am trying to find similar words for List A with respect to B.I am using Spacy's similarity function.



      The problem I am facing is that it takes ages to compute. I am new to multicore usage, hence request help.



      How do I speed up the running of the algorithm through multicore processing in python?



      The following is my code.



      ListA =['Dell', 'GPU',......] #500 words lists
      ListB = ['Docker','Ec2'.......] #10000 words lists
      s_words =
      for token1 in ListB:
      list_to_sort =
      for token2 in ListA:
      list_to_sort.append((token1, token2,nlp(str(token1)).similarity(nlp(str(token2)))))
      sorted_list = sorted(list_to_sort, key = itemgetter(2), reverse=True)[0][:2]
      s_words.append(sorted_list)









      share|improve this question
















      I have two lists. List A contains 500 words. List B contains 10000 words. I am trying to find similar words for List A with respect to B.I am using Spacy's similarity function.



      The problem I am facing is that it takes ages to compute. I am new to multicore usage, hence request help.



      How do I speed up the running of the algorithm through multicore processing in python?



      The following is my code.



      ListA =['Dell', 'GPU',......] #500 words lists
      ListB = ['Docker','Ec2'.......] #10000 words lists
      s_words =
      for token1 in ListB:
      list_to_sort =
      for token2 in ListA:
      list_to_sort.append((token1, token2,nlp(str(token1)).similarity(nlp(str(token2)))))
      sorted_list = sorted(list_to_sort, key = itemgetter(2), reverse=True)[0][:2]
      s_words.append(sorted_list)






      python-3.x parallel-processing nlp python-multiprocessing spacy






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 25 '18 at 16:51







      Ridhima Kumar

















      asked Nov 25 '18 at 9:53









      Ridhima KumarRidhima Kumar

      358




      358
























          1 Answer
          1






          active

          oldest

          votes


















          1














          You can use multiprocessing package. This I hope will reduce your time significantly. See here for a sample code.






          share|improve this answer
























          • Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

            – Ridhima Kumar
            Nov 25 '18 at 15:33











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466369%2fhow-to-run-spacy-algorithm-on-multiple-cores%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          You can use multiprocessing package. This I hope will reduce your time significantly. See here for a sample code.






          share|improve this answer
























          • Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

            – Ridhima Kumar
            Nov 25 '18 at 15:33
















          1














          You can use multiprocessing package. This I hope will reduce your time significantly. See here for a sample code.






          share|improve this answer
























          • Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

            – Ridhima Kumar
            Nov 25 '18 at 15:33














          1












          1








          1







          You can use multiprocessing package. This I hope will reduce your time significantly. See here for a sample code.






          share|improve this answer













          You can use multiprocessing package. This I hope will reduce your time significantly. See here for a sample code.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 25 '18 at 15:22









          rishirishi

          1,14331841




          1,14331841













          • Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

            – Ridhima Kumar
            Nov 25 '18 at 15:33



















          • Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

            – Ridhima Kumar
            Nov 25 '18 at 15:33

















          Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

          – Ridhima Kumar
          Nov 25 '18 at 15:33





          Thanks for your answer @rishi. Could you be more specific about the sample code. In my case what are the arguments to be passed in map function results = pool.map(test, data) Would be really helpful if you can show a working snippet in perspective of my code above.

          – Ridhima Kumar
          Nov 25 '18 at 15:33




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53466369%2fhow-to-run-spacy-algorithm-on-multiple-cores%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          404 Error Contact Form 7 ajax form submitting

          How to know if a Active Directory user can login interactively

          TypeError: fit_transform() missing 1 required positional argument: 'X'