Add column based on different conditions for different columns | python pandas












0














I have a dataframe with 4 columns:



c1        c2        c3      GName
0.221445 0.300534 5.689 KDD
0.001000 0.969000 15.140 ACC
1.000000 0.094000 -0.245 QETF


And dataframe called file of one column:



GName
Abd
kkoew
KDD
pwqh
ACC
dsewf


I need to add new column call label that based on checking the scores in c1, c2 and c3 and GName



So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file; the label = 1, otherwise the label = 0



The conditions of c1 should be > 0.95
c2 should be > 0.50
c3 should be > 15


The output will be like this:



c1        c2        c3      GName label
0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)


I'm struggling with those different conditions, any help please?










share|improve this question



























    0














    I have a dataframe with 4 columns:



    c1        c2        c3      GName
    0.221445 0.300534 5.689 KDD
    0.001000 0.969000 15.140 ACC
    1.000000 0.094000 -0.245 QETF


    And dataframe called file of one column:



    GName
    Abd
    kkoew
    KDD
    pwqh
    ACC
    dsewf


    I need to add new column call label that based on checking the scores in c1, c2 and c3 and GName



    So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file; the label = 1, otherwise the label = 0



    The conditions of c1 should be > 0.95
    c2 should be > 0.50
    c3 should be > 15


    The output will be like this:



    c1        c2        c3      GName label
    0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
    0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
    1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)


    I'm struggling with those different conditions, any help please?










    share|improve this question

























      0












      0








      0







      I have a dataframe with 4 columns:



      c1        c2        c3      GName
      0.221445 0.300534 5.689 KDD
      0.001000 0.969000 15.140 ACC
      1.000000 0.094000 -0.245 QETF


      And dataframe called file of one column:



      GName
      Abd
      kkoew
      KDD
      pwqh
      ACC
      dsewf


      I need to add new column call label that based on checking the scores in c1, c2 and c3 and GName



      So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file; the label = 1, otherwise the label = 0



      The conditions of c1 should be > 0.95
      c2 should be > 0.50
      c3 should be > 15


      The output will be like this:



      c1        c2        c3      GName label
      0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
      0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
      1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)


      I'm struggling with those different conditions, any help please?










      share|improve this question













      I have a dataframe with 4 columns:



      c1        c2        c3      GName
      0.221445 0.300534 5.689 KDD
      0.001000 0.969000 15.140 ACC
      1.000000 0.094000 -0.245 QETF


      And dataframe called file of one column:



      GName
      Abd
      kkoew
      KDD
      pwqh
      ACC
      dsewf


      I need to add new column call label that based on checking the scores in c1, c2 and c3 and GName



      So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file; the label = 1, otherwise the label = 0



      The conditions of c1 should be > 0.95
      c2 should be > 0.50
      c3 should be > 15


      The output will be like this:



      c1        c2        c3      GName label
      0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
      0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
      1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)


      I'm struggling with those different conditions, any help please?







      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 at 23:45









      Sara Wasl

      817




      817
























          1 Answer
          1






          active

          oldest

          votes


















          1














          The way I would do it is this:



          import pandas as pd

          df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
          'c2':[0.300534, 0.969000, 0.094000],
          'c3':[5.689, 15.140, -0.245],
          'GName':['KDD', 'ACC', 'QETF']})
          file = pd.DataFrame({'GName':['KDD', 'ACC']})

          conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
          conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
          df['label'] = 0
          df.loc[conditions, 'label'] = 1

          >>> df
          c1 c2 c3 GName label
          0 0.221445 0.300534 5.689 KDD 0
          1 0.001000 0.969000 15.140 ACC 1
          2 1.000000 0.094000 -0.245 QETF 0


          It would be nice if you could include code to generate your dataframe in your question, as well.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403293%2fadd-column-based-on-different-conditions-for-different-columns-python-pandas%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            The way I would do it is this:



            import pandas as pd

            df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
            'c2':[0.300534, 0.969000, 0.094000],
            'c3':[5.689, 15.140, -0.245],
            'GName':['KDD', 'ACC', 'QETF']})
            file = pd.DataFrame({'GName':['KDD', 'ACC']})

            conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
            conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
            df['label'] = 0
            df.loc[conditions, 'label'] = 1

            >>> df
            c1 c2 c3 GName label
            0 0.221445 0.300534 5.689 KDD 0
            1 0.001000 0.969000 15.140 ACC 1
            2 1.000000 0.094000 -0.245 QETF 0


            It would be nice if you could include code to generate your dataframe in your question, as well.






            share|improve this answer


























              1














              The way I would do it is this:



              import pandas as pd

              df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
              'c2':[0.300534, 0.969000, 0.094000],
              'c3':[5.689, 15.140, -0.245],
              'GName':['KDD', 'ACC', 'QETF']})
              file = pd.DataFrame({'GName':['KDD', 'ACC']})

              conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
              conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
              df['label'] = 0
              df.loc[conditions, 'label'] = 1

              >>> df
              c1 c2 c3 GName label
              0 0.221445 0.300534 5.689 KDD 0
              1 0.001000 0.969000 15.140 ACC 1
              2 1.000000 0.094000 -0.245 QETF 0


              It would be nice if you could include code to generate your dataframe in your question, as well.






              share|improve this answer
























                1












                1








                1






                The way I would do it is this:



                import pandas as pd

                df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
                'c2':[0.300534, 0.969000, 0.094000],
                'c3':[5.689, 15.140, -0.245],
                'GName':['KDD', 'ACC', 'QETF']})
                file = pd.DataFrame({'GName':['KDD', 'ACC']})

                conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
                conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
                df['label'] = 0
                df.loc[conditions, 'label'] = 1

                >>> df
                c1 c2 c3 GName label
                0 0.221445 0.300534 5.689 KDD 0
                1 0.001000 0.969000 15.140 ACC 1
                2 1.000000 0.094000 -0.245 QETF 0


                It would be nice if you could include code to generate your dataframe in your question, as well.






                share|improve this answer












                The way I would do it is this:



                import pandas as pd

                df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
                'c2':[0.300534, 0.969000, 0.094000],
                'c3':[5.689, 15.140, -0.245],
                'GName':['KDD', 'ACC', 'QETF']})
                file = pd.DataFrame({'GName':['KDD', 'ACC']})

                conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
                conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
                df['label'] = 0
                df.loc[conditions, 'label'] = 1

                >>> df
                c1 c2 c3 GName label
                0 0.221445 0.300534 5.689 KDD 0
                1 0.001000 0.969000 15.140 ACC 1
                2 1.000000 0.094000 -0.245 QETF 0


                It would be nice if you could include code to generate your dataframe in your question, as well.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 21 at 0:12









                CJ59

                1,2171214




                1,2171214






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403293%2fadd-column-based-on-different-conditions-for-different-columns-python-pandas%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    404 Error Contact Form 7 ajax form submitting

                    How to know if a Active Directory user can login interactively

                    TypeError: fit_transform() missing 1 required positional argument: 'X'