How to get sum of values in column based on variables in other column separately?











up vote
1
down vote

favorite












I have a table data like below



abc 1   1   1
bcd 2 2 4
bcd 12 23 3
cde 3 5 5
cde 3 4 5
cde 14 2 25


I want the sum of values in each column based on variables in first column and desired result is like below:



abc 1   1   1
bcd 14 25 7
cde 20 11 35


I used awk command like this



awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath


and I got a result below:



abc 3
bcd 46
cde 66


I think the end of my code is wrong but don't know how to fix it.
Please help me with some directions to fix the code...










share|improve this question









New contributor




awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    up vote
    1
    down vote

    favorite












    I have a table data like below



    abc 1   1   1
    bcd 2 2 4
    bcd 12 23 3
    cde 3 5 5
    cde 3 4 5
    cde 14 2 25


    I want the sum of values in each column based on variables in first column and desired result is like below:



    abc 1   1   1
    bcd 14 25 7
    cde 20 11 35


    I used awk command like this



    awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath


    and I got a result below:



    abc 3
    bcd 46
    cde 66


    I think the end of my code is wrong but don't know how to fix it.
    Please help me with some directions to fix the code...










    share|improve this question









    New contributor




    awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I have a table data like below



      abc 1   1   1
      bcd 2 2 4
      bcd 12 23 3
      cde 3 5 5
      cde 3 4 5
      cde 14 2 25


      I want the sum of values in each column based on variables in first column and desired result is like below:



      abc 1   1   1
      bcd 14 25 7
      cde 20 11 35


      I used awk command like this



      awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath


      and I got a result below:



      abc 3
      bcd 46
      cde 66


      I think the end of my code is wrong but don't know how to fix it.
      Please help me with some directions to fix the code...










      share|improve this question









      New contributor




      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I have a table data like below



      abc 1   1   1
      bcd 2 2 4
      bcd 12 23 3
      cde 3 5 5
      cde 3 4 5
      cde 14 2 25


      I want the sum of values in each column based on variables in first column and desired result is like below:



      abc 1   1   1
      bcd 14 25 7
      cde 20 11 35


      I used awk command like this



      awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath


      and I got a result below:



      abc 3
      bcd 46
      cde 66


      I think the end of my code is wrong but don't know how to fix it.
      Please help me with some directions to fix the code...







      linux bash shell-script awk table






      share|improve this question









      New contributor




      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 53 mins ago









      Scott

      6,69742650




      6,69742650






      New contributor




      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 1 hour ago









      awkprob

      61




      61




      New contributor




      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      awkprob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          1
          down vote













          So long as your file is tab-delimited, datamash is a good fit for this.



          $ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
          abc 1 1 1
          bcd 14 25 7
          cde 20 11 35


          Datamash can also work with non-tabs, if you specify -t <delimiter>. But tabs seem closest to the example input you have provided.



          Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:



          sed -i 's/ +/t/g' tablefilepath





          share|improve this answer

















          • 1




            At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
            – steeldriver
            1 hour ago












          • @steeldriver Thanks!
            – cryptarch
            26 mins ago


















          up vote
          1
          down vote













          Using awk summing up the columns 2-4 based on 1.



          awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i]  }' file





          share|improve this answer




























            up vote
            1
            down vote













            You were fairly close. 
            You see what you were doing wrong, don't you? 
            You were keeping one total for each column 1 value,
            when you should have been keeping three.



            This is similar to Inian's answer,
            but trivially extendable to handle any number of columns:



            awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
            END {for(i in a) {
            printf "%s", i
            for (n=2; n<=4; ++n) printf "t%s", a[i][n]
            printf "n"
            }
            }'


            Rather than keep three arrays, like Inian's answer,
            it keeps a two-dimensional array.






            share|improve this answer





















            • +1 for the simplification. Well done on the multi-dim arrays
              – Inian
              51 mins ago













            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "106"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            awkprob is a new contributor. Be nice, and check out our Code of Conduct.










             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484370%2fhow-to-get-sum-of-values-in-column-based-on-variables-in-other-column-separately%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote













            So long as your file is tab-delimited, datamash is a good fit for this.



            $ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
            abc 1 1 1
            bcd 14 25 7
            cde 20 11 35


            Datamash can also work with non-tabs, if you specify -t <delimiter>. But tabs seem closest to the example input you have provided.



            Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:



            sed -i 's/ +/t/g' tablefilepath





            share|improve this answer

















            • 1




              At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
              – steeldriver
              1 hour ago












            • @steeldriver Thanks!
              – cryptarch
              26 mins ago















            up vote
            1
            down vote













            So long as your file is tab-delimited, datamash is a good fit for this.



            $ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
            abc 1 1 1
            bcd 14 25 7
            cde 20 11 35


            Datamash can also work with non-tabs, if you specify -t <delimiter>. But tabs seem closest to the example input you have provided.



            Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:



            sed -i 's/ +/t/g' tablefilepath





            share|improve this answer

















            • 1




              At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
              – steeldriver
              1 hour ago












            • @steeldriver Thanks!
              – cryptarch
              26 mins ago













            up vote
            1
            down vote










            up vote
            1
            down vote









            So long as your file is tab-delimited, datamash is a good fit for this.



            $ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
            abc 1 1 1
            bcd 14 25 7
            cde 20 11 35


            Datamash can also work with non-tabs, if you specify -t <delimiter>. But tabs seem closest to the example input you have provided.



            Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:



            sed -i 's/ +/t/g' tablefilepath





            share|improve this answer












            So long as your file is tab-delimited, datamash is a good fit for this.



            $ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
            abc 1 1 1
            bcd 14 25 7
            cde 20 11 35


            Datamash can also work with non-tabs, if you specify -t <delimiter>. But tabs seem closest to the example input you have provided.



            Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:



            sed -i 's/ +/t/g' tablefilepath






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 1 hour ago









            cryptarch

            3465




            3465








            • 1




              At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
              – steeldriver
              1 hour ago












            • @steeldriver Thanks!
              – cryptarch
              26 mins ago














            • 1




              At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
              – steeldriver
              1 hour ago












            • @steeldriver Thanks!
              – cryptarch
              26 mins ago








            1




            1




            At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
            – steeldriver
            1 hour ago






            At least in recent versions, there's a -W (--whitespace) option that should allow arbitrary whitespace delimiters
            – steeldriver
            1 hour ago














            @steeldriver Thanks!
            – cryptarch
            26 mins ago




            @steeldriver Thanks!
            – cryptarch
            26 mins ago












            up vote
            1
            down vote













            Using awk summing up the columns 2-4 based on 1.



            awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i]  }' file





            share|improve this answer

























              up vote
              1
              down vote













              Using awk summing up the columns 2-4 based on 1.



              awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i]  }' file





              share|improve this answer























                up vote
                1
                down vote










                up vote
                1
                down vote









                Using awk summing up the columns 2-4 based on 1.



                awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i]  }' file





                share|improve this answer












                Using awk summing up the columns 2-4 based on 1.



                awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i]  }' file






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 1 hour ago









                Inian

                3,795824




                3,795824






















                    up vote
                    1
                    down vote













                    You were fairly close. 
                    You see what you were doing wrong, don't you? 
                    You were keeping one total for each column 1 value,
                    when you should have been keeping three.



                    This is similar to Inian's answer,
                    but trivially extendable to handle any number of columns:



                    awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
                    END {for(i in a) {
                    printf "%s", i
                    for (n=2; n<=4; ++n) printf "t%s", a[i][n]
                    printf "n"
                    }
                    }'


                    Rather than keep three arrays, like Inian's answer,
                    it keeps a two-dimensional array.






                    share|improve this answer





















                    • +1 for the simplification. Well done on the multi-dim arrays
                      – Inian
                      51 mins ago

















                    up vote
                    1
                    down vote













                    You were fairly close. 
                    You see what you were doing wrong, don't you? 
                    You were keeping one total for each column 1 value,
                    when you should have been keeping three.



                    This is similar to Inian's answer,
                    but trivially extendable to handle any number of columns:



                    awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
                    END {for(i in a) {
                    printf "%s", i
                    for (n=2; n<=4; ++n) printf "t%s", a[i][n]
                    printf "n"
                    }
                    }'


                    Rather than keep three arrays, like Inian's answer,
                    it keeps a two-dimensional array.






                    share|improve this answer





















                    • +1 for the simplification. Well done on the multi-dim arrays
                      – Inian
                      51 mins ago















                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    You were fairly close. 
                    You see what you were doing wrong, don't you? 
                    You were keeping one total for each column 1 value,
                    when you should have been keeping three.



                    This is similar to Inian's answer,
                    but trivially extendable to handle any number of columns:



                    awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
                    END {for(i in a) {
                    printf "%s", i
                    for (n=2; n<=4; ++n) printf "t%s", a[i][n]
                    printf "n"
                    }
                    }'


                    Rather than keep three arrays, like Inian's answer,
                    it keeps a two-dimensional array.






                    share|improve this answer












                    You were fairly close. 
                    You see what you were doing wrong, don't you? 
                    You were keeping one total for each column 1 value,
                    when you should have been keeping three.



                    This is similar to Inian's answer,
                    but trivially extendable to handle any number of columns:



                    awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
                    END {for(i in a) {
                    printf "%s", i
                    for (n=2; n<=4; ++n) printf "t%s", a[i][n]
                    printf "n"
                    }
                    }'


                    Rather than keep three arrays, like Inian's answer,
                    it keeps a two-dimensional array.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 55 mins ago









                    Scott

                    6,69742650




                    6,69742650












                    • +1 for the simplification. Well done on the multi-dim arrays
                      – Inian
                      51 mins ago




















                    • +1 for the simplification. Well done on the multi-dim arrays
                      – Inian
                      51 mins ago


















                    +1 for the simplification. Well done on the multi-dim arrays
                    – Inian
                    51 mins ago






                    +1 for the simplification. Well done on the multi-dim arrays
                    – Inian
                    51 mins ago












                    awkprob is a new contributor. Be nice, and check out our Code of Conduct.










                     

                    draft saved


                    draft discarded


















                    awkprob is a new contributor. Be nice, and check out our Code of Conduct.













                    awkprob is a new contributor. Be nice, and check out our Code of Conduct.












                    awkprob is a new contributor. Be nice, and check out our Code of Conduct.















                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484370%2fhow-to-get-sum-of-values-in-column-based-on-variables-in-other-column-separately%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    404 Error Contact Form 7 ajax form submitting

                    How to know if a Active Directory user can login interactively

                    TypeError: fit_transform() missing 1 required positional argument: 'X'