How to get sum of values in column based on variables in other column separately?
up vote
1
down vote
favorite
I have a table data like below
abc 1 1 1
bcd 2 2 4
bcd 12 23 3
cde 3 5 5
cde 3 4 5
cde 14 2 25
I want the sum of values in each column based on variables in first column and desired result is like below:
abc 1 1 1
bcd 14 25 7
cde 20 11 35
I used awk command like this
awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath
and I got a result below:
abc 3
bcd 46
cde 66
I think the end of my code is wrong but don't know how to fix it.
Please help me with some directions to fix the code...
linux bash shell-script awk table
New contributor
add a comment |
up vote
1
down vote
favorite
I have a table data like below
abc 1 1 1
bcd 2 2 4
bcd 12 23 3
cde 3 5 5
cde 3 4 5
cde 14 2 25
I want the sum of values in each column based on variables in first column and desired result is like below:
abc 1 1 1
bcd 14 25 7
cde 20 11 35
I used awk command like this
awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath
and I got a result below:
abc 3
bcd 46
cde 66
I think the end of my code is wrong but don't know how to fix it.
Please help me with some directions to fix the code...
linux bash shell-script awk table
New contributor
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a table data like below
abc 1 1 1
bcd 2 2 4
bcd 12 23 3
cde 3 5 5
cde 3 4 5
cde 14 2 25
I want the sum of values in each column based on variables in first column and desired result is like below:
abc 1 1 1
bcd 14 25 7
cde 20 11 35
I used awk command like this
awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath
and I got a result below:
abc 3
bcd 46
cde 66
I think the end of my code is wrong but don't know how to fix it.
Please help me with some directions to fix the code...
linux bash shell-script awk table
New contributor
I have a table data like below
abc 1 1 1
bcd 2 2 4
bcd 12 23 3
cde 3 5 5
cde 3 4 5
cde 14 2 25
I want the sum of values in each column based on variables in first column and desired result is like below:
abc 1 1 1
bcd 14 25 7
cde 20 11 35
I used awk command like this
awk -F"t" '{for(n=2;n<=NF; ++n)a[$1]+=$n}END{for(i in a ) print i, a[i] }' tablefilepath
and I got a result below:
abc 3
bcd 46
cde 66
I think the end of my code is wrong but don't know how to fix it.
Please help me with some directions to fix the code...
linux bash shell-script awk table
linux bash shell-script awk table
New contributor
New contributor
edited 53 mins ago
Scott
6,69742650
6,69742650
New contributor
asked 1 hour ago
awkprob
61
61
New contributor
New contributor
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
1
down vote
So long as your file is tab-delimited, datamash is a good fit for this.
$ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
abc 1 1 1
bcd 14 25 7
cde 20 11 35
Datamash can also work with non-tabs, if you specify -t <delimiter>
. But tabs seem closest to the example input you have provided.
Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:
sed -i 's/ +/t/g' tablefilepath
1
At least in recent versions, there's a-W
(--whitespace
) option that should allow arbitrary whitespace delimiters
– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
add a comment |
up vote
1
down vote
Using awk
summing up the columns 2-4 based on 1.
awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i] }' file
add a comment |
up vote
1
down vote
You were fairly close.
You see what you were doing wrong, don't you?
You were keeping one total for each column 1 value,
when you should have been keeping three.
This is similar to Inian's answer,
but trivially extendable to handle any number of columns:
awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
END {for(i in a) {
printf "%s", i
for (n=2; n<=4; ++n) printf "t%s", a[i][n]
printf "n"
}
}'
Rather than keep three arrays, like Inian's answer,
it keeps a two-dimensional array.
+1
for the simplification. Well done on the multi-dim arrays
– Inian
51 mins ago
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
So long as your file is tab-delimited, datamash is a good fit for this.
$ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
abc 1 1 1
bcd 14 25 7
cde 20 11 35
Datamash can also work with non-tabs, if you specify -t <delimiter>
. But tabs seem closest to the example input you have provided.
Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:
sed -i 's/ +/t/g' tablefilepath
1
At least in recent versions, there's a-W
(--whitespace
) option that should allow arbitrary whitespace delimiters
– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
add a comment |
up vote
1
down vote
So long as your file is tab-delimited, datamash is a good fit for this.
$ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
abc 1 1 1
bcd 14 25 7
cde 20 11 35
Datamash can also work with non-tabs, if you specify -t <delimiter>
. But tabs seem closest to the example input you have provided.
Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:
sed -i 's/ +/t/g' tablefilepath
1
At least in recent versions, there's a-W
(--whitespace
) option that should allow arbitrary whitespace delimiters
– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
add a comment |
up vote
1
down vote
up vote
1
down vote
So long as your file is tab-delimited, datamash is a good fit for this.
$ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
abc 1 1 1
bcd 14 25 7
cde 20 11 35
Datamash can also work with non-tabs, if you specify -t <delimiter>
. But tabs seem closest to the example input you have provided.
Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:
sed -i 's/ +/t/g' tablefilepath
So long as your file is tab-delimited, datamash is a good fit for this.
$ datamash groupby 1 sum 2 sum 3 sum 4 < tablefilepath
abc 1 1 1
bcd 14 25 7
cde 20 11 35
Datamash can also work with non-tabs, if you specify -t <delimiter>
. But tabs seem closest to the example input you have provided.
Datamash won't work if your input is delimited by arbitrary whitespace (i.e. possible multiple spaces intended to "look like" a tab). Still, even if that is what your data looks like, it is easily munged into the form expected by datamash:
sed -i 's/ +/t/g' tablefilepath
answered 1 hour ago
cryptarch
3465
3465
1
At least in recent versions, there's a-W
(--whitespace
) option that should allow arbitrary whitespace delimiters
– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
add a comment |
1
At least in recent versions, there's a-W
(--whitespace
) option that should allow arbitrary whitespace delimiters
– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
1
1
At least in recent versions, there's a
-W
(--whitespace
) option that should allow arbitrary whitespace delimiters– steeldriver
1 hour ago
At least in recent versions, there's a
-W
(--whitespace
) option that should allow arbitrary whitespace delimiters– steeldriver
1 hour ago
@steeldriver Thanks!
– cryptarch
26 mins ago
@steeldriver Thanks!
– cryptarch
26 mins ago
add a comment |
up vote
1
down vote
Using awk
summing up the columns 2-4 based on 1.
awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i] }' file
add a comment |
up vote
1
down vote
Using awk
summing up the columns 2-4 based on 1.
awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i] }' file
add a comment |
up vote
1
down vote
up vote
1
down vote
Using awk
summing up the columns 2-4 based on 1.
awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i] }' file
Using awk
summing up the columns 2-4 based on 1.
awk -v FS="t" -v OFS="t" '{ col1[$1]+=$2; col2[$1]+=$3; col3[$1]+=$4; next } END { for ( i in col1) print i, col1[i], col2[i], col3[i] }' file
answered 1 hour ago
Inian
3,795824
3,795824
add a comment |
add a comment |
up vote
1
down vote
You were fairly close.
You see what you were doing wrong, don't you?
You were keeping one total for each column 1 value,
when you should have been keeping three.
This is similar to Inian's answer,
but trivially extendable to handle any number of columns:
awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
END {for(i in a) {
printf "%s", i
for (n=2; n<=4; ++n) printf "t%s", a[i][n]
printf "n"
}
}'
Rather than keep three arrays, like Inian's answer,
it keeps a two-dimensional array.
+1
for the simplification. Well done on the multi-dim arrays
– Inian
51 mins ago
add a comment |
up vote
1
down vote
You were fairly close.
You see what you were doing wrong, don't you?
You were keeping one total for each column 1 value,
when you should have been keeping three.
This is similar to Inian's answer,
but trivially extendable to handle any number of columns:
awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
END {for(i in a) {
printf "%s", i
for (n=2; n<=4; ++n) printf "t%s", a[i][n]
printf "n"
}
}'
Rather than keep three arrays, like Inian's answer,
it keeps a two-dimensional array.
+1
for the simplification. Well done on the multi-dim arrays
– Inian
51 mins ago
add a comment |
up vote
1
down vote
up vote
1
down vote
You were fairly close.
You see what you were doing wrong, don't you?
You were keeping one total for each column 1 value,
when you should have been keeping three.
This is similar to Inian's answer,
but trivially extendable to handle any number of columns:
awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
END {for(i in a) {
printf "%s", i
for (n=2; n<=4; ++n) printf "t%s", a[i][n]
printf "n"
}
}'
Rather than keep three arrays, like Inian's answer,
it keeps a two-dimensional array.
You were fairly close.
You see what you were doing wrong, don't you?
You were keeping one total for each column 1 value,
when you should have been keeping three.
This is similar to Inian's answer,
but trivially extendable to handle any number of columns:
awk -F"t" '{for(n=2;n<=NF; ++n) a[$1][n]+=$n}
END {for(i in a) {
printf "%s", i
for (n=2; n<=4; ++n) printf "t%s", a[i][n]
printf "n"
}
}'
Rather than keep three arrays, like Inian's answer,
it keeps a two-dimensional array.
answered 55 mins ago
Scott
6,69742650
6,69742650
+1
for the simplification. Well done on the multi-dim arrays
– Inian
51 mins ago
add a comment |
+1
for the simplification. Well done on the multi-dim arrays
– Inian
51 mins ago
+1
for the simplification. Well done on the multi-dim arrays– Inian
51 mins ago
+1
for the simplification. Well done on the multi-dim arrays– Inian
51 mins ago
add a comment |
awkprob is a new contributor. Be nice, and check out our Code of Conduct.
awkprob is a new contributor. Be nice, and check out our Code of Conduct.
awkprob is a new contributor. Be nice, and check out our Code of Conduct.
awkprob is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484370%2fhow-to-get-sum-of-values-in-column-based-on-variables-in-other-column-separately%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown