Pandas DataFrame Groupby two columns and get counts

I have a pandas dataframe in the following format:

df = pd.DataFrame([[1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'],['1','3','3','2','4','2','5','3','6','3','5','1','1','1']]).T

df.columns = ['col1','col2','col3','col4','col5']

df:

   col1 col2 col3     col4 col5

0   1.1    A  1.1    x/y/z    1

1   1.1    A  1.7      x/y    3

2   1.1    A  2.5  x/y/z/n    3

3   2.6    B  2.6      x/u    2

4   2.5    B  3.3        x    4

5   3.4    B  3.8    x/u/v    2

6   2.6    B    4    x/y/z    5

7   2.6    A  4.2        x    3

8   3.4    B  4.3  x/u/v/b    6

9   3.4    C  4.5        -    3

10  2.6    B  4.6      x/y    5

11  1.1    D  4.7    x/y/z    1

12  1.1    D  4.7        x    1

13  3.3    D  4.8  x/u/v/w    1

Now I want to group this by two columns like following:

df.groupby(['col5','col2']).reset_index()

OutPut:

             index col1 col2 col3     col4 col5

col5 col2                                      

1    A    0      0  1.1    A  1.1    x/y/z    1

     D    0     11  1.1    D  4.7    x/y/z    1

          1     12  1.1    D  4.7        x    1

          2     13  3.3    D  4.8  x/u/v/w    1

2    B    0      3  2.6    B  2.6      x/u    2

          1      5  3.4    B  3.8    x/u/v    2

3    A    0      1  1.1    A  1.7      x/y    3

          1      2  1.1    A  2.5  x/y/z/n    3

          2      7  2.6    A  4.2        x    3

     C    0      9  3.4    C  4.5        -    3

4    B    0      4  2.5    B  3.3        x    4

5    B    0      6  2.6    B    4    x/y/z    5

          1     10  2.6    B  4.6      x/y    5

6    B    0      8  3.4    B  4.3  x/u/v/b    6

I want to get the count by each row like following.
Expected Output:

col5 col2 count

1    A      1

     D      3

2    B      2

etc...

How to get my expected output? And I want to find largest count for each 'col2' value?

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

A very similar question just came up yesterday.. see here.

– bdiamante
Jul 16 '13 at 14:29

For an idiomatic solution that only uses a single groupby, see this answer below

– Ted Petrou
Nov 5 '17 at 19:38

Note on performance, including alternatives: Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series

– jpp
Jun 25 '18 at 14:01

add a comment |

I have a pandas dataframe in the following format:

df = pd.DataFrame([[1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'],['1','3','3','2','4','2','5','3','6','3','5','1','1','1']]).T

df.columns = ['col1','col2','col3','col4','col5']

df:

   col1 col2 col3     col4 col5

0   1.1    A  1.1    x/y/z    1

1   1.1    A  1.7      x/y    3

2   1.1    A  2.5  x/y/z/n    3

3   2.6    B  2.6      x/u    2

4   2.5    B  3.3        x    4

5   3.4    B  3.8    x/u/v    2

6   2.6    B    4    x/y/z    5

7   2.6    A  4.2        x    3

8   3.4    B  4.3  x/u/v/b    6

9   3.4    C  4.5        -    3

10  2.6    B  4.6      x/y    5

11  1.1    D  4.7    x/y/z    1

12  1.1    D  4.7        x    1

13  3.3    D  4.8  x/u/v/w    1

Now I want to group this by two columns like following:

df.groupby(['col5','col2']).reset_index()

OutPut:

             index col1 col2 col3     col4 col5

col5 col2                                      

1    A    0      0  1.1    A  1.1    x/y/z    1

     D    0     11  1.1    D  4.7    x/y/z    1

          1     12  1.1    D  4.7        x    1

          2     13  3.3    D  4.8  x/u/v/w    1

2    B    0      3  2.6    B  2.6      x/u    2

          1      5  3.4    B  3.8    x/u/v    2

3    A    0      1  1.1    A  1.7      x/y    3

          1      2  1.1    A  2.5  x/y/z/n    3

          2      7  2.6    A  4.2        x    3

     C    0      9  3.4    C  4.5        -    3

4    B    0      4  2.5    B  3.3        x    4

5    B    0      6  2.6    B    4    x/y/z    5

          1     10  2.6    B  4.6      x/y    5

6    B    0      8  3.4    B  4.3  x/u/v/b    6

I want to get the count by each row like following.
Expected Output:

col5 col2 count

1    A      1

     D      3

2    B      2

etc...

How to get my expected output? And I want to find largest count for each 'col2' value?

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

A very similar question just came up yesterday.. see here.

– bdiamante
Jul 16 '13 at 14:29

For an idiomatic solution that only uses a single groupby, see this answer below

– Ted Petrou
Nov 5 '17 at 19:38

Note on performance, including alternatives: Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series

– jpp
Jun 25 '18 at 14:01

add a comment |

I have a pandas dataframe in the following format:

df = pd.DataFrame([[1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'],['1','3','3','2','4','2','5','3','6','3','5','1','1','1']]).T

df.columns = ['col1','col2','col3','col4','col5']

df:

   col1 col2 col3     col4 col5

0   1.1    A  1.1    x/y/z    1

1   1.1    A  1.7      x/y    3

2   1.1    A  2.5  x/y/z/n    3

3   2.6    B  2.6      x/u    2

4   2.5    B  3.3        x    4

5   3.4    B  3.8    x/u/v    2

6   2.6    B    4    x/y/z    5

7   2.6    A  4.2        x    3

8   3.4    B  4.3  x/u/v/b    6

9   3.4    C  4.5        -    3

10  2.6    B  4.6      x/y    5

11  1.1    D  4.7    x/y/z    1

12  1.1    D  4.7        x    1

13  3.3    D  4.8  x/u/v/w    1

Now I want to group this by two columns like following:

df.groupby(['col5','col2']).reset_index()

OutPut:

             index col1 col2 col3     col4 col5

col5 col2                                      

1    A    0      0  1.1    A  1.1    x/y/z    1

     D    0     11  1.1    D  4.7    x/y/z    1

          1     12  1.1    D  4.7        x    1

          2     13  3.3    D  4.8  x/u/v/w    1

2    B    0      3  2.6    B  2.6      x/u    2

          1      5  3.4    B  3.8    x/u/v    2

3    A    0      1  1.1    A  1.7      x/y    3

          1      2  1.1    A  2.5  x/y/z/n    3

          2      7  2.6    A  4.2        x    3

     C    0      9  3.4    C  4.5        -    3

4    B    0      4  2.5    B  3.3        x    4

5    B    0      6  2.6    B    4    x/y/z    5

          1     10  2.6    B  4.6      x/y    5

6    B    0      8  3.4    B  4.3  x/u/v/b    6

I want to get the count by each row like following.
Expected Output:

col5 col2 count

1    A      1

     D      3

2    B      2

etc...

How to get my expected output? And I want to find largest count for each 'col2' value?

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

I have a pandas dataframe in the following format:

df = pd.DataFrame([[1.1, 1.1, 1.1, 2.6, 2.5, 3.4,2.6,2.6,3.4,3.4,2.6,1.1,1.1,3.3], list('AAABBBBABCBDDD'), [1.1, 1.7, 2.5, 2.6, 3.3, 3.8,4.0,4.2,4.3,4.5,4.6,4.7,4.7,4.8], ['x/y/z','x/y','x/y/z/n','x/u','x','x/u/v','x/y/z','x','x/u/v/b','-','x/y','x/y/z','x','x/u/v/w'],['1','3','3','2','4','2','5','3','6','3','5','1','1','1']]).T

df.columns = ['col1','col2','col3','col4','col5']

df:

   col1 col2 col3     col4 col5

0   1.1    A  1.1    x/y/z    1

1   1.1    A  1.7      x/y    3

2   1.1    A  2.5  x/y/z/n    3

3   2.6    B  2.6      x/u    2

4   2.5    B  3.3        x    4

5   3.4    B  3.8    x/u/v    2

6   2.6    B    4    x/y/z    5

7   2.6    A  4.2        x    3

8   3.4    B  4.3  x/u/v/b    6

9   3.4    C  4.5        -    3

10  2.6    B  4.6      x/y    5

11  1.1    D  4.7    x/y/z    1

12  1.1    D  4.7        x    1

13  3.3    D  4.8  x/u/v/w    1

Now I want to group this by two columns like following:

df.groupby(['col5','col2']).reset_index()

OutPut:

             index col1 col2 col3     col4 col5

col5 col2                                      

1    A    0      0  1.1    A  1.1    x/y/z    1

     D    0     11  1.1    D  4.7    x/y/z    1

          1     12  1.1    D  4.7        x    1

          2     13  3.3    D  4.8  x/u/v/w    1

2    B    0      3  2.6    B  2.6      x/u    2

          1      5  3.4    B  3.8    x/u/v    2

3    A    0      1  1.1    A  1.7      x/y    3

          1      2  1.1    A  2.5  x/y/z/n    3

          2      7  2.6    A  4.2        x    3

     C    0      9  3.4    C  4.5        -    3

4    B    0      4  2.5    B  3.3        x    4

5    B    0      6  2.6    B    4    x/y/z    5

          1     10  2.6    B  4.6      x/y    5

6    B    0      8  3.4    B  4.3  x/u/v/b    6

I want to get the count by each row like following.
Expected Output:

col5 col2 count

1    A      1

     D      3

2    B      2

etc...

How to get my expected output? And I want to find largest count for each 'col2' value?

python pandas dataframe

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

asked Jul 16 '13 at 14:19

Nilani Algiriyage

6,319216192

A very similar question just came up yesterday.. see here.

– bdiamante
Jul 16 '13 at 14:29

For an idiomatic solution that only uses a single groupby, see this answer below

– Ted Petrou
Nov 5 '17 at 19:38

Note on performance, including alternatives: Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series

– jpp
Jun 25 '18 at 14:01

add a comment |

A very similar question just came up yesterday.. see here.

– bdiamante
Jul 16 '13 at 14:29

For an idiomatic solution that only uses a single groupby, see this answer below

– Ted Petrou
Nov 5 '17 at 19:38

Note on performance, including alternatives: Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series

– jpp
Jun 25 '18 at 14:01

A very similar question just came up yesterday.. see here.

– bdiamante
Jul 16 '13 at 14:29

For an idiomatic solution that only uses a single groupby, see this answer below

– Ted Petrou
Nov 5 '17 at 19:38

Note on performance, including alternatives: Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series

– jpp
Jun 25 '18 at 14:01

add a comment |

6 Answers
6

active

oldest

votes

Followed by @Andy's answer, you can do following to solve your second question:

In [56]: df.groupby(['col5','col2']).size().reset_index().groupby('col2')[[0]].max()

Out[56]: 

      0

col2   

A     3

B     2

C     1

D     3

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

1

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

add a comment |

You are looking for size:

In [11]: df.groupby(['col5', 'col2']).size()

Out[11]:

col5  col2

1     A       1

      D       3

2     B       2

3     A       3

      C       1

4     B       1

5     B       2

6     B       1

dtype: int64

To get the same answer as waitingkuo (the "second question"), but slightly cleaner, is to groupby the level:

In [12]: df.groupby(['col5', 'col2']).size().groupby(level=1).max()

Out[12]:

col2

A       3

B       2

C       1

D       3

dtype: int64

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

1

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

add a comment |

Inserting data into a pandas dataframe and providing column name.

import pandas as pd

df = pd.DataFrame([['A','C','A','B','C','A','B','B','A','A'], ['ONE','TWO','ONE','ONE','ONE','TWO','ONE','TWO','ONE','THREE']]).T

df.columns = [['Alphabet','Words']]

print(df)   #printing dataframe.

This is our printed data:

enter image description here

For making a group of dataframe in pandas and counter,

You need to provide one more column which counts the grouping, let's call that column as, "COUNTER" in dataframe.

Like this:

df['COUNTER'] =1       #initially, set that counter to 1.

group_data = df.groupby(['Alphabet','Words'])['COUNTER'].sum() #sum function

print(group_data)

OUTPUT:

enter image description here

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

2

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

add a comment |

Idiomatic solution that uses only a single groupby

df.groupby(['col5', 'col2']).size() 

  .sort_values(ascending=False) 

  .reset_index(name='count') 

  .drop_duplicates(subset='col2')



  col5 col2  count

0    3    A      3

1    1    D      3

2    5    B      2

6    3    C      1

Explanation

The result of the groupby size method is a Series with col5 and col2 in the index. From here, you can use another groupby method to find the maximum value of each value in col2 but it is not necessary to do. You can simply sort all the values descendingly and then keep only the rows with the first occurrence of col2 with the drop_duplicates method.

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

add a comment |

Should you want to add a new column (say 'count_column') containing the groups' counts into the dataframe:

df.count_column=df.groupby(['col5','col2']).col5.transform('count')

(I picked 'col5' as it contains no nan)

answered Jun 6 '17 at 12:20

Tom

525

add a comment |

-2

You can just use the built-in function count follow by the groupby function

df.groupby(['col5','col2']).count()

answered Dec 2 '16 at 2:22

seansio1995

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17679089%2fpandas-dataframe-groupby-two-columns-and-get-counts%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

Followed by @Andy's answer, you can do following to solve your second question:

In [56]: df.groupby(['col5','col2']).size().reset_index().groupby('col2')[[0]].max()

Out[56]: 

      0

col2   

A     3

B     2

C     1

D     3

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

1

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

add a comment |

Followed by @Andy's answer, you can do following to solve your second question:

In [56]: df.groupby(['col5','col2']).size().reset_index().groupby('col2')[[0]].max()

Out[56]: 

      0

col2   

A     3

B     2

C     1

D     3

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

1

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

add a comment |

Followed by @Andy's answer, you can do following to solve your second question:

In [56]: df.groupby(['col5','col2']).size().reset_index().groupby('col2')[[0]].max()

Out[56]: 

      0

col2   

A     3

B     2

C     1

D     3

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

Followed by @Andy's answer, you can do following to solve your second question:

In [56]: df.groupby(['col5','col2']).size().reset_index().groupby('col2')[[0]].max()

Out[56]: 

      0

col2   

A     3

B     2

C     1

D     3

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

answered Jul 16 '13 at 14:53

waitingkuo

36.3k178699

1

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

add a comment |

1

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

Can I get "col5" values for this like C...1...3?

– Nilani Algiriyage
Jul 16 '13 at 15:12

add a comment |

You are looking for size:

In [11]: df.groupby(['col5', 'col2']).size()

Out[11]:

col5  col2

1     A       1

      D       3

2     B       2

3     A       3

      C       1

4     B       1

5     B       2

6     B       1

dtype: int64

To get the same answer as waitingkuo (the "second question"), but slightly cleaner, is to groupby the level:

In [12]: df.groupby(['col5', 'col2']).size().groupby(level=1).max()

Out[12]:

col2

A       3

B       2

C       1

D       3

dtype: int64

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

1

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

add a comment |

You are looking for size:

In [11]: df.groupby(['col5', 'col2']).size()

Out[11]:

col5  col2

1     A       1

      D       3

2     B       2

3     A       3

      C       1

4     B       1

5     B       2

6     B       1

dtype: int64

To get the same answer as waitingkuo (the "second question"), but slightly cleaner, is to groupby the level:

In [12]: df.groupby(['col5', 'col2']).size().groupby(level=1).max()

Out[12]:

col2

A       3

B       2

C       1

D       3

dtype: int64

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

1

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

add a comment |

You are looking for size:

In [11]: df.groupby(['col5', 'col2']).size()

Out[11]:

col5  col2

1     A       1

      D       3

2     B       2

3     A       3

      C       1

4     B       1

5     B       2

6     B       1

dtype: int64

To get the same answer as waitingkuo (the "second question"), but slightly cleaner, is to groupby the level:

In [12]: df.groupby(['col5', 'col2']).size().groupby(level=1).max()

Out[12]:

col2

A       3

B       2

C       1

D       3

dtype: int64

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

You are looking for size:

In [11]: df.groupby(['col5', 'col2']).size()

Out[11]:

col5  col2

1     A       1

      D       3

2     B       2

3     A       3

      C       1

4     B       1

5     B       2

6     B       1

dtype: int64

To get the same answer as waitingkuo (the "second question"), but slightly cleaner, is to groupby the level:

In [12]: df.groupby(['col5', 'col2']).size().groupby(level=1).max()

Out[12]:

col2

A       3

B       2

C       1

D       3

dtype: int64

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

edited Apr 2 '15 at 16:45

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

answered Jul 16 '13 at 14:37

Andy Hayden

183k52431419

1

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

add a comment |

1

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

I don't know Why I forgot this :O, Any way what about my second question?Find largest count for each "col2" value and get corresponding "col5" value?

– Nilani Algiriyage
Jul 16 '13 at 14:40

add a comment |

Inserting data into a pandas dataframe and providing column name.

import pandas as pd

df = pd.DataFrame([['A','C','A','B','C','A','B','B','A','A'], ['ONE','TWO','ONE','ONE','ONE','TWO','ONE','TWO','ONE','THREE']]).T

df.columns = [['Alphabet','Words']]

print(df)   #printing dataframe.

This is our printed data:

enter image description here

For making a group of dataframe in pandas and counter,

You need to provide one more column which counts the grouping, let's call that column as, "COUNTER" in dataframe.

Like this:

df['COUNTER'] =1       #initially, set that counter to 1.

group_data = df.groupby(['Alphabet','Words'])['COUNTER'].sum() #sum function

print(group_data)

OUTPUT:

enter image description here

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

2

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

add a comment |

Inserting data into a pandas dataframe and providing column name.

import pandas as pd

df = pd.DataFrame([['A','C','A','B','C','A','B','B','A','A'], ['ONE','TWO','ONE','ONE','ONE','TWO','ONE','TWO','ONE','THREE']]).T

df.columns = [['Alphabet','Words']]

print(df)   #printing dataframe.

This is our printed data:

enter image description here

For making a group of dataframe in pandas and counter,

You need to provide one more column which counts the grouping, let's call that column as, "COUNTER" in dataframe.

Like this:

df['COUNTER'] =1       #initially, set that counter to 1.

group_data = df.groupby(['Alphabet','Words'])['COUNTER'].sum() #sum function

print(group_data)

OUTPUT:

enter image description here

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

2

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

add a comment |

Inserting data into a pandas dataframe and providing column name.

import pandas as pd

df = pd.DataFrame([['A','C','A','B','C','A','B','B','A','A'], ['ONE','TWO','ONE','ONE','ONE','TWO','ONE','TWO','ONE','THREE']]).T

df.columns = [['Alphabet','Words']]

print(df)   #printing dataframe.

This is our printed data:

enter image description here

For making a group of dataframe in pandas and counter,

You need to provide one more column which counts the grouping, let's call that column as, "COUNTER" in dataframe.

Like this:

df['COUNTER'] =1       #initially, set that counter to 1.

group_data = df.groupby(['Alphabet','Words'])['COUNTER'].sum() #sum function

print(group_data)

OUTPUT:

enter image description here

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

Inserting data into a pandas dataframe and providing column name.

import pandas as pd

df = pd.DataFrame([['A','C','A','B','C','A','B','B','A','A'], ['ONE','TWO','ONE','ONE','ONE','TWO','ONE','TWO','ONE','THREE']]).T

df.columns = [['Alphabet','Words']]

print(df)   #printing dataframe.

This is our printed data:

enter image description here

For making a group of dataframe in pandas and counter,

You need to provide one more column which counts the grouping, let's call that column as, "COUNTER" in dataframe.

Like this:

df['COUNTER'] =1       #initially, set that counter to 1.

group_data = df.groupby(['Alphabet','Words'])['COUNTER'].sum() #sum function

print(group_data)

OUTPUT:

enter image description here

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

edited Sep 20 '17 at 12:02

Vaibhav Mule

3,06112444

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

answered Jul 21 '16 at 11:53

The Gr8 Adakron

588612

2

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

add a comment |

2

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

How can I get the Alphabet column (eg.A) to repeat below and not leave the gaps in the first column ??

– blissweb
Jan 6 '18 at 6:54

add a comment |

Idiomatic solution that uses only a single groupby

df.groupby(['col5', 'col2']).size() 

  .sort_values(ascending=False) 

  .reset_index(name='count') 

  .drop_duplicates(subset='col2')



  col5 col2  count

0    3    A      3

1    1    D      3

2    5    B      2

6    3    C      1

Explanation

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

add a comment |

Idiomatic solution that uses only a single groupby

df.groupby(['col5', 'col2']).size() 

  .sort_values(ascending=False) 

  .reset_index(name='count') 

  .drop_duplicates(subset='col2')



  col5 col2  count

0    3    A      3

1    1    D      3

2    5    B      2

6    3    C      1

Explanation

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

add a comment |

Idiomatic solution that uses only a single groupby

df.groupby(['col5', 'col2']).size() 

  .sort_values(ascending=False) 

  .reset_index(name='count') 

  .drop_duplicates(subset='col2')



  col5 col2  count

0    3    A      3

1    1    D      3

2    5    B      2

6    3    C      1

Explanation

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

Idiomatic solution that uses only a single groupby

df.groupby(['col5', 'col2']).size() 

  .sort_values(ascending=False) 

  .reset_index(name='count') 

  .drop_duplicates(subset='col2')



  col5 col2  count

0    3    A      3

1    1    D      3

2    5    B      2

6    3    C      1

Explanation

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

answered Nov 5 '17 at 19:37

Ted Petrou

22.6k87166

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

add a comment |

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

There is no param called name in reset_index() in the current version of pandas: pandas.pydata.org/pandas-docs/stable/generated/…

– mmBs
Nov 25 '18 at 21:27

pandas.pydata.org/pandas-docs/stable/generated/…

– Ted Petrou
Nov 26 '18 at 11:20

Ok, my bad. I used it when working with DataFrame not Series. Thanks for the link.

– mmBs
Nov 26 '18 at 11:42

add a comment |

Should you want to add a new column (say 'count_column') containing the groups' counts into the dataframe:

df.count_column=df.groupby(['col5','col2']).col5.transform('count')

(I picked 'col5' as it contains no nan)

answered Jun 6 '17 at 12:20

Tom

525

add a comment |

Should you want to add a new column (say 'count_column') containing the groups' counts into the dataframe:

df.count_column=df.groupby(['col5','col2']).col5.transform('count')

(I picked 'col5' as it contains no nan)

answered Jun 6 '17 at 12:20

Tom

525

add a comment |

Should you want to add a new column (say 'count_column') containing the groups' counts into the dataframe:

df.count_column=df.groupby(['col5','col2']).col5.transform('count')

(I picked 'col5' as it contains no nan)

answered Jun 6 '17 at 12:20

Tom

525

Should you want to add a new column (say 'count_column') containing the groups' counts into the dataframe:

df.count_column=df.groupby(['col5','col2']).col5.transform('count')

(I picked 'col5' as it contains no nan)

answered Jun 6 '17 at 12:20

Tom

525

answered Jun 6 '17 at 12:20

Tom

525

answered Jun 6 '17 at 12:20

Tom

525

answered Jun 6 '17 at 12:20

Tom

525

add a comment |

-2

You can just use the built-in function count follow by the groupby function

df.groupby(['col5','col2']).count()

answered Dec 2 '16 at 2:22

seansio1995

add a comment |

-2

You can just use the built-in function count follow by the groupby function

df.groupby(['col5','col2']).count()

answered Dec 2 '16 at 2:22

seansio1995

add a comment |

-2

You can just use the built-in function count follow by the groupby function

df.groupby(['col5','col2']).count()

answered Dec 2 '16 at 2:22

seansio1995

You can just use the built-in function count follow by the groupby function

df.groupby(['col5','col2']).count()

answered Dec 2 '16 at 2:22

seansio1995

answered Dec 2 '16 at 2:22

seansio1995

answered Dec 2 '16 at 2:22

seansio1995

answered Dec 2 '16 at 2:22

seansio1995

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

s1GeOxBUrse DfQdMZ2yG9h2q uyOb5eMf1Jfpy7bT,UcBV5afoOIaughr EC5O,s 4

搜尋此網誌

Tukukkk