Count number of rows within each group

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, sum)

Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, count)

But, no such luck.

Any ideas?

Some toy data:

set.seed(2)

df1 <- data.frame(x = 1:20,

                  Year = sample(2012:2014, 20, replace = TRUE),

                  Month = sample(month.abb[1:3], 20, replace = TRUE))

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

15

nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54

13

I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35

3

@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19

add a comment |

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, sum)

Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, count)

But, no such luck.

Any ideas?

Some toy data:

set.seed(2)

df1 <- data.frame(x = 1:20,

                  Year = sample(2012:2014, 20, replace = TRUE),

                  Month = sample(month.abb[1:3], 20, replace = TRUE))

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

15

nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54

13

I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35

3

@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19

add a comment |

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, sum)

Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, count)

But, no such luck.

Any ideas?

Some toy data:

set.seed(2)

df1 <- data.frame(x = 1:20,

                  Year = sample(2012:2014, 20, replace = TRUE),

                  Month = sample(month.abb[1:3], 20, replace = TRUE))

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, sum)

Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:

df2 <- aggregate(x ~ Year + Month, data = df1, count)

But, no such luck.

Any ideas?

Some toy data:

set.seed(2)

df1 <- data.frame(x = 1:20,

                  Year = sample(2012:2014, 20, replace = TRUE),

                  Month = sample(month.abb[1:3], 20, replace = TRUE))

r dataframe r-faq

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

edited Jan 29 '17 at 10:58

Henrik

42k994110

edited Jan 29 '17 at 10:58

Henrik

42k994110

edited Jan 29 '17 at 10:58

Henrik

42k994110

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

asked Mar 21 '12 at 16:50

MikeTP

2,616103753

15

nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54

13

I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35

3

@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19

add a comment |

15

nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54

13

I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35

3

@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19

nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54

I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35

@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19

add a comment |

11 Answers
11

active

oldest

votes

There is also df2 <- count(x, c('Year','Month')) (plyr package)

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

1

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

4

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

1

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

add a comment |

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

add a comment |

We can also use dplyr.

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)

count(df, year, month)

#piping

df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 

  group_by(year, month) %>%

  summarise(number = n())

or the tally function:

df %>% 

  group_by(year, month) %>%

  tally()

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

add a comment |

An old question without a data.table solution. So here goes...

Using .N

library(data.table)

DT <- data.table(df)

DT[, .N, by = list(year, month)]

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

add a comment |

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

add a comment |

Create a new variable Count with a value of 1 for each row:

df1["Count"] <-1

Then aggregate dataframe, summing by the Count column:

df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

add a comment |

An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))



myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

add a comment |

For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
So this is my handy snippet for those occasions;

agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")

agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")

aggcount <- agg.count$columnToMean

agg <- cbind(aggcount, agg.mean)

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

add a comment |

A sql solution using sqldf package:

library(sqldf)

sqldf("SELECT Year, Month, COUNT(*) as Freq

       FROM df1

       GROUP BY Year, Month")

answered May 29 '18 at 19:22

M-M

6,85661945

add a comment |

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

answered Feb 22 '18 at 22:55

paudan

261

add a comment |

You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.

The output will look like,

df1$Month: Feb

     x freq

1 2012    1

2 2013    1

3 2014    5

--------------------------------------------------------------- 

df1$Month: Jan

     x freq

1 2012    5

2 2013    2

--------------------------------------------------------------- 

df1$Month: Mar

     x freq

1 2012    1

2 2013    3

3 2014    2

>

answered Nov 25 '18 at 20:57

helcode

721522

add a comment |

protected by David Arenburg Aug 12 '15 at 21:46

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).

Would you like to answer one of these unanswered questions instead?

11 Answers
11

active

oldest

votes

11 Answers
11

active

oldest

votes

There is also df2 <- count(x, c('Year','Month')) (plyr package)

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

1

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

4

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

1

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

add a comment |

There is also df2 <- count(x, c('Year','Month')) (plyr package)

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

1

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

4

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

1

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

add a comment |

There is also df2 <- count(x, c('Year','Month')) (plyr package)

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

There is also df2 <- count(x, c('Year','Month')) (plyr package)

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

edited Jun 6 '13 at 14:46

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

answered Jun 5 '13 at 13:48

geotheory

9,1781567133

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

1

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

4

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

1

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

add a comment |

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

1

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

4

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

1

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

– sop
May 15 '15 at 14:06

I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

– geotheory
May 16 '15 at 22:28

I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

– sop
May 18 '15 at 7:20

I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

– Manoj Kumar
Dec 14 '16 at 17:57

Yes dplyr is best practice now.

– geotheory
Dec 15 '16 at 2:07

add a comment |

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

add a comment |

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

add a comment |

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):

nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])

and with aggregate, following @GregSnow:

aggregate(x ~ Year + Month, data = df, FUN = length)

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

edited Mar 22 '12 at 6:31

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

answered Mar 21 '12 at 17:06

Ben

31.9k1398171

add a comment |

We can also use dplyr.

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)

count(df, year, month)

#piping

df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 

  group_by(year, month) %>%

  summarise(number = n())

or the tally function:

df %>% 

  group_by(year, month) %>%

  tally()

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

add a comment |

We can also use dplyr.

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)

count(df, year, month)

#piping

df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 

  group_by(year, month) %>%

  summarise(number = n())

or the tally function:

df %>% 

  group_by(year, month) %>%

  tally()

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

add a comment |

We can also use dplyr.

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)

count(df, year, month)

#piping

df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 

  group_by(year, month) %>%

  summarise(number = n())

or the tally function:

df %>% 

  group_by(year, month) %>%

  tally()

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

We can also use dplyr.

First, some data:

df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))

Now the count:

library(dplyr)

count(df, year, month)

#piping

df %>% count(year, month)

We can also use a slightly longer version with piping and the n() function:

df %>% 

  group_by(year, month) %>%

  summarise(number = n())

or the tally function:

df %>% 

  group_by(year, month) %>%

  tally()

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

edited Dec 2 '18 at 18:45

Jaap

56.8k21122135

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

answered Aug 12 '15 at 21:55

jeremycg

18.9k44156

add a comment |

An old question without a data.table solution. So here goes...

Using .N

library(data.table)

DT <- data.table(df)

DT[, .N, by = list(year, month)]

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

add a comment |

An old question without a data.table solution. So here goes...

Using .N

library(data.table)

DT <- data.table(df)

DT[, .N, by = list(year, month)]

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

add a comment |

An old question without a data.table solution. So here goes...

Using .N

library(data.table)

DT <- data.table(df)

DT[, .N, by = list(year, month)]

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

An old question without a data.table solution. So here goes...

Using .N

library(data.table)

DT <- data.table(df)

DT[, .N, by = list(year, month)]

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

answered Aug 2 '13 at 0:30

mnel

91.9k18219230

add a comment |

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

add a comment |

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

add a comment |

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

answered Mar 21 '12 at 18:08

Greg Snow

40.5k35987

add a comment |

Create a new variable Count with a value of 1 for each row:

df1["Count"] <-1

Then aggregate dataframe, summing by the Count column:

df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

add a comment |

Create a new variable Count with a value of 1 for each row:

df1["Count"] <-1

Then aggregate dataframe, summing by the Count column:

df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

add a comment |

Create a new variable Count with a value of 1 for each row:

df1["Count"] <-1

Then aggregate dataframe, summing by the Count column:

df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

Create a new variable Count with a value of 1 for each row:

df1["Count"] <-1

Then aggregate dataframe, summing by the Count column:

df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

edited Aug 12 '15 at 21:44

David Arenburg

78.8k1195163

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

answered Aug 2 '13 at 0:16

Leroy Tyrone

3371315

add a comment |

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))



myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

add a comment |

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))



myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

add a comment |

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))



myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))



myAns<-as.data.frame(table(df[,c("year","month")]))

And without the zero-occurring combinations

myAns[which(myAns$Freq>0),]

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

answered Mar 21 '12 at 20:41

BenBarnes

15.2k54464

add a comment |

For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
So this is my handy snippet for those occasions;

agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")

agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")

aggcount <- agg.count$columnToMean

agg <- cbind(aggcount, agg.mean)

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

add a comment |

For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
So this is my handy snippet for those occasions;

agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")

agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")

aggcount <- agg.count$columnToMean

agg <- cbind(aggcount, agg.mean)

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

add a comment |

For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
So this is my handy snippet for those occasions;

agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")

agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")

aggcount <- agg.count$columnToMean

agg <- cbind(aggcount, agg.mean)

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
So this is my handy snippet for those occasions;

agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")

agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")

aggcount <- agg.count$columnToMean

agg <- cbind(aggcount, agg.mean)

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

edited Jan 2 '16 at 0:31

answered Jan 5 '15 at 16:38

maze

1087

answered Jan 5 '15 at 16:38

maze

1087

answered Jan 5 '15 at 16:38

maze

1087

add a comment |

A sql solution using sqldf package:

library(sqldf)

sqldf("SELECT Year, Month, COUNT(*) as Freq

       FROM df1

       GROUP BY Year, Month")

answered May 29 '18 at 19:22

M-M

6,85661945

add a comment |

A sql solution using sqldf package:

library(sqldf)

sqldf("SELECT Year, Month, COUNT(*) as Freq

       FROM df1

       GROUP BY Year, Month")

answered May 29 '18 at 19:22

M-M

6,85661945

add a comment |

A sql solution using sqldf package:

library(sqldf)

sqldf("SELECT Year, Month, COUNT(*) as Freq

       FROM df1

       GROUP BY Year, Month")

answered May 29 '18 at 19:22

M-M

6,85661945

A sql solution using sqldf package:

library(sqldf)

sqldf("SELECT Year, Month, COUNT(*) as Freq

       FROM df1

       GROUP BY Year, Month")

answered May 29 '18 at 19:22

M-M

6,85661945

answered May 29 '18 at 19:22

M-M

6,85661945

answered May 29 '18 at 19:22

M-M

6,85661945

answered May 29 '18 at 19:22

M-M

6,85661945

add a comment |

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

answered Feb 22 '18 at 22:55

paudan

261

add a comment |

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

answered Feb 22 '18 at 22:55

paudan

261

add a comment |

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

answered Feb 22 '18 at 22:55

paudan

261

Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:

aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)

Similarly, it can be generalized if more than two variables are used in grouping:

aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)

answered Feb 22 '18 at 22:55

paudan

261

answered Feb 22 '18 at 22:55

paudan

261

answered Feb 22 '18 at 22:55

paudan

261

answered Feb 22 '18 at 22:55

paudan

261

add a comment |

You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.

The output will look like,

df1$Month: Feb

     x freq

1 2012    1

2 2013    1

3 2014    5

--------------------------------------------------------------- 

df1$Month: Jan

     x freq

1 2012    5

2 2013    2

--------------------------------------------------------------- 

df1$Month: Mar

     x freq

1 2012    1

2 2013    3

3 2014    2

>

answered Nov 25 '18 at 20:57

helcode

721522

add a comment |

You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.

The output will look like,

df1$Month: Feb

     x freq

1 2012    1

2 2013    1

3 2014    5

--------------------------------------------------------------- 

df1$Month: Jan

     x freq

1 2012    5

2 2013    2

--------------------------------------------------------------- 

df1$Month: Mar

     x freq

1 2012    1

2 2013    3

3 2014    2

>

answered Nov 25 '18 at 20:57

helcode

721522

add a comment |

You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.

The output will look like,

df1$Month: Feb

     x freq

1 2012    1

2 2013    1

3 2014    5

--------------------------------------------------------------- 

df1$Month: Jan

     x freq

1 2012    5

2 2013    2

--------------------------------------------------------------- 

df1$Month: Mar

     x freq

1 2012    1

2 2013    3

3 2014    2

>

answered Nov 25 '18 at 20:57

helcode

721522

You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.

The output will look like,

df1$Month: Feb

     x freq

1 2012    1

2 2013    1

3 2014    5

--------------------------------------------------------------- 

df1$Month: Jan

     x freq

1 2012    5

2 2013    2

--------------------------------------------------------------- 

df1$Month: Mar

     x freq

1 2012    1

2 2013    3

3 2014    2

>

answered Nov 25 '18 at 20:57

helcode

721522

answered Nov 25 '18 at 20:57

helcode

721522

answered Nov 25 '18 at 20:57

helcode

721522

answered Nov 25 '18 at 20:57

helcode

721522

add a comment |

protected by David Arenburg Aug 12 '15 at 21:46

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Tukukkk