Count number of rows within each group












85















I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, sum)


Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, count)


But, no such luck.



Any ideas?





Some toy data:



set.seed(2)
df1 <- data.frame(x = 1:20,
Year = sample(2012:2014, 20, replace = TRUE),
Month = sample(month.abb[1:3], 20, replace = TRUE))









share|improve this question




















  • 15





    nrow, NROW, length...

    – Joshua Ulrich
    Mar 21 '12 at 16:54






  • 13





    I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

    – Hong Ooi
    Mar 22 '12 at 6:35






  • 3





    @JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

    – Prolix
    Aug 11 '15 at 10:19
















85















I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, sum)


Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, count)


But, no such luck.



Any ideas?





Some toy data:



set.seed(2)
df1 <- data.frame(x = 1:20,
Year = sample(2012:2014, 20, replace = TRUE),
Month = sample(month.abb[1:3], 20, replace = TRUE))









share|improve this question




















  • 15





    nrow, NROW, length...

    – Joshua Ulrich
    Mar 21 '12 at 16:54






  • 13





    I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

    – Hong Ooi
    Mar 22 '12 at 6:35






  • 3





    @JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

    – Prolix
    Aug 11 '15 at 10:19














85












85








85


27






I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, sum)


Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, count)


But, no such luck.



Any ideas?





Some toy data:



set.seed(2)
df1 <- data.frame(x = 1:20,
Year = sample(2012:2014, 20, replace = TRUE),
Month = sample(month.abb[1:3], 20, replace = TRUE))









share|improve this question
















I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, sum)


Now, I would like to count observations but can't seem to find the proper argument for FUN. Intuitively, I thought it would be as follows:



df2 <- aggregate(x ~ Year + Month, data = df1, count)


But, no such luck.



Any ideas?





Some toy data:



set.seed(2)
df1 <- data.frame(x = 1:20,
Year = sample(2012:2014, 20, replace = TRUE),
Month = sample(month.abb[1:3], 20, replace = TRUE))






r dataframe r-faq






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 29 '17 at 10:58









Henrik

42k994110




42k994110










asked Mar 21 '12 at 16:50









MikeTPMikeTP

2,616103753




2,616103753








  • 15





    nrow, NROW, length...

    – Joshua Ulrich
    Mar 21 '12 at 16:54






  • 13





    I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

    – Hong Ooi
    Mar 22 '12 at 6:35






  • 3





    @JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

    – Prolix
    Aug 11 '15 at 10:19














  • 15





    nrow, NROW, length...

    – Joshua Ulrich
    Mar 21 '12 at 16:54






  • 13





    I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

    – Hong Ooi
    Mar 22 '12 at 6:35






  • 3





    @JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

    – Prolix
    Aug 11 '15 at 10:19








15




15





nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54





nrow, NROW, length...

– Joshua Ulrich
Mar 21 '12 at 16:54




13




13





I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35





I keep reading this question as asking for a fun way to count things (as opposed to the many unfun ways, I guess).

– Hong Ooi
Mar 22 '12 at 6:35




3




3





@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19





@JoshuaUlrich: nrow did not work for me but NROW and lengthworked fine. +1

– Prolix
Aug 11 '15 at 10:19












11 Answers
11






active

oldest

votes


















45














There is also df2 <- count(x, c('Year','Month')) (plyr package)






share|improve this answer


























  • Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

    – sop
    May 15 '15 at 14:06






  • 1





    I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

    – geotheory
    May 16 '15 at 22:28













  • I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

    – sop
    May 18 '15 at 7:20






  • 4





    I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

    – Manoj Kumar
    Dec 14 '16 at 17:57






  • 1





    Yes dplyr is best practice now.

    – geotheory
    Dec 15 '16 at 2:07



















52














Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):



nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])


and with aggregate, following @GregSnow:



aggregate(x ~ Year + Month, data = df, FUN = length)





share|improve this answer

































    32














    We can also use dplyr.



    First, some data:



    df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))


    Now the count:



    library(dplyr)
    count(df, year, month)
    #piping
    df %>% count(year, month)


    We can also use a slightly longer version with piping and the n() function:



    df %>% 
    group_by(year, month) %>%
    summarise(number = n())


    or the tally function:



    df %>% 
    group_by(year, month) %>%
    tally()





    share|improve this answer

































      30














      An old question without a data.table solution. So here goes...



      Using .N



      library(data.table)
      DT <- data.table(df)
      DT[, .N, by = list(year, month)]





      share|improve this answer































        21














        The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).






        share|improve this answer































          16














          Create a new variable Count with a value of 1 for each row:



          df1["Count"] <-1


          Then aggregate dataframe, summing by the Count column:



          df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)





          share|improve this answer

































            15














            An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences



            df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

            myAns<-as.data.frame(table(df[,c("year","month")]))


            And without the zero-occurring combinations



            myAns[which(myAns$Freq>0),]





            share|improve this answer































              4














              For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
              So this is my handy snippet for those occasions;



              agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")
              agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")
              aggcount <- agg.count$columnToMean
              agg <- cbind(aggcount, agg.mean)





              share|improve this answer

































                2














                A sql solution using sqldf package:



                library(sqldf)
                sqldf("SELECT Year, Month, COUNT(*) as Freq
                FROM df1
                GROUP BY Year, Month")





                share|improve this answer































                  0














                  Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:



                  aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)


                  Similarly, it can be generalized if more than two variables are used in grouping:



                  aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)





                  share|improve this answer































                    0














                    You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.



                    The output will look like,



                    df1$Month: Feb
                    x freq
                    1 2012 1
                    2 2013 1
                    3 2014 5
                    ---------------------------------------------------------------
                    df1$Month: Jan
                    x freq
                    1 2012 5
                    2 2013 2
                    ---------------------------------------------------------------
                    df1$Month: Mar
                    x freq
                    1 2012 1
                    2 2013 3
                    3 2014 2
                    >





                    share|improve this answer






















                      protected by David Arenburg Aug 12 '15 at 21:46



                      Thank you for your interest in this question.
                      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                      Would you like to answer one of these unanswered questions instead?














                      11 Answers
                      11






                      active

                      oldest

                      votes








                      11 Answers
                      11






                      active

                      oldest

                      votes









                      active

                      oldest

                      votes






                      active

                      oldest

                      votes









                      45














                      There is also df2 <- count(x, c('Year','Month')) (plyr package)






                      share|improve this answer


























                      • Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                        – sop
                        May 15 '15 at 14:06






                      • 1





                        I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                        – geotheory
                        May 16 '15 at 22:28













                      • I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                        – sop
                        May 18 '15 at 7:20






                      • 4





                        I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                        – Manoj Kumar
                        Dec 14 '16 at 17:57






                      • 1





                        Yes dplyr is best practice now.

                        – geotheory
                        Dec 15 '16 at 2:07
















                      45














                      There is also df2 <- count(x, c('Year','Month')) (plyr package)






                      share|improve this answer


























                      • Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                        – sop
                        May 15 '15 at 14:06






                      • 1





                        I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                        – geotheory
                        May 16 '15 at 22:28













                      • I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                        – sop
                        May 18 '15 at 7:20






                      • 4





                        I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                        – Manoj Kumar
                        Dec 14 '16 at 17:57






                      • 1





                        Yes dplyr is best practice now.

                        – geotheory
                        Dec 15 '16 at 2:07














                      45












                      45








                      45







                      There is also df2 <- count(x, c('Year','Month')) (plyr package)






                      share|improve this answer















                      There is also df2 <- count(x, c('Year','Month')) (plyr package)







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Jun 6 '13 at 14:46

























                      answered Jun 5 '13 at 13:48









                      geotheorygeotheory

                      9,1781567133




                      9,1781567133













                      • Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                        – sop
                        May 15 '15 at 14:06






                      • 1





                        I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                        – geotheory
                        May 16 '15 at 22:28













                      • I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                        – sop
                        May 18 '15 at 7:20






                      • 4





                        I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                        – Manoj Kumar
                        Dec 14 '16 at 17:57






                      • 1





                        Yes dplyr is best practice now.

                        – geotheory
                        Dec 15 '16 at 2:07



















                      • Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                        – sop
                        May 15 '15 at 14:06






                      • 1





                        I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                        – geotheory
                        May 16 '15 at 22:28













                      • I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                        – sop
                        May 18 '15 at 7:20






                      • 4





                        I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                        – Manoj Kumar
                        Dec 14 '16 at 17:57






                      • 1





                        Yes dplyr is best practice now.

                        – geotheory
                        Dec 15 '16 at 2:07

















                      Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                      – sop
                      May 15 '15 at 14:06





                      Is there a way to aggregate a variable and do counting too (like 2 functions in aggregation: mean + count)? I need to get the mean of a column and the number of rows for the same value in other column

                      – sop
                      May 15 '15 at 14:06




                      1




                      1





                      I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                      – geotheory
                      May 16 '15 at 22:28







                      I'd cbind the results of aggregate(Sepal.Length ~ Species, iris, mean) and aggregate(Sepal.Length ~ Species, iris, length)

                      – geotheory
                      May 16 '15 at 22:28















                      I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                      – sop
                      May 18 '15 at 7:20





                      I have done it, but it seems that I get 2 times each column except the one that is aggregated; so I have done a merge on them and it seems to be ok

                      – sop
                      May 18 '15 at 7:20




                      4




                      4





                      I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                      – Manoj Kumar
                      Dec 14 '16 at 17:57





                      I don't know but this could be useful as well... df %>% group_by(group, variable) %>% mutate(count = n())

                      – Manoj Kumar
                      Dec 14 '16 at 17:57




                      1




                      1





                      Yes dplyr is best practice now.

                      – geotheory
                      Dec 15 '16 at 2:07





                      Yes dplyr is best practice now.

                      – geotheory
                      Dec 15 '16 at 2:07













                      52














                      Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):



                      nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])


                      and with aggregate, following @GregSnow:



                      aggregate(x ~ Year + Month, data = df, FUN = length)





                      share|improve this answer






























                        52














                        Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):



                        nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])


                        and with aggregate, following @GregSnow:



                        aggregate(x ~ Year + Month, data = df, FUN = length)





                        share|improve this answer




























                          52












                          52








                          52







                          Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):



                          nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])


                          and with aggregate, following @GregSnow:



                          aggregate(x ~ Year + Month, data = df, FUN = length)





                          share|improve this answer















                          Following @Joshua's suggestion, here's one way you might count the number of observations in your df dataframe where Year = 2007 and Month = Nov (assuming they are columns):



                          nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])


                          and with aggregate, following @GregSnow:



                          aggregate(x ~ Year + Month, data = df, FUN = length)






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Mar 22 '12 at 6:31

























                          answered Mar 21 '12 at 17:06









                          BenBen

                          31.9k1398171




                          31.9k1398171























                              32














                              We can also use dplyr.



                              First, some data:



                              df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))


                              Now the count:



                              library(dplyr)
                              count(df, year, month)
                              #piping
                              df %>% count(year, month)


                              We can also use a slightly longer version with piping and the n() function:



                              df %>% 
                              group_by(year, month) %>%
                              summarise(number = n())


                              or the tally function:



                              df %>% 
                              group_by(year, month) %>%
                              tally()





                              share|improve this answer






























                                32














                                We can also use dplyr.



                                First, some data:



                                df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))


                                Now the count:



                                library(dplyr)
                                count(df, year, month)
                                #piping
                                df %>% count(year, month)


                                We can also use a slightly longer version with piping and the n() function:



                                df %>% 
                                group_by(year, month) %>%
                                summarise(number = n())


                                or the tally function:



                                df %>% 
                                group_by(year, month) %>%
                                tally()





                                share|improve this answer




























                                  32












                                  32








                                  32







                                  We can also use dplyr.



                                  First, some data:



                                  df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))


                                  Now the count:



                                  library(dplyr)
                                  count(df, year, month)
                                  #piping
                                  df %>% count(year, month)


                                  We can also use a slightly longer version with piping and the n() function:



                                  df %>% 
                                  group_by(year, month) %>%
                                  summarise(number = n())


                                  or the tally function:



                                  df %>% 
                                  group_by(year, month) %>%
                                  tally()





                                  share|improve this answer















                                  We can also use dplyr.



                                  First, some data:



                                  df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))


                                  Now the count:



                                  library(dplyr)
                                  count(df, year, month)
                                  #piping
                                  df %>% count(year, month)


                                  We can also use a slightly longer version with piping and the n() function:



                                  df %>% 
                                  group_by(year, month) %>%
                                  summarise(number = n())


                                  or the tally function:



                                  df %>% 
                                  group_by(year, month) %>%
                                  tally()






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Dec 2 '18 at 18:45









                                  Jaap

                                  56.8k21122135




                                  56.8k21122135










                                  answered Aug 12 '15 at 21:55









                                  jeremycgjeremycg

                                  18.9k44156




                                  18.9k44156























                                      30














                                      An old question without a data.table solution. So here goes...



                                      Using .N



                                      library(data.table)
                                      DT <- data.table(df)
                                      DT[, .N, by = list(year, month)]





                                      share|improve this answer




























                                        30














                                        An old question without a data.table solution. So here goes...



                                        Using .N



                                        library(data.table)
                                        DT <- data.table(df)
                                        DT[, .N, by = list(year, month)]





                                        share|improve this answer


























                                          30












                                          30








                                          30







                                          An old question without a data.table solution. So here goes...



                                          Using .N



                                          library(data.table)
                                          DT <- data.table(df)
                                          DT[, .N, by = list(year, month)]





                                          share|improve this answer













                                          An old question without a data.table solution. So here goes...



                                          Using .N



                                          library(data.table)
                                          DT <- data.table(df)
                                          DT[, .N, by = list(year, month)]






                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Aug 2 '13 at 0:30









                                          mnelmnel

                                          91.9k18219230




                                          91.9k18219230























                                              21














                                              The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).






                                              share|improve this answer




























                                                21














                                                The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).






                                                share|improve this answer


























                                                  21












                                                  21








                                                  21







                                                  The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).






                                                  share|improve this answer













                                                  The simple option to use with aggregate is the length function which will give you the length of the vector in the subset. Sometimes a little more robust is to use function(x) sum( !is.na(x) ).







                                                  share|improve this answer












                                                  share|improve this answer



                                                  share|improve this answer










                                                  answered Mar 21 '12 at 18:08









                                                  Greg SnowGreg Snow

                                                  40.5k35987




                                                  40.5k35987























                                                      16














                                                      Create a new variable Count with a value of 1 for each row:



                                                      df1["Count"] <-1


                                                      Then aggregate dataframe, summing by the Count column:



                                                      df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)





                                                      share|improve this answer






























                                                        16














                                                        Create a new variable Count with a value of 1 for each row:



                                                        df1["Count"] <-1


                                                        Then aggregate dataframe, summing by the Count column:



                                                        df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)





                                                        share|improve this answer




























                                                          16












                                                          16








                                                          16







                                                          Create a new variable Count with a value of 1 for each row:



                                                          df1["Count"] <-1


                                                          Then aggregate dataframe, summing by the Count column:



                                                          df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)





                                                          share|improve this answer















                                                          Create a new variable Count with a value of 1 for each row:



                                                          df1["Count"] <-1


                                                          Then aggregate dataframe, summing by the Count column:



                                                          df2 <- aggregate(df1[c("Count")], by=list(year=df1$year, month=df1$month), FUN=sum, na.rm=TRUE)






                                                          share|improve this answer














                                                          share|improve this answer



                                                          share|improve this answer








                                                          edited Aug 12 '15 at 21:44









                                                          David Arenburg

                                                          78.8k1195163




                                                          78.8k1195163










                                                          answered Aug 2 '13 at 0:16









                                                          Leroy TyroneLeroy Tyrone

                                                          3371315




                                                          3371315























                                                              15














                                                              An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences



                                                              df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

                                                              myAns<-as.data.frame(table(df[,c("year","month")]))


                                                              And without the zero-occurring combinations



                                                              myAns[which(myAns$Freq>0),]





                                                              share|improve this answer




























                                                                15














                                                                An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences



                                                                df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

                                                                myAns<-as.data.frame(table(df[,c("year","month")]))


                                                                And without the zero-occurring combinations



                                                                myAns[which(myAns$Freq>0),]





                                                                share|improve this answer


























                                                                  15












                                                                  15








                                                                  15







                                                                  An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences



                                                                  df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

                                                                  myAns<-as.data.frame(table(df[,c("year","month")]))


                                                                  And without the zero-occurring combinations



                                                                  myAns[which(myAns$Freq>0),]





                                                                  share|improve this answer













                                                                  An alternative to the aggregate() function in this case would be table() with as.data.frame(), which would also indicate which combinations of Year and Month are associated with zero occurrences



                                                                  df<-data.frame(x=rep(1:6,rep(c(1,2,3),2)),year=1993:2004,month=c(1,1:11))

                                                                  myAns<-as.data.frame(table(df[,c("year","month")]))


                                                                  And without the zero-occurring combinations



                                                                  myAns[which(myAns$Freq>0),]






                                                                  share|improve this answer












                                                                  share|improve this answer



                                                                  share|improve this answer










                                                                  answered Mar 21 '12 at 20:41









                                                                  BenBarnesBenBarnes

                                                                  15.2k54464




                                                                  15.2k54464























                                                                      4














                                                                      For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
                                                                      So this is my handy snippet for those occasions;



                                                                      agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")
                                                                      agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")
                                                                      aggcount <- agg.count$columnToMean
                                                                      agg <- cbind(aggcount, agg.mean)





                                                                      share|improve this answer






























                                                                        4














                                                                        For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
                                                                        So this is my handy snippet for those occasions;



                                                                        agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")
                                                                        agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")
                                                                        aggcount <- agg.count$columnToMean
                                                                        agg <- cbind(aggcount, agg.mean)





                                                                        share|improve this answer




























                                                                          4












                                                                          4








                                                                          4







                                                                          For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
                                                                          So this is my handy snippet for those occasions;



                                                                          agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")
                                                                          agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")
                                                                          aggcount <- agg.count$columnToMean
                                                                          agg <- cbind(aggcount, agg.mean)





                                                                          share|improve this answer















                                                                          For my aggregations I usually end up wanting to see mean and "how big is this group" (a.k.a. length).
                                                                          So this is my handy snippet for those occasions;



                                                                          agg.mean <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="mean")
                                                                          agg.count <- aggregate(columnToMean ~ columnToAggregateOn1*columnToAggregateOn2, yourDataFrame, FUN="length")
                                                                          aggcount <- agg.count$columnToMean
                                                                          agg <- cbind(aggcount, agg.mean)






                                                                          share|improve this answer














                                                                          share|improve this answer



                                                                          share|improve this answer








                                                                          edited Jan 2 '16 at 0:31

























                                                                          answered Jan 5 '15 at 16:38









                                                                          mazemaze

                                                                          1087




                                                                          1087























                                                                              2














                                                                              A sql solution using sqldf package:



                                                                              library(sqldf)
                                                                              sqldf("SELECT Year, Month, COUNT(*) as Freq
                                                                              FROM df1
                                                                              GROUP BY Year, Month")





                                                                              share|improve this answer




























                                                                                2














                                                                                A sql solution using sqldf package:



                                                                                library(sqldf)
                                                                                sqldf("SELECT Year, Month, COUNT(*) as Freq
                                                                                FROM df1
                                                                                GROUP BY Year, Month")





                                                                                share|improve this answer


























                                                                                  2












                                                                                  2








                                                                                  2







                                                                                  A sql solution using sqldf package:



                                                                                  library(sqldf)
                                                                                  sqldf("SELECT Year, Month, COUNT(*) as Freq
                                                                                  FROM df1
                                                                                  GROUP BY Year, Month")





                                                                                  share|improve this answer













                                                                                  A sql solution using sqldf package:



                                                                                  library(sqldf)
                                                                                  sqldf("SELECT Year, Month, COUNT(*) as Freq
                                                                                  FROM df1
                                                                                  GROUP BY Year, Month")






                                                                                  share|improve this answer












                                                                                  share|improve this answer



                                                                                  share|improve this answer










                                                                                  answered May 29 '18 at 19:22









                                                                                  M-MM-M

                                                                                  6,85661945




                                                                                  6,85661945























                                                                                      0














                                                                                      Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:



                                                                                      aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)


                                                                                      Similarly, it can be generalized if more than two variables are used in grouping:



                                                                                      aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)





                                                                                      share|improve this answer




























                                                                                        0














                                                                                        Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:



                                                                                        aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)


                                                                                        Similarly, it can be generalized if more than two variables are used in grouping:



                                                                                        aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)





                                                                                        share|improve this answer


























                                                                                          0












                                                                                          0








                                                                                          0







                                                                                          Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:



                                                                                          aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)


                                                                                          Similarly, it can be generalized if more than two variables are used in grouping:



                                                                                          aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)





                                                                                          share|improve this answer













                                                                                          Considering @Ben answer, R would throw an error if df1 does not contain x column. But it can be solved elegantly with paste:



                                                                                          aggregate(paste(Year, Month) ~ Year + Month, data = df1, FUN = NROW)


                                                                                          Similarly, it can be generalized if more than two variables are used in grouping:



                                                                                          aggregate(paste(Year, Month, Day) ~ Year + Month + Day, data = df1, FUN = NROW)






                                                                                          share|improve this answer












                                                                                          share|improve this answer



                                                                                          share|improve this answer










                                                                                          answered Feb 22 '18 at 22:55









                                                                                          paudanpaudan

                                                                                          261




                                                                                          261























                                                                                              0














                                                                                              You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.



                                                                                              The output will look like,



                                                                                              df1$Month: Feb
                                                                                              x freq
                                                                                              1 2012 1
                                                                                              2 2013 1
                                                                                              3 2014 5
                                                                                              ---------------------------------------------------------------
                                                                                              df1$Month: Jan
                                                                                              x freq
                                                                                              1 2012 5
                                                                                              2 2013 2
                                                                                              ---------------------------------------------------------------
                                                                                              df1$Month: Mar
                                                                                              x freq
                                                                                              1 2012 1
                                                                                              2 2013 3
                                                                                              3 2014 2
                                                                                              >





                                                                                              share|improve this answer




























                                                                                                0














                                                                                                You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.



                                                                                                The output will look like,



                                                                                                df1$Month: Feb
                                                                                                x freq
                                                                                                1 2012 1
                                                                                                2 2013 1
                                                                                                3 2014 5
                                                                                                ---------------------------------------------------------------
                                                                                                df1$Month: Jan
                                                                                                x freq
                                                                                                1 2012 5
                                                                                                2 2013 2
                                                                                                ---------------------------------------------------------------
                                                                                                df1$Month: Mar
                                                                                                x freq
                                                                                                1 2012 1
                                                                                                2 2013 3
                                                                                                3 2014 2
                                                                                                >





                                                                                                share|improve this answer


























                                                                                                  0












                                                                                                  0








                                                                                                  0







                                                                                                  You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.



                                                                                                  The output will look like,



                                                                                                  df1$Month: Feb
                                                                                                  x freq
                                                                                                  1 2012 1
                                                                                                  2 2013 1
                                                                                                  3 2014 5
                                                                                                  ---------------------------------------------------------------
                                                                                                  df1$Month: Jan
                                                                                                  x freq
                                                                                                  1 2012 5
                                                                                                  2 2013 2
                                                                                                  ---------------------------------------------------------------
                                                                                                  df1$Month: Mar
                                                                                                  x freq
                                                                                                  1 2012 1
                                                                                                  2 2013 3
                                                                                                  3 2014 2
                                                                                                  >





                                                                                                  share|improve this answer













                                                                                                  You can use by functions as by(df1$Year, df1$Month, count) that will produce a list of needed aggregation.



                                                                                                  The output will look like,



                                                                                                  df1$Month: Feb
                                                                                                  x freq
                                                                                                  1 2012 1
                                                                                                  2 2013 1
                                                                                                  3 2014 5
                                                                                                  ---------------------------------------------------------------
                                                                                                  df1$Month: Jan
                                                                                                  x freq
                                                                                                  1 2012 5
                                                                                                  2 2013 2
                                                                                                  ---------------------------------------------------------------
                                                                                                  df1$Month: Mar
                                                                                                  x freq
                                                                                                  1 2012 1
                                                                                                  2 2013 3
                                                                                                  3 2014 2
                                                                                                  >






                                                                                                  share|improve this answer












                                                                                                  share|improve this answer



                                                                                                  share|improve this answer










                                                                                                  answered Nov 25 '18 at 20:57









                                                                                                  helcodehelcode

                                                                                                  721522




                                                                                                  721522

















                                                                                                      protected by David Arenburg Aug 12 '15 at 21:46



                                                                                                      Thank you for your interest in this question.
                                                                                                      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                                                                      Would you like to answer one of these unanswered questions instead?



                                                                                                      Popular posts from this blog

                                                                                                      404 Error Contact Form 7 ajax form submitting

                                                                                                      How to know if a Active Directory user can login interactively

                                                                                                      TypeError: fit_transform() missing 1 required positional argument: 'X'