Count number of rows within each group

Question

I have a dataframe and I would like to count the number of rows within each group  I reguarly use the aggregate function to sum data as follows   df2  lt - aggregate x   Year   Month  data   df1  sum    Now  I would like to count observations but can t seem to find the proper argument for FUN  Intuitively  I thought it would be as follows   df2  lt - aggregate x   Year   Month  data   df1  count    But  no such luck   Any ideas     Some toy data   set seed 2  df1  lt - data frame x   1 20                    Year   sample 2012 2014  20  replace   TRUE                     Month   sample month abb 1 3   20  replace   TRUE

User · Answer

If your trying the aggregate solutions above and you get the error:

invalid type (list) for variable

Because you're using date or datetime stamps, try using as.character on the variables:

aggregate(x ~ as.character(Year) + Month, data = df, FUN = length)

On one or both of the variables.

User · Answer

There are plenty of wonderful answers here already  but I wanted to throw in 1 more option for those wanting to add a new column to the original dataset that contains the number of times that row is repeated    df1 counts  lt - sapply X   paste df1 Year  df1 Month                         FUN   function x    sum paste df1 Year  df1 Month     x       The same could be accomplished by combining any of the above answers with the merge   function

User · Answer

The simple option to use with aggregate is the length function which will give you the length of the vector in the subset   Sometimes a little more robust is to use function x  sum   is na x

User · Answer

Following  Joshua s suggestion  here s one way you might count the number of observations in your df dataframe where Year   2007 and Month   Nov  assuming they are columns    nrow df  df YEAR    2007  amp  df Month     Nov      and with aggregate  following  GregSnow   aggregate x   Year   Month  data   df  FUN   length

User · Answer

A sql solution using sqldf package   library sqldf  sqldf  SELECT Year  Month  COUNT    as Freq        FROM df1        GROUP BY Year  Month

User · Answer

dplyr package does this with count tally commands  or the n   function   First  some data   df  lt - data frame x   rep 1 6  rep c 1  2  3   2    year   1993 2004  month   c 1  1 11     Now the count   library dplyr  count df  year  month   piping df   gt   count year  month    We can also use a slightly longer version with piping and the n   function   df   gt      group by year  month    gt     summarise number   n      or the tally function   df   gt      group by year  month    gt     tally

User · Answer

If you want to include 0 counts for month-years that are missing in the data  you can use a little table magic   data frame with df1  table Year  Month        For example  the toy data frame in the question  df1  contains no observations of January 2014   df1     x Year Month 1   1 2012   Feb 2   2 2014   Feb 3   3 2013   Mar 4   4 2012   Jan 5   5 2014   Feb 6   6 2014   Feb 7   7 2012   Jan 8   8 2014   Feb 9   9 2013   Mar 10 10 2013   Jan 11 11 2013   Jan 12 12 2012   Jan 13 13 2014   Mar 14 14 2012   Mar 15 15 2013   Feb 16 16 2014   Feb 17 17 2014   Mar 18 18 2012   Jan 19 19 2013   Mar 20 20 2012   Jan   The base R aggregate function does not return an observation for January 2014   aggregate x   Year   Month  data   df1  FUN   length    Year Month x 1 2012   Feb 1 2 2013   Feb 1 3 2014   Feb 5 4 2012   Jan 5 5 2013   Jan 2 6 2012   Mar 1 7 2013   Mar 3 8 2014   Mar 2   If you would like an observation of this month-year with 0 as the count  then the above code will return a data frame with counts for all month-year combinations   data frame with df1  table Year  Month      Year Month Freq 1 2012   Feb    1 2 2013   Feb    1 3 2014   Feb    5 4 2012   Jan    5 5 2013   Jan    2 6 2014   Jan    0 7 2012   Mar    1 8 2013   Mar    3 9 2014   Mar    2

User · Answer

You can use by functions as by df1 Year  df1 Month  count  that will produce a list of needed aggregation   The output will look like    df1 Month  Feb      x freq 1 2012    1 2 2013    1 3 2014    5 ---------------------------------------------------------------  df1 Month  Jan      x freq 1 2012    5 2 2013    2 ---------------------------------------------------------------  df1 Month  Mar      x freq 1 2012    1 2 2013    3 3 2014    2  gt

User · Answer

An old question without a data table solution  So here goes     Using  N   library data table  DT  lt - data table df  DT    N  by   list year  month

User · Answer

Current best practice  tidyverse  is   require dplyr  df1   gt   count Year  Month

User · Answer

Considering  Ben answer  R would throw an error if df1 does not contain x column  But it can be solved elegantly with paste   aggregate paste Year  Month    Year   Month  data   df1  FUN   NROW    Similarly  it can be generalized if more than two variables are used in grouping   aggregate paste Year  Month  Day    Year   Month   Day  data   df1  FUN   NROW

User · Answer

An alternative to the aggregate   function in this case would be table   with as data frame    which would also indicate which combinations of Year and Month are associated with zero occurrences  df lt -data frame x rep 1 6 rep c 1 2 3  2   year 1993 2004 month c 1 1 11    myAns lt -as data frame table df  c  year   month        And without the zero-occurring combinations  myAns which myAns Freq gt 0

User · Answer

Create a new variable Count with a value of 1 for each row   df1  Count    lt -1   Then aggregate dataframe  summing by the Count column   df2  lt - aggregate df1 c  Count     by list Year df1 Year  Month df1 Month   FUN sum  na rm TRUE

User · Answer

For my aggregations I usually end up wanting to see mean and  how big is this group   a k a  length   So this is my handy snippet for those occasions   agg mean  lt - aggregate columnToMean   columnToAggregateOn1 columnToAggregateOn2  yourDataFrame  FUN  mean   agg count  lt - aggregate columnToMean   columnToAggregateOn1 columnToAggregateOn2  yourDataFrame  FUN  length   aggcount  lt - agg count columnToMean agg  lt - cbind aggcount  agg mean

User · Answer

library tidyverse   df 1   gt     group by Year  Month    gt     summarise count  n

[r] Count number of rows within each group

Examples related to r

Examples related to dataframe

Examples related to aggregate

Examples related to r-faq