Count number of rows per group and add result to original data frame

Question

Say I have a data frame object   df  lt - data frame name c  black   black   black   red   red                     type c  chair   chair   sofa   sofa   plate                     num c 4 5 12 4 3     Now I want to count the number of rows  observations  of for each combination of name and type  This can be done like so   table df    c  name   type       or possibly also with plyr   though I am not sure how    However  how do I get the results incorporated into the original data frame   So that the results will look like this   df      name  type num count   1 black chair   4     2   2 black chair   5     2   3 black  sofa  12     1   4   red  sofa   4     1   5   red plate   3     1   where count now stores the results from the aggregation   A solution with plyr could be interesting to learn as well  though I would like to see how this is done with base R

User · Answer

The base R function aggregate will obtain the counts with a one-liner  but adding those counts back to the original data frame seems to take a bit of processing   df  lt - data frame name c  black   black   black   red   red                     type c  chair   chair   sofa   sofa   plate                     num c 4 5 12 4 3   df      name  type num   1 black chair   4   2 black chair   5   3 black  sofa  12   4   red  sofa   4   5   red plate   3  rows per group   lt - aggregate rep 1  length paste0 df name  df type                                  by list df name  df type   sum  rows per group     Group 1 Group 2 x   1   black   chair 2   2     red   plate 1   3   black    sofa 1   4     red    sofa 1  my summary  lt - do call data frame  rows per group  colnames my summary   lt - c colnames df  1 2    rows per group   my data  lt - merge df  my summary  by   c colnames df  1 2    my data      name  type num rows per group   1 black chair   4              2   2 black chair   5              2   3 black  sofa  12              1   4   red plate   3              1   5   red  sofa   4              1

User · Answer

This should do your work    df agg  lt - aggregate num name type df FUN NROW  names df agg  3   lt -  count  df  lt - merge df df agg by c  name   type   all x TRUE

User · Answer

Using data table   library data table  dt   as data table df     or coerce to data table by reference    setDT df   dt    count     N  by     name  type     For pre-data table 1 8 2 alternative  see edit history     Using dplyr   library dplyr  df   gt     group by name  type    gt     mutate count   n      Or simply   add count df  name  type      Using plyr   plyr  ddply df    name  type   transform  count   length num

User · Answer

You were just one step away from incorporating the row count into the base dataset   Using the tidy   function from the broom package  convert the frequency table into a data frame and inner join with df   df  lt - data frame name c  black   black   black   red   red                             type c  chair   chair   sofa   sofa   plate                             num c 4 5 12 4 3   library broom  df  lt - merge df  tidy table df    c  name   type       by c  name   type    df    name  type num Freq 1 black chair   4    2 2 black chair   5    2 3 black  sofa  12    1 4   red plate   3    1 5   red  sofa   4    1

User · Answer

Another way that generalizes more   df count  lt - unsplit lapply split df  df c  name   type      nrow   df c  name   type

User · Answer

A two line alternative is to generate a variable of 0s and then fill it in with split lt -  split  and lengths like this     generate vector of 0s df count  lt -0L    fill it in split df count  df c  name    type      lt - lengths split df num  df c  name    type        This returns the desired result  df    name  type num count 1 black chair   4     2 2 black chair   5     2 3 black  sofa  12     1 4   red  sofa   4     1 5   red plate   3     1   Essentially  the RHS calculates the lengths of each name-type combination  returning a named vector of length 6 with 0s for  red chair  and   black plate   This is fed to the LHS with split  lt - which takes the vector and appropriately adds the values in their given spots  This is essentially what ave does  as you can see that the second to final line of ave is  split x  g   lt - lapply split x  g   FUN    However  lengths is an optimized version of sapply list  length

User · Answer

One simple line in base R  df count   table interaction df    c  quot name quot    quot type quot       interaction df    c  quot name quot    quot type quot        Same in two lines  for clarity efficiency  fact   interaction df    c  quot name quot    quot type quot      df count   table fact  fact

User · Answer

You can do this    gt  ddply df   name type  transform count   NROW piece      name  type num count 1 black chair   4     2 2 black chair   5     2 3 black  sofa  12     1 4   red plate   3     1 5   red  sofa   4     1   or perhaps more intuitively    gt  ddply df   name type  transform count   length num      name  type num count 1 black chair   4     2 2 black chair   5     2 3 black  sofa  12     1 4   red plate   3     1 5   red  sofa   4     1

User · Answer

You can use ave   df count  lt - ave df num  df  c  name   type     FUN length

User · Answer

Using sqldf package   library sqldf   sqldf  select a    b cnt        from df a              select name  type  count 1  as cnt             from df             group by name  type  b       where a name   b name and             a type   b type         name  type num cnt   1 black chair   4   2   2 black chair   5   2   3 black  sofa  12   1   4   red  sofa   4   1   5   red plate   3   1

[r] Count number of rows per group and add result to original data frame

Examples related to r

Examples related to count

Examples related to aggregate

Examples related to r-faq