Extend contigency table with proportions percentages

Question

I have a contingency table of counts  and I want to extend it with corresponding proportions of each group  Some sample data  tips data set from ggplot2 package   library ggplot2   head tips  3      total bill tip    sex smoker day   time size   1         17 1 0 Female     No Sun Dinner    2   2         10 1 7   Male     No Sun Dinner    3   3         21 3 5   Male     No Sun Dinner    3  First  use table to count smoker vs non-smoker  and nrow to count total number of subjects  table tips smoker     No Yes    151  93   nrow tips     1  244  Then  I want to calculate percentage of smokers vs  non smokers  Something like this  ugly code     percentage of smokers options digits   2   transform as data frame table tips smoker    percentage column   Freq   nrow tips    100      Var1 Freq percentage column   1   No  151                62   2  Yes   93                38  Is there a better way to do this   even better it would be to do this on a set of columns  which I enumerate  and have output somewhat nicely formatted   e g   smoker  day  and time

User · Answer

I am not 100  certain  but I think this does what you want using prop table   See mostly the last 3 lines   The rest of the code is just creating fake data   set seed 1234   total bill  lt - rnorm 50  25  3  tip  lt - 0 15   total bill   rnorm 50  0  1  sex  lt - rbinom 50  1  0 5  smoker  lt - rbinom 50  1  0 3  day  lt - ceiling runif 50  0 7   time  lt - ceiling runif 50  0 3   size  lt - 1   rpois 50  2  my data  lt - as data frame cbind total bill  tip  sex  smoker  day  time  size   my data  my table  lt - table my data smoker   my prop  lt - prop table my table   cbind my table  my prop

User · Answer

Here is another example using the lapply and table functions in base R      freqList   lapply select if tips  is factor                  function x                      df   data frame table x                      df   data frame fct   df   1                                      n   sapply df   2   function y                                          round y   nrow dat   2                                                                                                                          return df                                             Use print freqList  to see the proportion tables  percent of frequencies  for each column feature variable  depending on your tradecraft  that is labeled as a factor

User · Answer

Your code doesn t seem so ugly to me    however  an alternative  not much better  could be e g     df  lt - data frame table yn   colnames df   lt - c  Smoker   Freq   df Perc  lt - df Freq   sum df Freq    100  ------------------   Smoker Freq Perc 1     No   19 47 5 2    Yes   21 52 5

User · Answer

I made this for when doing aggregate functions and similar  per fun  lt - function x        if length x  gt 1           denom  lt - length x           num  lt - sum x           percentage  lt - num denom          percentage 100                   else NA

User · Answer

Here s a tidyverse version   library tidyverse  data diamonds    as data frame table diamonds cut     gt   rename Count 1 Freq 2    gt   mutate Perc 100 Freq sum Freq      Or if you want a handy function   getPercentages  lt - function df  colName      df cnt  lt - df   gt   select   colName      gt        table     gt       as data frame     gt        rename   colName     1  Freq 2    gt        mutate Perc 100 Freq sum Freq       Now you can do   diamonds   gt   getPercentages cut    or this   df diamonds   gt   group by cut    gt   group modify   x   gt   getPercentages clarity   ggplot df aes x clarity y Perc   geom col   facet wrap  cut

User · Answer

If it s conciseness you re after  you might like   prop table table tips smoker     and then scale by 100 and round if you  like  Or more like your exact output   tbl  lt - table tips smoker  cbind tbl prop table tbl     If you wanted to do this for multiple columns  there are lots of different directions you could go depending on what your tastes tell you is clean looking output  but here s one option   tblFun  lt - function x       tbl  lt - table x      res  lt - cbind tbl round prop table tbl  100 2       colnames res   lt - c  Count   Percentage       res    do call rbind lapply tips 3 6  tblFun          Count Percentage Female    87      35 66 Male     157      64 34 No       151      61 89 Yes       93      38 11 Fri       19       7 79 Sat       87      35 66 Sun       76      31 15 Thur      62      25 41 Dinner   176      72 13 Lunch     68      27 87   If you don t like stack the different tables on top of each other  you can ditch the do call and leave them in a list

[r] Extend contigency table with proportions (percentages)

Examples related to r

Examples related to dataframe

Examples related to count