how to realize countifs function excel in R

Question

I have a dataset containing 100000 rows of data  I tried to do some countif operations in Excel  but it was prohibitively slow  So I am wondering if this kind of operation can be done in R  Basically  I want to do a count based on multiple conditions  For example  I can count on both occupation and sex  row sex occupation   1   M    Student   2   F    Analyst   2   M    Analyst

User · Answer

Table is the obvious choice  but it returns an object of class table which takes a few annoying steps to transform back into a data frame So  if you re OK using dplyr  you use the command tally       library dplyr      df   data frame sex sample c  M    F    100000  replace T   occupation sample c  Analyst    Student    100000  replace T      df   gt   group by all     gt   tally       A tibble  4 x 3   Groups    sex  2    sex   occupation  n       lt fct gt   lt fct gt        lt int gt  1 F     Analyst    25105 2 F     Student    24933 3 M     Analyst    24769 4 M     Student    25193

User · Answer

Given a dataset  df  lt - data frame  sex   c  M    M    F    F    M                       occupation   c  analyst    dentist    dentist    analyst    cook       you can subset rows  df df sex     M      To get all males df df occupation     analyst      All analysts   etc   If you want to get number of rows  just call the function nrow such as  nrow df df sex     M

User · Answer

Here an example with 100000 rows  occupations are set here from  A to Z     gt  a   data frame sex sample c  M    F    100000  replace T   occupation sample LETTERS  100000  replace T    gt  sum a sex     M   amp  a occupation   A    1  1882   returns the number of males with occupation  A    EDIT  As I understand from your comment  you want the counts of all possible combinations of sex and occupation   So first create a dataframe with all combinations   combns   expand grid c  M    F    LETTERS    and loop with apply to sum for your criteria and append the results to combns   combns   cbind  combns  apply combns  1  function x sum a sex  x 1   amp  a occupation  x 2     colnames combns    c  sex    occupation    count     The first rows of your result look as follows     sex occupation count 1   M          A  1882 2   F          A  1869 3   M          B  1866 4   F          B  1904 5   M          C  1979 6   F          C  1910   Does this solve your problem   OR   Much easier solution suggested by thelatemai   table a sex  a occupation           A    B    C    D    E    F    G    H    I    J    K    L    M    N    O   F 1869 1904 1910 1907 1894 1940 1964 1907 1918 1892 1962 1933 1886 1960 1972   M 1882 1866 1979 1904 1895 1845 1946 1905 1999 1994 1933 1950 1876 1856 1911         P    Q    R    S    T    U    V    W    X    Y    Z   F 1908 1907 1883 1888 1943 1922 2016 1962 1885 1898 1889   M 1928 1938 1916 1927 1972 1965 1946 1903 1965 1974 1906

User · Answer

Easy peasy  Your data frame will look like this   df  lt - data frame sex c  M   F   M                     occupation c  Student   Analyst   Analyst      You can then do the equivalent of a COUNTIF by first specifying the IF part  like so   df sex     M    This will give you a boolean vector  i e  a vector of TRUE and FALSE  What you want is to count the observations for which the condition is TRUE  Since in R TRUE and FALSE double as 1 and 0 you can simply sum   over the boolean vector  The equivalent of COUNTIF sex  M   is therefore  sum df sex     M     Should there be rows in which the sex is not specified the above will give back NA  In that case  if you just want to ignore the missing observations use  sum df sex     M   na rm TRUE

User · Answer

library matrixStats   gt  data  lt - rbind c  M    F    M    c  Student    Analyst    Analyst     gt  rowCounts data  value    M     output   2 0  gt  rowCounts data  value    F     output   1 0

[r] how to realize countifs function (excel) in R

Examples related to r