Counting the number of elements with the values of x in a vector

Question

I have a vector of numbers   numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435           453 435 324 34 456 56 567 65 34 435    How can I have R count the number of times a value x appears in the vector

User · Answer

My preferred solution uses rle  which will return a value  the label  x in your example  and a length  which represents how many times that value appeared in sequence   By combining rle with sort  you have an extremely fast way to count the number of times any value appeared  This can be helpful with more complex problems   Example    gt  numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435 453 435 324 34 456 56 567 65 34 435   gt  a  lt - rle sort numbers    gt  a   Run Length Encoding     lengths  int  1 15  2 1 2 2 1 1 2 1 2 1         values   num  1 15  4 5 23 34 43 54 56 65 67 324       If the value you want doesn t show up  or you need to store that value for later  make a a data frame    gt  b  lt - data frame number a values  n a lengths   gt  b     values n  1       4 2  2       5 1  3      23 2  4      34 2  5      43 1  6      54 1  7      56 2  8      65 1  9      67 2  10    324 1  11    435 3  12    453 1  13    456 1  14    567 1  15    657 1   I find it is rare that I want to know the frequency of one value and not all of the values  and rle seems to be the quickest way to get count and store them all

User · Answer

You can just use table      gt  a  lt - table numbers   gt  a numbers   4   5  23  34  43  54  56  65  67 324 435 453 456 567 657    2   1   2   2   1   1   2   1   2   1   3   1   1   1   1    Then you can subset it    gt  a names a   435  435    3   Or convert it into a data frame if you re more comfortable working with that    gt  as data frame table numbers      numbers Freq 1        4    2 2        5    1 3       23    2 4       34    2

User · Answer

One option could be to use vec count   function from the vctrs library  vec count numbers      key count 1  435     3 2   67     2 3    4     2 4   34     2 5   56     2 6   23     2 7  456     1 8   43     1 9  453     1 10   5     1 11 657     1 12 324     1 13  54     1 14 567     1 15  65     1  The default ordering puts the most frequent values at top  If looking for sorting according keys  a table  -like output   vec count numbers  sort    quot key quot       key count 1    4     2 2    5     1 3   23     2 4   34     2 5   43     1 6   54     1 7   56     2 8   65     1 9   67     2 10 324     1 11 435     3 12 453     1 13 456     1 14 567     1 15 657     1

User · Answer

One more way i find convenient is   numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435 453 435 324 34 456 56 567 65 34 435   s lt -summary  as factor numbers      This converts the dataset to factor  and then summary   gives us the control totals  counts of the unique values    Output is   4   5  23  34  43  54  56  65  67 324 435 453 456 567 657  2   1   2   2   1   1   2   1   2   1   3   1   1   1   1    This can be stored as dataframe if preferred      as data frame cbind Number   names s  Freq   s   stringsAsFactors F  row names   1 length s     here row names has been used to rename row names  without using row names  column names in s are used as row names in new dataframe  Output is        Number Freq 1       4    2 2       5    1 3      23    2 4      34    2 5      43    1 6      54    1 7      56    2 8      65    1 9      67    2 10    324    1 11    435    3 12    453    1 13    456    1 14    567    1 15    657    1

User · Answer

numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435 453 435 324 34 456 56 567 65 34 435    gt  length grep 435  numbers    1  3    gt  length which 435    numbers    1  3    gt  require plyr   gt  df   count numbers   gt  df df x    435          x freq 11 435    3    gt  sum 435    numbers   1  3    gt  sum grepl 435  numbers    1  3    gt  sum 435    numbers   1  3    gt  tabulate numbers  435   1  3    gt  table numbers   435   435    3     gt  length subset numbers  numbers   435      1  3

User · Answer

There is also count numbers  from plyr package  Much more convenient than table in my opinion

User · Answer

This is a very fast solution for one-dimensional atomic vectors  It relies on match    so it is compatible with NA   x  lt - c  a   NA   a    c    a    b   NA   c    fn  lt - function x      u  lt - unique default x    out  lt - list x   u  freq    Internal tabulate match x  u   length u       class out   lt -  data frame    attr out   row names    lt - seq along u    out    fn x     gt       x freq   gt  1    a    3   gt  2  lt NA gt     2   gt  3    c    2   gt  4    b    1   You could also tweak the algorithm so that it doesn t run unique     fn2  lt - function x      y  lt - match x  x    out  lt - list x   x  freq    Internal tabulate y  length x    y     class out   lt -  data frame    attr out   row names    lt - seq along x    out    fn2 x     gt       x freq   gt  1    a    3   gt  2  lt NA gt     2   gt  3    a    3   gt  4    c    2   gt  5    a    3   gt  6    b    1   gt  7  lt NA gt     2   gt  8    c    2   In cases where that output is desirable  you probably don t even need it to re-return the original vector  and the second column is probably all you need  You can get that in one line with the pipe   match x  x    gt       tabulate           gt   1  3 2 3 2 3 1 2 2

User · Answer

A method that is relatively fast on long vectors and gives a convenient output is to use lengths split numbers  numbers    note the S at the end of lengths      Make some integer vectors of different sizes set seed 123  x  lt - sample int 1e3  1e4  replace   TRUE  xl  lt - sample int 1e3  1e6  replace   TRUE  xxl  lt -sample int 1e3  1e7  replace   TRUE     Number of times each value appears in x  a  lt - lengths split x x      Number of times the value 64 appears  a  64      64    15    Occurences of the first 10 values a 1 10     1  2  3  4  5  6  7  8  9 10     13 12  6 14 12  5 13 14 11 14    The output is simply a named vector  The speed appears comparable to rle proposed by JBecker and even a bit faster on very long vectors  Here is a microbenchmark in R 3 6 2 with some of the functions proposed   library microbenchmark   f1  lt - function vec  lengths split vec vec   f2  lt - function vec  table vec  f3  lt - function vec  rle sort vec   f4  lt - function vec  plyr  count vec   microbenchmark split   f1 x                  table   f2 x                  rle   f3 x                  plyr   f4 x      Unit  microseconds      expr      min        lq      mean    median        uq      max neval  cld     split  402 024  423 2445  492 3400  446 7695  484 3560 2970 107   100  b       table 1234 888 1290 0150 1378 8902 1333 2445 1382 2005 3203 332   100    d       rle  227 685  238 3845  264 2269  245 7935  279 5435  378 514   100 a         plyr  758 866  793 0020  866 9325  843 2290  894 5620 2346 407   100   c   microbenchmark split   f1 xl                  table   f2 xl                  rle   f3 xl                  plyr   f4 xl      Unit  milliseconds      expr       min        lq      mean    median        uq       max neval cld     split  21 96075  22 42355  26 39247  23 24847  24 60674  82 88853   100 ab      table 100 30543 104 05397 111 62963 105 54308 110 28732 168 27695   100   c       rle  19 07365  20 64686  23 71367  21 30467  23 22815  78 67523   100 a        plyr  24 33968  25 21049  29 71205  26 50363  27 75960  92 02273   100  b   microbenchmark split   f1 xxl                  table   f2 xxl                  rle   f3 xxl                  plyr   f4 xxl      Unit  milliseconds      expr       min        lq      mean    median        uq       max neval  cld     split  296 4496  310 9702  342 6766  332 5098  374 6485  421 1348   100 a        table 1151 4551 1239 9688 1283 8998 1288 0994 1323 1833 1385 3040   100    d       rle  399 9442  430 8396  464 2605  471 4376  483 2439  555 9278   100   c       plyr  350 0607  373 1603  414 3596  425 1436  437 8395  506 0169   100  b     Importantly  the only function that also counts the number of missing values NA is plyr  count  These can also be obtained separately using sum is na vec

User · Answer

If you want to count the number of appearances subsequently  you can make use of the sapply function   index lt -sapply 1 length numbers  function x sum numbers 1 x   numbers x    cbind numbers  index    Output           numbers index   1         4     1   2        23     1   3         4     2   4        23     2   5         5     1   6        43     1   7        54     1   8        56     1   9       657     1  10        67     1  11        67     2  12       435     1  13       453     1  14       435     2  15       324     1  16        34     1  17       456     1  18        56     2  19       567     1  20        65     1  21        34     2  22       435     3

User · Answer

This can be done with outer to get a metrix of equalities followed by rowSums  with an obvious meaning  In order to have the counts and numbers in the same dataset  a data frame is first created  This step is not needed if you want separate input and output   df  lt - data frame No   numbers  df count  lt - rowSums outer df No  df No  FUN

User · Answer

Here is a way you could do it with dplyr  library tidyverse   numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435               453 435 324 34 456 56 567 65 34 435  ord  lt - seq 1  length numbers     df  lt - data frame ord numbers   df  lt - df   gt     count numbers   numbers     n       lt dbl gt   lt int gt   1       4     2  2       5     1  3      23     2  4      34     2  5      43     1  6      54     1  7      56     2  8      65     1  9      67     2 10     324     1 11     435     3 12     453     1 13     456     1 14     567     1 15     657     1

User · Answer

You can change the number to whatever you wish in following line   length which numbers    4

User · Answer

You can make a function to give you results    your list numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435           453 435 324 34 456 56 567 65 34 435   function1 lt -function x       if x  value  return 1  else  return 0         set your value here value lt -4    make a vector which return 1 if it equal to your value  0 else vector lt -sapply numbers function x  function1 x   sum vector   result  2

User · Answer

The most direct way is sum numbers    x      numbers    x creates a logical vector which is TRUE at every location that x occurs  and when suming  the logical vector is coerced to numeric which converts TRUE to 1 and FALSE to 0   However  note that for floating point numbers it s better to use something like  sum abs numbers - x   lt  1e-6

User · Answer

There is a standard function in R for that  tabulate numbers

User · Answer

Using table but without comparing with names   numbers  lt - c 4 23 4 23 5 43 54 56 657 67 67 435  x  lt - 67 numbertable  lt - table numbers  numbertable as character x    67    2    table is useful when you are using the counts of different elements several times  If you need only one count  use sum numbers    x

User · Answer

I would probably do something like this  length which numbers  x     But really  a better way is  table numbers

User · Answer

There are different ways of counting a specific elements  library plyr  numbers  c 4 23 4 23 5 43 54 56 657 67 67 435 453 435 7 65 34 435   print length which numbers  435      Sum counts number of TRUE s in a vector  print sum numbers  435   print sum c TRUE  FALSE  TRUE      count is present in plyr library   o p of count is a DataFrame  freq is 1 of the columns of data frame print count numbers numbers  435    print count numbers numbers  435     freq

User · Answer

here s one fast and dirty way   x  lt - 23 length subset numbers  numbers  x

[r] Counting the number of elements with the values of x in a vector

Examples related to r

Examples related to vector

Examples related to count

Examples related to r-faq