Removing specific rows from a dataframe

Question

I have a data frame e g    sub   day 1      1 1      2 1      3 1      4 2      1 2      2 2      3 2      4 3      1 3      2 3      3 3      4   and I would like to remove specific rows that can be identified by the combination of sub and day  For example say I wanted to remove rows where sub  1  and day  2  and sub 3 and day  4   How could I do this   I realise that I could specify the row numbers  but this needs to be applied to a huge dataframe which would be tedious to go through and ID each row

User · Answer

This boils down to two distinct steps    Figure out when your condition is true  and hence compute a vector of booleans  or  as I prefer  their indices by wrapping it into which   Create an updated data frame by excluding the indices from the previous step    Here is an example   R gt  set seed 42  R gt  DF  lt - data frame sub rep 1 4  each 4   day sample 1 4  16  replace TRUE   R gt  DF    sub day 1    1   4 2    1   4 3    1   2 4    1   4 5    2   3 6    2   3 7    2   3 8    2   1 9    3   3 10   3   3 11   3   2 12   3   3 13   4   4 14   4   2 15   4   2 16   4   4 R gt  ind  lt - which with  DF  sub  2  amp  day  3    R gt  ind  1  5 6 7 R gt  DF  lt - DF  -ind    R gt  table DF     day sub 1 2 3 4   1 0 1 0 3   2 1 0 0 0   3 0 1 3 0   4 0 2 0 2 R gt     And we see that sub  2 has only one entry remaining with day  1   Edit The compound condition can be done with an  or  as follows   ind  lt - which with  DF   sub  1  amp  day  2     sub 3  amp  day 4       and here is a new full example  R gt  set seed 1  R gt  DF  lt - data frame sub rep 1 4  each 5   day sample 1 4  20  replace TRUE   R gt  table DF     day sub 1 2 3 4   1 1 2 1 1   2 1 0 2 2   3 2 1 1 1   4 0 2 1 2 R gt  ind  lt - which with  DF   sub  1  amp  day  2     sub  3  amp  day  4     R gt  ind  1   1  2 15 R gt  DF  lt - DF -ind    R gt  table DF     day sub 1 2 3 4   1 1 0 1 1   2 1 0 2 2   3 2 1 1 0   4 0 2 1 2 R gt

User · Answer

One simple solution   cond1  lt - df sub    1  amp  df day    2  cond2  lt - df sub    3  amp  df day    4  df  lt - df   cond1   cond2

User · Answer

Here s a solution to your problem using dplyr s filter function   Although you can pass your data frame as the first argument to any dplyr function  I ve used its   gt   operator  which pipes your data frame to one or more dplyr functions  just filter in this case    Once you are somewhat familiar with dplyr  the cheat sheet is very handy      gt  print df  lt - data frame sub rep 1 3  each 4   day 1 4      sub day 1    1   1 2    1   2 3    1   3 4    1   4 5    2   1 6    2   2 7    2   3 8    2   4 9    3   1 10   3   2 11   3   3 12   3   4  gt  print df  lt - df   gt   filter    sub  1  amp  day  2     sub  3  amp  day  4        sub day 1    1   1 2    1   3 3    1   4 4    2   1 5    2   2 6    2   3 7    2   4 8    3   1 9    3   2 10   3   3

User · Answer

DF        DF sub   1  amp  DF day  2      DF sub   3  amp  DF day  4            note the    negation    Or if sub is a factor as suggested by your use of quotes   DF    paste sub day sep       in  c  1 2    3 4        Could also use subset   subset DF     paste sub day sep       in  c  1 2    3 4        And I endorse the use of which in Dirk s answer when using     even though some claim it is not needed

[r] Removing specific rows from a dataframe

The answer is

Examples related to r

Examples related to dataframe

Examples related to rows

Tags