How do I delete rows in a data frame

Question

I have a data frame named  mydata  that looks like this this       A  B  C   D  1  5  4  4   4  2  5  4  4   4  3  5  4  4   4  4  5  4  4   4  5  5  4  4   4  6  5  4  4   4  7  5  4  4   4    I d like to delete row 2 4 6  For example  like this      A  B  C   D 1  5  4  4  4  3  5  4  4  4  5  5  4  4  4  7  5  4  4  4

User · Answer

The key idea is you form a set of the rows you want to remove, and keep the complement of that set.

In R, the complement of a set is given by the '-' operator.

So, assuming the data.frame is called myData:

myData[-c(2, 4, 6), ]   # notice the -

Of course, don't forget to "reassign" myData if you wanted to drop those rows entirely---otherwise, R just prints the results.

myData <- myData[-c(2, 4, 6), ]

User · Answer

For completeness  I ll add that this can be done with dplyr as well using slice  The advantage of using this is that it can be part of a piped workflow   df  lt - df   gt             slice -c 2  4  6     gt             Of course  you can also use it without pipes   df  lt - slice df  -c 2  4  6     The  not vector  format  -c 2  4  6  means to get everything that is not at rows 2  4 and 6  For an example using a range  let s say you wanted to remove the first 5 rows  you could do slice df  6 n     For more examples  see the docs

User · Answer

By simplified sequence    mydata - 1 3   2       By sequence    mydata seq 1  nrow mydata   by   2        By negative sequence    mydata -seq 2  nrow mydata   by   2        Or if you want to subset by selecting odd numbers   mydata which 1 nrow mydata     2    1        Or if you want to subset by selecting odd numbers  version 2   mydata which 1 nrow mydata     2    0        Or if you want to subset by filtering even numbers out   mydata  which 1 nrow mydata     2    0        Or if you want to subset by filtering even numbers out  version 2   mydata  which 1 nrow mydata     2    1

User · Answer

Delete Dan from employee data - No need to manage a new data frame   employee data  lt - subset employee data  name   Dan

User · Answer

Problems with deleting by row number  For quick and dirty analyses  you can delete rows of a data frame by number as per the top answer  I e    newdata  lt - myData -c 2  4  6        However  if you are trying to write a robust data analysis script  you should  generally avoid deleting rows by numeric position  This is because the order of the rows in your data may change in the future  A general principle of a data frame or database tables is that the order of the rows should not matter  If the order does matter  this should be encoded in an actual variable in the data frame   For example  imagine  you  imported a dataset and deleted rows by numeric position after inspecting the data and identifying the row numbers of the rows that you wanted to delete  However  at some later point  you go into the raw data and have a look around and reorder the data  Your row deletion code will now delete the wrong rows  and worse  you are unlikely to get any errors warning you that this has occurred   Better strategy  A better strategy is to delete rows based on substantive and stable properties of the row  For example  if you had an id column variable that uniquely identifies each case  you could use that   newdata  lt - myData    myData id  in  c 2 4 6        Other times  you will have a formal exclusion criteria that could be specified  and you could use one of the many subsetting tools in R to exclude cases based on that rule

User · Answer

You can also work with a so called boolean vector  aka logical   row to keep   c TRUE  FALSE  TRUE  FALSE  TRUE  FALSE  TRUE  myData   myData row to keep     Note that the   operator acts as a NOT  i e   TRUE    FALSE   myData   myData  row to keep     This seems a bit cumbersome in comparison to  mrwab s answer   1 btw      but a logical vector can be generated on the fly  e g  where a column value exceeds a certain value   myData   myData myData A  gt  4   myData   myData  myData A  gt  4     equal to myData myData A  lt   4     You can transform a boolean vector to a vector of indices   row to keep   which myData A  gt  4    Finally  a very neat trick is that you can use this kind of subsetting not only for extraction  but also for assignment   myData A myData A  gt  4    lt - NA   where column A is assigned NA  not a number  where A exceeds 4

User · Answer

Here s a quick and dirty function to remove a row by index   removeRowByIndex  lt - function x  row index      nr  lt - nrow x    if  nr  lt  row index        print  row index exceeds number of rows       else if  row index    1          return x 2 nr         else if  row index    nr        return x 1  nr - 1          else       return  x c 1  row index - 1    row index   1  nr              It s main flaw is it the row index argument doesn t follow the R pattern of being a vector of values  There may be other problems as I only spent a couple of minutes writing and testing it  and have only started using R in the last few weeks  Any comments and improvements on this would be very welcome

User · Answer

Create id column in your data frame or use any column name to identify the row  Using index is not fair to delete    Use subset function to create new frame   updated myData  lt - subset myData  id   6  print  updated myData   updated myData  lt - subset myData  id  in  c 1  3  5  7   print  updated myData

[r] How do I delete rows in a data frame?

Examples related to r

Examples related to row