How to drop a list of rows from Pandas dataframe

Question

I have a dataframe df     gt  gt  gt  df                   sales  discount  net sales    cogs STK ID RPT Date                                      600141 20060331   2 709       NaN      2 709   2 245        20060630   6 590       NaN      6 590   5 291        20060930  10 103       NaN     10 103   7 981        20061231  15 915       NaN     15 915  12 686        20070331   3 196       NaN      3 196   2 710        20070630   7 907       NaN      7 907   6 459   Then I want to drop rows with certain sequence numbers which indicated in a list  suppose here is  1 2 4   then left                     sales  discount  net sales    cogs STK ID RPT Date                                      600141 20060331   2 709       NaN      2 709   2 245        20061231  15 915       NaN     15 915  12 686        20070630   7 907       NaN      7 907   6 459   How or what function can do that

User · Accepted Answer

Use DataFrame drop and pass it a Series of index labels   In  65   df Out 65           one  two one      1    4 two      2    3 three    3    2 four     4    1   In  66   df drop df index  1 3    Out 66           one  two one      1    4 three    3    2

User · Answer

Use only the Index arg to drop row -  df drop index   2  inplace   True    For multiple rows -  df drop index  1 3   inplace   True

User · Answer

Look at the following dataframe df  df     column1  column2  column3 0        1       11       21 1        2       12       22 2        3       13       23 3        4       14       24 4        5       15       25 5        6       16       26 6        7       17       27 7        8       18       28 8        9       19       29 9       10       20       30  Lets drop all the rows which has an odd number in column1 Create a list of all the elements in column1 and keep only those elements that are even numbers  the elements that you dont want to drop   keep elements    x for x in df column1 if x 2  0   All the rows with the values  2  4  6  8  10  in its column1 will be retained or not dropped  df set index  column1  inplace   True  df drop df index difference keep elements  axis 0 inplace True  df reset index inplace True   We make the column1 as index and drop all the rows that are not required  Then we reset the index back  df    column1  column2  column3 0        2       12       22 1        4       14       24 2        6       16       26 3        8       18       28 4       10       20       30

User · Answer

To drop rows with indices 1  2  4 you can use  df  df index isin  1  2  4     The tilde operator   negates the result of the method isin  Another option is to drop indices  df loc df index drop  1  2  4

User · Answer

In a comment to  theodros-zelleke s answer   j-jones asked about what to do if the index is not unique   I had to deal with such a situation   What I did was to rename the duplicates in the index before I called drop    a la   dropped indexes    lt determine-indexes-to-drop gt  df index   rename duplicates df index  df drop df index dropped indexes   inplace True    where rename duplicates   is a function I defined that went through the elements of index and renamed the duplicates   I used the same renaming pattern as pd read csv   uses on columns  i e     s  d     name  count   where name is the name of the row and count is how many times it has occurred previously

User · Answer

Consider an example dataframe  df        index    column1 0           00 1           10 2           20 3           30   we want to drop 2nd and 3rd index rows    Approach 1    df   df drop df index 2 3    or  df drop df index 2 3  inplace True  print df   df        index    column1 0           00 3           30    This approach removes the rows as we wanted but the index remains unordered   Approach 2  df drop df index 2 3  inplace True ignore index True  print df  df        index    column1 0           00 1           30  This approach removes the rows as we wanted and resets the index

User · Answer

Note that it may be important to use the  inplace  command when you want to do the drop in line    df drop df index  1 3    inplace True    Because your original question is not returning anything  this command should be used  http   pandas pydata org pandas-docs version 0 17 0 generated pandas DataFrame drop html

User · Answer

Here is a bit specific example  I would like to show  Say you have many duplicate entries in some of your rows  If you have string entries you could easily use string methods to find all indexes to drop    ind drop   df df  column of strings   apply lambda x  x startswith  Keyword     index   And now to drop those rows using their indexes   new df   df drop ind drop

User · Answer

If the DataFrame is huge  and the number of rows to drop is large as well  then simple drop by index df drop df index    takes too much time    In my case  I have a multi-indexed DataFrame of floats with 100M rows x 3 cols  and I need to remove 10k rows from it  The fastest method I found is  quite counterintuitively  to take the remaining rows   Let indexes to drop be an array of positional indexes to drop   1  2  4  in the question    indexes to keep   set range df shape 0    - set indexes to drop  df sliced   df take list indexes to keep     In my case this took 20 5s  while the simple df drop took 5min 27s and consumed a lot of memory  The resulting DataFrame is the same

User · Answer

I solved this in a simpler way - just in 2 steps   Make a dataframe with unwanted rows data   Use the index of this unwanted dataframe to drop the rows from the original dataframe    Example  Suppose you have a dataframe df which as many columns including  Age  which is an integer  Now let s say you want to drop all the rows with  Age  as negative number  df age negative   df  df  Age    lt  0     Step 1 df   df drop df age negative index  axis 0    Step 2  Hope this is much simpler and helps you

User · Answer

If I want to drop a row which has let s say index x  I would do the following   df   df df index    x    If I would want to drop multiple indices  say these indices are in the list unwanted indices   I would do   desired indices    i for i in len df index  if i not in unwanted indices  desired df   df iloc desired indices

User · Answer

You can also pass to DataFrame drop the label itself  instead of Series of index labels    In 17   df Out 17                a         b         c         d         e one  0 456558 -2 536432  0 216279 -1 305855 -0 121635 two -1 015127 -0 445133  1 867681  2 179392  0 518801  In 18   df drop  one   Out 18                a         b         c         d         e two -1 015127 -0 445133  1 867681  2 179392  0 518801   Which is equivalent to   In 19   df drop df index  0    Out 19                a         b         c         d         e two -1 015127 -0 445133  1 867681  2 179392  0 518801

User · Answer

Determining the index from the boolean as described above e g   df df  column   isin values   index   can be more memory intensive than determining the index using this method  pd Index np where df  column   isin values   0     applied like so  df drop pd Index np where df  column   isin values   0    inplace   True    This method is useful when dealing with large dataframes and limited memory

[python] How to drop a list of rows from Pandas dataframe?

Examples related to python

Examples related to pandas