Deleting DataFrame row in Pandas based on column value

Question

I have the following DataFrame                daysago  line race rating        rw    wrating  line date                                                   2007-03-31       62         11     56  1 000000  56 000000  2007-03-10       83         11     67  1 000000  67 000000  2007-02-10      111          9     66  1 000000  66 000000  2007-01-13      139         10     83  0 880678  73 096278  2006-12-23      160         10     88  0 793033  69 786942  2006-11-09      204          9     52  0 636655  33 106077  2006-10-22      222          8     66  0 581946  38 408408  2006-09-29      245          9     70  0 518825  36 317752  2006-09-16      258         11     68  0 486226  33 063381  2006-08-30      275          8     72  0 446667  32 160051  2006-02-11      475          5     65  0 164591  10 698423  2006-01-13      504          0     70  0 142409   9 968634  2006-01-02      515          0     64  0 134800   8 627219  2005-12-06      542          0     70  0 117803   8 246238  2005-11-29      549          0     70  0 113758   7 963072  2005-11-22      556          0     -1  0 109852  -0 109852  2005-11-01      577          0     -1  0 098919  -0 098919  2005-10-20      589          0     -1  0 093168  -0 093168  2005-09-27      612          0     -1  0 083063  -0 083063  2005-09-07      632          0     -1  0 075171  -0 075171  2005-06-12      719          0     69  0 048690   3 359623  2005-05-29      733          0     -1  0 045404  -0 045404  2005-05-02      760          0     -1  0 039679  -0 039679  2005-04-02      790          0     -1  0 034160  -0 034160  2005-03-13      810          0     -1  0 030915  -0 030915  2004-11-09      934          0     -1  0 016647  -0 016647   I need to remove the rows where line race is equal to 0  What s the most efficient way to do this

User · Answer

Another way of doing it  May not be the most efficient way as the code looks a bit more complex than the code mentioned in other answers  but still alternate way of doing the same thing     df   df drop df df  line race    0  index

User · Answer

The best way to do this is with boolean masking   In  56   df Out 56        line date  daysago  line race  rating    raw  wrating 0   2007-03-31       62         11      56  1 000   56 000 1   2007-03-10       83         11      67  1 000   67 000 2   2007-02-10      111          9      66  1 000   66 000 3   2007-01-13      139         10      83  0 881   73 096 4   2006-12-23      160         10      88  0 793   69 787 5   2006-11-09      204          9      52  0 637   33 106 6   2006-10-22      222          8      66  0 582   38 408 7   2006-09-29      245          9      70  0 519   36 318 8   2006-09-16      258         11      68  0 486   33 063 9   2006-08-30      275          8      72  0 447   32 160 10  2006-02-11      475          5      65  0 165   10 698 11  2006-01-13      504          0      70  0 142    9 969 12  2006-01-02      515          0      64  0 135    8 627 13  2005-12-06      542          0      70  0 118    8 246 14  2005-11-29      549          0      70  0 114    7 963 15  2005-11-22      556          0      -1  0 110   -0 110 16  2005-11-01      577          0      -1  0 099   -0 099 17  2005-10-20      589          0      -1  0 093   -0 093 18  2005-09-27      612          0      -1  0 083   -0 083 19  2005-09-07      632          0      -1  0 075   -0 075 20  2005-06-12      719          0      69  0 049    3 360 21  2005-05-29      733          0      -1  0 045   -0 045 22  2005-05-02      760          0      -1  0 040   -0 040 23  2005-04-02      790          0      -1  0 034   -0 034 24  2005-03-13      810          0      -1  0 031   -0 031 25  2004-11-09      934          0      -1  0 017   -0 017  In  57   df df line race    0  Out 57        line date  daysago  line race  rating    raw  wrating 0   2007-03-31       62         11      56  1 000   56 000 1   2007-03-10       83         11      67  1 000   67 000 2   2007-02-10      111          9      66  1 000   66 000 3   2007-01-13      139         10      83  0 881   73 096 4   2006-12-23      160         10      88  0 793   69 787 5   2006-11-09      204          9      52  0 637   33 106 6   2006-10-22      222          8      66  0 582   38 408 7   2006-09-29      245          9      70  0 519   36 318 8   2006-09-16      258         11      68  0 486   33 063 9   2006-08-30      275          8      72  0 447   32 160 10  2006-02-11      475          5      65  0 165   10 698   UPDATE  Now that pandas 0 13 is out  another way to do this is df query  line race    0

User · Answer

In case of multiple values and str dtype I used the following to filter out given values in a col  def filter rows by values df  col  values   return df df col  isin values     False   Example  In a DataFrame I want to remove rows which have values  quot b quot  and  quot c quot  in column  quot str quot  df   pd DataFrame   quot str quot     quot a quot   quot a quot   quot a quot   quot a quot   quot b quot   quot b quot   quot c quot     quot other quot    1 2 3 4 5 6 7    df    str  other 0   a   1 1   a   2 2   a   3 3   a   4 4   b   5 5   b   6 6   c   7  filter rows by values d  quot str quot     quot b quot   quot c quot        str  other 0   a   1 1   a   2 2   a   3 3   a   4

User · Answer

Just adding another way for DataFrame expanded over all columns   for column in df columns     df   df df column   0    Example   def z score data count      threshold 3    for column in data columns         mean   np mean data column          std   np std data column          for i in data column              zscore    i-mean  std            if np abs zscore  gt threshold                  count count 1                data   data data column   i     return data count

User · Answer

But for any future bypassers you could mention that df   df df line race    0  doesn t do anything when trying to filter for None missing values   Does work   df   df df line race    0    Doesn t do anything   df   df df line race    None    Does work   df   df df line race notnull

User · Answer

Though the previou answer are almost similar to what I am going to do  but using the index method does not require using another indexing method  loc    It can be done in a similar but precise manner as  df drop df index df  line race      0   inplace   True

User · Answer

The given answer is correct nontheless as someone above said you can use df query  line race    0   which depending on your problem is much faster  Highly recommend

User · Answer

I compiled and run my code  This is accurate code  You can try it your own   data   pd read excel  file xlsx     If you have any special character or space in column name you can write it in    like in the given code   data   data data  expire t   notnull    print  date    If there is just a single string column name without any space or special  character you can directly access it    data   data data expire     0  print  date

User · Answer

just to add another solution  particularly useful if you are using the new pandas assessors  other solutions will replace the original pandas and lose the assessors  df drop df loc df  line race    0  index  inplace True

User · Answer

If I m understanding correctly  it should be as simple as   df   df df line race    0

User · Answer

If you want to delete rows based on multiple values of the column  you could use   df  df line race    0   amp   df line race    10     To drop all rows with values 0 and 10 for line race

[python] Deleting DataFrame row in Pandas based on column value

Examples related to python

Examples related to pandas