Get first row of dataframe in Python Pandas based on criteria

Question

Let s say that I have a dataframe like this one  import pandas as pd df   pd DataFrame   1  2  1    1  3  2    4  6  3    4  3  4    5  4  5    columns   A    B    C      gt  gt  df    A  B  C 0  1  2  1 1  1  3  2 2  4  6  3 3  4  3  4 4  5  4  5   The original table is more complicated with more columns and rows   I want to get the first row that fulfil some criteria  Examples    Get first row where A   3  returns row 2  Get first row where A   4 AND B   3  returns row 4  Get first row where A   3 AND  B   3 OR C   2   returns row 2    But  if there isn t any row that fulfil the specific criteria  then I want to get the first one after I just sort it descending by A  or other cases by B  C etc    Get first row where A   6  returns row 4 by ordering it by A desc and get the first one    I was able to do it by iterating on the dataframe  I know that craps  P   So  I prefer a more pythonic way to solve it

User · Answer

you can take care of the first 3 items with slicing and head:

df[df.A>=4].head(1)
df[(df.A>=4)&(df.B>=3)].head(1)
df[(df.A>=4)&((df.B>=3) * (df.C>=2))].head(1)

The condition in case nothing comes back you can handle with a try or an if...

try:
    output = df[df.A>=6].head(1)
    assert len(output) == 1
except: 
    output = df.sort_values('A',ascending=False).head(1)

User · Answer

This tutorial is a very good one for pandas slicing  Make sure you check it out  Onto some snippets    To slice a dataframe with a condition  you use this format    gt  gt  gt  df condition    This will return a slice of your dataframe which you can index using iloc  Here are your examples    Get first row where A   3  returns row 2    gt  gt  gt  df df A  gt  3  iloc 0  A    4 B    6 C    3 Name  2  dtype  int64    If what you actually want is the row number  rather than using iloc  it would be df df A  gt  3  index 0     Get first row where A   4 AND B   3    gt  gt  gt  df  df A  gt  4   amp   df B  gt  3   iloc 0  A    5 B    4 C    5 Name  4  dtype  int64  Get first row where A   3 AND  B   3 OR C   2   returns row 2    gt  gt  gt  df  df A  gt  3   amp    df B  gt  3     df C  gt  2    iloc 0  A    4 B    6 C    3 Name  2  dtype  int64    Now  with your last case we can write a function that handles the default case of returning the descending-sorted frame    gt  gt  gt  def series or default X  condition  default col  ascending False           sliced   X condition          if sliced shape 0     0              return X sort values default col  ascending ascending  iloc 0          return sliced iloc 0   gt  gt  gt    gt  gt  gt  series or default df  df A  gt  6   A   A    5 B    4 C    5 Name  4  dtype  int64   As expected  it returns row 4

User · Answer

For existing matches  use query   df query   A  gt  3    head 1  Out 33       A  B  C 2  4  6  3  df query   A  gt  4 and B  gt  3    head 1  Out 34       A  B  C 4  5  4  5  df query   A  gt  3 and  B  gt  3 or C  gt  2     head 1  Out 35       A  B  C 2  4  6  3

User · Answer

For the point that  returns the value as soon as you find the first row record that meets the requirements and NOT iterating other rows   the following code would work  def pd iter func df       for row in df itertuples              Define your criteria here         if row A  gt  4 and row B  gt  3              return row  It is more efficient than Boolean Indexing when it comes to a large dataframe  To make the function above more applicable  one can implements lambda functions  def pd iter func df  DataFrame  criteria  Callable  NamedTuple   bool   - gt  Optional NamedTuple       for row in df itertuples            if criteria row               return row  pd iter func df  lambda row  row A  gt  4 and row B  gt  3    As mentioned in the answer to the  mirror  question  pandas Series idxmax would also be a nice choice  def pd idxmax func df  mask       return df loc mask idxmax     pd idxmax func df   df A  gt  4   amp   df B  gt  3

[python] Get first row of dataframe in Python Pandas based on criteria

Examples related to python

Examples related to pandas