Get index of a row of a pandas dataframe as an integer

Question

Assume an easy dataframe  for example      A         B 0   1  0 810743 1   2  0 595866 2   3  0 154888 3   4  0 472721 4   5  0 894525 5   6  0 978174 6   7  0 859449 7   8  0 541247 8   9  0 232302 9  10  0 276566   How can I retrieve an index value of a row  given a condition  For example  dfb   df df  A    5  index values astype int  returns  4   but what I would like to get is just 4  This is causing me troubles later in the code   Based on some conditions  I want to have a record of the indexes where that condition is fulfilled  and then select rows between    I tried  dfb   df df  A    5  index values astype int  dfbb   df df  A    8  index values astype int  df loc dfb dfbb  B     for a desired output      A         B 4   5  0 894525 5   6  0 978174 6   7  0 859449   but I get TypeError    4   is an invalid key

User · Answer

The nature of wanting to include the row where A == 5 and all rows upto but not including the row where A == 8 means we will end up using iloc (loc includes both ends of slice).

In order to get the index labels we use idxmax. This will return the first position of the maximum value. I run this on a boolean series where A == 5 (then when A == 8) which returns the index value of when A == 5 first happens (same thing for A == 8).

Then I use searchsorted to find the ordinal position of where the index label (that I found above) occurs. This is what I use in iloc.

i5, i8 = df.index.searchsorted([df.A.eq(5).idxmax(), df.A.eq(8).idxmax()])
df.iloc[i5:i8]

numpy

you can further enhance this by using the underlying numpy objects the analogous numpy functions. I wrapped it up into a handy function.

def find_between(df, col, v1, v2):
    vals = df[col].values
    mx1, mx2 = (vals == v1).argmax(), (vals == v2).argmax()
    idx = df.index.values
    i1, i2 = idx.searchsorted([mx1, mx2])
    return df.iloc[i1:i2]

find_between(df, 'A', 5, 8)

timing

User · Answer

The easier is add  0  - select first value of list with one element   dfb   df df  A    5  index values astype int  0  dfbb   df df  A    8  index values astype int  0      dfb   int df df  A    5  index 0   dfbb   int df df  A    8  index 0     But if possible some values not match  error is raised  because first value not exist   Solution is use next with iter for get default parameetr if values not matched   dfb   next iter df df  A    5  index    no match   print  dfb  4  dfb   next iter df df  A    50  index    no match   print  dfb  no match   Then it seems need substract 1   print  df loc dfb dfbb-1  B    4    0 894525 5    0 978174 6    0 859449 Name  B  dtype  float64   Another solution with boolean indexing or query   print  df  df  A    gt   5   amp   df  A    lt  8       A         B 4  5  0 894525 5  6  0 978174 6  7  0 859449  print  df loc  df  A    gt   5   amp   df  A    lt  8    B    4    0 894525 5    0 978174 6    0 859449 Name  B  dtype  float64     print  df query  A  gt   5 and A  lt  8       A         B 4  5  0 894525 5  6  0 978174 6  7  0 859449

User · Answer

Little sum up for searching by row  This can be useful if you don t know the column values   or if columns have non-numeric values if u want get index number as integer u can also do  item   df 4 5  index item   print item  4  it also works in numpy   list  numpy   df 4 7  index to numpy   0  lista   df 4 7  index to list   0   in  x  u pick number in range  4 7   for example if u want 6  numpy   df 4 7  index to numpy   2  print numpy  6  for DataFrame  df 4 7       A          B 4   5   0 894525 5   6   0 978174 6   7   0 859449  or  df  df index gt  4   amp   df index lt 7        A          B 4   5   0 894525 5   6   0 978174 6   7   0 859449

User · Answer

To answer the original question on how to get the index as an integer for the desired selection  the following will work    df df  A    5  index item

[python] Get index of a row of a pandas dataframe as an integer

Examples related to python

Examples related to pandas

Examples related to numpy