Find element s index in pandas Series

Question

I know this is a very basic question but for some reason I can t find an answer  How can I get the index of certain element of a Series in python pandas   first occurrence would suffice   I e   I d like something like   import pandas as pd myseries   pd Series  1 4 0 7 5   index  0 1 2 3 4   print myseries find 7    should output 3   Certainly  it is possible to define such a method with a loop   def find s  el       for i in s index          if s i     el               return i     return None  print find myseries  7    but I assume there should be a better way  Is there

User · Answer

I m impressed with all the answers here   This is not a new answer  just an attempt to summarize the timings of all these methods   I considered the case of a series with 25 elements and assumed the general case where the index could contain any values and you want the index value corresponding to the search value which is towards the end of the series   Here are the speed tests on a 2013 MacBook Pro in Python 3 7 with Pandas version 0 25 3   In  1   import pandas as pd                                                  In  2   import numpy as np                                                   In  3   data    406400  203200  101600   76100   50800   25400   19050   12700                    9500    6700    4750    3350    2360    1700    1180     850                     600     425     300     212     150     106      75      53                      38                                                                                  In  4   myseries   pd Series data  index range 1 26                                                    In  5   myseries 21                                                                                   Out 5   150  In  7    timeit myseries myseries    150  index 0                                                     416   s    5 05   s per loop  mean    std  dev  of 7 runs  1000 loops each   In  8    timeit myseries myseries    150  first valid index                                           585   s    32 5   s per loop  mean    std  dev  of 7 runs  1000 loops each   In  9    timeit myseries where myseries    150  first valid index                                     652   s    23 3   s per loop  mean    std  dev  of 7 runs  1000 loops each   In  10    timeit myseries index np where myseries    150  0  0                                        195   s    1 18   s per loop  mean    std  dev  of 7 runs  10000 loops each   In  11    timeit pd Series myseries index  index myseries  150                   178   s    9 35   s per loop  mean    std  dev  of 7 runs  10000 loops each   In  12    timeit myseries index pd Index myseries  get loc 150                                       77 4   s    1 41   s per loop  mean    std  dev  of 7 runs  10000 loops each   In  13    timeit myseries index list myseries  index 150   12 7   s    42 5 ns per loop  mean    std  dev  of 7 runs  100000 loops each   In  14    timeit myseries index myseries tolist   index 150                      9 46   s    19 2 ns per loop  mean    std  dev  of 7 runs  100000 loops each     Jeff s answer seems to be the fastest - although it doesn t handle duplicates   Correction  Sorry  I missed one   Alex Spangher s solution using the list index method is by far the fastest   Update  Added  EliadL s answer   Hope this helps   Amazing that such a simple operation requires such convoluted solutions and many are so slow   Over half a millisecond in some cases to find a value in a series of 25

User · Answer

Converting to an Index  you can use get loc  In  1   myseries   pd Series  1 4 0 7 5   index  0 1 2 3 4    In  3   Index myseries  get loc 7  Out 3   3  In  4   Index myseries  get loc 10  KeyError  10   Duplicate handling  In  5   Index  1 1 2 2 3 4   get loc 2  Out 5   slice 2  4  None    Will return a boolean array if non-contiguous returns  In  6   Index  1 1 2 1 3 2 4   get loc 2  Out 6   array  False  False   True  False  False   True  False   dtype bool    Uses a hashtable internally  so fast  In  7   s   Series randint 0 10 10000    In  9    timeit s s    5  1000 loops  best of 3  203   s per loop  In  12   i   Index s   In  13    timeit i get loc 5  1000 loops  best of 3  226   s per loop   As Viktor points out  there is a one-time creation overhead to creating an index  its incurred when you actually DO something with the index  e g  the is unique   In  2   s   Series randint 0 10 10000    In  3    timeit Index s  100000 loops  best of 3  9 6   s per loop  In  4    timeit Index s  is unique 10000 loops  best of 3  140   s per loop

User · Answer

gt  gt  gt  myseries myseries    7  3    7 dtype  int64  gt  gt  gt  myseries myseries    7  index 0  3   Though I admit that there should be a better way to do that  but this at least avoids iterating and looping through the object and moves it to the C level

User · Answer

you can use Series idxmax      gt  gt  gt  import pandas as pd  gt  gt  gt  myseries   pd Series  1 4 0 7 5   index  0 1 2 3 4    gt  gt  gt  myseries idxmax   3  gt  gt  gt

User · Answer

In  92    myseries  7  argmax   Out 92   3   This works if you know 7 is there in advance  You can check this with  myseries  7  any      Another approach  very similar to the first answer  that also accounts for multiple 7 s  or none  is   In  122   myseries   pd Series  1 7 0 7 5   index   a   b   c   d   e    In  123   list myseries myseries  7  index  Out 123     b    d

User · Answer

If you use numpy  you can get an array of the indecies that your value is found   import numpy as np import pandas as pd myseries   pd Series  1 4 0 7 5   index  0 1 2 3 4   np where myseries    7    This returns a one element tuple containing an array of the indecies where 7 is the value in myseries    array  3   dtype int64

User · Answer

This is the most native and scalable approach I could find    gt  gt  gt  myindex   pd Series myseries index  index myseries    gt  gt  gt  myindex 7  3   gt  gt  gt  myindex  7  5  7   7    3 5    4 7    3 dtype  int64

User · Answer

Often your value occurs at multiple indices    gt  gt  gt  myseries   pd Series  0  0  0  1  1  1  1  0  0  0  1  1    gt  gt  gt  myseries index myseries    1  Int64Index  3  4  5  6  10  11   dtype  int64

User · Answer

Another way to do this  although equally unsatisfying is   s   pd Series  1 3 0 7 5  index  0 1 2 3 4    list s  index 7    returns      3  On time tests using a current dataset I m working with  consider it random     64       timeit pd Index article reference df asset id  get loc  100000003003614   10000 loops  best of 3  60 1   s per loop  In  66    timeit article reference df asset id article reference df asset id     100000003003614   index 0  1000 loops  best of 3  255   s per loop   In  65    timeit list article reference df asset id  index  100000003003614   100000 loops  best of 3  14 5   s per loop

User · Answer

Another way to do it that hasn t been mentioned yet is the tolist method   myseries tolist   index 7    should return the correct index  assuming the value exists in the Series

[python] Find element's index in pandas Series

Examples related to python

Examples related to pandas