Pandas How to filter a Series

Question

I have a Series like this after doing groupby  name   and used mean   function on other column  name 383      3 000000 663      1 000000 726      1 000000 737      9 000000 833      8 166667   Could anyone please show me how to filter out the rows with 1 000000 mean values  Thank you and I greatly appreciate your help

User · Answer

If you like a chained operation  you can also use compress function   test   pd Series   383     3 000000  663     1 000000  726     1 000000  737     9 000000  833     8 166667     test compress lambda x  x    1     383    3 000000   737    9 000000   833    8 166667   dtype  float64

User · Answer

From pandas version 0 18  filtering a series can also be done as below  test     383     3 000000  663     1 000000  726     1 000000  737     9 000000  833     8 166667    pd Series test  where lambda x   x  1  dropna     Checkout  http   pandas pydata org pandas-docs version 0 18 1 whatsnew html method-chaininng-improvements

User · Answer

A fast way of doing this is to reconstruct using numpy to slice the underlying arrays   See timings below   mask   s values    1 pd Series s values mask   s index mask    0 383    3 000000 737    9 000000 833    8 166667 dtype  float64   naive timing

User · Answer

Another way is to first convert to a DataFrame and use the query method  assuming you have numexpr installed    import pandas as pd  test     383     3 000000  663     1 000000  726     1 000000  737     9 000000  833     8 166667    s   pd Series test  s to frame name  x   query  x    1

User · Answer

In my case I had a panda Series where the values are tuples of characters   Out 67  0     H  H  H  H  1     H  H  H  T  2     H  H  T  H  3     H  H  T  T  4     H  T  H  H    Therefore I could use indexing to filter the series  but to create the index I needed apply  My condition is  find all tuples which have exactly one  H     series of tuples series of tuples apply lambda x  x count  H    1     I admit it is not  chainable    i e  notice I repeat series of tuples twice  you must store any temporary series into a variable so you can call apply      on it     There may also be other methods  besides  apply        which can operate elementwise to produce a Boolean index   Many other answers  including accepted answer  using the chainable functions like     compress    where    loc        These accept callables  lambdas  which are applied to the Series  not to the individual values in those series   Therefore my Series of tuples behaved strangely when I tried to use my above condition   callable   lambda  with any of the chainable functions  like  loc     series of tuples loc lambda x  x count  H    1    Produces the error   KeyError   Level H must be same as name  None    I was very confused  but it seems to be using the Series count series of tuples count      function   which is not what I wanted   I admit that an alternative data structure may be better    A Category datatype  A Dataframe  each element of the tuple becomes a column  A Series of strings  just concatenate the tuples together     This creates a series of strings  i e  by concatenating the tuple  joining the characters in the tuple on a single string   series of tuples apply    join    So I can then use the chainable Series str count  series of tuples apply    join  str count  H    1

User · Answer

As DACW pointed out  there are method-chaining improvements in pandas 0 18 1 that do what you are looking for very nicely   Rather than using  where  you can pass your function to either the  loc indexer or the Series indexer    and avoid the call to  dropna   test   pd Series   383     3 000000  663     1 000000  726     1 000000  737     9 000000  833     8 166667     test loc lambda x   x  1   test lambda x  x  1    Similar behavior is supported on the DataFrame and NDFrame classes

User · Answer

In  5    import pandas as pd  test     383     3 000000  663     1 000000  726     1 000000  737     9 000000  833     8 166667    s   pd Series test  s   s s    1  s Out 0   383    3 000000 737    9 000000 833    8 166667 dtype  float64

[python] Pandas How to filter a Series

Examples related to python

Examples related to pandas