How to filter rows in pandas by regex

Question

I would like to cleanly filter a dataframe using regex on one of the columns   For a contrived example   In  210   foo   pd DataFrame   a     1 2 3 4    b      hi    foo    fat    cat     In  211   foo Out 211       a    b 0  1   hi 1  2  foo 2  3  fat 3  4  cat   I want to filter the rows to those that start with f using a regex  First go   In  213   foo b str match  f     Out 213    0       1       2       3         That s not too terribly useful  However this will get me my boolean index   In  226   foo b str match   f      str len    gt  0 Out 226    0    False 1     True 2     True 3    False Name  b   So I could then do my restriction by   In  229   foo foo b str match   f      str len    gt  0  Out 229       a    b 1  2  foo 2  3  fat   That makes me artificially put a group into the regex though  and seems like maybe not the clean way to go  Is there a better way to do this

User · Answer

There is already a string handling function Series.str.startswith(). You should try foo[foo.b.str.startswith('f')].

Result:

    a   b
1   2   foo
2   3   fat

I think what you expect.

Alternatively you can use contains with regex option. For example:

foo[foo.b.str.contains('oo', regex= True, na=False)]

Result:

    a   b
1   2   foo

na=False is to prevent Errors in case there is nan, null etc. values

User · Answer

Use contains instead   In  10   df b str contains   f   Out 10    0    False 1     True 2     True 3    False Name  b  dtype  bool

User · Answer

Using str  slice   foo foo b str 0    f   Out 18       a    b 1  2  foo 2  3  fat

User · Answer

Multiple column search with dataframe   frame frame filename str match      MetaData        amp  frame file path str match  C  test test txt

User · Answer

Thanks for the great answer  user3136169  here is an example of how that might be done also removing NoneType values   def regex filter val       if val          mo   re search regex val          if mo              return True         else              return False     else          return False  df filtered   df df  col   apply regex filter     Also you can also add regex as an arg   def regex filter val myregex            df filtered   df df  col   apply res regex filter regex myregex

User · Answer

Write a Boolean function that checks the regex and use apply on the column  foo foo  b   apply regex function

User · Answer

It may be a bit late  but this is now easier to do in Pandas by calling Series str match  The docs explain the difference between match  fullmatch and contains  Note that in order to use the results for indexing  set the na False argument  or True if you want to include NANs in the results

[python] How to filter rows in pandas by regex

Examples related to python

Examples related to regex

Examples related to pandas