I'm not 100% on why (actually came here to search for the answer), but this also works, and doesn't require replacing all nan values.
import pandas as pd
import numpy as np
df = pd.DataFrame([["foo1"], ["foo2"], ["bar"], [np.nan]], columns=['a'])
newdf = df.loc[df['a'].str.contains('foo') == True]
Works with or without .loc
.
I have no idea why this works, as I understand it when you're indexing with brackets pandas evaluates whatever's inside the bracket as either True
or False
. I can't tell why making the phrase inside the brackets 'extra boolean' has any effect at all.