I am trying to write a Pandas dataframe (or can use a numpy array) to a mysql database using MysqlDB . MysqlDB doesn't seem understand 'nan' and my database throws out an error saying nan is not in the field list. I need to find a way to convert the 'nan' into a NoneType.
Any ideas?
This question is related to
python
pandas
numpy
mysql-python
Another addition: be careful when replacing multiples and converting the type of the column back from object to float. If you want to be certain that your None
's won't flip back to np.NaN
's apply @andy-hayden's suggestion with using pd.where
.
Illustration of how replace can still go 'wrong':
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: df = pd.DataFrame({"a": [1, np.NAN, np.inf]})
In [4]: df
Out[4]:
a
0 1.0
1 NaN
2 inf
In [5]: df.replace({np.NAN: None})
Out[5]:
a
0 1
1 None
2 inf
In [6]: df.replace({np.NAN: None, np.inf: None})
Out[6]:
a
0 1.0
1 NaN
2 NaN
In [7]: df.where((pd.notnull(df)), None).replace({np.inf: None})
Out[7]:
a
0 1.0
1 NaN
2 NaN
After stumbling around, this worked for me:
df = df.astype(object).where(pd.notnull(df),None)
You can replace nan
with None
in your numpy array:
>>> x = np.array([1, np.nan, 3])
>>> y = np.where(np.isnan(x), None, x)
>>> print y
[1.0 None 3.0]
>>> print type(y[1])
<type 'NoneType'>
Just an addition to @Andy Hayden's answer:
Since DataFrame.mask
is the opposite twin of DataFrame.where
, they have the exactly same signature but with opposite meaning:
DataFrame.where
is useful for Replacing values where the condition is False. DataFrame.mask
is used for Replacing values where the condition is True.So in this question, using df.mask(df.isna(), other=None, inplace=True)
might be more intuitive.
Quite old, yet I stumbled upon the very same issue. Try doing this:
df['col_replaced'] = df['col_with_npnans'].apply(lambda x: None if np.isnan(x) else x)
df = df.replace({np.nan: None})
Credit goes to this guy here on this Github issue.
Source: Stackoverflow.com