I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:
import pandas as pd
df = pd.DataFrame(index=range(0,4),columns=['A'])
This code results in a DataFrame filled with NaNs of type "object". So they cannot be used later on for example with the interpolate()
method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):
import pandas as pd
import numpy as np
dummyarray = np.empty((4,1))
dummyarray[:] = np.nan
df = pd.DataFrame(dummyarray)
This results in a DataFrame filled with NaN of type "float", so it can be used later on with interpolate()
. Is there a more elegant way to create the same result?
Hope this can help!
pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])
You can try this line of code:
pdDataFrame = pd.DataFrame([np.nan] * 7)
This will create a pandas dataframe of size 7 with NaN of type float:
if you print pdDataFrame
the output will be:
0
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
Also the output for pdDataFrame.dtypes
is:
0 float64
dtype: object
For multiple columns you can do:
df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)
You could specify the dtype directly when constructing the DataFrame:
>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A float64
dtype: object
Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.
Source: Stackoverflow.com