I have a list that countain values, one of the values I got is 'nan'
countries= [nan, 'USA', 'UK', 'France']
I tried to remove it, but I everytime get an error
cleanedList = [x for x in countries if (math.isnan(x) == True)]
TypeError: a float is required
When I tried this one :
cleanedList = cities[np.logical_not(np.isnan(countries))]
cleanedList = cities[~np.isnan(countries)]
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
use numpy fancy indexing:
In [29]: countries=np.asarray(countries)
In [30]: countries[countries!='nan']
Out[30]:
array(['USA', 'UK', 'France'],
dtype='|S6')
Using your example where...
countries= [nan, 'USA', 'UK', 'France']
Since nan is not equal to nan (nan != nan) and countries[0] = nan, you should observe the following:
countries[0] == countries[0]
False
However,
countries[1] == countries[1]
True
countries[2] == countries[2]
True
countries[3] == countries[3]
True
Therefore, the following should work:
cleanedList = [x for x in countries if x == x]
import numpy as np
mylist = [3, 4, 5, np.nan]
l = [x for x in mylist if ~np.isnan(x)]
This should remove all NaN. Of course, I assume that it is not a string here but actual NaN (np.nan
).
if you check for the element type
type(countries[1])
the result will be <class float>
so you can use the following code:
[i for i in countries if type(i) is not float]
The problem comes from the fact that np.isnan()
does not handle string values correctly. For example, if you do:
np.isnan("A")
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
However the pandas version pd.isnull()
works for numeric and string values:
pd.isnull("A")
> False
pd.isnull(3)
> False
pd.isnull(np.nan)
> True
pd.isnull(None)
> True
I like to remove missing values from a list like this:
list_no_nan = [x for x in list_with_nan if pd.notnull(x)]
Another way to do it would include using filter like this:
countries = list(filter(lambda x: str(x) != 'nan', countries))
I noticed that Pandas for example will return 'nan' for blank values. Since it's not a string you need to convert it to one in order to match it. For example:
ulist = df.column1.unique() #create a list from a column with Pandas which
for loc in ulist:
loc = str(loc) #here 'nan' is converted to a string to compare with if
if loc != 'nan':
print(loc)
In your example 'nan'
is a string so instead of using isnan()
just check for the string
like this:
cleanedList = [x for x in countries if x != 'nan']
The question has changed, so to has the answer:
Strings can't be tested using math.isnan
as this expects a float argument. In your countries
list, you have floats and strings.
In your case the following should suffice:
cleanedList = [x for x in countries if str(x) != 'nan']
In your countries
list, the literal 'nan'
is a string not the Python float nan
which is equivalent to:
float('NaN')
In your case the following should suffice:
cleanedList = [x for x in countries if x != 'nan']
Source: Stackoverflow.com