Efficiently checking if arbitrary object is NaN in Python numpy pandas

Question

My numpy arrays use np nan to designate missing values  As I iterate over the data set  I need to detect such missing values and handle them in special ways   Naively I used numpy isnan val   which works well unless val isn t among the subset of types supported by numpy isnan    For example  missing data can occur in string fields  in which case I get    gt  gt  gt  np isnan  some string   Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt  TypeError  Not implemented for this type   Other than writing an expensive wrapper that catches the exception and returns False  is there a way to handle this elegantly and efficiently

User · Accepted Answer

pandas isnull    also pd isna    in newer versions  checks for missing values in both numeric and string object arrays  From the documentation  it checks for      NaN in numeric arrays  None NaN in object arrays   Quick example   import pandas as pd import numpy as np s   pd Series   apple   np nan   banana    pd isnull s  Out 9    0    False 1     True 2    False dtype  bool   The idea of using numpy nan to represent missing values is something that pandas introduced  which is why pandas has the tools to deal with it   Datetimes too  if you use pd NaT you won t need to specify the dtype   In  24   s   Series  Timestamp  20130101   np nan Timestamp  20130102 9 30    dtype  M8 ns     In  25   s Out 25    0   2013-01-01 00 00 00 1                   NaT 2   2013-01-02 09 30 00 dtype  datetime64 ns     In  26   pd isnull s  Out 26    0    False 1     True 2    False dtype  bool

User · Answer

I found this brilliant solution here  it uses the simple logic NAN  NAN  https   www codespeedy com check-if-a-given-string-is-nan-in-python  Using above example you can simply do the following  This should work on different type of objects as it simply utilize the fact that NAN is not equal to NAN   import numpy as np  s   pd Series   apple   np nan   banana     s apply lambda x  x  x   out 252   0    False  1     True  2    False  dtype  bool

User · Answer

Is your type really arbitrary   If you know it is just going to be a int float or string you could just do   if val dtype    float and np isnan val     assuming it is wrapped in numpy   it will always have a dtype and only float and complex can be NaN

[python] Efficiently checking if arbitrary object is NaN in Python / numpy / pandas?

Examples related to python

Examples related to numpy

Examples related to pandas