Find integer index of rows with NaN in pandas dataframe

Question

I have a pandas DataFrame like this                       a         b 2011-01-01 00 00 00 1 883381  -0 416629 2011-01-01 01 00 00 0 149948  -1 782170 2011-01-01 02 00 00 -0 407604 0 314168 2011-01-01 03 00 00 1 452354  NaN 2011-01-01 04 00 00 -1 224869 -0 947457 2011-01-01 05 00 00 0 498326  0 070416 2011-01-01 06 00 00 0 401665  NaN 2011-01-01 07 00 00 -0 019766 0 533641 2011-01-01 08 00 00 -1 101303 -1 408561 2011-01-01 09 00 00 1 671795  -0 764629   Is there an efficient way to find the  integer  index of rows with NaNs  In this case the desired output should be  3  6

User · Answer

And just in case  if you want to find the coordinates of  nan  for all the columns instead  supposing they are all numericals   here you go   df   pd DataFrame   0 1 3 4 np nan 2   3 5 6 np nan 3 3     df    0  1  2    3    4  5 0  0  1  3  4 0  NaN  2 1  3  5  6  NaN  3 0  3  np where np asanyarray np isnan df     array  0  1    array  4  3

User · Answer

Let the dataframe be named df and the column of interest i e  the column in which we are trying to find nulls  is  b   Then the following snippet gives the desired index of null in the dataframe      for i in range df shape 0           if df  b   isnull   iloc i              print i

User · Answer

Another simple solution is list np where df  b   isnull    0

User · Answer

One line solution  However it works for one column only   df loc pandas isna df  b        index

User · Answer

Here is another simpler take   df   pd DataFrame   0 1 3 4 np nan 2   3 5 6 np nan 3 3     inds   np asarray df isnull    nonzero     array  0  1   dtype int64   array  4  3   dtype int64

User · Answer

Here are tests for a few methods    timeit np where np isnan df  b     0   timeit pd isnull df  b    nonzero   0   timeit np where df  b   isna    0   timeit df loc pd isna df  b        index   And their corresponding timings   333   s    9 95   s per loop  mean    std  dev  of 7 runs  1000 loops each  280   s    220 ns per loop  mean    std  dev  of 7 runs  1000 loops each  313   s    128 ns per loop  mean    std  dev  of 7 runs  1000 loops each  6 84 ms    1 59   s per loop  mean    std  dev  of 7 runs  100 loops each    It would appear that pd isnull df  DRGWeight    nonzero   0  wins the day in terms of timing  but that any of the top three methods have comparable performance

User · Answer

I was looking for all indexes of rows with NaN values  My working solution   def get nan indexes data frame       indexes          print data frame      for column in data frame          index   data frame column  index data frame column  apply np isnan           if len index               indexes append index 0       df index   data frame index values tolist       return  df index index i  for i in set indexes

User · Answer

in the case you have datetime index and you want to have the values   df loc pd isnull df  any 1      index values

User · Answer

Here is a simpler solution   inds   pd isnull df  any 1  nonzero   0   In  9   df Out 9              0         1 0  0 450319  0 062595 1 -0 673058  0 156073 2 -0 871179 -0 118575 3  0 594188       NaN 4 -1 017903 -0 484744 5  0 860375  0 239265 6 -0 640070       NaN 7 -0 535802  1 632932 8  0 876523 -0 153634 9 -0 686914  0 131185  In  10   pd isnull df  any 1  nonzero   0  Out 10   array  3  6

User · Answer

For DataFrame df   import numpy as np index   df  b   index df  b   apply np isnan     will give you back the MultiIndex that you can use to index back into df  e g    df  a   ix index 0    gt  gt  gt  1 452354   For the integer index   df index   df index values tolist    df index index i  for i in index   gt  gt  gt   3  6

User · Answer

Don t know if this is too late but you can use np where to find the indices of non values as such   indices   list np where df  b   isna   0

[python] Find integer index of rows with NaN in pandas dataframe

Examples related to python

Examples related to pandas