How to check if any value is NaN in a Pandas DataFrame

Question

In Python Pandas  what s the best way to check whether a DataFrame has one  or more  NaN values   I know about the function pd isnan  but this returns a DataFrame of booleans for each element  This post right here doesn t exactly answer my question either

User · Answer

You could not only check if any  NaN  exist but also get the percentage of  NaN s in each column using the following     df   pd DataFrame   col1   1 2 3 4 5   col2   6 np nan 8 9 10      df       col1 col2   0   1   6 0   1   2   NaN   2   3   8 0   3   4   9 0   4   5   10 0     df isnull   sum   len df    col1    0 0   col2    0 2   dtype  float64

User · Answer

let df be the name of the Pandas DataFrame and any value that is numpy nan is a null value   If you want to see which columns has nulls and which do not just True and False  df isnull   any     If you want to see only the columns that has nulls df loc    df isnull   any    columns   If you want to see the count of nulls in every column df isna   sum     If you want to see the percentage of nulls in every column df isna   sum    len df   100   If you want to see the percentage of nulls in columns only with nulls    df loc   list df loc   df isnull   any    columns   isnull   sum    len df   100  EDIT 1  If you want to see where your data is missing visually  import missingno missingdata df   df columns df isnull   any    tolist   missingno matrix df missingdata df

User · Answer

Super Simple Syntax  df isna   any axis None   Starting from v0 23 2  you can use DataFrame isna   DataFrame any axis None  where axis None specifies logical reduction over the entire DataFrame       Setup df   pd DataFrame   A    1  2  np nan    B     np nan  4  5    df      A    B 0  1 0  NaN 1  2 0  4 0 2  NaN  5 0     df isna           A      B 0  False   True 1  False  False 2   True  False  df isna   any axis None    True     Useful Alternatives  numpy isnan Another performant option if you re running older versions of pandas   np isnan df values   array   False   True           False  False            True  False     np isnan df values  any     True   Alternatively  check the sum   np isnan df values  sum     2  np isnan df values  sum    gt  0   True   Series hasnans You can also iteratively call Series hasnans  For example  to check if a single column has NaNs    df  A   hasnans   True   And to check if any column has NaNs  you can use a comprehension with any  which is a short-circuiting operation    any df c  hasnans for c in df    True   This is actually very fast

User · Answer

We can see the null values present in the dataset by generating heatmap using seaborn moduleheatmap  import pandas as pd import seaborn as sns dataset pd read csv  train csv   sns heatmap dataset isnull   cbar False

User · Answer

I ve been using the following and type casting it to a string and checking for the nan value       str df at index   column        nan     This allows me to check specific value in a series and not just return if this is contained somewhere within the series

User · Answer

Or you can use  info   on the DF such as    df info null counts True  which returns the number of non null rows in a columns such as    lt class  pandas core frame DataFrame  gt  Int64Index  3276314 entries  0 to 3276313 Data columns  total 10 columns   n matches                          3276314 non-null int64 avg pic distance                   3276314 non-null float64

User · Answer

df apply axis 0  func lambda x   any pd isnull x      Will check for each column if it contains Nan or not

User · Answer

You have a couple of options    import pandas as pd import numpy as np  df   pd DataFrame np random randn 10 6     Make a few areas have NaN values df iloc 1 3 1    np nan df iloc 5 3    np nan df iloc 7 9 5    np nan   Now the data frame looks something like this             0         1         2         3         4         5 0  0 520113  0 884000  1 260966 -0 236597  0 312972 -0 196281 1 -0 837552       NaN  0 143017  0 862355  0 346550  0 842952 2 -0 452595       NaN -0 420790  0 456215  1 203459  0 527425 3  0 317503 -0 917042  1 780938 -1 584102  0 432745  0 389797 4 -0 722852  1 704820 -0 113821 -1 466458  0 083002  0 011722 5 -0 622851 -0 251935 -1 498837       NaN  1 098323  0 273814 6  0 329585  0 075312 -0 690209 -3 807924  0 489317 -0 841368 7 -1 123433 -1 187496  1 868894 -2 046456 -0 949718       NaN 8  1 133880 -0 110447  0 050385 -1 158387  0 188222       NaN 9 -0 513741  1 196259  0 704537  0 982395 -0 585040 -1 693810    Option 1  df isnull   any   any   - This returns a boolean value   You know of the isnull   which would return a dataframe like this          0      1      2      3      4      5 0  False  False  False  False  False  False 1  False   True  False  False  False  False 2  False   True  False  False  False  False 3  False  False  False  False  False  False 4  False  False  False  False  False  False 5  False  False  False   True  False  False 6  False  False  False  False  False  False 7  False  False  False  False  False   True 8  False  False  False  False  False   True 9  False  False  False  False  False  False   If you make it df isnull   any    you can find just the columns that have NaN values   0    False 1     True 2    False 3     True 4    False 5     True dtype  bool   One more  any   will tell you if any of the above are True   gt  df isnull   any   any   True    Option 2  df isnull   sum   sum   - This returns an integer of the total number of NaN values    This operates the same way as the  any   any   does  by first giving a summation of the number of NaN values in a column  then the summation of those values   df isnull   sum   0    0 1    2 2    0 3    1 4    0 5    2 dtype  int64   Finally  to get the total number of NaN values in the DataFrame   df isnull   sum   sum   5

User · Answer

To find out which rows have NaNs in a specific column   nan rows   df df  name column   isnull

User · Answer

If you need to know how many rows there are with  quot one or more NaNs quot   df isnull   T any   T sum    Or if you need to pull out these rows and examine them  nan rows   df df isnull   T any

User · Answer

import missingno as msno msno matrix df     just to visualize  no missing value

User · Answer

Adding to Hobs brilliant answer  I am very new to Python and Pandas so please point out if I am wrong   To find out which rows have NaNs   nan rows   df df isnull   any 1     would perform the same operation without the need for transposing by specifying the axis of any   as 1 to check if  True  is present in rows

User · Answer

df isnull   sum     This will give you count of all NaN values present in the respective coloums of the DataFrame

User · Answer

Since none have mentioned  there is just another variable called hasnans    df i  hasnans will output to True if one or more of the values in the pandas Series is NaN  False if not  Note that its not a function   pandas version  0 19 2  and  0 20 2

User · Answer

Depending on the type of data you re dealing with  you could also just get the value counts of each column while performing your EDA by setting dropna to False    for col in df     print df col  value counts dropna False    Works well for categorical variables  not so much when you have many unique values

User · Answer

Since pandas has to find this out for DataFrame dropna    I took a look to see how they implement it and discovered that they made use of DataFrame count    which counts all non-null values in the DataFrame  Cf  pandas source code  I haven t benchmarked this technique  but I figure the authors of the library are likely to have made a wise choice for how to do it

User · Answer

jwilner s response is spot on  I was exploring to see if there s a faster option  since in my experience  summing flat arrays is  strangely  faster than counting  This code seems faster  df isnull   values any     import numpy as np import pandas as pd import perfplot   def setup n       df   pd DataFrame np random randn n       df df  gt  0 9    np nan     return df   def isnull any df       return df isnull   any     def isnull values sum df       return df isnull   values sum    gt  0   def isnull sum df       return df isnull   sum    gt  0   def isnull values any df       return df isnull   values any     perfplot save       quot out png quot       setup setup      kernels  isnull any  isnull values sum  isnull sum  isnull values any       n range  2    k for k in range 25       df isnull   sum   sum   is a bit slower  but of course  has additional information -- the number of NaNs

User · Answer

The best would be to use   df isna   any   any     Here is why  So isna   is used to define isnull    but both of these are identical of course   This is even faster than the accepted answer and covers all 2D panda arrays

User · Answer

Just using math isnan x   Return True if x is a NaN  not a number   and False otherwise

User · Answer

Here is another interesting way of finding null and replacing with a calculated value       Creating the DataFrame      testdf   pd DataFrame   Tenure   1 2 3 4 5   Monthly   10 20 30 40 50   Yearly   10 40 np nan np nan 250         gt  gt  gt  testdf2        Monthly  Tenure  Yearly     0       10       1    10 0     1       20       2    40 0     2       30       3     NaN     3       40       4     NaN     4       50       5   250 0       Identifying the rows with empty columns     nan rows   testdf2 testdf2  Yearly   isnull         gt  gt  gt  nan rows        Monthly  Tenure  Yearly     2       30       3     NaN     3       40       4     NaN       Getting the rows  into a list      gt  gt  gt  index   list nan rows index       gt  gt  gt  index      2  3         Replacing null values with calculated value      gt  gt  gt  for i in index          testdf2  Yearly   i    testdf2  Monthly   i    testdf2  Tenure   i       gt  gt  gt  testdf2        Monthly  Tenure  Yearly     0       10       1    10 0     1       20       2    40 0     2       30       3    90 0     3       40       4   160 0     4       50       5   250 0

User · Answer

df isnull   any   any   should do it

[python] How to check if any value is NaN in a Pandas DataFrame

Examples related to python

Examples related to pandas

Examples related to dataframe

Examples related to nan