How to check whether a pandas DataFrame is empty

Question

How to check whether a pandas DataFrame is empty  In my case I want to print some message in terminal if the DataFrame is empty

User · Answer

1  If a DataFrame has got Nan and Non Null values and you want to find whether the DataFrame is empty or not then try this code  2  when this situation can happen   This situation happens when a single function is used to plot more than one DataFrame  which are passed as parameter In such a situation the function try to plot the data even  when a DataFrame is empty and thus plot an empty figure   It will make sense if simply display  DataFrame has no data  message  3  why   if a DataFrame is empty i e  contain no data at all Mind you DataFrame with Nan values  is considered non empty  then it is desirable not to plot but put out a message   Suppose we have two DataFrames df1 and df2  The function myfunc takes any DataFrame df1 and df2 in this case  and print a message  if a DataFrame is empty instead of plotting    df1                     df2 col1 col2           col1 col2  Nan   2              Nan  Nan  2     Nan            Nan  Nan     and the function   def myfunc df     if  df count   sum    gt 0    count the total number of non Nan values Equal to 0 if DataFrame is empty      print  not empty        df plot kind  barh     else       display a message instead of plotting if it is empty      print  empty

User · Answer

To see if a dataframe is empty  I argue that one should test for the length of a dataframe s columns index  if len df columns     0  1  Reason  According to the Pandas Reference API  there is a distinction between   an empty dataframe with 0 rows and 0 columns an empty dataframe with rows containing NaN hence at least 1 column  Arguably  they are not the same  The other answers are imprecise in that df empty  len df   or len df index  make no distinction and return index is 0 and empty is True in both cases  Examples Example 1  An empty dataframe with 0 rows and 0 columns In  1   import pandas as pd         df1   pd DataFrame           df1 Out 1   Empty DataFrame         Columns             Index      In  2   len df1 index     or len df1  Out 2   0  In  3   df1 empty Out 3   True  Example 2  A dataframe which is emptied to 0 rows but still retains n columns In  4   df2   pd DataFrame   AA     1  2  3    BB     11  22  33            df2 Out 4      AA  BB         0   1  11         1   2  22         2   3  33  In  5   df2   df2 df2  AA      5          df2 Out 5   Empty DataFrame         Columns   AA  BB          Index      In  6   len df2 index     or len df2  Out 6   0  In  7   df2 empty Out 7   True  Now  building on the previous examples  in which the index is 0 and empty is True  When reading the length of the columns index for the first loaded dataframe df1  it returns 0 columns to prove that it is indeed empty  In  8   len df1 columns  Out 8   0  In  9   len df2 columns  Out 9   2  Critically  while the second dataframe df2 contains no data  it is not completely empty because it returns the amount of empty columns that persist  Why it matters Let s add a new column to these dataframes to understand the implications    As expected  the empty column displays 1 series In  10   df1  CC      111  222  333           df1 Out 10      CC          0 111          1 222          2 333 In  11   len df1 columns  Out 11   1    Note the persisting series with rows containing  NaN  values in df2 In  12   df2  CC      111  222  333           df2 Out 12      AA  BB   CC          0 NaN NaN  111          1 NaN NaN  222          2 NaN NaN  333 In  13   len df2 columns  Out 13   3  It is evident that the original columns in df2 have re-surfaced  Therefore  it is prudent to instead read the length of the columns index with len pandas core frame DataFrame columns  to see if a dataframe is empty  Practical solution   New dataframe df In  1   df   pd DataFrame   AA     1  2  3    BB     11  22  33            df Out 1      AA  BB         0   1  11         1   2  22         2   3  33    This data manipulation approach results in an empty df   because of a subset of values that are not available   NaN   In  2   df   df df  AA      5          df Out 2   Empty DataFrame         Columns   AA  BB          Index        NOTE  the df is empty  BUT the columns are persistent In  3   len df columns  Out 3   2    And accordingly  the other answers on this page In  4   len df index     or len df  Out 4   0  In  5   df empty Out 5   True    SOLUTION  conditionally check for empty columns In  6   if len df columns     0      lt --- here               Do something  e g                 drop any columns containing rows with  NaN                to make the df really empty             df   df dropna how  all   axis 1          df Out 6   Empty DataFrame         Columns             Index        Testing shows it is indeed empty now In  7   len df columns  Out 7   0  Adding a new data series works as expected without the re-surfacing of empty columns  factually  without any series that were containing rows with only NaN   In  8   df  CC      111  222  333           df Out 8      CC          0 111          1 222          2 333 In  9   len df columns  Out 9   1

User · Answer

I prefer going the long route  These are the checks I follow to avoid using a try-except clause -    check if variable is not None then check if its a dataframe and  make sure its not empty   Here  DATA is the suspect variable -   DATA is not None and isinstance DATA  pd DataFrame  and not DATA empty

User · Answer

You can use the attribute df empty to check whether it s empty or not   if df empty      print  DataFrame is empty      Source  Pandas Documentation

User · Answer

I use the len function  It s much faster than empty  len df index  is even faster   import pandas as pd import numpy as np  df   pd DataFrame np random randn 10000  4   columns list  ABCD     def empty df       return df empty  def lenz df       return len df     0  def lenzi df       return len df index     0       timeit empty df   timeit lenz df   timeit lenzi df   10000 loops  best of 3  13 9   s per loop 100000 loops  best of 3  2 34   s per loop 1000000 loops  best of 3  695 ns per loop  len on index seems to be faster

[python] How to check whether a pandas DataFrame is empty?

Examples related to python

Examples related to pandas

Examples related to dataframe