How do I get the row count of a Pandas DataFrame

Question

I m trying to get the number of rows of dataframe df with Pandas  and here is my code  Method 1  total rows   df count print total rows   1  Method 2  total rows   df  First columnn label   count print total rows   1  Both the code snippets give me this error   TypeError  unsupported operand type s  for     instancemethod  and  int   What am I doing wrong

User · Answer

I come to Pandas from an R background  and I see that Pandas is more complicated when it comes to selecting rows or columns  I had to wrestle with it for a while  and then I found some ways to deal with  Getting the number of columns  len df columns     Here    df is your data frame   df columns returns a string  It contains column s titles of the df    Then   quot len   quot  gets the length of it   Getting the number of rows  len df index    It s similar

User · Answer

building on Jan-Philip Gehrcke s answer  The reason why len df  or len df index  is faster than df shape 0   Look at the code  df shape is a  property that runs a DataFrame method calling len twice  df shape   Type         property String form   lt property object at 0x1127b33c0 gt  Source    df shape fget  property def shape self        quot  quot  quot      Return a tuple representing the dimensionality of the DataFrame       quot  quot  quot      return len self index   len self columns   And beneath the hood of len df  df   len     Signature  df   len     Source      def   len   self            quot  quot  quot Returns length of info axis  but here we use the index  quot  quot  quot          return len self index  File         miniconda2 lib python2 7 site-packages pandas core frame py Type       instancemethod  len df index  will be slightly faster than len df  since it has one less function call  but this is always faster than df shape 0

User · Answer

Think  the dataset is  quot data quot  and name your dataset as  quot  data fr  quot  and number of rows in the data fr is  quot nu rows quot   import the data frame  Extention could be different as csv xlsx or etc  data fr   pd read csv  data csv     print the number of rows nu rows   data fr shape 0  print nu rows

User · Answer

You can use the  shape property or just len DataFrame index   However  there are notable performance differences  len DataFrame index  is fastest     Code to reproduce the plot  import numpy as np import pandas as pd import perfplot   perfplot save       quot out png quot       setup lambda n  pd DataFrame np arange n   3  reshape n  3        n range  2  k for k in range 25        kernels           lambda data  data shape 0           lambda data  data 0  count            lambda data  len data index              labels   quot data shape 0  quot    quot data 0  count   quot    quot len data index  quot        xlabel  quot data rows quot      As  Dan Allen noted in the comments  len df index  and df 0  count   are not interchangeable as count excludes NaNs

User · Answer

I m not sure if this would work  data could be omitted   but this may work   dataframe name  tails 1   and then using this  you could find the number of rows by running the code snippet and looking at the row number that was given to you

User · Answer

An alternative method to finding out the amount of rows in a dataframe which I think is the most readable variant is pandas Index size  Do note that  as I commented on the accepted answer   Suspected pandas Index size would actually be faster than len df index  but timeit on my computer tells me otherwise   150 ns slower per loop

User · Answer

Use len df     len     is documented with Returns length of index  Timing info  set up the same way as in root s answer  In  7   timeit len df index  1000000 loops  best of 3  248 ns per loop  In  8   timeit len df  1000000 loops  best of 3  573 ns per loop  Due to one additional function call  it is a tiny bit slower than calling len df index  directly  This should not matter in most cases

User · Answer

Apart from the previous answers  you can use df axes to get the tuple with row and column indexes and then use the len   function  total rows   len df axes 0   total cols   len df axes 1

User · Answer

Suppose df is your dataframe then  count row   df shape 0     Gives number of rows count col   df shape 1     Gives number of columns  Or  more succinctly  r  c   df shape

User · Answer

TL DR Short  clear and clean   use len df   len   is your friend  and it can be used for row counts as len df   Alternatively  you can access all rows by df index and all columns by df columns  and as you can use the len anyList  for getting the count of list  use len df index  for getting the number of rows  and len df columns  for the column count  Or  you can use df shape which returns the number of rows and columns together  If you want to access the number of rows  only use df shape 0   For the number of columns  only use  df shape 1

User · Answer

How do I get the row count of a Pandas DataFrame   This table summarises the different situations in which you d want to count something in a DataFrame  or Series  for completeness   along with the recommended method s     Footnotes  DataFrame count returns counts for each column as a Series since the non-null count varies by column  DataFrameGroupBy size returns a Series  since all columns in the same group share the same row-count  DataFrameGroupBy count returns a DataFrame  since the non-null count could differ across columns in the same group  To get the group-wise non-null count for a specific column  use df groupby       x   count   where  quot x quot  is the column to count      Minimal Code Examples Below  I show examples of each of the methods described in the table above  First  the setup - df   pd DataFrame        A   list  aabbc     B     x    x   np nan   x   np nan    s   df  B   copy    df     A    B 0  a    x 1  a    x 2  b  NaN 3  b    x 4  c  NaN  s  0      x 1      x 2    NaN 3      x 4    NaN Name  B  dtype  object  Row Count of a DataFrame  len df   df shape 0   or len df index  len df    5  df shape 0    5  len df index    5  It seems silly to compare the performance of constant time operations  especially when the difference is on the level of  quot seriously  don t worry about it quot   But this seems to be a trend with other answers  so I m doing the same for completeness  Of the three methods above  len df index   as mentioned in other answers  is the fastest   Note  All the methods above are constant time operations as they are simple attribute lookups  df shape  similar to ndarray shape  is an attribute that returns a tuple of    Rows    Cols   For example  df shape returns  8  2  for the example here    Column Count of a DataFrame  df shape 1   len df columns  df shape 1    2  len df columns    2  Analogous to len df index   len df columns  is the faster of the two methods  but takes more characters to type   Row Count of a Series  len s   s size  len s index  len s    5  s size   5  len s index    5  s size and len s index  are about the same in terms of speed  But I recommend len df    Note size is an attribute  and it returns the number of elements   count of rows for any Series   DataFrames also define a size attribute which returns the same result as df shape 0    df shape 1    Non-Null Row Count  DataFrame count and Series count The methods described here only count non-null values  meaning NaNs are ignored   Calling DataFrame count will return non-NaN counts for each column  df count    A    5 B    3 dtype  int64  For Series  use Series count to similar effect  s count     3  Group-wise Row Count  GroupBy size For DataFrames  use DataFrameGroupBy size to count the number of rows per group  df groupby  A   size    A a    2 b    2 c    1 dtype  int64  Similarly  for Series  you ll use SeriesGroupBy size  s groupby df A  size    A a    2 b    2 c    1 Name  B  dtype  int64  In both cases  a Series is returned  This makes sense for DataFrames as well since all groups share the same row-count  Group-wise Non-Null Row Count  GroupBy count Similar to above  but use GroupBy count  not GroupBy size  Note that size always returns a Series  while count returns a Series if called on a specific column  or else a DataFrame  The following methods return the same thing  df groupby  A    B   size   df groupby  A   size    A a    2 b    2 c    1 Name  B  dtype  int64  Meanwhile  for count  we have df groupby  A   count       B A a  2 b  1 c  0     called on the entire GroupBy object  vs   df groupby  A    B   count    A a    2 b    1 c    0 Name  B  dtype  int64  Called on a specific column

User · Answer

Either of this can do it  df is the name of the DataFrame   Method 1  Using the len function  len df  will give the number of rows in a DataFrame named df  Method 2  using count function  df col  count   will count the number of rows in a given column col  df count   will give the number of rows for all the columns

User · Answer

For dataframe df  a printed comma formatted row count used while exploring data       def nrow df       print        format df shape 0      Example   nrow my df  12 456 789

User · Answer

You can do this also  Let   s say df is your dataframe  Then df shape gives you the shape of your dataframe i e  row col  Thus  assign the below command to get the required  row   df shape 0   col   df shape 1

User · Answer

In case you want to get the row count in the middle of a chained operation  you can use   df pipe len    Example   row count           pd DataFrame np random rand 3 4          reset index          pipe len      This can be useful if you don t want to put a long statement inside a len   function   You could use   len     instead but   len     looks a bit weird

[python] How do I get the row count of a Pandas DataFrame?

Examples related to python

Examples related to pandas

Examples related to dataframe