...building on Jan-Philip Gehrcke's answer.
The reason why len(df)
or len(df.index)
is faster than df.shape[0]
:
Look at the code. df.shape is a @property
that runs a DataFrame method calling len
twice.
df.shape??
Type: property
String form: <property object at 0x1127b33c0>
Source:
# df.shape.fget
@property
def shape(self):
"""
Return a tuple representing the dimensionality of the DataFrame.
"""
return len(self.index), len(self.columns)
And beneath the hood of len(df)
df.__len__??
Signature: df.__len__()
Source:
def __len__(self):
"""Returns length of info axis, but here we use the index """
return len(self.index)
File: ~/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py
Type: instancemethod
len(df.index)
will be slightly faster than len(df)
since it has one less function call, but this is always faster than df.shape[0]