There are a number of ways to get information on the attributes of your DataFrame or Series.
Create Sample DataFrame and Series
df = pd.DataFrame({'a':[5, 2, np.nan], 'b':[ 9, 2, 4]})
df
a b
0 5.0 9
1 2.0 2
2 NaN 4
s = df['a']
s
0 5.0
1 2.0
2 NaN
Name: a, dtype: float64
shape
AttributeThe shape
attribute returns a two-item tuple of the number of rows and the number of columns in the DataFrame. For a Series, it returns a one-item tuple.
df.shape
(3, 2)
s.shape
(3,)
len
functionTo get the number of rows of a DataFrame or get the length of a Series, use the len
function. An integer will be returned.
len(df)
3
len(s)
3
size
attributeTo get the total number of elements in the DataFrame or Series, use the size
attribute. For DataFrames, this is the product of the number of rows and the number of columns. For a Series, this will be equivalent to the len
function:
df.size
6
s.size
3
ndim
attributeThe ndim
attribute returns the number of dimensions of your DataFrame or Series. It will always be 2 for DataFrames and 1 for Series:
df.ndim
2
s.ndim
1
count
methodThe count
method can be used to return the number of non-missing values for each column/row of the DataFrame. This can be very confusing, because most people normally think of count as just the length of each row, which it is not. When called on a DataFrame, a Series is returned with the column names in the index and the number of non-missing values as the values.
df.count() # by default, get the count of each column
a 2
b 3
dtype: int64
df.count(axis='columns') # change direction to get count of each row
0 2
1 2
2 1
dtype: int64
For a Series, there is only one axis for computation and so it just returns a scalar:
s.count()
2
info
method for retrieving metadataThe info
method returns the number of non-missing values and data types of each column
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
a 2 non-null float64
b 3 non-null int64
dtypes: float64(1), int64(1)
memory usage: 128.0 bytes