How can I get the number of the row in a dataframe that contains a certain value in a certain column using Pandas? For example, I have the following dataframe:
ClientID LastName
0 34 Johnson
1 67 Smith
2 53 Brows
How can I find the number of the row that has 'Smith' in 'LastName' column?
len(df[df["Lastname"]=="Smith"].values)
count_smiths = (df['LastName'] == 'Smith').sum()
You can simply use shape method
df[df['LastName'] == 'Smith'].shape
Output
(1,1)
Which indicates 1 row and 1 column. This way you can get the idea of whole datasets
Let me explain the above code
DataframeName[DataframeName['Column_name'] == 'Value to match in column']
df.index[df.LastName == 'Smith']
Or
df.query('LastName == "Smith"').index
Will return all row indices where LastName
is Smith
Int64Index([1], dtype='int64')
df.loc[df.LastName == 'Smith']
will return the row
ClientID LastName
1 67 Smith
and
df.loc[df.LastName == 'Smith'].index
will return the index
Int64Index([1], dtype='int64')
NOTE: Column names 'LastName' and 'Last Name' or even 'lastname' are three unique names. The best practice would be to first check the exact name using df.columns. If you really need to strip the column names of all the white spaces, you can first do
df.columns = [x.strip().replace(' ', '') for x in df.columns]
Source: Stackoverflow.com