[python] Pandas get the most frequent values of a column

i have this dataframe:

0 name data
1 alex asd
2 helen sdd
3 alex dss
4 helen sdsd
5 john sdadd

so i am trying to get the most frequent value or values(in this case its values) so what i do is:

dataframe['name'].value_counts().idxmax()

but it returns only the value: Alex even if it Helen appears two times as well.

This question is related to python pandas dataframe

The answer is


n is used to get the number of top frequent used items

n = 2

a=dataframe['name'].value_counts()[:n].index.tolist()

dataframe["name"].value_counts()[a]

df['name'].value_counts()[:5].sort_values(ascending=False)

The value_counts will return a count object of pandas.core.series.Series and sort_values(ascending=False) will get you the highest values first.


my best solution to get the first is

 df['my_column'].value_counts().sort_values(ascending=False).argmax()

Use:

df['name'].mode()

or

df['name'].value_counts().idxmax()

Here's one way:

df['name'].value_counts()[df['name'].value_counts() == df['name'].value_counts().max()]

which prints:

helen    2
alex     2
Name: name, dtype: int64

Simply use this..

dataframe['name'].value_counts().nlargest(n)

The functions for frequencies largest and smallest are:

  • nlargest() for mostfrequent 'n' values
  • nsmallest() for least frequent 'n' values

To get the top five most common names:

dataframe['name'].value_counts().head()

You could use .apply and pd.value_counts to get a count the occurrence of all the names in the name column.

dataframe['name'].apply(pd.value_counts)

You could try argmax like this:

dataframe['name'].value_counts().argmax() Out[13]: 'alex'

The value_counts will return a count object of pandas.core.series.Series and argmax could be used to achieve the key of max values.


You can use this to get a perfect count, it calculates the mode a particular column

df['name'].value_counts()

Not Obvious, But Fast

f, u = pd.factorize(df.name.values)
counts = np.bincount(f)
u[counts == counts.max()]

array(['alex', 'helen'], dtype=object)

To get the n most frequent values, just subset .value_counts() and grab the index:

# get top 10 most frequent names
n = 10
dataframe['name'].value_counts()[:n].index.tolist()

to get top 5:

dataframe['name'].value_counts()[0:5]

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe