Python Pandas Counting the Occurrences of a Specific value

Question

I am trying to find the number of times a certain value appears in one column   I have made the dataframe with data   pd DataFrame from csv  data DataSet2 csv    and now I want to find the number of times something appears in a column  How is this done   I thought it was the below  where I am looking in the education column and counting the number of time   occurs   The code below shows that I am trying to find the number of times 9th appears and the error is what I am getting when I run the code  Code  missing2   df education value counts    9th   print missing2    Error  KeyError   9th

User · Answer

Couple of ways using count or sum  In  338   df Out 338     col1 education 0    a       9th 1    b       9th 2    c       8th  In  335   df loc df education     9th    education   count   Out 335   2  In  336    df education     9th   sum   Out 336   2  In  337   df query  education     9th    education count   Out 337   2

User · Answer

You can create subset of data with your condition and then use shape or len   print df   col1 education 0    a       9th 1    b       9th 2    c       8th  print df education     9th  0     True 1     True 2    False Name  education  dtype  bool  print df df education     9th     col1 education 0    a       9th 1    b       9th  print df df education     9th   shape 0  2 print len df df  education       9th    2   Performance is interesting  the fastest solution is compare numpy array and sum     Code   import perfplot  string np random seed 123    def shape df       return df df education     a   shape 0   def len df df       return len df df  education       a     def query count df       return df query  education     a    education count    def sum mask df       return  df education     a   sum    def sum mask numpy df       return  df education values     a   sum    def make df n       L   list string ascii letters      df   pd DataFrame np random choice L  size n   columns   education        return df  perfplot show      setup make df      kernels  shape  len df  query count  sum mask  sum mask numpy       n range  2  k for k in range 2  25        logx True      logy True      equality check False       xlabel  len df

User · Answer

easy but not efficient   list df education  count  9th

User · Answer

Try this     df education    9th   sum

User · Answer

for finding a specific value of a column you can use the code below   irrespective of the preference you can use the any of the method you like  df col name value counts   Value you are looking for   take example of the titanic dataset  df Sex value counts   male   this gives a count of all male on the ship Although if you want to count a numerical data then you cannot use the above method because value counts   is used only with series type of data hence fails  So for that you can use the second method example  the second method is    this is an example method of counting on a data frame df  df  Survived    1  amp  df  Sex     male    counts     this is not that efficient as value counts   but surely will help if you want to count values of a data frame hope this helps

User · Answer

An elegant way to count the occurrence of     or any symbol in any column  is to use built-in function isin of a dataframe object   Suppose that we have loaded the  Automobile  dataset into df object  We do not know which columns contain missing value      symbol   so let do   df isin        sum axis 0    DataFrame isin values  official document says      it returns boolean DataFrame showing whether each element in the DataFrame   is contained in values   Note that isin accepts an iterable as input  thus we need to pass a list containing the target symbol to this function  df isin        will return a boolean dataframe as follows       symboling   normalized-losses   make    fuel-type   aspiration-ratio     0   False       True                False   False       False 1   False       True                False   False       False 2   False       True                False   False       False 3   False       False               False   False       False 4   False       False               False   False       False 5   False       True                False   False       False       To count the number of occurrence of the target symbol in each column  let s take sum over all the rows of the above dataframe by indicating axis 0  The final  truncated  result shows what we expect   symboling             0 normalized-losses    41     bore                  4 stroke                4 compression-ratio     0 horsepower            2 peak-rpm              2 city-mpg              0 highway-mpg           0 price                 4

[python] Python Pandas Counting the Occurrences of a Specific value

Examples related to python

Examples related to pandas