Find the item with maximum occurrences in a list

Question

In Python  I have a list   L    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67      I want to identify the item that occurred the highest number of times  I am able to solve it but I need the fastest way to do so  I know there is a nice Pythonic answer to this

User · Answer

from collections import Counter most common num most common   Counter L  most common 1  0    4  6 times  For older Python versions   lt  2 7   you can use this recipe to create the Counter class

User · Answer

My  simply  code  three months studying Python    def more frequent item lst       new lst          times   0     for item in lst          count num   lst count item          new lst append count num          times   max new lst      key   max lst  key lst count      print  In the list         print lst      print  The most frequent item is     str key       Appears     str times      times in this list      more frequent item  1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67     The output will be   In the list    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67  The most frequent item is 4  Appears 6 times in this list

User · Answer

I obtained the best results with groupby from itertools module with this function using Python 3 5 2   from itertools import groupby  a    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67   def occurrence        occurrence  num times   0  0     for key  values in groupby a  lambda x   x           val   len list values           if val  gt   occurrence              occurrence  num times    key  val     return occurrence  num times  occurrence  num times   occurrence   print   d occurred  d times which is the highest number of times     occurrence  num times     Output   4 occurred 6 times which is the highest number of times   Test with timeit from timeit module   I used this script for my test with number  20000   from itertools import groupby  def occurrence        a    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67      occurrence  num times   0  0     for key  values in groupby a  lambda x   x           val   len list values           if val  gt   occurrence              occurrence  num times    key  val     return occurrence  num times  if   name         main         from timeit import timeit     print timeit  occurrence     setup    from   main   import occurrence    number   20000     Output  The best one    0 1893607140000313

User · Answer

Perhaps the most common   method

User · Answer

if you are using numpy in your solution for faster computation use this  import numpy as np x   np array  2 5 77 77 77 77 77 77 77 9 0 3 3 3 3 3   y   np bincount x minlength   max x   y   np argmax y     print y    outputs 77

User · Answer

A simple way without any libraries or sets  def mcount l     n                        To store count of each elements   for x in l        count   0       for i in range len l              if x    l i                 count  1       n append count    a   max n                largest in counts list   for i in range len n          if n i     a            return l i  a    element frequency   return                   if something goes wrong

User · Answer

Following is the solution which I came up with if there are multiple characters in the string all having the highest frequency   mystr   input  enter string      define dictionary to store characters and their frequencies mydict       get the unique characters unique chars   sorted set mystr  key   mystr index   store the characters and their respective frequencies in the dictionary for c in unique chars      ctr   0     for d in mystr          if d        and d    c              ctr   ctr   1     mydict c    ctr print mydict   store the maximum frequency max freq   max mydict values    print  the highest frequency of occurence    max freq   print all characters with highest frequency print  the characters are    for k v in mydict items        if v    max freq          print k    Input   hello people   Output     o   2   p   2   h   1       0   e   3   l   3    the highest frequency of occurence   3  the characters are   e  l

User · Answer

I am surprised no-one has mentioned the simplest solution max   with the key list count   max lst key lst count    Example    gt  gt  gt  lst    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67   gt  gt  gt  max lst key lst count  4   This works in Python 3 or 2  but note that it only returns the most frequent item and not also the frequency  Also  in the case of a draw  i e  joint most frequent item  only a single item is returned   Although the time complexity of using max   is worse than using Counter most common 1  as PM 2Ring comments  the approach benefits from a rapid C implementation and I find this approach is fastest for short lists but slower for larger ones  Python 3 6 timings shown in IPython 5 3    In  1   from collections import Counter                  def f1 lst               return max lst  key   lst count                   def f2 lst               return Counter lst  most common 1                   lst0    1 2 3 4 3          lst1   lst0      100           In  2    timeit -n 10 f1 lst0  10 loops  best of 3  3 32 us per loop  In  3    timeit -n 10 f2 lst0  10 loops  best of 3  26 us per loop  In  4    timeit -n 10 f1 lst1  10 loops  best of 3  4 04 ms per loop  In  5    timeit -n 10 f2 lst1  10 loops  best of 3  75 6 us per loop

User · Answer

If you re using Python 3 4 or above  you can use statistics mode     gt  gt  gt  import statistics  gt  gt  gt  L    1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67    gt  gt  gt  statistics mode L  4   Note that this will throw a statistics StatisticsError if the list is empty or if there is not exactly one most common value

User · Answer

In your question  you asked for the fastest way to do it   As has been demonstrated repeatedly  particularly with Python  intuition is not a reliable guide  you need to measure     Here s a simple test of several different implementations   import sys from collections import Counter  defaultdict from itertools import groupby from operator import itemgetter from timeit import timeit  L    1 2 45 55 5 4 4 4 4 4 4 5456 56 6 7 67   def max occurrences 1a seq L        dict iteritems      c   dict       for item in seq          c item    c get item  0    1     return max c iteritems    key itemgetter 1    def max occurrences 1b seq L        dict items      c   dict       for item in seq          c item    c get item  0    1     return max c items    key itemgetter 1    def max occurrences 2 seq L        defaultdict iteritems      c   defaultdict int      for item in seq          c item     1     return max c iteritems    key itemgetter 1    def max occurrences 3a seq L        sort groupby generator expression      return max   k  sum 1 for i in g   for k  g in groupby sorted seq     key itemgetter 1    def max occurrences 3b seq L        sort groupby list comprehension      return max   k  sum 1 for i in g   for k  g in groupby sorted seq     key itemgetter 1    def max occurrences 4 seq L        counter      return Counter L  most common 1  0   versions    max occurrences 1a  max occurrences 1b  max occurrences 2  max occurrences 3a  max occurrences 3b  max occurrences 4   print sys version    n   for vers in versions      print vers   doc    vers    timeit vers  number 20000    The results on my machine   2 7 2  v2 7 2 8527427914a2  Jun 11 2011  15 22 34    GCC 4 2 1  Apple Inc  build 5666   dot 3     dict iteritems  4  6  0 202214956284 dict items  4  6  0 208412885666 defaultdict iteritems  4  6  0 221301078796 sort groupby generator expression  4  6  0 383440971375 sort groupby list comprehension  4  6  0 402786016464 counter  4  6  0 564319133759   So it appears that the Counter solution is not the fastest   And  in this case at least  groupby is faster  defaultdict is good but you pay a little bit for its convenience   it s slightly faster to use a regular dict with a get   What happens if the list is much bigger   Adding L    10000 to the test above and reducing the repeat count to 200   dict iteritems  4  60000  10 3451900482 dict items  4  60000  10 2988479137 defaultdict iteritems  4  60000  5 52838587761 sort groupby generator expression  4  60000  11 9538850784 sort groupby list comprehension  4  60000  12 1327362061 counter  4  60000  14 7495789528   Now defaultdict is the clear winner  So perhaps the cost of the  get  method and the loss of the inplace add adds up  an examination of the generated code is left as an exercise    But with the modified test data  the number of unique item values did not change so presumably dict and defaultdict have an advantage there over the other implementations   So what happens if we use the bigger list but substantially increase the number of unique items   Replacing the initialization of L with   LL    1 2 45 55 5 4 4 4 4 4 4 5456 56 6 7 67  L      for i in xrange 1 10001       L extend l   i for l in LL   dict iteritems  2520  13  17 9935798645 dict items  2520  13  21 8974409103 defaultdict iteritems  2520  13  16 8289561272 sort groupby generator expression  2520  13  33 853593111 sort groupby list comprehension  2520  13  36 1303369999 counter  2520  13  22 626899004   So now Counter is clearly faster than the groupby solutions but still slower than the iteritems versions of dict and defaultdict   The point of these examples isn t to produce an optimal solution   The point is that there often isn t one optimal general solution   Plus there are other performance criteria   The memory requirements will differ substantially among the solutions and  as the size of the input goes up  memory requirements may become the overriding factor in algorithm selection   Bottom line  it all depends and you need to measure

User · Answer

may something like this   testList    1  2  3  4  2  2  1  4  4  print max set testList   key   testList count

User · Answer

Here is a defaultdict solution that will work with Python versions 2 5 and above   from collections import defaultdict  L    1 2 45 55 5 4 4 4 4 4 4 5456 56 6 7 67  d   defaultdict int  for i in L      d i     1 result   max d iteritems    key lambda x  x 1   print result    4  6    The number 4 occurs 6 times   Note if L    1  2  45  55  5  4  4  4  4  4  4  5456  7  7  7  7  7  56  6  7  67  then there are six 4s and six 7s    However  the result will be  4  6   i e  six 4s

User · Answer

Simple and best code           def max occ lst x       count 0     for i in lst          if  i  x               count count 1     return count  lst  1  2  45  55  5  4  4  4  4  4  4  5456  56  6  7  67  x max lst key lst count  print x  occurs   max occ lst x   times     Output  4 occurs  6 times

User · Answer

I want to throw in another solution that looks nice and is fast for short lists   def mc seq L        max count      max element   max seq  key seq count      return  max element  seq count max element     You can benchmark this with the code provided by Ned Deily which will give you these results for the smallest test case   3 5 2  default  Nov  7 2016  11 31 36    GCC 6 2 1 20160830    dict iteritems  4  6  0 2069783889998289 dict items  4  6  0 20462976200065896 defaultdict iteritems  4  6  0 2095775119996688 sort groupby generator expression  4  6  0 4473949929997616 sort groupby list comprehension  4  6  0 4367636879997008 counter  4  6  0 3618192010007988 max count  4  6  0 20328268999946886   But beware  it is inefficient and thus gets really slow for large lists

[python] Find the item with maximum occurrences in a list

Examples related to python

Examples related to list

Examples related to max

Examples related to counting