Find the most common element in a list

Question

What is an efficient way to find the most common element in a Python list   My list items may not be hashable so can t use a dictionary  Also in case of draws the item with the lowest index should be returned  Example    gt  gt  gt  most common   duck    duck    goose     duck   gt  gt  gt  most common   goose    duck    duck    goose     goose

User · Answer

This is the obvious slow solution  O n 2   if neither sorting nor hashing is feasible  but equality comparison      is available   def most common items     if not items      raise ValueError   fitems         best idx   0   for item in items         item missing   True     i   0     for fitem in fitems          if fitem 0     item          fitem 1     1         d   fitem 1  - fitems best idx  1          if d  gt  0 or  d    0 and fitems best idx  2   gt  fitem 2              best idx   i         item missing   False         break       i    1     if item missing        fitems append  item  1  i     return items best idx    But making your items hashable or sortable  as recommended by other answers  would almost always make finding the most common element faster if the length of your list  n  is large  O n  on average with hashing  and O n log n   at worst for sorting

User · Answer

def most common lst       if max  lst count i for i in lst      1          return False     else          return max set lst   key lst count

User · Answer

Building on Luiz s answer  but satisfying the  in case of draws the item with the lowest index should be returned  condition   from statistics import mode  StatisticsError  def most common l       try          return mode l      except StatisticsError as e            will only return the first element if no unique mode found         if  no unique mode  in e args 0               return l 0            this is for  StatisticsError  no mode for empty data            after calling mode             raise   Example    gt  gt  gt  most common   a    b    b     b   gt  gt  gt  most common  1  2   1  gt  gt  gt  most common     StatisticsError  no mode for empty data

User · Answer

This is an O n  solution   mydict        cnt  itm   0     for item in reversed lst        mydict item    mydict get item  0    1      if mydict item   gt   cnt            cnt  itm   mydict item   item  print itm    reversed is used to make sure that it returns the lowest index item

User · Answer

A one-liner   def most common  lst       return max   item  lst count item   for item in set lst    key lambda a  a 1   0

User · Answer

Without the requirement about the lowest index  you can use collections Counter for this   from collections import Counter  a    1936  2401  2916  4761  9216  9216  9604  9801    c   Counter a   print c most common 1     the one most common element    2 would mean the 2 most common   9216  2     a set containing the element  and it s count in  a

User · Answer

use Decorate  Sort  Undecorate to solve the problem  def most common iterable         Make a list with tuples   item  index        The index will be used later to break ties for most common item      lst     x  i  for i  x in enumerate iterable       lst sort          lst final will also be a list of tuples   count  index  item        Sorting on this list will find us the most common item  and the index       will break ties so the one listed first wins   Count is negative so       largest count will have lowest value and sort first      lst final             Get an iterator for our new list        itr   iter lst            and pop the first tuple off   Setup current state vars for loop      count   1     tup   next itr      x cur  i cur   tup        Loop over sorted list of tuples  counting occurrences of item      for tup in itr            Same item again          if x cur    tup 0                 Yes  same item  increment count             count    1         else                No  new item  so write previous current item to lst final                t    -count  i cur  x cur              lst final append t                   and reset current state vars for loop              x cur  i cur   tup             count   1        Write final item after loop ends     t    -count  i cur  x cur      lst final append t       lst final sort       answer   lst final 0  2       return answer  print most common   x    e    a    e    a    e    e      prints  e  print most common   goose    duck    duck    goose      prints  goose

User · Answer

gt  gt  gt  li      goose    duck    duck     gt  gt  gt  def foo li            st   set li           mx   -1          for each in st               temp   li count each                if mx  lt  temp                   mx   temp                   h   each           return h   gt  gt  gt  foo li   duck

User · Answer

Borrowing from here  this can be used with Python 2 7   from collections import Counter  def Most Common lst       data   Counter lst      return data most common 1  0  0    Works around 4-6 times faster than Alex s solutions  and is 50 times faster than the one-liner proposed by newacct   To retrieve the element that occurs first in the list in case of ties   def most common lst       data   Counter lst      return max lst  key data get

User · Answer

Here   def most common l       max   0     maxitem   None     for x in set l           count    l count x          if count  gt  max              max   count             maxitem   x     return maxitem   I have a vague feeling there is a method somewhere in the standard library that will give you the count of each element  but I can t find it

User · Answer

Hi this is a very simple solution with big O n   L    1  4  7  5  5  4  5   def mode f L     your code here     counter   0     number   L 0      for i in L          amount times   L count i          if amount times  gt  counter              counter   amount times             number   i      return number   Where number the element in the list that repeats most of the time

User · Answer

A simpler one-liner   def most common lst       return max set lst   key lst count

User · Answer

You probably don t need this anymore  but this is what I did for a similar problem   It looks longer than it is because of the comments    itemList     hi    hi    hello    bye    counter      maxItemCount   0 for item in itemList      try            Referencing this will cause a KeyError exception           if it doesn t already exist         counter item                meaning if we get this far it didn t happen so           we ll increment         counter item     1     except KeyError            If we got a KeyError we need to create the           dictionary key         counter item    1        Keep overwriting maxItemCount with the latest number        if it s higher than the existing itemCount     if counter item   gt  maxItemCount          maxItemCount   counter item          mostPopularItem   item  print mostPopularItem

User · Answer

def mostCommonElement list     count         dict holder   max   0    keep track of the count by key   result   None    holder when count is greater than max   for i in list      if i not in count        count i    1     else        count i     1     if count i   gt  max        max   count i        result   i   return result      mostCommonElement   a   b   a   c    -   a

User · Answer

With so many solutions proposed  I m amazed nobody s proposed what I d consider an obvious one  for non-hashable but comparable elements  --  itertools groupby  1    itertools offers fast  reusable functionality  and lets you delegate some tricky logic to well-tested standard library components   Consider for example   import itertools import operator  def most common L       get an iterable of  item  iterable  pairs   SL   sorted  x  i  for i  x in enumerate L       print  SL    SL   groups   itertools groupby SL  key operator itemgetter 0       auxiliary function to get  quality  for an item   def  auxfun g       item  iterable   g     count   0     min index   len L      for    where in iterable        count    1       min index   min min index  where        print  item  r  count  r  minind  r     item  count  min index      return count  -min index     pick the highest-count earliest item   return max groups  key  auxfun  0    This could be written more concisely  of course  but I m aiming for maximal clarity   The two print statements can be uncommented to better see the machinery in action  for example  with prints uncommented   print most common   goose    duck    duck    goose      emits   SL     duck   1     duck   2     goose   0     goose   3   item  duck   count 2  minind 1 item  goose   count 2  minind 0 goose   As you see  SL is a list of pairs  each pair an item followed by the item s index in the original list  to implement the key condition that  if the  most common  items with the same highest count are   1  the result must be the earliest-occurring one    groupby groups by the item only  via operator itemgetter   The auxiliary function  called once per grouping during the max computation  receives and internally unpacks a group - a tuple with two items  item  iterable  where the iterable s items are also two-item tuples   item  original index    the items of SL     Then the auxiliary function uses a loop to determine both the count of entries in the group s iterable  and the minimum original index  it returns those as combined  quality key   with the min index sign-changed so the max operation will consider  better  those items that occurred earlier in the original list   This code could be much simpler if it worried a little less about big-O issues in time and space  e g       def most common L     groups   itertools groupby sorted L     def  auxfun  item  iterable        return len list iterable    -L index item    return max groups  key  auxfun  0    same basic idea  just expressed more simply and compactly    but  alas  an extra O N  auxiliary space  to embody the groups  iterables to lists  and O N squared  time  to get the L index of every item   While premature optimization is the root of all evil in programming  deliberately picking an O N squared  approach when an O N log N  one is available just goes too much against the grain of scalability -   Finally  for those who prefer  oneliners  to clarity and performance  a bonus 1-liner version with suitably mangled names -    from itertools import groupby as g def most common oneliner L     return max g sorted L    key lambda x  v   len list v   -L index x    0

User · Answer

Simple one line solution  moc  max   lst count chr  chr  for chr in set lst      It will return most frequent element with its frequency

User · Answer

If they are not hashable  you can sort them and do a single loop over the result counting the items  identical items will be next to each other   But it might be faster to make them hashable and use a dict   def most common lst       cur length   0     max length   0     cur i   0     max i   0     cur item   None     max item   None     for i  item in sorted enumerate lst   key lambda x  x 1            if cur item is None or cur item    item              if cur length  gt  max length or  cur length    max length and cur i  lt  max i                   max length   cur length                 max i   cur i                 max item   cur item             cur length   1             cur i   i             cur item   item         else              cur length    1     if cur length  gt  max length or  cur length    max length and cur i  lt  max i           return cur item     return max item

User · Answer

What you want is known in statistics as mode  and Python of course has a built-in function to do exactly that for you    gt  gt  gt  from statistics import mode  gt  gt  gt  mode  1  2  2  3  3  3  3  3  4  5  6  6  6   3   Note that if there is no  most common element  such as cases where the top two are tied  this will raise StatisticsError  because statistically speaking  there is no mode in this case

User · Answer

Sort a copy of the list and find the longest run   You can decorate the list before sorting it with the index of each element  and then choose the run that starts with the lowest index in the case of a tie

User · Answer

def popular L   C    for a in L      C a  L count a  for b in C keys        if C b   max C values             return b L  2 3 5 3 6 3 6 3 6 3 7 467 4 7 4  print popular L

User · Answer

I needed to do this in a recent program  I ll admit it  I couldn t understand Alex s answer  so this is what I ended up with   def mostPopular l       mpEl None     mpIndex 0     mpCount 0     curEl None     curCount 0     for i  el in sorted enumerate l   key lambda x   x 1   x 0    reverse True           curCount curCount 1 if el  curEl else 1         curEl el         if curCount gt mpCount           or  curCount  mpCount and i lt mpIndex               mpEl curEl             mpIndex i             mpCount curCount     return mpEl  mpCount  mpIndex   I timed it against Alex s solution and it s about 10-15  faster for short lists  but once you go over 100 elements or more  tested up to 200000  it s about 20  slower

User · Answer

I am doing this using scipy stat module and lambda  import scipy stats lst    1 2 3 4 5 6 7 5  most freq val   lambda x  scipy stats mode x  0  0  print most freq val lst    Result   most freq val   5

User · Answer

ans     1  1  0  0  1  1  all ans    ans count ans i    ans i  for i in range len ans    print all ans   all ans  4  1  2  0  max key   max all ans keys      4  print all ans max key     1

[python] Find the most common element in a list

Examples related to python

Examples related to list