Index of duplicates items in a python list

Question

Does anyone know how I can get the index position of duplicate items in a python list  I have tried doing this and it keeps giving me only the index of the 1st occurrence of the of the item in the list   List     A    B    A    C    E     I want it to give me   index 0  A    index 2  A

User · Answer

I ll mention the more obvious way of dealing with duplicates in lists   In terms of complexity  dictionaries are the way to go because each lookup is O 1    You can be more clever if you re only interested in duplicates      my list    1 1 2 3 4 5 5  my dict      for  ind elem  in enumerate my list       if elem in my dict          my dict elem  append ind      else          my dict update  elem  ind     for key value in my dict iteritems        if len value   gt  1          print  key  s  has indices   s     key value    which prints the following    key 1  has indices   0  1   key 5  has indices   5  6

User · Answer

In a single line with pandas 1 2 2 and numpy   import numpy as np  import pandas as pd    idx   np where pd DataFrame List  duplicated keep False    The argument keep False will mark every duplicate as True and np where   will return an array with the indices where the element in the array was True

User · Answer

I just make it simple  i    1 2 1 3  k   0 for ii in i      if ii    1       print   quot index of 1    quot   k  k   k 1  output   index of 1    0   index of 1    2

User · Answer

Wow  everyone s answer is so long   I simply used a pandas dataframe  masking  and the duplicated function  keep False markes all duplicates as True  not just first or last    import pandas as pd import numpy as np np random seed 42     make results reproducible  int df   pd DataFrame   int list   np random randint 1  20  size 10    dupes   int df  int list   duplicated keep False  print int df  int list   dupes  index    This should return Int64Index  0  2  3  4  6  7  9   dtype  int64

User · Answer

I think I found a simple solution after a lot of irritation    if elem in string list      counter   0     elem pos          for i in string list          if i    elem              elem pos append counter          counter   counter   1     print elem pos    This prints a list giving you the indexes of a specific element   elem

User · Answer

from collections import Counter  defaultdict  def duplicates lst       cnt  Counter lst      return  key for key in cnt keys   if cnt key  gt  1   def duplicates indices lst       dup  ind  duplicates lst   defaultdict list      for i  v in enumerate lst           if v in dup  ind v  append i      return ind  lst    a    b    a    c    b    a    e   print duplicates lst      a    b   print duplicates indices lst           a    0  2  5    b    1  4      A slightly more orthogonal  and thus more useful  implementation would be   from collections import Counter  defaultdict  def duplicates lst       cnt  Counter lst      return  key for key in cnt keys   if cnt key  gt  1   def indices lst  items  None       items  ind  set lst  if items is None else items  defaultdict list      for i  v in enumerate lst           if v in items  ind v  append i      return ind  lst    a    b    a    c    b    a    e   print indices lst  duplicates lst            a    0  2  5    b    1  4

User · Answer

def find duplicate list        duplicate list           for k in range len list             if duplicate list   contains   list  k                continue         for j in range len list                 if k    j                  continue             if list  k     list  j                   duplicate list append list  j                   print  duplicate   str list  index list  j    str list  index list  k

User · Answer

You want to pass in the optional second parameter to index  the location where you want index to start looking  After you find each match  reset this parameter to the location just after the match that was found   def list duplicates of seq item       start at   -1     locs          while True          try              loc   seq index item start at 1          except ValueError              break         else              locs append loc              start at   loc     return locs  source    ABABDBAAEDSBQEWBAFLSAFB  print list duplicates of source   B      Prints    1  3  5  11  15  22    You can find all the duplicates at once in a single pass through source  by using a defaultdict to keep a list of all seen locations for any item  and returning those items that were seen more than once   from collections import defaultdict  def list duplicates seq       tally   defaultdict list      for i item in enumerate seq           tally item  append i      return   key locs  for key locs in tally items                                if len locs  gt 1   for dup in sorted list duplicates source        print dup    Prints     A    0  2  6  7  16  20     B    1  3  5  11  15  22     D    4  9     E    8  13     F    17  21     S    10  19     If you want to do repeated testing for various keys against the same source  you can use functools partial to create a new function variable  using a  partially complete  argument list  that is  specifying the seq  but omitting the item to search for   from functools import partial dups in source   partial list duplicates of  source   for c in  ABDEFS       print c  dups in source c     Prints   A  0  2  6  7  16  20  B  1  3  5  11  15  22  D  4  9  E  8  13  F  17  21  S  10  19

User · Answer

dups   collections defaultdict list  for i  e in enumerate L     dups e  append i  for k  v in sorted dups iteritems       if len v   gt   2      print   s   r     k  v    And extrapolate from there

User · Answer

string list     A    B    C    B    D    B   pos list      for i in range len string list        if string list i      B           pos list append i  print pos list

User · Answer

gt  gt  gt  def duplicates lst  item         return  i for i  x in enumerate lst  if x    item        gt  gt  gt  duplicates List   A    0  2    To get all duplicates  you can use the below method  but it is not very efficient  If efficiency is important you should consider Ignacio s solution instead    gt  gt  gt  dict  x  duplicates List  x   for x in set List  if List count x   gt  1    A    0  2     As for solving it using the index method of list instead  that method takes a second optional argument indicating where to start  so you could just repeatedly call it with the previous index plus 1    gt  gt  gt  List index  A   0  gt  gt  gt  List index  A   1  2   EDIT Fixed issue raised in comments

User · Answer

a   2 3 4 5 6 2 3 2 4 2  search 2 pos 0 positions     while  search in a       pos  a index search      positions append pos      a a a index search  1       pos  1  print  search found at   positions

User · Answer

Using new  Counter  class in collections module  based on lazyr s answer    gt  gt  gt  import collections  gt  gt  gt  def duplicates n    n  123123123          counter collections Counter n     1   3   3   3   2   3          dups  i for i in counter if counter i   1     1   3   2           result            for item in dups                  result item   i for i j in enumerate n  if j  item           return result       gt  gt  gt  duplicates  123123123     1    0  3  6    3    2  5  8    2    1  4  7

User · Answer

I made a benchmark of all solutions suggested here and also added another solution to this problem  described in the end of the answer    Benchmarks  First  the benchmarks  I initialize a list of n random ints within a range  1  n 2  and then call timeit over all algorithms  The solutions of  Paul McGuire and  Ignacio Vazquez-Abrams works about twice as fast as the rest on the list of 100 ints   Testing algorithm on the list of 100 items using 10000 loops Algorithm  dupl eat Timing  1 46247477189                      Algorithm  dupl utdemir Timing  2 93324529055                      Algorithm  dupl lthaulow Timing  3 89198786645                      Algorithm  dupl pmcguire Timing  0 583058259784                      Algorithm  dupl ivazques abrams Timing  0 645062989076                      Algorithm  dupl rbespal Timing  1 06523873786                        If you change the number of items to 1000  the difference becomes much bigger  BTW  I ll be happy if someone could explain why     Testing algorithm on the list of 1000 items using 1000 loops Algorithm  dupl eat Timing  5 46171654555                      Algorithm  dupl utdemir Timing  25 5582547323                      Algorithm  dupl lthaulow Timing  39 284285326                      Algorithm  dupl pmcguire Timing  0 56558489513                      Algorithm  dupl ivazques abrams Timing  0 615980005148                      Algorithm  dupl rbespal Timing  1 21610942322                        On the bigger lists  the solution of   Paul McGuire continues to be the most efficient and my algorithm begins having problems   Testing algorithm on the list of 1000000 items using 1 loops Algorithm  dupl pmcguire Timing  1 5019953958                      Algorithm  dupl ivazques abrams Timing  1 70856155898                      Algorithm  dupl rbespal Timing  3 95820421595                        The full code of the benchmark is here  Another algorithm  Here is my solution to the same problem   def dupl rbespal c       alreadyAdded   False     dupl c   dict       sorted ind c   sorted range len c    key lambda x  c x     sort incoming list but save the indexes of sorted items      for i in xrange len c  - 1     loop over indexes of sorted items         if c sorted ind c i      c sorted ind c i 1      if two consecutive indexes point to the same value  add it to the duplicates             if not alreadyAdded                  dupl c c sorted ind c i       sorted ind c i   sorted ind c i 1                   alreadyAdded   True             else                  dupl c c sorted ind c i    append  sorted ind c i 1            else              alreadyAdded   False     return dupl c   Although it s not the best it allowed me to generate a little bit different structure needed for my problem  i needed something like a linked list of indexes of the same value

User · Answer

def index arr  num       for i  x in enumerate arr           if x    num              print x  i     index List   A

User · Answer

Here is one that works for multiple duplicates and you don t need to specify any values  List     A    B    A    C    E    B     duplicate two  A s two  B s  ix list      for i in range len List        try          dup ix   List  i 1    index List i      i   1    dup onwards    i   1          ix list extend  i  dup ix     if found no error  add i also     except          pass      ix list sort    print ix list   0  1  2  5

[python] Index of duplicates items in a python list

Examples related to python