[python] Identify duplicate values in a list in Python

Is it possible to get which values are duplicates in a list using python?

I have a list of items:

    mylist = [20, 30, 25, 20]

I know the best way of removing the duplicates is set(mylist), but is it possible to know what values are being duplicated? As you can see, in this list the duplicates are the first and last values. [0, 3].

Is it possible to get this result or something similar in python? I'm trying to avoid making a ridiculously big if elif conditional statement.

This question is related to python arrays list

The answer is


It looks like you want the indices of the duplicates. Here is some short code that will find those in O(n) time, without using any packages:

dups = {}
[dups.setdefault(v, []).append(i) for i, v in enumerate(mylist)]
dups = {k: v for k, v in dups.items() if len(v) > 1}
# dups now has keys for all the duplicate values
# and a list of matching indices for each

# The second line produces an unused list. 
# It could be replaced with this:
for i, v in enumerate(mylist):
    dups.setdefault(v, []).append(i)

You can use list compression and set to reduce the complexity.

my_list = [3, 5, 2, 1, 4, 4, 1]
opt = [item for item in set(my_list) if my_list.count(item) > 1]

The following list comprehension will yield the duplicate values:

[x for x in mylist if mylist.count(x) >= 2]

I tried below code to find duplicate values from list

1) create a set of duplicate list

2) Iterated through set by looking in duplicate list.

glist=[1, 2, 3, "one", 5, 6, 1, "one"]
x=set(glist)
dup=[]
for c in x:
    if(glist.count(c)>1):
        dup.append(c)
print(dup)

OUTPUT

[1, 'one']

Now get the all index for duplicate element

glist=[1, 2, 3, "one", 5, 6, 1, "one"]
x=set(glist)
dup=[]
for c in x:
    if(glist.count(c)>1):
        indices = [i for i, x in enumerate(glist) if x == c]
        dup.append((c,indices))
print(dup)

OUTPUT

[(1, [0, 6]), ('one', [3, 7])]

Hope this helps someone


The following code will fetch you desired results with duplicate items and their index values.

  for i in set(mylist):
    if mylist.count(i) > 1:
         print(i, mylist.index(i))

mylist = [20, 30, 25, 20]

kl = {i: mylist.count(i) for i in mylist if mylist.count(i) > 1 }

print(kl)

def checkduplicate(lists): 
 a = []
 for i in lists:
    if  i in a:
        pass   
    else:
        a.append(i)
 return i          
            
print(checkduplicate([1,9,78,989,2,2,3,6,8]))

Here's a list comprehension that does what you want. As @Codemonkey says, the list starts at index 0, so the indices of the duplicates are 0 and 3.

>>> [i for i, x in enumerate(mylist) if mylist.count(x) > 1]
[0, 3]

You can print duplicate and Unqiue using below logic using list.

def dup(x):
    duplicate = []
    unique = []
    for i in x:
        if i in unique:
            duplicate.append(i)
        else:
            unique.append(i)
    print("Duplicate values: ",duplicate)
    print("Unique Values: ",unique)

list1 = [1, 2, 1, 3, 2, 5]
dup(list1)

simplest way without any intermediate list using list.index():

z = ['a', 'b', 'a', 'c', 'b', 'a', ]
[z[i] for i in range(len(z)) if i == z.index(z[i])]
>>>['a', 'b', 'c']

and you can also list the duplicates itself (may contain duplicates again as in the example):

[z[i] for i in range(len(z)) if not i == z.index(z[i])]
>>>['a', 'b', 'a']

or their index:

[i for i in range(len(z)) if not i == z.index(z[i])]
>>>[2, 4, 5]

or the duplicates as a list of 2-tuples of their index (referenced to their first occurrence only), what is the answer to the original question!!!:

[(i,z.index(z[i])) for i in range(len(z)) if not i == z.index(z[i])]
>>>[(2, 0), (4, 1), (5, 0)]

or this together with the item itself:

[(i,z.index(z[i]),z[i]) for i in range(len(z)) if not i == z.index(z[i])]
>>>[(2, 0, 'a'), (4, 1, 'b'), (5, 0, 'a')]

or any other combination of elements and indices....


That's the simplest way I can think for finding duplicates in a list:

my_list = [3, 5, 2, 1, 4, 4, 1]

my_list.sort()
for i in range(0,len(my_list)-1):
               if my_list[i] == my_list[i+1]:
                   print str(my_list[i]) + ' is a duplicate'

You should sort the list:

mylist.sort()

After this, iterate through it like this:

doubles = []
for i, elem in enumerate(mylist):
    if i != 0:
        if elem == old:
            doubles.append(elem)
            old = None
            continue
    old = elem

m = len(mylist)
for index,value in enumerate(mylist):
        for i in xrange(1,m):
                if(index != i):
                    if (L[i] == L[index]):
                        print "Location %d and location %d has same list-entry:  %r" % (index,i,value)

This has some redundancy that can be improved however.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to arrays

PHP array value passes to next row Use NSInteger as array index How do I show a message in the foreach loop? Objects are not valid as a React child. If you meant to render a collection of children, use an array instead Iterating over arrays in Python 3 Best way to "push" into C# array Sort Array of object by object field in Angular 6 Checking for duplicate strings in JavaScript array what does numpy ndarray shape do? How to round a numpy array?

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>