[python] How to remove all duplicate items from a list

How would I use python to check a list and delete all duplicates? I don't want to have to specify what the duplicate item is - I want the code to figure out if there are any and remove them if so, keeping only one instance of each. It also must work if there are multiple duplicates in a list.

For example, in my code below, the list lseparatedOrbList has 12 items - one is repeated six times, one is repeated five times, and there is only one instance of one. I want it to change the list so there are only three items - one of each, and in the same order they appeared before. I tried this:

for i in lseparatedOrbList:
   for j in lseparatedOrblist:
        if lseparatedOrbList[i] == lseparatedOrbList[j]:
            lseparatedOrbList.remove(lseparatedOrbList[j])

But I get the error:

Traceback (most recent call last):
  File "qchemOutputSearch.py", line 123, in <module>
    for j in lseparatedOrblist:
NameError: name 'lseparatedOrblist' is not defined

I'm guessing because it's because I'm trying to loop through lseparatedOrbList while I loop through it, but I can't think of another way to do it.

This question is related to python list

The answer is


No, it's simply a typo, the "list" at the end must be capitalized. You can nest loops over the same variable just fine (although there's rarely a good reason to).

However, there are other problems with the code. For starters, you're iterating through lists, so i and j will be items not indices. Furthermore, you can't change a collection while iterating over it (well, you "can" in that it runs, but madness lies that way - for instance, you'll propably skip over items). And then there's the complexity problem, your code is O(n^2). Either convert the list into a set and back into a list (simple, but shuffles the remaining list items) or do something like this:

seen = set()
new_x = []
for x in xs:
    if x in seen:
        continue
    seen.add(x)
    new_xs.append(x)

Both solutions require the items to be hashable. If that's not possible, you'll probably have to stick with your current approach sans the mentioned problems.


This should be faster and will preserve the original order:

seen = {}
new_list = [seen.setdefault(x, x) for x in my_list if x not in seen]

If you don't care about order, you can just:

new_list = list(set(my_list))

In this way one can delete a particular item which is present multiple times in a list : Try deleting all 5

list1=[1,2,3,4,5,6,5,3,5,7,11,5,9,8,121,98,67,34,5,21]
print list1
n=input("item to be deleted : " )
for i in list1:
    if n in list1:
        list1.remove(n)
print list1

This should do it for you:

new_list = list(set(old_list))

set will automatically remove duplicates. list will cast it back to a list.


There is a faster way to fix this:

list = [1, 1.0, 1.41, 1.73, 2, 2, 2.0, 2.24, 3, 3, 4, 4, 4, 5, 6, 6, 8, 8, 9, 10]
list2=[]

for value in list:
    try:
        list2.index(value)
    except:
        list2.append(value)
list.clear()
for value in list2:
    list.append(value)
list2.clear()
print(list)
print(list2)

Use set():

woduplicates = set(lseparatedOrblist)

Returns a set without duplicates. If you, for some reason, need a list back:

woduplicates = list(set(lseperatedOrblist))

This will, however, have a different order than your original list.


for unhashable lists. It is faster as it does not iterate about already checked entries.

def purge_dublicates(X):
    unique_X = []
    for i, row in enumerate(X):
        if row not in X[i + 1:]:
            unique_X.append(row)
    return unique_X

The modern way to do it that maintains the order is:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lseparatedOrbList))

as discussed by Raymond Hettinger (python core dev) in this answer. In python 3.5 and above this is also the fastest way - see the linked answer for details. However the keys must be hashable (as is the case in your list I think)


You can do this like that:

x = list(set(x))

Example: if you do something like that:

x = [1,2,3,4,5,6,7,8,9,10,2,1,6,31,20]
x = list(set(x))
x

you will see the following result:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 31]

There is only one thing you should think of: the resulting list will not be ordered as the original one (will lose the order in the process).


It's because you are missing a capital letter, actually.

Purposely dedented:

for i in lseparatedOrbList:   # capital 'L'
for j in lseparatedOrblist:   # lowercase 'l'

Though the more efficient way to do it would be to insert the contents into a set.

If maintaining the list order matters (ie, it must be "stable"), check out the answers on this question