[python] Remove all values within one list from another list?

I am looking for a way to remove all values within a list from another list.

Something like this:

a = range(1,10)  
a.remove([2,3,7])  
print a  
a = [1,4,5,6,8,9]  

This question is related to python list

The answer is


The simplest way is

>>> a = range(1, 10)
>>> for x in [2, 3, 7]:
...  a.remove(x)
... 
>>> a
[1, 4, 5, 6, 8, 9]

One possible problem here is that each time you call remove(), all the items are shuffled down the list to fill the hole. So if a grows very large this will end up being quite slow.

This way builds a brand new list. The advantage is that we avoid all the shuffling of the first approach

>>> removeset = set([2, 3, 7])
>>> a = [x for x in a if x not in removeset]

If you want to modify a in place, just one small change is required

>>> removeset = set([2, 3, 7])
>>> a[:] = [x for x in a if x not in removeset]

a = range(1,10)
itemsToRemove = set([2, 3, 7])
b = filter(lambda x: x not in itemsToRemove, a)

or

b = [x for x in a if x not in itemsToRemove]

Don't create the set inside the lambda or inside the comprehension. If you do, it'll be recreated on every iteration, defeating the point of using a set at all.


>>> a=range(1,10)
>>> for i in [2,3,7]: a.remove(i)
...
>>> a
[1, 4, 5, 6, 8, 9]

>>> a=range(1,10)
>>> b=map(a.remove,[2,3,7])
>>> a
[1, 4, 5, 6, 8, 9]

Others have suggested ways to make newlist after filtering e.g.

newl = [x for x in l if x not in [2,3,7]]

or

newl = filter(lambda x: x not in [2,3,7], l) 

but from your question it looks you want in-place modification for that you can do this, this will also be much much faster if original list is long and items to be removed less

l = range(1,10)
for o in set([2,3,7,11]):
    try:
        l.remove(o)
    except ValueError:
        pass

print l

output: [1, 4, 5, 6, 8, 9]

I am checking for ValueError exception so it works even if items are not in orginal list.

Also if you do not need in-place modification solution by S.Mark is simpler.


I was looking for fast way to do the subject, so I made some experiments with suggested ways. And I was surprised by results, so I want to share it with you.

Experiments were done using pythonbenchmark tool and with

a = range(1,50000) # Source list
b = range(1,15000) # Items to remove

Results:

 def comprehension(a, b):
     return [x for x in a if x not in b]

5 tries, average time 12.8 sec

def filter_function(a, b):
    return filter(lambda x: x not in b, a)

5 tries, average time 12.6 sec

def modification(a,b):
    for x in b:
        try:
            a.remove(x)
        except ValueError:
            pass
    return a

5 tries, average time 0.27 sec

def set_approach(a,b):
    return list(set(a)-set(b))

5 tries, average time 0.0057 sec

Also I made another measurement with bigger inputs size for the last two functions

a = range(1,500000)
b = range(1,100000)

And the results:

For modification (remove method) - average time is 252 seconds For set approach - average time is 0.75 seconds

So you can see that approach with sets is significantly faster than others. Yes, it doesn't keep similar items, but if you don't need it - it's for you. And there is almost no difference between list comprehension and using filter function. Using 'remove' is ~50 times faster, but it modifies source list. And the best choice is using sets - it's more than 1000 times faster than list comprehension!


If you don't have repeated values, you could use set difference.

x = set(range(10))
y = x - set([2, 3, 7])
# y = set([0, 1, 4, 5, 6, 8, 9])

and then convert back to list, if needed.


>>> a = range(1, 10)
>>> [x for x in a if x not in [2, 3, 7]]
[1, 4, 5, 6, 8, 9]