Remove all the elements that occur in one list from another

Question

Let s say I have two lists  l1 and l2   I want to perform l1 - l2  which returns all elements of l1 not in l2   I can think of a naive loop approach to doing this  but that is going to be really inefficient   What is a pythonic and efficient way of doing this   As an example  if I have l1    1 2 6 8  and l2    2 3 5 8   l1 - l2 should return  1 6

User · Answer

One way is to use sets   gt  gt  gt  set  1 2 6 8   - set  2 3 5 8   set  1  6    Note  however  that sets do not preserve the order of elements  and cause any duplicated elements to be removed  The elements also need to be hashable  If these restrictions are tolerable  this may often be the simplest and highest performance option

User · Answer

Performance Comparisons Comparing the performance of all the answers mentioned here on Python 3 9 1 and Python 2 7 16  Python 3 9 1 Answers are mentioned in order of performance   Arkku s set difference using subtraction  quot - quot  operation -  91 3 nsec per loop  mquadri  python3 -m timeit -s  quot l1   set  1 2 6 8    l2   set  2 3 5 8    quot   quot l1 - l2 quot  5000000 loops  best of 5  91 3 nsec per loop   Moinuddin Quadri s using set   difference  -  133 nsec per loop  mquadri  python3 -m timeit -s  quot l1   set  1 2 6 8    l2   set  2 3 5 8    quot   quot l1 difference l2  quot  2000000 loops  best of 5  133 nsec per loop   Moinuddin Quadri s list comprehension with set based lookup-  366 nsec per loop   mquadri  python3 -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot  x for x in l1 if x not in l2  quot   1000000 loops  best of 5  366 nsec per loop   Donut s list comprehension on plain list -  489 nsec per loop   mquadri  python3 -m timeit -s  quot l1    1 2 6 8   l2    2 3 5 8   quot   quot  x for x in l1 if x not in l2  quot   500000 loops  best of 5  489 nsec per loop   Daniel Pryden s generator expression with set based lookup and type-casting to list -  583 nsec per loop    Explicitly type-casting to list to get the final object as list  as requested by OP  If generator expression is replaced with list comprehension  it ll become same as Moinuddin Quadri s list comprehension with set based lookup   mquadri  mquadri  python3 -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot list x for x in l1 if x not in l2  quot   500000 loops  best of 5  583 nsec per loop   Moinuddin Quadri s using filter   and explicitly type-casting to list  need to explicitly type-cast as in Python 3 x  it returns iterator  -  681 nsec per loop   mquadri  python3 -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot list filter lambda x  x not in l2  l1   quot   500000 loops  best of 5  681 nsec per loop   Akshay Hazari s using combination of functools reduce   filter - 3 36 usec per loop    Explicitly type-casting to list as from Python 3 x it started returned returning iterator  Also we need to import functools to use reduce in Python 3 x  mquadri  python3 -m timeit  quot from functools import reduce  l1    1 2 6 8   l2    2 3 5 8   quot   quot list reduce lambda x y   filter lambda z  z  y x   l1 l2   quot   100000 loops  best of 5  3 36 usec per loop    Python 2 7 16 Answers are mentioned in order of performance   Arkku s set difference using subtraction  quot - quot  operation -  0 0783 usec per loop  mquadri  python -m timeit -s  quot l1   set  1 2 6 8    l2   set  2 3 5 8    quot   quot l1 - l2 quot  10000000 loops  best of 3  0 0783 usec per loop   Moinuddin Quadri s using set   difference  -  0 117 usec per loop  mquadri  mquadri  python -m timeit -s  quot l1   set  1 2 6 8    l2   set  2 3 5 8    quot   quot l1 difference l2  quot  10000000 loops  best of 3  0 117 usec per loop   Moinuddin Quadri s list comprehension with set based lookup-  0 246 usec per loop   mquadri  python -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot  x for x in l1 if x not in l2  quot   1000000 loops  best of 3  0 246 usec per loop   Donut s list comprehension on plain list -  0 372 usec per loop   mquadri  python -m timeit -s  quot l1    1 2 6 8   l2    2 3 5 8   quot   quot  x for x in l1 if x not in l2  quot   1000000 loops  best of 3  0 372 usec per loop   Moinuddin Quadri s using filter   -  0 593 usec per loop   mquadri  python -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot filter lambda x  x not in l2  l1  quot   1000000 loops  best of 3  0 593 usec per loop   Daniel Pryden s generator expression with set based lookup and type-casting to list -  0 964 per loop    Explicitly type-casting to list to get the final object as list  as requested by OP  If generator expression is replaced with list comprehension  it ll become same as Moinuddin Quadri s list comprehension with set based lookup   mquadri  python -m timeit -s  quot l1    1 2 6 8   l2   set  2 3 5 8    quot   quot list x for x in l1 if x not in l2  quot   1000000 loops  best of 3  0 964 usec per loop   Akshay Hazari s using combination of functools reduce   filter - 2 78 usec per loop   mquadri  python -m timeit  quot l1    1 2 6 8   l2    2 3 5 8   quot   quot reduce lambda x y   filter lambda z  z  y x   l1 l2  quot   100000 loops  best of 3  2 78 usec per loop

User · Answer

Expanding on Donut s answer and the other answers here  you can get even better results by using a generator comprehension instead of a list comprehension  and by using a set data structure  since the in operator is O n  on a list but O 1  on a set    So here s a function that would work for you   def filter list full list  excludes       s   set excludes      return  x for x in full list if x not in s    The result will be an iterable that will lazily fetch the filtered list   If you need a real list object  e g  if you need to do a len   on the result   then you can easily build a list like so   filtered list   list filter list full list  excludes

User · Answer

Alternate Solution     reduce lambda x y   filter lambda z  z  y x    2 3 5 8   1 2 6 8

User · Answer

Sets versus list comprehension benchmark on Python 3 8  adding up to Moinuddin Quadri s benchmarks  tldr  Use Arkku s set solution  it s even faster than promised in comparison  Checking existing files against a list In my example I found it to be 40 times     faster to use Arkku s set solution than the pythonic list comprehension for a real world application of checking existing filenames against a list  List comprehension    time import glob existing    int os path basename x  split  quot   quot   0   for x in glob glob  quot   txt quot    wanted   list range 1  100000    i for i in wanted if i not in existing   Wall time  28 2 s Sets   time import glob existing    int os path basename x  split  quot   quot   0   for x in glob glob  quot   txt quot    wanted   list range 1  100000   set wanted  - set existing   Wall time  689 ms

User · Answer

Use the Python set type    That would be the most Pythonic       Also  since it s native  it should be the most optimized method too   See   http   docs python org library stdtypes html set  http   docs python org library sets htm  for older python     Using Python 2 7 set literal format    Otherwise  use  l1   set  1 2 6 8     l1    1 2 6 8  l2    2 3 5 8  l3   l1 - l2

User · Answer

Try this  l1  1 2 6 8  l2  2 3 5 8  r    for x in l1      if x in l2          continue     r r  x  print r

User · Answer

Using set difference    You can use set difference   to get new set with elements in the set that are not in the others  i e  set A  difference B  will return set with items present in A  but not in B  For example   gt  gt  gt  set  1 2 6 8   difference  2 3 5 8    1  6   It is a functional approach to get set difference mentioned in Arkku s answer  which uses arithmetic subtraction - operator for set difference   Since sets are unordered  you ll loose the ordering of elements from initial list   continue reading next section if you want to maintain the orderig of elements  Using List Comprehension with set based lookup If you want to maintain the ordering from initial list  then Donut s list comprehension based answer will do the trick  However  you can get better performance from the accepted answer by using set internally for checking whether element is present in other list  For example  l1  l2    1 2 6 8    2 3 5 8  s2   set l2     Type-cast  l2  to  set   l3    x for x in l1 if x not in s2                                     Doing membership checking on  set  s2  If you are interested in knowing why membership checking is faster is set when compared to list  please read this  What makes sets faster than lists   Using filter   and lambda expression Here s another alternative using filter   with the lambda expression  Adding it here just for reference  but it is not performance efficient   gt  gt  gt  l1    1 2 6 8   gt  gt  gt  l2   set  2 3 5 8          v   filter  returns the a iterator object  Here I m type-casting        v  it to  list  in order to display the resultant value  gt  gt  gt  list filter lambda x  x not in l2  l1    1  6

User · Answer

use  Set Comprehensions  x for x in l2  or set l2  to get set  then use List Comprehensions to get list  l2set   set l2  l3    x for x in l1 if x not in l2set    benchmark test code    import time  l1   list range 1000 10   3   l2   list range 1000 10   2    l2set    x for x in l2   tic   time time   l3    x for x in l1 if x not in l2set  toc   time time   diffset   toc-tic print diffset   tic   time time   l3    x for x in l1 if x not in l2  toc   time time   difflist   toc-tic print difflist   print  speedup  fx   difflist diffset     benchmark test result   0 0015058517456054688 3 968189239501953 speedup 2635 179227x

User · Answer

Python has a language feature called List Comprehensions that is perfectly suited to making this sort of thing extremely easy  The following statement does exactly what you want and stores the result in l3   l3    x for x in l1 if x not in l2    l3 will contain  1  6

[python] Remove all the elements that occur in one list from another

Examples related to python

Examples related to list