Python list subtraction operation

Question

I want to do something similar to this    gt  gt  gt  x    1 2 3 4 5 6 7 8 9 0     gt  gt  gt  x    1  2  3  4  5  6  7  8  9  0     gt  gt  gt  y    1 3 5 7 9     gt  gt  gt  y    1  3  5  7  9     gt  gt  gt  y - x      should return  2 4 6 8 0     But this is not supported by python lists What is the best way of doing it

User · Answer

Use set difference   gt  gt  gt  z   list set x  - set y    gt  gt  gt  z  0  8  2  4  6    Or you might just have x and y be sets so you don t have to do any conversions

User · Answer

Try this   def subtract lists a  b           Subtracts two lists  Throws ValueError if b contains items not in a           Terminate if b is empty  otherwise remove b 0  from a and recurse     return a if len b     0 else  a  i    subtract lists a i 1    b 1                                       for i in  a index b 0     0    gt  gt  gt  x    1 2 3 4 5 6 7 8 9 0   gt  gt  gt  y    1 3 5 7 9   gt  gt  gt  subtract lists x y   2  4  6  8  0   gt  gt  gt  x    1 2 3 4 5 6 7 8 9 0 9   gt  gt  gt  subtract lists x y   2  4  6  8  0  9       9 is only deleted once  gt  gt  gt

User · Answer

The other solutions have one of a few problems:

They don't preserve order, or
They don't remove a precise count of elements, e.g. for x = [1, 2, 2, 2] and y = [2, 2] they convert y to a set, and either remove all matching elements (leaving [1] only) or remove one of each unique element (leaving [1, 2, 2]), when the proper behavior would be to remove 2 twice, leaving [1, 2], or
They do O(m * n) work, where an optimal solution can do O(m + n) work

Alain was on the right track with Counter to solve #2 and #3, but that solution will lose ordering. The solution that preserves order (removing the first n copies of each value for n repetitions in the list of values to remove) is:

from collections import Counter

x = [1,2,3,4,3,2,1]  
y = [1,2,2]  
remaining = Counter(y)

out = []
for val in x:
    if remaining[val]:
        remaining[val] -= 1
    else:
        out.append(val)
# out is now [3, 4, 3, 1], having removed the first 1 and both 2s.

Try it online!

To make it remove the last copies of each element, just change the for loop to for val in reversed(x): and add out.reverse() immediately after exiting the for loop.

Constructing the Counter is O(n) in terms of y's length, iterating x is O(n) in terms of x's length, and Counter membership testing and mutation are O(1), while list.append is amortized O(1) (a given append can be O(n), but for many appends, the overall big-O averages O(1) since fewer and fewer of them require a reallocation), so the overall work done is O(m + n).

You can also test for to determine if there were any elements in y that were not removed from x by testing:

remaining = +remaining  # Removes all keys with zero counts from Counter
if remaining:
    # remaining contained elements with non-zero counts

User · Answer

Looking up values in sets are faster than looking them up in lists:

[item for item in x if item not in set(y)]

I believe this will scale slightly better than:

[item for item in x if item not in y]

Both preserve the order of the lists.

User · Answer

I think the easiest way to achieve this is by using set      gt  gt  gt  x    1 2 3 4 5 6 7 8 9 0     gt  gt  gt  y    1 3 5 7 9     gt  gt  gt  list set x - set y    0  2  4  6  8

User · Answer

This example subtracts two lists     List of pairs of points list      list append   602  336    624  365    list append   635  336    654  365    list append   642  342    648  358    list append   644  344    646  356    list append   653  337    671  365    list append   728  13    739  32    list append   756  59    767  79     itens to remove      itens to remove append   642  342    648  358    itens to remove append   644  344    646  356     print  Initial List Size     len list    for a in itens to remove      for b in list          if a    b               list remove b   print  Final List Size     len list

User · Answer

if duplicate and ordering items are problem     i for i in a if not i in b or b remove i    a    1 2 3 3 3 3 4  b    1 3  result   2  3  3  3  4

User · Answer

That is a  set subtraction  operation  Use the set data structure for that   In Python 2 7   x    1 2 3 4 5 6 7 8 9 0  y    1 3 5 7 9  print x - y   Output    gt  gt  gt  print x - y set  0  8  2  4  6

User · Answer

If the lists allow duplicate elements  you can use Counter from collections   from collections import Counter result   list  Counter x -Counter y   elements      If you need to preserve the order of elements from x   result     v for c in  Counter y   for v in x if not c v  or c subtract  v

User · Answer

For many use cases  the answer you want is   ys   set y   item for item in x if item not in ys    This is a hybrid between aaronasterling s answer and quantumSoup s answer   aaronasterling s version does len y  item comparisons for each element in x  so it takes quadratic time  quantumSoup s version uses sets  so it does a single constant-time set lookup for each element in x   but  because it converts both x and y into sets  it loses the order of your elements   By converting only y into a set  and iterating x in order  you get the best of both worlds   linear time  and order preservation      However  this still has a problem from quantumSoup s version  It requires your elements to be hashable  That s pretty much built into the nature of sets    If you re trying to  e g   subtract a list of dicts from another list of dicts  but the list to subtract is large  what do you do   If you can decorate your values in some way that they re hashable  that solves the problem  For example  with a flat dictionary whose values are themselves hashable   ys    tuple item items    for item in y   item for item in x if tuple item items    not in ys    If your types are a bit more complicated  e g   often you re dealing with JSON-compatible values  which are hashable  or lists or dicts whose values are recursively the same type   you can still use this solution  But some types just can t be converted into anything hashable     If your items aren t  and can t be made  hashable  but they are comparable  you can at least get log-linear time  O N log M   which is a lot better than the O N M  time of the list solution  but not as good as the O N M  time of the set solution  by sorting and using bisect   ys   sorted y  def bisect contains seq  item       index   bisect bisect seq  item      return index  lt  len seq  and seq index     item  item for item in x if bisect contains ys  item       If your items are neither hashable nor comparable  then you re stuck with the quadratic solution       Note that you could also do this by using a pair of OrderedSet objects  for which you can find recipes and third-party modules  But I think this is simpler      The reason set lookups are constant time is that all it has to do is hash the value and see if there s an entry for that hash  If it can t hash the value  this won t work

User · Answer

The answer provided by  aaronasterling looks good  however  it is not compatible with the default interface of list  x   MyList 1  2  3  4  vs x   MyList  1  2  3  4    Thus  the below code can be used as a more python-list friendly   class MyList list       def   init   self   args           super MyList  self    init    args       def   sub   self  other           return self   class    item for item in self if item not in other     Example    x   MyList  1  2  3  4   y   MyList  2  5  2   z   x - y

User · Answer

Use a list comprehension    item for item in x if item not in y    If you want to use the - infix syntax  you can just do   class MyList list       def   init   self   args           super MyList  self    init   args       def   sub   self  other           return self   class     item for item in self if item not in other     you can then use it like    x   MyList 1  2  3  4  y   MyList 2  5  2  z   x - y      But if you don t absolutely need list properties  for example  ordering   just use sets as the other answers recommend

[python] Python list subtraction operation

The answer is

Examples related to python

Examples related to list

Tags