Subtracting two lists in Python

Question

In Python  How can one subtract two non-unique  unordered lists  Say we have a    0 1 2 1 0  and b    0  1  1  I d like to do something like c   a - b and have c be  2  0  or  0  2  order doesn t matter to me  This should throw an exception if a does not contain all elements in b   Note this is different from sets  I m not interested in finding the difference of the sets of elements in a and b  I m interested in the difference between the actual collections of elements in a and b   I can do this with a for loop  looking up the first element of b in a and then removing the element from b and from a  etc  But this doesn t appeal to me  it would be very inefficient  order of O n 2  time  while it should be no problem to do this in O n log n  time

User · Answer

I know  for  is not what you want  but it s simple and clear   for x in b    a remove x    Or if members of b might not be in a then use   for x in b    if x in a      a remove x

User · Answer

You can try something like this   class mylist list        def   sub   self  b           result   self            b   b            while b              try                  result remove b pop                except ValueError                  raise Exception  Not all elements found during subtraction           return result   a   mylist  0  1  2  1  0    b   mylist  0  1  1     gt  gt  gt  a - b  2  0    You have to define what  1  2  3  -  5  6  should output though  I guess you want  1  2  3  thats why I ignore the ValueError   Edit  Now I see you wanted an exception if a does not contain all elements  added it instead of passing the ValueError

User · Answer

To prove jkp s point that  anything on one line will probably be helishly complex to understand   I created a one-liner  Please do not mod me down because I understand this is not a solution that you should actually use  It is just for demonstrational purposes   The idea is to add the values in a one by one  as long as the total times you have added that value does is smaller than the total number of times this value is in a minus the number of times it is in b     value for counter value in enumerate a  if a count value   gt   b count value    a counter   count value      The horror  But perhaps someone can improve on it  Is it even bug free   Edit  Seeing Devin Jeanpierre comment about using a dictionary datastructure  I came up with this oneliner   sum    value  count for value count in  value a count value -b count value  for value in set a   items            Better  but still unreadable

User · Answer

Python 2 7  and 3 0 have collections Counter  a k a  multiset    The documentation links to Recipe 576611  Counter class for Python 2 5   from operator import itemgetter from heapq import nlargest from itertools import repeat  ifilter  class Counter dict          Dict subclass for counting hashable objects   Sometimes called a bag     or multiset   Elements are stored as dictionary keys and their counts     are stored as dictionary values        gt  gt  gt  Counter  zyzygy       Counter   y   3   z   2   g   1                 def   init   self  iterable None    kwds              Create a new  empty Counter object   And if given  count elements         from an input iterable   Or  initialize the count from another mapping         of elements to their counts            gt  gt  gt  c   Counter                               a new  empty counter          gt  gt  gt  c   Counter  gallahad                     a new counter from an iterable          gt  gt  gt  c   Counter   a   4   b   2               a new counter from a mapping          gt  gt  gt  c   Counter a 4  b 2                      a new counter from keyword args                              self update iterable    kwds       def   missing   self  key           return 0      def most common self  n None              List the n most common elements and their counts from the most         common to the least   If n is None  then list all element counts            gt  gt  gt  Counter  abracadabra   most common 3             a   5     r   2     b   2                                if n is None              return sorted self iteritems    key itemgetter 1   reverse True          return nlargest n  self iteritems    key itemgetter 1        def elements self              Iterator over elements repeating each as many times as its count            gt  gt  gt  c   Counter  ABCABC            gt  gt  gt  sorted c elements              A    A    B    B    C    C            If an element s count has been set to zero or is a negative number          elements   will ignore it                       for elem  count in self iteritems                for   in repeat None  count                   yield elem        Override dict methods where the meaning changes for Counter objects        classmethod     def fromkeys cls  iterable  v None           raise NotImplementedError               Counter fromkeys   is undefined   Use Counter iterable  instead         def update self  iterable None    kwds              Like dict update   but add counts instead of replacing them           Source can be an iterable  a dictionary  or another Counter instance            gt  gt  gt  c   Counter  which            gt  gt  gt  c update  witch               add elements from another iterable          gt  gt  gt  d   Counter  watch            gt  gt  gt  c update d                    add elements from another counter          gt  gt  gt  c  h                          four  h  in which  witch  and watch         4                              if iterable is not None              if hasattr iterable   iteritems                    if self                      self get   self get                     for elem  count in iterable iteritems                            self elem    self get elem  0    count                 else                      dict update self  iterable    fast path when counter is empty             else                  self get   self get                 for elem in iterable                      self elem    self get elem  0    1         if kwds              self update kwds       def copy self            Like dict copy   but returns a Counter instance instead of a dict           return Counter self       def   delitem   self  elem            Like dict   delitem     but does not raise KeyError for missing values           if elem in self              dict   delitem   self  elem       def   repr   self           if not self              return   s      self   class     name           items        join map   r   r    mod    self most common             return   s   s       self   class     name    items         Multiset-style mathematical operations discussed in              Knuth TAOCP Volume II section 4 6 3 exercise 19             and at http   en wikipedia org wiki Multiset             Outputs guaranteed to only include positive counts              To strip negative and zero counts  add-in an empty counter              c    Counter        def   add   self  other              Add counts from two counters            gt  gt  gt  Counter  abbb     Counter  bcc           Counter   b   4   c   2   a   1                         if not isinstance other  Counter               return NotImplemented         result   Counter           for elem in set self    set other               newcount   self elem    other elem              if newcount  gt  0                  result elem    newcount         return result      def   sub   self  other               Subtract count  but keep only results with positive counts            gt  gt  gt  Counter  abbbc   - Counter  bccd           Counter   b   2   a   1                        if not isinstance other  Counter               return NotImplemented         result   Counter           for elem in set self    set other               newcount   self elem  - other elem              if newcount  gt  0                  result elem    newcount         return result      def   or   self  other              Union is the maximum of value in either of the input counters            gt  gt  gt  Counter  abbb     Counter  bcc           Counter   b   3   c   2   a   1                        if not isinstance other  Counter               return NotImplemented          max   max         result   Counter           for elem in set self    set other               newcount    max self elem   other elem               if newcount  gt  0                  result elem    newcount         return result      def   and   self  other               Intersection is the minimum of corresponding counts            gt  gt  gt  Counter  abbb    amp  Counter  bcc           Counter   b   1                        if not isinstance other  Counter               return NotImplemented          min   min         result   Counter           if len self   lt  len other               self  other   other  self         for elem in ifilter self   contains    other               newcount    min self elem   other elem               if newcount  gt  0                  result elem    newcount         return result   if   name         main         import doctest     print doctest testmod     Then you can write   a   Counter  0 1 2 1 0    b   Counter  0  1  1    c   a - b  print list c elements        0  2

User · Answer

list set  x for x in a if x not in b       Leaves a and b untouched  Is a unique set of  a - b   Done

User · Answer

to use list comprehension    i for i in a if not i in b or b remove i     would do the trick  It would change b in the process though  But I agree with jkp and Dyno Fu that using a for loop would be better   Perhaps someone can create a better example that uses list comprehension but still is KISS

User · Answer

c    i for i in b if i not in a

User · Answer

I attempted to find a more elegant solution  but the best I could do was basically the same thing that Dyno Fu said   from copy import copy  def subtract lists a  b                gt  gt  gt  a    0  1  2  1  0       gt  gt  gt  b    0  1  1       gt  gt  gt  subtract lists a  b       2  0        gt  gt  gt  import random      gt  gt  gt  size   10000      gt  gt  gt  a    random randrange 100  for   in range size        gt  gt  gt  b    random randrange 100  for   in range size        gt  gt  gt  c   subtract lists a  b       gt  gt  gt  assert all  x in a  for x in c              a   copy a      for x in b          if x in a              a remove x      return a

User · Answer

Python 2 7 and 3 2 added the collections Counter class  which is a dictionary subclass that maps elements to the number of occurrences of the element   This can be used as a multiset  You can do something like this  from collections import Counter a   Counter  0  1  2  1  0   b   Counter  0  1  1   c   a - b    ignores items in b missing in a  print list c elements        - gt   0  2   As well  if you want to check that every element in b is in a    a key  returns 0 if key not in a  instead of raising an exception assert all a key   gt   b key  for key in b   But since you are stuck with 2 5  you could try importing it and define your own version if that fails   That way you will be sure to get the latest version if it is available  and fall back to a working version if not  You will also benefit from speed improvements if if gets converted to a C implementation in the future  try     from collections import Counter except ImportError      class Counter dict               You can find the current Python source here

User · Answer

I would do it in an easier way   a b    e for e in a if not e in b         as wich wrote  this is wrong - it works only if the items are unique in the lists  And if they are  it s better to use  a b   list set a  - set b

User · Answer

You can use the map construct to do this  It looks quite ok  but beware that the map line itself will return a list of Nones   a    1  2  3  b    2  3   map lambda x a remove x   b  a

User · Answer

Here s a relatively long but efficient and readable solution  It s O n    def list diff list1  list2       counts          for x in list1          try              counts x     1         except              counts x    1     for x in list2          try              counts x  -  1             if counts x   lt  0                  raise ValueError  All elements of list2 not in list2           except              raise ValueError  All elements of list2 not in list1        result          for k  v in counts iteritems            result    v  k       return result  a    0  1  1  2  0  b    0  1  1   timeit list diff a  b   timeit list diff 1000 a  1000 b   timeit list diff 1000000 a  1000000 b  100000 loops  best of 3  4 8   s per loop 1000 loops  best of 3  1 18 ms per loop 1 loops  best of 3  1 21 s per loop

User · Answer

I m not sure what the objection to a for loop is  there is no multiset in Python so you can t use a builtin container to help you out   Seems to me anything on one line  if possible  will probably be helishly complex to understand   Go for readability and KISS   Python is not C

[python] Subtracting two lists in Python

Examples related to python

Examples related to list

Examples related to collections