[python] Determine if 2 lists have the same elements, regardless of order?

Sorry for the simple question, but I'm having a hard time finding the answer.

When I compare 2 lists, I want to know if they are "equal" in that they have the same contents, but in different order.

Ex:

x = ['a', 'b']
y = ['b', 'a']

I want x == y to evaluate to True.

This question is related to python list equality python-2.x

The answer is


As mentioned in comments above, the general case is a pain. It is fairly easy if all items are hashable or all items are sortable. However I have recently had to try solve the general case. Here is my solution. I realised after posting that this is a duplicate to a solution above that I missed on the first pass. Anyway, if you use slices rather than list.remove() you can compare immutable sequences.

def sequences_contain_same_items(a, b):
    for item in a:
        try:
            i = b.index(item)
        except ValueError:
            return False
        b = b[:i] + b[i+1:]
    return not b

This seems to work, though possibly cumbersome for large lists.

>>> A = [0, 1]
>>> B = [1, 0]
>>> C = [0, 2]
>>> not sum([not i in A for i in B])
True
>>> not sum([not i in A for i in C])
False
>>> 

However, if each list must contain all the elements of other then the above code is problematic.

>>> A = [0, 1, 2]
>>> not sum([not i in A for i in B])
True

The problem arises when len(A) != len(B) and, in this example, len(A) > len(B). To avoid this, you can add one more statement.

>>> not sum([not i in A for i in B]) if len(A) == len(B) else False
False

One more thing, I benchmarked my solution with timeit.repeat, under the same conditions used by Aaron Hall in his post. As suspected, the results are disappointing. My method is the last one. set(x) == set(y) it is.

>>> def foocomprehend(): return not sum([not i in data for i in data2])
>>> min(timeit.repeat('fooset()', 'from __main__ import fooset, foocount, foocomprehend'))
25.2893661496
>>> min(timeit.repeat('foosort()', 'from __main__ import fooset, foocount, foocomprehend'))
94.3974742993
>>> min(timeit.repeat('foocomprehend()', 'from __main__ import fooset, foocount, foocomprehend'))
187.224562545

Determine if 2 lists have the same elements, regardless of order?

Inferring from your example:

x = ['a', 'b']
y = ['b', 'a']

that the elements of the lists won't be repeated (they are unique) as well as hashable (which strings and other certain immutable python objects are), the most direct and computationally efficient answer uses Python's builtin sets, (which are semantically like mathematical sets you may have learned about in school).

set(x) == set(y) # prefer this if elements are hashable

In the case that the elements are hashable, but non-unique, the collections.Counter also works semantically as a multiset, but it is far slower:

from collections import Counter
Counter(x) == Counter(y)

Prefer to use sorted:

sorted(x) == sorted(y) 

if the elements are orderable. This would account for non-unique or non-hashable circumstances, but this could be much slower than using sets.

Empirical Experiment

An empirical experiment concludes that one should prefer set, then sorted. Only opt for Counter if you need other things like counts or further usage as a multiset.

First setup:

import timeit
import random
from collections import Counter

data = [str(random.randint(0, 100000)) for i in xrange(100)]
data2 = data[:]     # copy the list into a new one

def sets_equal(): 
    return set(data) == set(data2)

def counters_equal(): 
    return Counter(data) == Counter(data2)

def sorted_lists_equal(): 
    return sorted(data) == sorted(data2)

And testing:

>>> min(timeit.repeat(sets_equal))
13.976069927215576
>>> min(timeit.repeat(counters_equal))
73.17287588119507
>>> min(timeit.repeat(sorted_lists_equal))
36.177085876464844

So we see that comparing sets is the fastest solution, and comparing sorted lists is second fastest.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>

Examples related to equality

Python if not == vs if != Comparing arrays for equality in C++ Correct way to override Equals() and GetHashCode() Determine if 2 lists have the same elements, regardless of order? equals vs Arrays.equals in Java What is the difference between == and equals() in Java? What's the difference between equal?, eql?, ===, and ==? bash string equality jQuery object equality String comparison in Python: is vs. ==

Examples related to python-2.x

Combine several images horizontally with Python How to print variables without spaces between values How to return dictionary keys as a list in Python? How to add an element to the beginning of an OrderedDict? How to read a CSV file from a URL with Python? Malformed String ValueError ast.literal_eval() with String representation of Tuple Relative imports for the billionth time How do you use subprocess.check_output() in Python? write() versus writelines() and concatenated strings How to select a directory and store the location using tkinter in Python