[python] Compare object instances for equality by their attributes

I have a class MyClass, which contains two member variables foo and bar:

class MyClass:
    def __init__(self, foo, bar):
        self.foo = foo
        self.bar = bar

I have two instances of this class, each of which has identical values for foo and bar:

x = MyClass('foo', 'bar')
y = MyClass('foo', 'bar')

However, when I compare them for equality, Python returns False:

>>> x == y
False

How can I make python consider these two objects equal?

This question is related to python equality

The answer is


I tried the initial example (see 7 above) and it did not work in ipython. Note that cmp(obj1,obj2) returns a "1" when implemented using two identical object instances. Oddly enough when I modify one of the attribute values and recompare, using cmp(obj1,obj2) the object continues to return a "1". (sigh...)

Ok, so what you need to do is iterate two objects and compare each attribute using the == sign.


class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

    def __repr__(self):
        return str(self.value)

    def __eq__(self,other):
        return self.value == other.value

node1 = Node(1)
node2 = Node(1)

print(f'node1 id:{id(node1)}')
print(f'node2 id:{id(node2)}')
print(node1 == node2)
>>> node1 id:4396696848
>>> node2 id:4396698000
>>> True

Implement the __eq__ method in your class; something like this:

def __eq__(self, other):
    return self.path == other.path and self.title == other.title

Edit: if you want your objects to compare equal if and only if they have equal instance dictionaries:

def __eq__(self, other):
    return self.__dict__ == other.__dict__

When comparing instances of objects, the __cmp__ function is called.

If the == operator is not working for you by default, you can always redefine the __cmp__ function for the object.

Edit:

As has been pointed out, the __cmp__ function is deprecated since 3.0. Instead you should use the “rich comparison” methods.


Depending on your specific case, you could do:

>>> vars(x) == vars(y)
True

See Python dictionary from an object's fields


If you're dealing with one or more classes which you can't change from the inside, there are generic and simple ways to do this that also don't depend on a diff-specific library:

Easiest, unsafe-for-very-complex-objects method

pickle.dumps(a) == pickle.dumps(b)

pickle is a very common serialization lib for Python objects, and will thus be able to serialize pretty much anything, really. In the above snippet I'm comparing the str from serialized a with the one from b. Unlike the next method, this one has the advantage of also type checking custom classes.

The biggest hassle: due to specific ordering and [de/en]coding methods, pickle may not yield the same result for equal objects, specially when dealing with more complex ones (e.g. lists of nested custom-class instances) like you'll frequently find in some third-party libs. For those cases, I'd recommend a different approach:

Thorough, safe-for-any-object method

You could write a recursive reflection that'll give you serializable objects, and then compare results

from collections.abc import Iterable

BASE_TYPES = [str, int, float, bool, type(None)]


def base_typed(obj):
    """Recursive reflection method to convert any object property into a comparable form.
    """
    T = type(obj)
    from_numpy = T.__module__ == 'numpy'

    if T in BASE_TYPES or callable(obj) or (from_numpy and not isinstance(T, Iterable)):
        return obj

    if isinstance(obj, Iterable):
        base_items = [base_typed(item) for item in obj]
        return base_items if from_numpy else T(base_items)

    d = obj if T is dict else obj.__dict__

    return {k: base_typed(v) for k, v in d.items()}


def deep_equals(*args):
    return all(base_typed(args[0]) == base_typed(other) for other in args[1:])

Now it doesn't matter what your objects are, deep equality is assured to work

>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> a = RandomForestClassifier(max_depth=2, random_state=42)
>>> b = RandomForestClassifier(max_depth=2, random_state=42)
>>> 
>>> deep_equals(a, b)
True

The number of comparables doesn't matter as well

>>> c = RandomForestClassifier(max_depth=2, random_state=1000)
>>> deep_equals(a, b, c)
False

My use case for this was checking deep equality among a diverse set of already trained Machine Learning models inside BDD tests. The models belonged to a diverse set of third-party libs. Certainly implementing __eq__ like other answers here suggest wasn't an option for me.

Covering all the bases

You may be in a scenario where one or more of the custom classes being compared do not have a __dict__ implementation. That's not common by any means, but it is the case of a subtype within sklearn's Random Forest classifier: <type 'sklearn.tree._tree.Tree'>. Treat these situations in a case by case basis - e.g. specifically, I decided to replace the content of the afflicted type with the content of a method that gives me representative information on the instance (in this case, the __getstate__ method). For such, the second-to-last row in base_typed became

d = obj if T is dict else obj.__dict__ if '__dict__' in dir(obj) else obj.__getstate__()

Edit: for the sake of organization, I replaced the hideous oneliner above with return dict_from(obj). Here, dict_from is a really generic reflection made to accommodate more obscure libs (I'm looking at you, Doc2Vec)

def isproperty(prop, obj):
    return not callable(getattr(obj, prop)) and not prop.startswith('_')


def dict_from(obj):
    """Converts dict-like objects into dicts
    """
    if isinstance(obj, dict):
        # Dict and subtypes are directly converted
        d = dict(obj)

    elif '__dict__' in dir(obj):
        # Use standard dict representation when available
        d = obj.__dict__

    elif str(type(obj)) == 'sklearn.tree._tree.Tree':
        # Replaces sklearn trees with their state metadata
        d = obj.__getstate__()

    else:
        # Extract non-callable, non-private attributes with reflection
        kv = [(p, getattr(obj, p)) for p in dir(obj) if isproperty(p, obj)]
        d = {k: v for k, v in kv}

    return {k: base_typed(v) for k, v in d.items()}

Do mind none of the above methods yield True for objects with the same key-value pairs in differing order, as in

>>> a = {'foo':[], 'bar':{}}
>>> b = {'bar':{}, 'foo':[]}
>>> pickle.dumps(a) == pickle.dumps(b)
False

But if you want that you could use Python's built-in sorted method beforehand anyway.


If you want to get an attribute-by-attribute comparison, and see if and where it fails, you can use the following list comprehension:

[i for i,j in 
 zip([getattr(obj_1, attr) for attr in dir(obj_1)],
     [getattr(obj_2, attr) for attr in dir(obj_2)]) 
 if not i==j]

The extra advantage here is that you can squeeze it one line and enter in the "Evaluate Expression" window when debugging in PyCharm.


You override the rich comparison operators in your object.

class MyClass:
 def __lt__(self, other):
      # return comparison
 def __le__(self, other):
      # return comparison
 def __eq__(self, other):
      # return comparison
 def __ne__(self, other):
      # return comparison
 def __gt__(self, other):
      # return comparison
 def __ge__(self, other):
      # return comparison

Like this:

    def __eq__(self, other):
        return self._id == other._id

Instance of a class when compared with == comes to non-equal. The best way is to ass the cmp function to your class which will do the stuff.

If you want to do comparison by the content you can simply use cmp(obj1,obj2)

In your case cmp(doc1,doc2) It will return -1 if the content wise they are same.


With Dataclasses in Python 3.7 (and above), a comparison of object instances for equality is an inbuilt feature.

A backport for Dataclasses is available for Python 3.6.

(Py37) nsc@nsc-vbox:~$ python
Python 3.7.5 (default, Nov  7 2019, 10:50:52) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> @dataclass
... class MyClass():
...     foo: str
...     bar: str
... 
>>> x = MyClass(foo="foo", bar="bar")
>>> y = MyClass(foo="foo", bar="bar")
>>> x == y
True

As a summary :

  1. It's advised to implement __eq__ rather than __cmp__, except if you run python <= 2.0 (__eq__ has been added in 2.1)
  2. Don't forget to also implement __ne__ (should be something like return not self.__eq__(other) or return not self == other except very special case)
  3. Don`t forget that the operator must be implemented in each custom class you want to compare (see example below).
  4. If you want to compare with object that can be None, you must implement it. The interpreter cannot guess it ... (see example below)

    class B(object):
      def __init__(self):
        self.name = "toto"
      def __eq__(self, other):
        if other is None:
          return False
        return self.name == other.name
    
    class A(object):
      def __init__(self):
        self.toto = "titi"
        self.b_inst = B()
      def __eq__(self, other):
        if other is None:
          return False
        return (self.toto, self.b_inst) == (other.toto, other.b_inst)
    

I wrote this and placed it in a test/utils module in my project. For cases when its not a class, just plan ol' dict, this will traverse both objects and ensure

  1. every attribute is equal to its counterpart
  2. No dangling attributes exist (attrs that only exist on one object)

Its big... its not sexy... but oh boi does it work!

def assertObjectsEqual(obj_a, obj_b):

    def _assert(a, b):
        if a == b:
            return
        raise AssertionError(f'{a} !== {b} inside assertObjectsEqual')

    def _check(a, b):
        if a is None or b is None:
            _assert(a, b)
        for k,v in a.items():
            if isinstance(v, dict):
                assertObjectsEqual(v, b[k])
            else:
                _assert(v, b[k])

    # Asserting both directions is more work
    # but it ensures no dangling values on
    # on either object
    _check(obj_a, obj_b)
    _check(obj_b, obj_a)

You can clean it up a little by removing the _assert and just using plain ol' assert but then the message you get when it fails is very unhelpful.


Below works (in my limited testing) by doing deep compare between two object hierarchies. In handles various cases including the cases when objects themselves or their attributes are dictionaries.

def deep_comp(o1:Any, o2:Any)->bool:
    # NOTE: dict don't have __dict__
    o1d = getattr(o1, '__dict__', None)
    o2d = getattr(o2, '__dict__', None)

    # if both are objects
    if o1d is not None and o2d is not None:
        # we will compare their dictionaries
        o1, o2 = o1.__dict__, o2.__dict__

    if o1 is not None and o2 is not None:
        # if both are dictionaries, we will compare each key
        if isinstance(o1, dict) and isinstance(o2, dict):
            for k in set().union(o1.keys() ,o2.keys()):
                if k in o1 and k in o2:
                    if not deep_comp(o1[k], o2[k]):
                        return False
                else:
                    return False # some key missing
            return True
    # mismatched object types or both are scalers, or one or both None
    return o1 == o2

This is a very tricky code so please add any cases that might not work for you in comments.


You should implement the method __eq__:

 class MyClass:
      def __init__(self, foo, bar, name):
           self.foo = foo
           self.bar = bar
           self.name = name

      def __eq__(self,other):
           if not isinstance(other,MyClass):
                return NotImplemented
           else:
                #string lists of all method names and properties of each of these objects
                prop_names1 = list(self.__dict__)
                prop_names2 = list(other.__dict__)

                n = len(prop_names1) #number of properties
                for i in range(n):
                     if getattr(self,prop_names1[i]) != getattr(other,prop_names2[i]):
                          return False

                return True