[python] Comparing two dictionaries and checking how many (key, value) pairs are equal

I have two dictionaries, but for simplification, I will take these two:

>>> x = dict(a=1, b=2)
>>> y = dict(a=2, b=2)

Now, I want to compare whether each key, value pair in x has the same corresponding value in y. So I wrote this:

>>> for x_values, y_values in zip(x.iteritems(), y.iteritems()):
        if x_values == y_values:
            print 'Ok', x_values, y_values
        else:
            print 'Not', x_values, y_values

And it works since a tuple is returned and then compared for equality.

My questions:

Is this correct? Is there a better way to do this? Better not in speed, I am talking about code elegance.

UPDATE: I forgot to mention that I have to check how many key, value pairs are equal.

This question is related to python dictionary comparison

The answer is


>>> hash_1
{'a': 'foo', 'b': 'bar'}
>>> hash_2
{'a': 'foo', 'b': 'bar'}
>>> set_1 = set (hash_1.iteritems())
>>> set_1
set([('a', 'foo'), ('b', 'bar')])
>>> set_2 = set (hash_2.iteritems())
>>> set_2
set([('a', 'foo'), ('b', 'bar')])
>>> len (set_1.difference(set_2))
0
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
...    print "The two hashes match."
...
The two hashes match.
>>> hash_2['c'] = 'baz'
>>> hash_2
{'a': 'foo', 'c': 'baz', 'b': 'bar'}
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
...     print "The two hashes match."
...
>>>
>>> hash_2.pop('c')
'baz'

Here's another option:

>>> id(hash_1)
140640738806240
>>> id(hash_2)
140640738994848

So as you see the two id's are different. But the rich comparison operators seem to do the trick:

>>> hash_1 == hash_2
True
>>>
>>> hash_2
{'a': 'foo', 'b': 'bar'}
>>> set_2 = set (hash_2.iteritems())
>>> if (len(set_1.difference(set_2)) | len(set_2.difference(set_1))) == False:
...     print "The two hashes match."
...
The two hashes match.
>>>

What you want to do is simply x==y

What you do is not a good idea, because the items in a dictionary are not supposed to have any order. You might be comparing [('a',1),('b',1)] with [('b',1), ('a',1)] (same dictionaries, different order).

For example, see this:

>>> x = dict(a=2, b=2,c=3, d=4)
>>> x
{'a': 2, 'c': 3, 'b': 2, 'd': 4}
>>> y = dict(b=2,c=3, d=4)
>>> y
{'c': 3, 'b': 2, 'd': 4}
>>> zip(x.iteritems(), y.iteritems())
[(('a', 2), ('c', 3)), (('c', 3), ('b', 2)), (('b', 2), ('d', 4))]

The difference is only one item, but your algorithm will see that all items are different


Code

def equal(a, b):
    type_a = type(a)
    type_b = type(b)
    
    if type_a != type_b:
        return False
    
    if isinstance(a, dict):
        if len(a) != len(b):
            return False
        for key in a:
            if key not in b:
                return False
            if not equal(a[key], b[key]):
                return False
        return True

    elif isinstance(a, list):
        if len(a) != len(b):
            return False
        while len(a):
            x = a.pop()
            index = indexof(x, b)
            if index == -1:
                return False
            del b[index]
        return True
        
    else:
        return a == b

def indexof(x, a):
    for i in range(len(a)):
        if equal(x, a[i]):
            return i
    return -1

Test

>>> a = {
    'number': 1,
    'list': ['one', 'two']
}
>>> b = {
    'list': ['two', 'one'],
    'number': 1
}
>>> equal(a, b)
True

The function is fine IMO, clear and intuitive. But just to give you (another) answer, here is my go:

def compare_dict(dict1, dict2):
    for x1 in dict1.keys():
        z = dict1.get(x1) == dict2.get(x1)
        if not z:
            print('key', x1)
            print('value A', dict1.get(x1), '\nvalue B', dict2.get(x1))
            print('-----\n')

Can be useful for you or for anyone else..

EDIT:

I have created a recursive version of the one above.. Have not seen that in the other answers

def compare_dict(a, b):
    # Compared two dictionaries..
    # Posts things that are not equal..
    res_compare = []
    for k in set(list(a.keys()) + list(b.keys())):
        if isinstance(a[k], dict):
            z0 = compare_dict(a[k], b[k])
        else:
            z0 = a[k] == b[k]

        z0_bool = np.all(z0)
        res_compare.append(z0_bool)
        if not z0_bool:
            print(k, a[k], b[k])
    return np.all(res_compare)

A simple compare with == should be enough nowadays (python 3.8). Even when you compare the same dicts in a different order (last example). The best thing is, you don't need a third-party package to accomplish this.

a = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
b = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}

c = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}
d = {'one': 'dog', 'two': 'cat', 'three': 'mouse', 'four': 'fish'}

e = {'one': 'cat', 'two': 'dog', 'three': 'mouse'}
f = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}

g = {'two': 'cat', 'one': 'dog', 'three': 'mouse'}
h = {'one': 'dog', 'two': 'cat', 'three': 'mouse'}


print(a == b) # True
print(c == d) # False
print(e == f) # False
print(g == h) # True

Being late in my response is better than never!

Compare Not_Equal is more efficient than comparing Equal. As such two dicts are not equal if any key values in one dict is not found in the other dict. The code below takes into consideration that you maybe comparing default dict and thus uses get instead of getitem [].

Using a kind of random value as default in the get call equal to the key being retrieved - just in case the dicts has a None as value in one dict and that key does not exist in the other. Also the get != condition is checked before the not in condition for efficiency because you are doing the check on the keys and values from both sides at the same time.

def Dicts_Not_Equal(first,second):
    """ return True if both do not have same length or if any keys and values are not the same """
    if len(first) == len(second): 
        for k in first:
            if first.get(k) != second.get(k,k) or k not in second: return (True)
        for k in second:         
            if first.get(k,k) != second.get(k) or k not in first: return (True)
        return (False)   
    return (True)

def dict_compare(d1, d2):
    d1_keys = set(d1.keys())
    d2_keys = set(d2.keys())
    shared_keys = d1_keys.intersection(d2_keys)
    added = d1_keys - d2_keys
    removed = d2_keys - d1_keys
    modified = {o : (d1[o], d2[o]) for o in shared_keys if d1[o] != d2[o]}
    same = set(o for o in shared_keys if d1[o] == d2[o])
    return added, removed, modified, same

x = dict(a=1, b=2)
y = dict(a=2, b=2)
added, removed, modified, same = dict_compare(x, y)

I am using this solution that works perfectly for me in Python 3


import logging
log = logging.getLogger(__name__)

...

    def deep_compare(self,left, right, level=0):
        if type(left) != type(right):
            log.info("Exit 1 - Different types")
            return False

        elif type(left) is dict:
            # Dict comparison
            for key in left:
                if key not in right:
                    log.info("Exit 2 - missing {} in right".format(key))
                    return False
                else:
                    if not deep_compare(left[str(key)], right[str(key)], level +1 ):
                        log.info("Exit 3 - different children")
                        return False
            return True
        elif type(left) is list:
            # List comparison
            for key in left:
                if key not in right:
                    log.info("Exit 4 - missing {} in right".format(key))
                    return False
                else:
                    if not deep_compare(left[left.index(key)], right[right.index(key)], level +1 ):
                        log.info("Exit 5 - different children")
                        return False
            return True
        else:
            # Other comparison
            return left == right

        return False

It compares dict, list and any other types that implements the "==" operator by themselves. If you need to compare something else different, you need to add a new branch in the "if tree".

Hope that helps.


Since it seems nobody mentioned deepdiff, I will add it here for completeness. I find it very convenient for getting diff of (nested) objects in general:

Installation

pip install deepdiff

Sample code

import deepdiff
import json

dict_1 = {
    "a": 1,
    "nested": {
        "b": 1,
    }
}

dict_2 = {
    "a": 2,
    "nested": {
        "b": 2,
    }
}

diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(diff, indent=4))

Output

{
    "values_changed": {
        "root['a']": {
            "new_value": 2,
            "old_value": 1
        },
        "root['nested']['b']": {
            "new_value": 2,
            "old_value": 1
        }
    }
}

Note about pretty-printing the result for inspection: The above code works if both dicts have the same attribute keys (with possibly different attribute values as in the example). However, if an "extra" attribute is present is one of the dicts, json.dumps() fails with

TypeError: Object of type PrettyOrderedSet is not JSON serializable

Solution: use diff.to_json() and json.loads() / json.dumps() to pretty-print:

import deepdiff
import json

dict_1 = {
    "a": 1,
    "nested": {
        "b": 1,
    },
    "extra": 3
}

dict_2 = {
    "a": 2,
    "nested": {
        "b": 2,
    }
}

diff = deepdiff.DeepDiff(dict_1, dict_2)
print(json.dumps(json.loads(diff.to_json()), indent=4))  

Output:

{
    "dictionary_item_removed": [
        "root['extra']"
    ],
    "values_changed": {
        "root['a']": {
            "new_value": 2,
            "old_value": 1
        },
        "root['nested']['b']": {
            "new_value": 2,
            "old_value": 1
        }
    }
}

Alternative: use pprint, results in a different formatting:

import pprint

# same code as above

pprint.pprint(diff, indent=4)

Output:

{   'dictionary_item_removed': [root['extra']],
    'values_changed': {   "root['a']": {   'new_value': 2,
                                           'old_value': 1},
                          "root['nested']['b']": {   'new_value': 2,
                                                     'old_value': 1}}}

Why not just iterate through one dictionary and check the other in the process (assuming both dictionaries have the same keys)?

x = dict(a=1, b=2)
y = dict(a=2, b=2)

for key, val in x.items():
    if val == y[key]:
        print ('Ok', val, y[key])
    else:
        print ('Not', val, y[key])

Output:

Not 1 2
Ok 2 2

see dictionary view objects: https://docs.python.org/2/library/stdtypes.html#dict

This way you can subtract dictView2 from dictView1 and it will return a set of key/value pairs that are different in dictView2:

original = {'one':1,'two':2,'ACTION':'ADD'}
originalView=original.viewitems()
updatedDict = {'one':1,'two':2,'ACTION':'REPLACE'}
updatedDictView=updatedDict.viewitems()
delta=original | updatedDict
print delta
>>set([('ACTION', 'REPLACE')])

You can intersect, union, difference (shown above), symmetric difference these dictionary view objects.
Better? Faster? - not sure, but part of the standard library - which makes it a big plus for portability


Below code will help you to compare list of dict in python

def compate_generic_types(object1, object2):
    if isinstance(object1, str) and isinstance(object2, str):
        return object1 == object2
    elif isinstance(object1, unicode) and isinstance(object2, unicode):
        return object1 == object2
    elif isinstance(object1, bool) and isinstance(object2, bool):
        return object1 == object2
    elif isinstance(object1, int) and isinstance(object2, int):
        return object1 == object2
    elif isinstance(object1, float) and isinstance(object2, float):
        return object1 == object2
    elif isinstance(object1, float) and isinstance(object2, int):
        return object1 == float(object2)
    elif isinstance(object1, int) and isinstance(object2, float):
        return float(object1) == object2

    return True

def deep_list_compare(object1, object2):
    retval = True
    count = len(object1)
    object1 = sorted(object1)
    object2 = sorted(object2)
    for x in range(count):
        if isinstance(object1[x], dict) and isinstance(object2[x], dict):
            retval = deep_dict_compare(object1[x], object2[x])
            if retval is False:
                print "Unable to match [{0}] element in list".format(x)
                return False
        elif isinstance(object1[x], list) and isinstance(object2[x], list):
            retval = deep_list_compare(object1[x], object2[x])
            if retval is False:
                print "Unable to match [{0}] element in list".format(x)
                return False
        else:
            retval = compate_generic_types(object1[x], object2[x])
            if retval is False:
                print "Unable to match [{0}] element in list".format(x)
                return False

    return retval

def deep_dict_compare(object1, object2):
    retval = True

    if len(object1) != len(object2):
        return False

    for k in object1.iterkeys():
        obj1 = object1[k]
        obj2 = object2[k]
        if isinstance(obj1, list) and isinstance(obj2, list):
            retval = deep_list_compare(obj1, obj2)
            if retval is False:
                print "Unable to match [{0}]".format(k)
                return False

        elif isinstance(obj1, dict) and isinstance(obj2, dict):
            retval = deep_dict_compare(obj1, obj2)
            if retval is False:
                print "Unable to match [{0}]".format(k)
                return False
        else:
            retval = compate_generic_types(obj1, obj2)
            if retval is False:
                print "Unable to match [{0}]".format(k)
                return False

    return retval

I'm new to python but I ended up doing something similar to @mouad

unmatched_item = set(dict_1.items()) ^ set(dict_2.items())
len(unmatched_item) # should be 0

The XOR operator (^) should eliminate all elements of the dict when they are the same in both dicts.


import json

if json.dumps(dict1) == json.dumps(dict2):
    print("Equal")

dic1 == dic2

From python docs:

The following examples all return a dictionary equal to {"one": 1, "two": 2, "three": 3}:

>>> a = dict(one=1, two=2, three=3)
>>> b = {'one': 1, 'two': 2, 'three': 3}
>>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
>>> d = dict([('two', 2), ('one', 1), ('three', 3)])
>>> e = dict({'three': 3, 'one': 1, 'two': 2})
>>> a == b == c == d == e
True

Providing keyword arguments as in the first example only works for keys that are valid Python identifiers. Otherwise, any valid keys can be used.

Valid for both py2 and py3.


To test if two dicts are equal in keys and values:

def dicts_equal(d1,d2):
    """ return True if all keys and values are the same """
    return all(k in d2 and d1[k] == d2[k]
               for k in d1) \
        and all(k in d1 and d1[k] == d2[k]
               for k in d2)

If you want to return the values which differ, write it differently:

def dict1_minus_d2(d1, d2):
    """ return the subset of d1 where the keys don't exist in d2 or
        the values in d2 are different, as a dict """
    return {k,v for k,v in d1.items() if k in d2 and v == d2[k]}

You would have to call it twice i.e

dict1_minus_d2(d1,d2).extend(dict1_minus_d2(d2,d1))

In Python 3.6, It can be done as:-

if (len(dict_1)==len(dict_2): 
  for i in dict_1.items():
        ret=bool(i in dict_2.items())

ret variable will be true if all the items of dict_1 in present in dict_2


for python3:

data_set_a = dict_a.items()
data_set_b = dict_b.items()

difference_set = data_set_a ^ data_set_b

In PyUnit there's a method which compares dictionaries beautifully. I tested it using the following two dictionaries, and it does exactly what you're looking for.

d1 = {1: "value1",
      2: [{"subKey1":"subValue1",
           "subKey2":"subValue2"}]}
d2 = {1: "value1",
      2: [{"subKey2":"subValue2",
           "subKey1": "subValue1"}]
      }


def assertDictEqual(self, d1, d2, msg=None):
        self.assertIsInstance(d1, dict, 'First argument is not a dictionary')
        self.assertIsInstance(d2, dict, 'Second argument is not a dictionary')

        if d1 != d2:
            standardMsg = '%s != %s' % (safe_repr(d1, True), safe_repr(d2, True))
            diff = ('\n' + '\n'.join(difflib.ndiff(
                           pprint.pformat(d1).splitlines(),
                           pprint.pformat(d2).splitlines())))
            standardMsg = self._truncateMessage(standardMsg, diff)
            self.fail(self._formatMessage(msg, standardMsg))

I'm not recommending importing unittest into your production code. My thought is the source in PyUnit could be re-tooled to run in production. It uses pprint which "pretty prints" the dictionaries. Seems pretty easy to adapt this code to be "production ready".


The easiest way (and one of the more robust at that) to do a deep comparison of two dictionaries is to serialize them in JSON format, sorting the keys, and compare the string results:

import json
if json.dumps(x, sort_keys=True) == json.dumps(y, sort_keys=True):
   ... Do something ...

Just use:

assert cmp(dict1, dict2) == 0

>>> x = {'a':1,'b':2,'c':3}
>>> x
{'a': 1, 'b': 2, 'c': 3}

>>> y = {'a':2,'b':4,'c':3}
>>> y
{'a': 2, 'b': 4, 'c': 3}

METHOD 1:

>>> common_item = x.items()&y.items() #using union,x.item() 
>>> common_item
{('c', 3)}

METHOD 2:

 >>> for i in x.items():
        if i in y.items():
           print('true')
        else:
           print('false')


false
false
true

@mouad 's answer is nice if you assume that both dictionaries contain simple values only. However, if you have dictionaries that contain dictionaries you'll get an exception as dictionaries are not hashable.

Off the top of my head, something like this might work:

def compare_dictionaries(dict1, dict2):
     if dict1 is None or dict2 is None:
        print('Nones')
        return False

     if (not isinstance(dict1, dict)) or (not isinstance(dict2, dict)):
        print('Not dict')
        return False

     shared_keys = set(dict1.keys()) & set(dict2.keys())

     if not ( len(shared_keys) == len(dict1.keys()) and len(shared_keys) == len(dict2.keys())):
        print('Not all keys are shared')
        return False


     dicts_are_equal = True
     for key in dict1.keys():
         if isinstance(dict1[key], dict) or isinstance(dict2[key], dict):
             dicts_are_equal = dicts_are_equal and compare_dictionaries(dict1[key], dict2[key])
         else:
             dicts_are_equal = dicts_are_equal and all(atleast_1d(dict1[key] == dict2[key]))

     return dicts_are_equal

Here is my answer, use a recursize way:

def dict_equals(da, db):
    if not isinstance(da, dict) or not isinstance(db, dict):
        return False
    if len(da) != len(db):
        return False
    for da_key in da:
        if da_key not in db:
            return False
        if not isinstance(db[da_key], type(da[da_key])):
            return False
        if isinstance(da[da_key], dict):
            res = dict_equals(da[da_key], db[da_key])
            if res is False:
                return False
        elif da[da_key] != db[da_key]:
            return False
    return True

a = {1:{2:3, 'name': 'cc', "dd": {3:4, 21:"nm"}}}
b = {1:{2:3, 'name': 'cc', "dd": {3:4, 21:"nm"}}}
print dict_equals(a, b)

Hope that helps!


Yet another possibility, up to the last note of the OP, is to compare the hashes (SHA or MD) of the dicts dumped as JSON. The way hashes are constructed guarantee that if they are equal, the source strings are equal as well. This is very fast and mathematically sound.

import json
import hashlib

def hash_dict(d):
    return hashlib.sha1(json.dumps(d, sort_keys=True)).hexdigest()

x = dict(a=1, b=2)
y = dict(a=2, b=2)
z = dict(a=1, b=2)

print(hash_dict(x) == hash_dict(y))
print(hash_dict(x) == hash_dict(z))

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to dictionary

JS map return object python JSON object must be str, bytes or bytearray, not 'dict Python update a key in dict if it doesn't exist How to update the value of a key in a dictionary in Python? How to map an array of objects in React C# Dictionary get item by index Are dictionaries ordered in Python 3.6+? Split / Explode a column of dictionaries into separate columns with pandas Writing a dictionary to a text file? enumerate() for dictionary in python

Examples related to comparison

Wildcard string comparison in Javascript How to compare two JSON objects with the same elements in a different order equal? Comparing strings, c++ Char Comparison in C bash string compare to multiple correct values Comparing two hashmaps for equal values and same key sets? Comparing boxed Long values 127 and 128 Compare two files report difference in python How do I fix this "TypeError: 'str' object is not callable" error? Compare cell contents against string in Excel