[python] Should I use 'has_key()' or 'in' on Python dicts?

I wonder what is better to do:

d = {'a': 1, 'b': 2}
'a' in d
True

or:

d = {'a': 1, 'b': 2}
d.has_key('a')
True

This question is related to python dictionary

The answer is


Solution to dict.has_key() is deprecated, use 'in' -- sublime text editor 3

Here I have taken an example of dictionary named 'ages' -

ages = {}

# Add a couple of names to the dictionary
ages['Sue'] = 23

ages['Peter'] = 19

ages['Andrew'] = 78

ages['Karren'] = 45

# use of 'in' in if condition instead of function_name.has_key(key-name).
if 'Sue' in ages:

    print "Sue is in the dictionary. She is", ages['Sue'], "years old"

else:

    print "Sue is not in the dictionary"

If you have something like this:

t.has_key(ew)

change it to below for running on Python 3.X and above:

key = ew
if key not in t

Use dict.has_key() if (and only if) your code is required to be runnable by Python versions earlier than 2.3 (when key in dict was introduced).


According to python docs:

has_key() is deprecated in favor of key in d.


in wins hands-down, not just in elegance (and not being deprecated;-) but also in performance, e.g.:

$ python -mtimeit -s'd=dict.fromkeys(range(99))' '12 in d'
10000000 loops, best of 3: 0.0983 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'd.has_key(12)'
1000000 loops, best of 3: 0.21 usec per loop

While the following observation is not always true, you'll notice that usually, in Python, the faster solution is more elegant and Pythonic; that's why -mtimeit is SO helpful -- it's not just about saving a hundred nanoseconds here and there!-)


has_key is a dictionary method, but in will work on any collection, and even when __contains__ is missing, in will use any other method to iterate the collection to find out.


There is one example where in actually kills your performance.

If you use in on a O(1) container that only implements __getitem__ and has_key() but not __contains__ you will turn an O(1) search into an O(N) search (as in falls back to a linear search via __getitem__).

Fix is obviously trivial:

def __contains__(self, x):
    return self.has_key(x)

Expanding on Alex Martelli's performance tests with Adam Parkin's comments...

$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 301, in main
    x = t.timeit(number)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 178, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
    d.has_key(12)
AttributeError: 'dict' object has no attribute 'has_key'

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(  99))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0872 usec per loop

$ python2.7 -mtimeit -s'd=dict.fromkeys(range(1999))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0858 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d'
10000000 loops, best of 3: 0.031 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d'
10000000 loops, best of 3: 0.033 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(  99))' '12 in d.keys()'
10000000 loops, best of 3: 0.115 usec per loop

$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d.keys()'
10000000 loops, best of 3: 0.117 usec per loop