I wonder what is better to do:
d = {'a': 1, 'b': 2}
'a' in d
True
or:
d = {'a': 1, 'b': 2}
d.has_key('a')
True
This question is related to
python
dictionary
Solution to dict.has_key() is deprecated, use 'in' -- sublime text editor 3
Here I have taken an example of dictionary named 'ages' -
ages = {}
# Add a couple of names to the dictionary
ages['Sue'] = 23
ages['Peter'] = 19
ages['Andrew'] = 78
ages['Karren'] = 45
# use of 'in' in if condition instead of function_name.has_key(key-name).
if 'Sue' in ages:
print "Sue is in the dictionary. She is", ages['Sue'], "years old"
else:
print "Sue is not in the dictionary"
If you have something like this:
t.has_key(ew)
change it to below for running on Python 3.X and above:
key = ew
if key not in t
Use dict.has_key()
if (and only if) your code is required to be runnable by Python versions earlier than 2.3 (when key in dict
was introduced).
According to python docs:
has_key()
is deprecated in favor ofkey in d
.
in
wins hands-down, not just in elegance (and not being deprecated;-) but also in performance, e.g.:
$ python -mtimeit -s'd=dict.fromkeys(range(99))' '12 in d'
10000000 loops, best of 3: 0.0983 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'd.has_key(12)'
1000000 loops, best of 3: 0.21 usec per loop
While the following observation is not always true, you'll notice that usually, in Python, the faster solution is more elegant and Pythonic; that's why -mtimeit
is SO helpful -- it's not just about saving a hundred nanoseconds here and there!-)
has_key
is a dictionary method, but in
will work on any collection, and even when __contains__
is missing, in
will use any other method to iterate the collection to find out.
There is one example where in
actually kills your performance.
If you use in
on a O(1) container that only implements __getitem__
and has_key()
but not __contains__
you will turn an O(1) search into an O(N) search (as in
falls back to a linear search via __getitem__
).
Fix is obviously trivial:
def __contains__(self, x):
return self.has_key(x)
Expanding on Alex Martelli's performance tests with Adam Parkin's comments...
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 301, in main
x = t.timeit(number)
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
d.has_key(12)
AttributeError: 'dict' object has no attribute 'has_key'
$ python2.7 -mtimeit -s'd=dict.fromkeys(range( 99))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0872 usec per loop
$ python2.7 -mtimeit -s'd=dict.fromkeys(range(1999))' 'd.has_key(12)'
10000000 loops, best of 3: 0.0858 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' '12 in d'
10000000 loops, best of 3: 0.031 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d'
10000000 loops, best of 3: 0.033 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range( 99))' '12 in d.keys()'
10000000 loops, best of 3: 0.115 usec per loop
$ python3.5 -mtimeit -s'd=dict.fromkeys(range(1999))' '12 in d.keys()'
10000000 loops, best of 3: 0.117 usec per loop
Source: Stackoverflow.com