[python] Understanding .get() method in Python

sentence = "The quick brown fox jumped over the lazy dog."
characters = {}

for character in sentence:
    characters[character] = characters.get(character, 0) + 1 

print(characters)

I don't understand what characters.get(character, 0) + 1 is doing, rest all seems pretty straightforward.

This question is related to python

The answer is


I see this is a fairly old question, but this looks like one of those times when something's been written without knowledge of a language feature. The collections library exists to fulfill these purposes.

from collections import Counter
letter_counter = Counter()
for letter in 'The quick brown fox jumps over the lazy dog':
    letter_counter[letter] += 1

>>> letter_counter
Counter({' ': 8, 'o': 4, 'e': 3, 'h': 2, 'r': 2, 'u': 2, 'T': 1, 'a': 1, 'c': 1, 'b': 1, 'd': 1, 'g': 1, 'f': 1, 'i': 1, 'k': 1, 'j': 1, 'm': 1, 'l': 1, 'n': 1, 'q': 1, 'p': 1, 's': 1, 't': 1, 'w': 1, 'v': 1, 'y': 1, 'x': 1, 'z': 1})

In this example the spaces are being counted, obviously, but whether or not you want those filtered is up to you.

As for the dict.get(a_key, default_value), there have been several answers to this particular question -- this method returns the value of the key, or the default_value you supply. The first argument is the key you're looking for, the second argument is the default for when that key is not present.


To understand what is going on, let's take one letter(repeated more than once) in the sentence string and follow what happens when it goes through the loop.

Remember that we start off with an empty characters dictionary

characters = {}

I will pick the letter 'e'. Let's pass the character 'e' (found in the word The) for the first time through the loop. I will assume it's the first character to go through the loop and I'll substitute the variables with their values:

for 'e' in "The quick brown fox jumped over the lazy dog.":
    {}['e'] = {}.get('e', 0) + 1 

characters.get('e', 0) tells python to look for the key 'e' in the dictionary. If it's not found it returns 0. Since this is the first time 'e' is passed through the loop, the character 'e' is not found in the dictionary yet, so the get method returns 0. This 0 value is then added to the 1 (present in the characters[character] = characters.get(character,0) + 1 equation). After completion of the first loop using the 'e' character, we now have an entry in the dictionary like this: {'e': 1}

The dictionary is now:

characters = {'e': 1}

Now, let's pass the second 'e' (found in the word jumped) through the same loop. I'll assume it's the second character to go through the loop and I'll update the variables with their new values:

for 'e' in "The quick brown fox jumped over the lazy dog.":
    {'e': 1}['e'] = {'e': 1}.get('e', 0) + 1

Here the get method finds a key entry for 'e' and finds its value which is 1. We add this to the other 1 in characters.get(character, 0) + 1 and get 2 as result.

When we apply this in the characters[character] = characters.get(character, 0) + 1 equation:

characters['e'] = 2

It should be clear that the last equation assigns a new value 2 to the already present 'e' key. Therefore the dictionary is now:

characters = {'e': 2}

Start here http://docs.python.org/tutorial/datastructures.html#dictionaries

Then here http://docs.python.org/library/stdtypes.html#mapping-types-dict

Then here http://docs.python.org/library/stdtypes.html#dict.get

characters.get( key, default )

key is a character

default is 0

If the character is in the dictionary, characters, you get the dictionary object.

If not, you get 0.


Syntax:

get(key[, default])

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.


If d is a dictionary, then d.get(k, v) means, give me the value of k in d, unless k isn't there, in which case give me v. It's being used here to get the current count of the character, which should start at 0 if the character hasn't been encountered before.