Python List of dict if exists increment a dict value if not append a new dict

Question

I would like do something like that   list of urls     http   www google fr     http   www google fr                      http   www google cn     http   www google com                      http   www google fr     http   www google fr                      http   www google fr     http   www google com                      http   www google fr     http   www google com                      http   www google cn     urls      url    http   www google fr     nbr   1    for url in list of urls      if url in  f  url   for f in urls            urls      nbr      1     else           urls append   url   url   nbr   1     How can I do   I don t know if I should take the tuple to edit it or figure out the tuple indices   Any help

User · Accepted Answer

That is a very strange way to organize things. If you stored in a dictionary, this is easy:

# This example should work in any version of Python.
# urls_d will contain URL keys, with counts as values, like: {'http://www.google.fr/' : 1 }
urls_d = {}
for url in list_of_urls:
    if not url in urls_d:
        urls_d[url] = 1
    else:
        urls_d[url] += 1

This code for updating a dictionary of counts is a common "pattern" in Python. It is so common that there is a special data structure, defaultdict, created just to make this even easier:

from collections import defaultdict  # available in Python 2.5 and newer

urls_d = defaultdict(int)
for url in list_of_urls:
    urls_d[url] += 1

If you access the defaultdict using a key, and the key is not already in the defaultdict, the key is automatically added with a default value. The defaultdict takes the callable you passed in, and calls it to get the default value. In this case, we passed in class int; when Python calls int() it returns a zero value. So, the first time you reference a URL, its count is initialized to zero, and then you add one to the count.

But a dictionary full of counts is also a common pattern, so Python provides a ready-to-use class: containers.Counter You just create a Counter instance by calling the class, passing in any iterable; it builds a dictionary where the keys are values from the iterable, and the values are counts of how many times the key appeared in the iterable. The above example then becomes:

from collections import Counter  # available in Python 2.7 and newer

urls_d = Counter(list_of_urls)

If you really need to do it the way you showed, the easiest and fastest way would be to use any one of these three examples, and then build the one you need.

from collections import defaultdict  # available in Python 2.5 and newer

urls_d = defaultdict(int)
for url in list_of_urls:
    urls_d[url] += 1

urls = [{"url": key, "nbr": value} for key, value in urls_d.items()]

If you are using Python 2.7 or newer you can do it in a one-liner:

from collections import Counter

urls = [{"url": key, "nbr": value} for key, value in Counter(list_of_urls).items()]

User · Answer

Use defaultdict   from collections import defaultdict  urls   defaultdict int   for url in list of urls      urls url     1

User · Answer

Using the default works  but so does   urls url    urls get url  0    1   using  get  you can get a default return if it doesn t exist  By default it s None  but in the case I sent you  it would be 0

User · Answer

To do it exactly your way  You could use the for   else structure  for url in list of urls      for url dict in urls          if url dict  url      url              url dict  nbr      1             break     else          urls append dict url url  nbr 1     But it is quite inelegant  Do you really have to store the visited urls as a LIST  If you sort it as a dict  indexed by url string  for example  it would be way cleaner   urls     http   www google fr    dict url  http   www google fr    nbr 1    for url in list of urls      if url in urls          urls url   nbr      1     else          urls url    dict url url  nbr 1    A few things to note in that second example    see how using a dict for urls removes the need for going through the whole urls list when testing for one single url  This approach will be faster  Using dict    instead of braces makes your code shorter using list of urls  urls and url as variable names make the code quite hard to parse  It s better to find something clearer  such as urls to visit  urls already visited and current url  I know  it s longer  But it s clearer    And of course I m assuming that dict url  http   www google fr   nbr 1  is a simplification of your own data structure  because otherwise  urls could simply be   urls     http   www google fr  1   for url in list of urls      if url in urls          urls url     1     else          urls url    1   Which can get very elegant with the defaultdict stance   urls   collections defaultdict int  for url in list of urls      urls url     1

User · Answer

Except for the first time  each time a word is seen the if statement s test fails  If you are counting a large number of words  many will probably occur multiple times  In a situation where the initialization of a value is only going to occur once and the augmentation of that value will occur many times it is cheaper to use a try statement   urls d      for url in list of urls      try          urls d url     1     except KeyError          urls d url    1   you can read more about this   https   wiki python org moin PythonSpeed PerformanceTips

User · Answer

This always works fine for me   for url in list of urls      urls setdefault url  0      urls url     1

[python] Python : List of dict, if exists increment a dict value, if not append a new dict

Examples related to python

Examples related to loops

Examples related to list

Examples related to tuples