Flatten nested dictionaries compressing keys

Question

Suppose you have a dictionary like     a   1    c     a   2          b     x   5                y    10      d    1  2  3     How would you go about flattening that into something like     a   1    c a   2    c b x   5    c b y   10    d    1  2  3

User · Answer

Or if you are already using pandas  You can do it with json normalize   like so  import pandas as pd  d     a   1        c     a   2   b     x   5   y    10          d    1  2  3    df   pd json normalize d  sep       print df to dict orient  records   0    Output    a   1   c a   2   c b x   5   c b y   10   d    1  2  3

User · Answer

If you want to flat nested dictionary and want all unique keys list then here is the solution   def flat dict return unique key data  unique keys set         if isinstance data  dict            unique keys add i  for i in data keys            for each v in data values                if isinstance each v  dict                   flat dict return unique key each v  unique keys      return list set unique keys

User · Answer

My Python 3 3 Solution using generators   def flattenit pyobj  keystring         if type pyobj  is dict       if  type pyobj  is dict            keystring   keystring       if keystring else keystring          for k in pyobj               yield from flattenit pyobj k   keystring   k       elif  type pyobj  is list            for lelm in pyobj               yield from flatten lelm  keystring     else        yield keystring  pyobj  my obj     a   1   c     a   2   b     x   5   y   10     d    1  2  3     your flattened dictionary object flattened  k v for k v in flattenit my obj   print flattened     result    c b y   10   d    1  2  3    c a   2   a   1   c b x   5

User · Answer

Not exactly what the OP asked  but lots of folks are coming here looking for ways to flatten real-world nested JSON data which can have nested key-value json objects and arrays and json objects inside the arrays and so on  JSON doesn t include tuples  so we don t have to fret over those   I found an implementation of the list-inclusion comment by  roneo to the answer posted by  Imran    https   github com ScriptSmith socialreaper blob master socialreaper tools py L8  import collections def flatten dictionary  parent key False  separator                   Turn a nested dictionary into a flattened dictionary      param dictionary  The dictionary to flatten      param parent key  The string to prepend to dictionary s keys      param separator  The string used to separate flattened keys      return  A flattened dictionary              items          for key  value in dictionary items            new key   str parent key    separator   key if parent key else key         if isinstance value  collections MutableMapping               items extend flatten value  new key  separator  items            elif isinstance value  list               for k  v in enumerate value                   items extend flatten  str k   v   new key  items            else              items append  new key  value       return dict items    Test it   flatten   a   1   c     a   2   b     x   5   y    10     d    1  2  3       gt  gt    a   1   c a   2   c b x   5   c b y   10   d 0   1   d 1   2   d 2   3    Annd that does the job I need done  I throw any complicated json at this and it flattens it out for me   All credits to https   github com ScriptSmith

User · Answer

This is not restricted to dictionaries  but every mapping type that implements  items    Further ist faster as it avoides an if condition  Nevertheless credits go to Imran   def flatten d  parent key          items          for k  v in d items            try              items extend flatten v    s s      parent key  k   items            except AttributeError              items append    s s     parent key  k   v       return dict items

User · Answer

here s a solution using a stack  No recursion  def flatten nested dict nested       stack   list nested items        ans          while stack          key  val   stack pop           if isinstance val  dict               for sub key  sub val in val items                    stack append  f quot  key   sub key  quot   sub val           else              ans key    val     return ans

User · Answer

I actually wrote a package called cherrypicker recently to deal with this exact sort of thing since I had to do it so often   I think the following code would give you exactly what you re after   from cherrypicker import CherryPicker  dct          a   1       c              a   2           b                  x   5               y    10                       d    1  2  3     picker   CherryPicker dct  picker flatten   get     You can install the package with   pip install cherrypicker      and there s more docs and guidance at https   cherrypicker readthedocs io   Other methods may be faster  but the priority of this package is to make such tasks easy  If you do have a large list of objects to flatten though  you can also tell CherryPicker to use parallel processing to speed things up

User · Answer

I always prefer access dict objects via  items    so for flattening dicts I use the following recursive generator flat items d   If you like to have dict again  simply wrap it like this  flat   dict flat items d    def flat items d  key separator                   Flattens the dictionary containing other dictionaries like here  https   stackoverflow com questions 6027558 flatten-nested-python-dictionaries-compressing-keys       gt  gt  gt  example     a   1   c     a   2   b     x   5   y    10     d    1  2  3        gt  gt  gt  flat   dict flat items example  key separator            gt  gt  gt  assert flat  c b y      10             for k  v in d items            if type v  is dict              for k1  v1 in flat items v  key separator key separator                   yield key separator join  k  k1    v1         else              yield k  v

User · Answer

Davoud s solution is very nice but doesn t give satisfactory results when the nested dict also contains lists of dicts  but his code be adapted for that case   def flatten dict d       items          for k  v in d items            try              if  type v   type                        for l in v  items extend flatten dict l  items                else                   items extend flatten dict v  items            except AttributeError              items append  k  v       return dict items

User · Answer

def flatten nested dict  dict   str                  recursive function to flatten a nested dictionary json             ret dict          for k  v in  dict items            if isinstance v  dict               ret dict update flatten nested dict v   str       join   str  k   strip                elif isinstance v  list               for index  item in enumerate v                   if isinstance item  dict                       ret dict update flatten nested dict item    str      join   str  k  str index    strip                        else                      ret dict     join   str  k  str index    strip         item         else              ret dict     join   str  k   strip         v     return ret dict

User · Answer

Here is a kind of a  functional    one-liner  implementation  It is recursive  and based on a conditional expression and a dict comprehension   def flatten dict dd  separator      prefix          return   prefix   separator   k if prefix else k   v              for kk  vv in dd items                for k  v in flatten dict vv  separator  kk  items                  if isinstance dd  dict  else   prefix   dd     Test   In  2   flatten dict   abc  123   hgf    gh  432   yu  433    gfd  902   xzxzxz    432    0b0b0b  231    43234  1321         Out 2      abc   123    gfd   902    hgf gh   432    hgf yu   433    xzxzxz 432 0b0b0b   231    xzxzxz 43234   1321

User · Answer

Here s an algorithm for elegant  in-place replacement  Tested with Python 2 7 and Python 3 5  Using the dot character as a separator   def flatten json json       if type json     dict          for k  v in list json items                 if type v     dict                  flatten json v                  json pop k                  for k2  v2 in v items                        json k     k2    v2   Example   d     a     b    c                       flatten json d  print d  unflatten json d  print d    Output     a b    c     a     b    c      I published this code here along with the matching unflatten json function

User · Answer

The answers above work really well  Just thought I d add the unflatten function that I wrote   def unflatten d       ud          for k  v in d items            context   ud         for sub key in k split       -1               if sub key not in context                  context sub key                   context   context sub key          context k split      -1     v     return ud   Note  This doesn t account for     already present in keys  much like the flatten counterparts

User · Answer

If you re using pandas there is a function hidden in pandas io json  normalize1 called nested to record which does this exactly   from pandas io json  normalize import nested to record      flat   nested to record my dict  sep          1 In pandas versions 0 24 x and older use pandas io json normalize  without the

User · Answer

Using generators   def flat dic helper prepand d       if len prepand   gt  0          prepand   prepand           for k in d          i d k          if type i    name     dict               r   flat dic helper prepand k i              for j in r                  yield j         else              yield  prepand k i   def flat dic d   return dict flat dic helper    d    d   a   1   c     a   2   b     x   5   y    10     d    1  2  3   print flat dic d      gt  gt    a   1   c a   2   c b x   5   d    1  2  3    c b y   10

User · Answer

Using dict popitem   in straightforward nested-list-like recursion   def flatten d       if d                return d     else          k v   d popitem           if  dict    type v                return  k v    flatten d           else              flat kv   flatten v              for k1 in list flat kv keys                     flat kv k         k1    flat kv k1                  del flat kv k1              return    flat kv    flatten d

User · Answer

I was thinking of a subclass of UserDict to automagically flat the keys   class FlatDict UserDict       def   init   self   args  separator        kwargs           self separator   separator         super     init    args    kwargs       def   setitem   self  key  value           if isinstance value  dict               for k1  v1 in FlatDict value  separator self separator  items                    super     setitem   f  key  self separator  k1    v1          else              super     setitem   key  value      The advantages it that keys can be added on the fly  or using standard dict instanciation  without surprise       gt  gt  gt  fd   FlatDict                       person                     sexe    male                    name                         first    jacques                       last    dupond                                                gt  gt  gt  fd   person sexe    male    person name first    jacques    person name last    dupond    gt  gt  gt  fd  person       name     nickname    Bob     gt  gt  gt  fd   person sexe    male    person name first    jacques    person name last    dupond    person name nickname    Bob    gt  gt  gt  fd  person name       civility    Dr    gt  gt  gt  fd   person sexe    male    person name first    jacques    person name last    dupond    person name nickname    Bob    person name civility    Dr

User · Answer

There are two big considerations that the original poster needs to consider    Are there keyspace clobbering issues  For example    a b    c  1    a    b c  2   would result in   a b c        The below solution evades the problem by returning an iterable of pairs  If performance is an issue  does the key-reducer function  which I hereby refer to as  join   require access to the entire key-path  or can it just do O 1  work at every node in the tree  If you want to be able to say joinedKey       join  keys   that will cost you O N 2  running time  However if you re willing to say nextKey   previousKey     thisKey  that gets you O N  time  The solution below lets you do both  since you could merely concatenate all the keys  then postprocess them      Performance is not likely an issue  but I ll elaborate on the second point in case anyone else cares  In implementing this  there are numerous dangerous choices  If you do this recursively and yield and re-yield  or anything equivalent which touches nodes more than once  which is quite easy to accidentally do   you are doing potentially O N 2  work rather than O N   This is because maybe you are calculating a key a then a 1 then a 1 i     and then calculating a then a 1 then a 1 ii     but really you shouldn t have to calculate a 1 again  Even if you aren t recalculating it  re-yielding it  a  level-by-level  approach  is just as bad  A good example is to think about the performance on  1  1  1  1     N times     1 SOME LARGE DICTIONARY OF SIZE N           Below is a function I wrote flattenDict d  join      lift      which can be adapted to many purposes and can do what you want  Sadly it is fairly hard to make a lazy version of this function without incurring the above performance penalties  many python builtins like chain from iterable aren t actually efficient  which I only realized after extensive testing of three different versions of this code before settling on this one    from collections import Mapping from itertools import chain from operator import add   FLAG FIRST   object    def flattenDict d  join add  lift lambda x x       results          def visit subdict  results  partialKey           for k v in subdict items                newKey   lift k  if partialKey   FLAG FIRST else join partialKey lift k               if isinstance v Mapping                   visit v  results  newKey              else                  results append  newKey v       visit d  results   FLAG FIRST      return results   To better understand what s going on  below is a diagram for those unfamiliar with reduce left   otherwise known as  fold left   Sometimes it is drawn with an initial value in place of k0  not part of the list  passed into the function   Here  J is our join function  We preprocess each kn with lift k                    k0 k1     kN  foldleft J                                                                    kN                                   J k0 J k1 J k2 k3                                                                        J J k0 k1  k2    k3                                                                   J k0 k1     k2                                                             k0     k1   This is in fact the same as functools reduce  but where our function does this to all key-paths of the tree    gt  gt  gt  reduce lambda a b  a b   range 5       0  1   2   3   4    Demonstration  which I d otherwise put in docstring     gt  gt  gt  testData              a  1           b  2           c                 aa  11               bb  22               cc                     aaa  111                               from pprint import pprint as pp   gt  gt  gt  pp dict  flattenDict testData  lift lambda x  x          a     1     b     2     c    aa    11     c    bb    22     c    cc    aaa    111    gt  gt  gt  pp dict  flattenDict testData  join lambda a b a     b       a   1   b   2   c aa   11   c bb   22   c cc aaa   111        gt  gt  gt  pp dict   v k  for k v in flattenDict testData  lift hash  join lambda a b hash  a b        1  12416037344   2  12544037731   11  5470935132935744593   22  4885734186131977315   111  3461911260025554326      Performance   from functools import reduce def makeEvilDict n       return reduce lambda acc x  x acc     i 0 for i in range n    range n    import timeit def time runnable       t0   timeit default timer           runnable       t1   timeit default timer       print  took    2f  seconds  format t1-t0     gt  gt  gt  pp makeEvilDict 8    7   6   5   4   3   2   1   0   0  0                                   1  0                                   2  0                                   3  0                                   4  0                                   5  0                                   6  0                                   7  0           import sys sys setrecursionlimit 1000000   forget   lambda a b      gt  gt  gt  time lambda  dict flattenDict makeEvilDict 10000   join forget     took 0 10 seconds  gt  gt  gt  time lambda  dict flattenDict makeEvilDict 100000   join forget      1     12569 segmentation fault  python       sigh  don t think that one is my fault        unimportant historical note due to moderation issues   Regarding the alleged duplicate of Flatten a dictionary of dictionaries  2 levels deep  of lists in Python   That question s solution can be implemented in terms of this one by doing sorted  sum flatten             The reverse is not possible  while it is true that the values of flatten      can be recovered from the alleged duplicate by mapping a higher-order accumulator  one cannot recover the keys   edit  Also it turns out that the alleged duplicate owner s question is completely different  in that it only deals with dictionaries exactly 2-level deep  though one of the answers on that page gives a general solution

User · Answer

How about a functional and performant solution in Python3 5   from functools import reduce   def  reducer items  key  val  pref       if isinstance val  dict           return    items    flatten val  pref   key       else          return    items  pref   key  val   def flatten d  pref          return reduce          lambda new d  kv   reducer new d   kv  pref            d items                         This is even more performant   def flatten d  pref          return reduce          lambda new d  kv                isinstance kv 1   dict  and                  new d    flatten kv 1   pref   kv 0    or                  new d  pref   kv 0   kv 1             d items                         In use   my obj     a   1   c     a   2   b     x   5   y   10     d    1  2  3    print flatten my obj        d    1  2  3    cby   10   cbx   5   ca   2   a   1

User · Answer

Code   test     a   1   c     a   2   b     x   5   y    10     d    1  2  3    def parse dict init  lkey          ret          for rkey val in init items            key   lkey rkey         if isinstance val  dict               ret update parse dict val  key               else              ret key    val     return ret  print parse dict test        Results     python test py   a   1   c a   2   c b x   5   d    1  2  3    c b y   10    I am using python3 2  update for your version of python

User · Answer

Variation of this Flatten nested dictionaries  compressing keys with max level and custom reducer     def flatten d  max level None  reducer  tuple          if reducer     tuple             reducer seed   tuple             reducer func   lambda x  y    x  y        else            raise ValueError f Unknown reducer   reducer           def impl d  pref  level           return reduce              lambda new d  kv                   max level is None or level  lt  max level                  and isinstance kv 1   dict                  and    new d    impl kv 1   reducer func pref  kv 0    level   1                   or    new d  reducer func pref  kv 0    kv 1                    d items                                    return impl d  reducer seed  0

User · Answer

Basically the same way you would flatten a nested list  you just have to do the extra work for iterating the dict by key value  creating new keys for your new dictionary and creating the dictionary at final step   import collections  def flatten d  parent key     sep           items          for k  v in d items            new key   parent key   sep   k if parent key else k         if isinstance v  collections MutableMapping               items extend flatten v  new key  sep sep  items            else              items append  new key  v       return dict items    gt  gt  gt  flatten   a   1   c     a   2   b     x   5   y    10     d    1  2  3      a   1   c a   2   c b x   5   d    1  2  3    c b y   10

User · Answer

def flatten unflattened dict  separator           flattened dict           for k  v in unflattened dict items            if isinstance v  dict               sub flattened dict   flatten v  separator              for k2  v2 in sub flattened dict items                    flattened dict k   separator   k2    v2         else              flattened dict k    v      return flattened dict

User · Answer

This is similar to both imran s and ralu s answer  It does not use a generator  but instead employs recursion with a closure   def flatten dict d  separator         final        def  flatten dict obj  parent keys          for k  v in obj iteritems          if isinstance v  dict            flatten dict v  parent keys    k         else          key   separator join parent keys    k           final key    v    flatten dict d    return final   gt  gt  gt  print flatten dict   a   1   c     a   2   b     x   5   y    10     d    1  2  3      a   1   c a   2   c b x   5   d    1  2  3    c b y   10

User · Answer

Simple function to flatten nested dictionaries  For Python 3  replace  iteritems   with  items    def flatten dict init dict       res dict          if type init dict  is not dict          return res dict      for k  v in init dict iteritems            if type v     dict              res dict update flatten dict v           else              res dict k    v      return res dict   The idea requirement was  Get flat dictionaries with no keeping parent keys   Example of usage   dd     a   3          b     c   4   d   5           e     f                       g   1   h   2                        i   9          flatten dict dd    gt  gt    a   3   c   4   d   5   g   1   h   2   i   9    Keeping parent keys is simple as well

User · Answer

If you do not mind recursive functions  here is a solution  I have also taken the liberty to include an exclusion-parameter in case there are one or more values you wish to maintain   Code   def flatten dict dictionary  exclude       delimiter            flat dict   dict       for key  value in dictionary items            if isinstance value  dict  and key not in exclude              flatten value dict   flatten dict value  exclude  delimiter              for k  v in flatten value dict items                    flat dict f  key  delimiter  k      v         else              flat dict key    value     return flat dict   Usage   d     a  1   b   1  2    c  3   d    a  4   b    a  7   b  8    c  6    e    a  1  b  2   flat d   flatten dict dictionary d  exclude   e    delimiter      print flat d    Output     a   1   b    1  2    c   3   d a   4   d b a   7   d b b   8   d c   6   e     a   1   b   2

User · Answer

Utilizing recursion  keeping it simple and human readable   def flatten dict dictionary  accumulator None  parent key None  separator           if accumulator is None          accumulator           for k  v in dictionary items            k   f  parent key  separator  k   if parent key else k         if isinstance v  dict               flatten dict dictionary v  accumulator accumulator  parent key k              continue          accumulator k    v      return accumulator   Call is simple   new dict   flatten dict dictionary    or  new dict   flatten dict dictionary  separator        if we want to change the default separator   A little breakdown   When the function is first called  it is called only passing the dictionary we want to flatten  The accumulator parameter is here to support recursion  which we see later  So  we instantiate accumulator to an empty dictionary where we will put all of the nested values from the original dictionary   if accumulator is None      accumulator        As we iterate over the dictionary s values  we construct a key for every value  The parent key argument will be None for the first call  while for every nested dictionary  it will contain the key pointing to it  so we prepend that key   k   f  parent key  separator  k   if parent key else k   In case the value v the key k is pointing to is a dictionary  the function calls itself  passing the nested dictionary  the accumulator  which is passed by reference  so all changes done to it are done on the same instance  and the key k so that we can construct the concatenated key  Notice the continue statement  We want to skip the next line  outside of the if block  so that the nested dictionary doesn t end up in the accumulator under key k   if isinstance v  dict       flatten dict dict v  accumulator accumulator  parent key k      continue   So  what do we do in case the value v is not a dictionary  Just put it unchanged inside the accumulator   accumulator k    v   Once we re done we just return the accumulator  leaving the original dictionary argument untouched   NOTE  This will work only with dictionaries that have strings as keys  It will work with hashable objects implementing the   repr   method  but will yield unwanted results

User · Answer

I tried some of the solutions on this page - though not all - but those I tried failed to handle the nested list of dict      Consider a dict like this   d              owner                  name     first name    Steven    last name    Smith                 lottery nums    1  2  3   four    11   None                address                    tuple    1  2   three                 tuple with dict    1  2   three     is valid   False                 set    1  2  3  4   five                 children                       name     first name    Jessica                              last name    Smith                        children                                            name     first name    George                              last name    Smith                      children                                                         Here s my makeshift solution   def flatten dict input node  dict  key   str       output dict  dict            if isinstance input node  dict           for key  val in input node items                new key   f  key    key   if key  else f  key               flatten dict val  new key  output dict      elif isinstance input node  list           for idx  item in enumerate input node               flatten dict item  f  key    idx    output dict      else          output dict key     input node     return output dict   which produces       owner name first name  Steven    owner name last name  Smith    owner lottery nums 0  1    owner lottery nums 1  2    owner lottery nums 2  3    owner lottery nums 3  four    owner lottery nums 4  11    owner lottery nums 5  None    owner tuple   1  2   three      owner tuple with dict   1  2   three     is valid   False      owner set   1  2  3  4   five      owner children 0 name first name  Jessica    owner children 0 name last name  Smith    owner children 1 name first name  George    owner children 1 name last name  Smith      A makeshift solution and it s not perfect  NOTE     it doesn t keep empty dicts such as the address     k v pair  it won t flatten dicts in nested tuples - though it would be easy to add using the fact that python tuples act similar to lists

[python] Flatten nested dictionaries, compressing keys

Examples related to python

Examples related to dictionary