Convert list of dictionaries to a pandas DataFrame

Question

I have a list of dictionaries like this      points   50   time    5 00    year   2010      points   25   time    6 00    month    february       points  90   time    9 00    month    january       points h1  20   month    june      And I want to turn this into a pandas DataFrame like this         month  points  points h1  time  year 0       NaN      50        NaN  5 00  2010 1  february      25        NaN  6 00   NaN 2   january      90        NaN  9 00   NaN 3      june     NaN         20   NaN   NaN   Note  Order of the columns does not matter   How can I turn the list of dictionaries into a pandas DataFrame as shown above

User · Answer

You can also use pd DataFrame from dict d  as    In  8   d      points   50   time    5 00    year   2010              points   25   time    6 00    month    february               points  90   time    9 00    month    january               points h1  20   month    june     In  12   pd DataFrame from dict d  Out 12          month  points  points h1  time    year 0       NaN    50 0        NaN  5 00  2010 0 1  february    25 0        NaN  6 00     NaN 2   january    90 0        NaN  9 00     NaN 3      june     NaN       20 0   NaN     NaN

User · Answer

In pandas 16 2  I had to do pd DataFrame from records d  to get this to work

User · Answer

Pyhton3  Most of the solutions listed previously work  However  there are instances when row number of the dataframe is not required and the each row  record  has to be written individually  The following method is useful in that case   import csv  my file   C  Users John Desktop export dataframe csv   records to save   data2  used as in the thread     colnames   list records to save 0  keys       remember colnames is a list of all keys  All values are written corresponding   to the keys and  quot None quot  is specified in case of missing value   with open myfile   w   newline  quot  quot  encoding  quot utf-8 quot   as f      writer   csv writer f      writer writerow colnames      for d in records to save          writer writerow  d get r   quot None quot   for r in colnames

User · Answer

Supposing d is your list of dicts  simply  df   pd DataFrame d   Note  this does not work with nested data

User · Answer

The easiest way I have found to do it is like this  dict count   len dict list  df   pd DataFrame dict list 0   index  0   for i in range 1 dict count-1       df   df append dict list i   ignore index True

User · Answer

For converting a list of dictionaries to a pandas DataFrame  you can use  append    We have a dictionary called dic and dic has 30 list items  list1  list2      list30    step1  define a variable for keeping your result  ex  total df  step2  initialize total df with list1 step3  use  for loop  for append all lists to total df   total df list1 nums Series np arange start 2  stop 31   for num in nums      total df total df append dic  list  str num

User · Answer

How do I convert a list of dictionaries to a pandas DataFrame   The other answers are correct  but not much has been explained in terms of advantages and limitations of these methods  The aim of this post will be to show examples of these methods under different situations  discuss when to use  and when not to use   and suggest alternatives   DataFrame    DataFrame from records    and  from dict   Depending on the structure and format of your data  there are situations where either all three methods work  or some work better than others  or  some don t work at all  Consider a very contrived example  np random seed 0  data   pd DataFrame      np random choice 10   3  4    columns list  ABCD    to dict  r    print data     A   5   B   0   C   3   D   3      A   7   B   9   C   3   D   5      A   2   B   4   C   7   D   6    This list consists of  quot records quot  with every keys present  This is the simplest case you could encounter    The following methods all produce the same output  pd DataFrame data  pd DataFrame from dict data  pd DataFrame from records data      A  B  C  D 0  5  0  3  3 1  7  9  3  5 2  2  4  7  6  Word on Dictionary Orientations  orient  index   columns  Before continuing  it is important to make the distinction between the different types of dictionary orientations  and support with pandas  There are two primary types   quot columns quot   and  quot index quot   orient  columns  Dictionaries with the  quot columns quot  orientation will have their keys correspond to columns in the equivalent DataFrame  For example  data above is in the  quot columns quot  orient  data c        A   5   B   0   C   3   D   3      A   7   B   9   C   3   D   5      A   2   B   4   C   7   D   6     pd DataFrame from dict data c  orient  columns       A  B  C  D 0  5  0  3  3 1  7  9  3  5 2  2  4  7  6  Note  If you are using pd DataFrame from records  the orientation is assumed to be  quot columns quot   you cannot specify otherwise   and the dictionaries will be loaded accordingly  orient  index  With this orient  keys are assumed to correspond to index values  This kind of data is best suited for pd DataFrame from dict  data i     0    A   5   B   0   C   3   D   3    1    A   7   B   9   C   3   D   5    2    A   2   B   4   C   7   D   6     pd DataFrame from dict data i  orient  index       A  B  C  D 0  5  0  3  3 1  7  9  3  5 2  2  4  7  6  This case is not considered in the OP  but is still useful to know  Setting Custom Index If you need a custom index on the resultant DataFrame  you can set it using the index     argument  pd DataFrame data  index   a    b    c      pd DataFrame from records data  index   a    b    c        A  B  C  D a  5  0  3  3 b  7  9  3  5 c  2  4  7  6  This is not supported by pd DataFrame from dict  Dealing with Missing Keys Columns All methods work out-of-the-box when handling dictionaries with missing keys column values  For example  data2            A   5   C   3   D   3          A   7   B   9   F   5          B   4   C   7   E   6       The methods below all produce the same output  pd DataFrame data2  pd DataFrame from dict data2  pd DataFrame from records data2        A    B    C    D    E    F 0  5 0  NaN  3 0  3 0  NaN  NaN 1  7 0  9 0  NaN  NaN  NaN  5 0 2  NaN  4 0  7 0  NaN  6 0  NaN  Reading Subset of Columns  quot What if I don t want to read in every single column quot   You can easily specify this using the columns     parameter  For example  from the example dictionary of data2 above  if you wanted to read only columns  quot A    D   and  F   you can do so by passing a list  pd DataFrame data2  columns   A    D    F      pd DataFrame from records data2  columns   A    D    F          A    D    F 0  5 0  3 0  NaN 1  7 0  NaN  5 0 2  NaN  NaN  NaN  This is not supported by pd DataFrame from dict with the default orient  quot columns quot   pd DataFrame from dict data2  orient  columns   columns   A    B      ValueError  cannot use columns parameter with orient  columns   Reading Subset of Rows Not supported by any of these methods directly  You will have to iterate over your data and perform a reverse delete in-place as you iterate  For example  to extract only the 0th and 2nd rows from data2 above  you can use  rows to select    0  2  for i in reversed range len data2         if i not in rows to select          del data2 i   pd DataFrame data2    pd DataFrame from dict data2    pd DataFrame from records data2        A    B  C    D    E 0  5 0  NaN  3  3 0  NaN 1  NaN  4 0  7  NaN  6 0   The Panacea  json normalize for Nested Data A strong  robust alternative to the methods outlined above is the json normalize function which works with lists of dictionaries  records   and in addition can also handle nested dictionaries  pd json normalize data      A  B  C  D 0  5  0  3  3 1  7  9  3  5 2  2  4  7  6   pd json normalize data2        A    B  C    D    E 0  5 0  NaN  3  3 0  NaN 1  NaN  4 0  7  NaN  6 0  Again  keep in mind that the data passed to json normalize needs to be in the list-of-dictionaries  records  format  As mentioned  json normalize can also handle nested dictionaries  Here s an example taken from the documentation  data nested         counties      name    Dade    population   12345                     name    Broward    population   40000                     name    Palm Beach    population   60000        info     governor    Rick Scott        shortname    FL       state    Florida        counties      name    Summit    population   1234                     name    Cuyahoga    population   1337        info     governor    John Kasich        shortname    OH       state    Ohio       pd json normalize data nested                             record path  counties                              meta   state    shortname     info    governor               name  population    state shortname info governor 0        Dade       12345  Florida        FL    Rick Scott 1     Broward       40000  Florida        FL    Rick Scott 2  Palm Beach       60000  Florida        FL    Rick Scott 3      Summit        1234     Ohio        OH   John Kasich 4    Cuyahoga        1337     Ohio        OH   John Kasich  For more information on the meta and record path arguments  check out the documentation   Summarising Here s a table of all the methods discussed above  along with supported features functionality     Use orient  columns  and then transpose to get the same effect as orient  index

[python] Convert list of dictionaries to a pandas DataFrame

Examples related to python

Examples related to dictionary

Examples related to pandas

Examples related to dataframe