How to append rows in a pandas dataframe in a for loop

Question

I have the following for loop   for i in links       data   urllib2 urlopen str i   read        data   json loads data       data   pd DataFrame data items         data   data transpose        data columns   data iloc 0       data   data drop data index  0      Each dataframe so created has most columns in common with the others but not all of them  Moreover  they all have just one row  What I need to to is to add to the dataframe all the distinct columns and each row from each dataframe produced by the for loop  I tried pandas concatenate or similar but nothing seemed to work  Any idea  Thanks

User · Accepted Answer

Suppose your data looks like this   import pandas as pd import numpy as np  np random seed 2015  df   pd DataFrame     for i in range 5       data   dict zip np random choice 10  replace False  size 5                       np random randint 10  size 5        data   pd DataFrame data items        data   data transpose       data columns   data iloc 0      data   data drop data index  0        df   df append data  print     n  format df     0   0   1   2   3   4   5   6   7   8   9   1   6 NaN NaN   8   5 NaN NaN   7   0 NaN   1 NaN   9   6 NaN   2 NaN   1 NaN NaN   2   1 NaN   2   2   1   2 NaN   1 NaN NaN NaN   1   6 NaN   6 NaN   4   4   0 NaN NaN NaN   1 NaN   9 NaN   9 NaN   7   1   9 NaN NaN   Then it could be replaced with   np random seed 2015  data      for i in range 5       data append dict zip np random choice 10  replace False  size 5                            np random randint 10  size 5     df   pd DataFrame data  print df    In other words  do not form a new DataFrame for each row  Instead  collect all the data in a list of dicts  and then call df   pd DataFrame data  once at the end  outside the loop   Each call to df append requires allocating space for a new DataFrame with one extra row  copying all the data from the original DataFrame into the new DataFrame  and then copying data into the new row  All that allocation and copying makes calling df append in a loop very inefficient  The time cost of copying grows quadratically with the number of rows  Not only is the call-DataFrame-once code easier to write  it s performance will be much better -- the time cost of copying grows linearly with the number of rows

User · Answer

I have created a data frame in a for loop with the help of a temporary empty data frame  Because for every iteration of for loop  a new data frame will be created thereby overwriting the contents of previous iteration    Hence I need to move the contents of the data frame to the empty data frame that was created already  It s as simple as that  We just need to use  append function as shown below    temp df   pd DataFrame    Temporary empty dataframe for sent in Sentences      New df   pd DataFrame   words   sent words    Creates a new dataframe and contains tokenized words of input sentences     temp df   temp df append New df  ignore index True   Moving the contents of newly created dataframe to the temporary dataframe   Outside the for loop  you can copy the contents of the temporary data frame into the master data frame and then delete the temporary data frame if you don t need it

User · Answer

There are 2 reasons you may append rows in a loop  1  add to an existing df  and 2  create a new df   to create a new df  I think its well documented that you should either create your data as a list and then create the data frame   cols     c1    c2    c3   lst      for a in range 2       lst append  1  2  3   df1   pd DataFrame lst  columns cols  df1 Out 3       c1  c2  c3 0   1   2   3 1   1   2   3   OR  Create the dataframe with an index and then add to it  cols     c1    c2    c3   df2   pd DataFrame columns cols  index range 2   for a in range 2       df2 loc a  c1   4     df2 loc a  c2   5     df2 loc a  c3   6 df2 Out 4      c1 c2 c3 0  4  5  6 1  4  5  6   If you want to add to an existing dataframe  you could use either method above and then append the df s together  with or without the index    df3   df2 append df1  ignore index True  df3 Out 6      c1 c2 c3 0  4  5  6 1  4  5  6 2  1  2  3 3  1  2  3   Or  you can also create a list of dictionary entries and append those as in the answer above   lst dict      for a in range 2       lst dict append   c1  2   c2  2   c3   3   df4   df1 append lst dict  df4 Out 7       c1  c2  c3 0   1   2   3 1   1   2   3 0   2   2   3 1   2   2   3   Using the dict zip cols  vals     lst dict      for a in range 2       vals    7  8  9      lst dict append dict zip cols  vals    df5   df1 append lst dict

User · Answer

A more compact and efficient way would be perhaps   cols     frame    count   N   4 dat   pd DataFrame columns   cols  for i in range N        dat   dat append   frame   str i    count  i  ignore index True    output would be    gt  gt  gt  dat    frame count 0     0     0 1     1     1 2     2     2 3     3     3

User · Answer

First  create a empty DataFrame with column names  after that  inside the for loop  you must define a dictionary  a row  with the data to append   df   pd DataFrame columns   A    for i in range 5       df   df append   A   i   ignore index True  df    A 0  0 1  1 2  2 3  3 4  4   If you want to add a row with more columns  the code will looks like this   df   pd DataFrame columns   A   B   C    for i in range 5       df   df append   A   i                       B   i   2                       C   i   3                                           ignore index True                     df     A   B   C 0   0   0   0 1   1   2   3 2   2   4   6 3   3   6   9 4   4   8   12   Source

[python] How to append rows in a pandas dataframe in a for loop?

Examples related to python

Examples related to for-loop

Examples related to pandas

Examples related to dataframe