Creating a Pandas DataFrame from a Numpy array How do I specify the index column and column headers

Question

I have a Numpy array consisting of a list of lists  representing a two-dimensional array with row labels and column names as shown below   data   array       Col1   Col2     Row1  1 2    Row2  3 4      I d like the resulting DataFrame to have Row1 and Row2 as index values  and Col1  Col2 as header values  I can specify the index as follows   df   pd DataFrame data index data   0      however I am unsure how to best assign column headers

User · Answer

Here simple example to create pandas dataframe by using numpy array  import numpy as np import pandas as pd    create an array  var1    np arange start 1  stop 21  step 1  reshape -1  var2   np random rand 20 1  reshape -1  print var1 shape  print var2 shape   dataset   pd DataFrame   dataset  col1     var1 dataset  col2     var2 dataset head

User · Answer

I think this is a simple and intuitive method  data   np array   0  0    0  1     1  0     1  1    reward   np array  1 0 1 0    dataset   pd DataFrame   dataset  StateAttributes     data tolist   dataset  reward     reward tolist    dataset  returns   But there are performance implications detailed here  How to set the value of a pandas column as list

User · Answer

Here is an easy to understand solution  import numpy as np import pandas as pd    Creating a 2 dimensional numpy array  gt  gt  gt  data   np array   5 8  2 8    6 0  2 2     gt  gt  gt  print data   gt  gt  gt  data array   5 8  2 8           6    2 2       Creating pandas dataframe from numpy array  gt  gt  gt  dataset   pd DataFrame   Column1   data    0    Column2   data    1     gt  gt  gt  print dataset     Column1  Column2 0      5 8      2 8 1      6 0      2 2

User · Answer

You need to specify data  index and columns to DataFrame constructor  as in    gt  gt  gt  pd DataFrame data data 1  1         values                  index data 1  0        1st column as index                  columns data 0 1       1st row as the column names   edit  as in the  joris comment  you may need to change above to np int  data 1  1    to have correct data type

User · Answer

Adding to  behzad nouri  s answer - we can create a helper routine to handle this common scenario   def csvDf dat   kwargs      from numpy import array   data   array dat    if data is None or len data   0 or len data 0    0      return None   else      return pd DataFrame data 1  1   index data 1  0  columns data 0 1     kwargs    Let s try it out   data         a   b   c     row1   row1cola   row1colb   row1colc           row2   row2cola   row2colb   row2colc     row3   row3cola   row3colb   row3colc    csvDf data   In  61   csvDf data  Out 61                a         b         c row1  row1cola  row1colb  row1colc row2  row2cola  row2colb  row2colc row3  row3cola  row3colb  row3colc

User · Answer

I agree with Joris  it seems like you should be doing this differently  like with numpy record arrays  Modifying  option 2  from this great answer  you could do it like this   import pandas import numpy  dtype      Col1   int32      Col2   float32      Col3   float32    values   numpy zeros 20  dtype dtype  index     Row  str i  for i in range 1  len values  1    df   pandas DataFrame values  index index

User · Answer

It s not so short  but maybe can help you  Creating Array import numpy as np import pandas as pd  data   np array    col1    col2     4 8  2 8    7 0  1 2      gt  gt  gt  data array    col1    col2             4 8    2 8             7 0    1 2     dtype   lt U4    Creating data frame df   pd DataFrame i for i in data  transpose   df drop 0  axis 1  inplace True  df columns   data 0  df   gt  gt  gt  df   col1 col2 0  4 8  7 0 1  2 8  1 2

User · Answer

This can be done simply by using from records of pandas DataFrame  import numpy as np import pandas as pd   Creating a numpy array x   np arange 1 10 1  reshape -1 1  dataframe   pd DataFrame from records x

User · Answer

gt  gt import pandas as pd      gt  gt import numpy as np      gt  gt data shape      480 193       gt  gt type data      numpy ndarray      gt  gt df pd DataFrame data data 0  0                   index  i for i in range data shape 0                    columns   f  str i  for i in range data shape 1          gt  gt df head          array to dataframe  1   1

[python] Creating a Pandas DataFrame from a Numpy array: How do I specify the index column and column headers?

Examples related to python

Examples related to pandas

Examples related to numpy