Get list from pandas dataframe column or row

Question

I have a dataframe df imported from an Excel document like this  cluster load date   budget  actual  fixed price A   1 1 2014    1000    4000    Y A   2 1 2014    12000   10000   Y A   3 1 2014    36000   2000    Y B   4 1 2014    15000   10000   N B   4 1 2014    12000   11500   N B   4 1 2014    90000   11000   N C   7 1 2014    22000   18000   N C   8 1 2014    30000   28960   N C   9 1 2014    53000   51200   N  I want to be able to return the contents of column 1 df  cluster   as a list  so I can run a for-loop over it  and create an Excel worksheet for every cluster  Is it also possible to return the contents of a whole column or row to a list  e g  list       list column1  or list df ix row1

User · Accepted Answer

Pandas DataFrame columns are Pandas Series when you pull them out, which you can then call x.tolist() on to turn them into a Python list. Alternatively you cast it with list(x).

import pandas as pd

data_dict = {'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']),
             'two': pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(data_dict)

print(f"DataFrame:\n{df}\n")
print(f"column types:\n{df.dtypes}")

col_one_list = df['one'].tolist()

col_one_arr = df['one'].to_numpy()

print(f"\ncol_one_list:\n{col_one_list}\ntype:{type(col_one_list)}")
print(f"\ncol_one_arr:\n{col_one_arr}\ntype:{type(col_one_arr)}")

Output:

DataFrame:
   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4

column types:
one    float64
two      int64
dtype: object

col_one_list:
[1.0, 2.0, 3.0, nan]
type:<class 'list'>

col_one_arr:
[ 1.  2.  3. nan]
type:<class 'numpy.ndarray'>

User · Answer

amount   list       for col in df columns          val   list df col           for v in val              amount append v

User · Answer

Example conversion  Numpy Array - gt  Panda Data Frame - gt  List from one Panda Column Numpy Array data   np array   10 20 30    20 30 60    30 60 90     Convert numpy array into Panda data frame dataPd   pd DataFrame data   data       print dataPd  0   1   2 0  10  20  30 1  20  30  60 2  30  60  90  Convert one Panda column to list pdToList   list dataPd  2

User · Answer

As this question attained a lot of attention and there are several ways to fulfill your task  let me present several options    Those are all one-liners by the way     Starting with   df   cluster load date budget actual fixed price 0       A  1 1 2014   1000   4000           Y 1       A  2 1 2014  12000  10000           Y 2       A  3 1 2014  36000   2000           Y 3       B  4 1 2014  15000  10000           N 4       B  4 1 2014  12000  11500           N 5       B  4 1 2014  90000  11000           N 6       C  7 1 2014  22000  18000           N 7       C  8 1 2014  30000  28960           N 8       C  9 1 2014  53000  51200           N   Overview of potential operations   ser aggCol  collapse each column to a list  cluster           A  A  A  B  B  B  C  C  C  load date       1 1 2014  2 1 2014  3 1 2    budget          1000  12000  36000  15000    actual          4000  10000  2000  10000     fixed price       Y  Y  Y  N  N  N  N  N  N  dtype  object   ser aggRows  collapse each row to a list  0      A  1 1 2014  1000  4000  Y  1     A  2 1 2014  12000  10000    2     A  3 1 2014  36000  2000  Y  3     B  4 1 2014  15000  10000    4     B  4 1 2014  12000  11500    5     B  4 1 2014  90000  11000    6     C  7 1 2014  22000  18000    7     C  8 1 2014  30000  28960    8     C  9 1 2014  53000  51200    dtype  object   df gr  here you get lists for each cluster                               load date                 budget                 actual fixed price cluster                                                                                          A         1 1 2014  2 1 2014  3 1 2       1000  12000  36000      4000  10000  2000     Y  Y  Y  B         4 1 2014  4 1 2014  4 1 2      15000  12000  90000    10000  11500  11000     N  N  N  C         7 1 2014  8 1 2014  9 1 2      22000  30000  53000    18000  28960  51200     N  N  N    a list of separate dataframes for each cluster  df for cluster A   cluster load date budget actual fixed price 0       A  1 1 2014   1000   4000           Y 1       A  2 1 2014  12000  10000           Y 2       A  3 1 2014  36000   2000           Y  df for cluster B   cluster load date budget actual fixed price 3       B  4 1 2014  15000  10000           N 4       B  4 1 2014  12000  11500           N 5       B  4 1 2014  90000  11000           N  df for cluster C   cluster load date budget actual fixed price 6       C  7 1 2014  22000  18000           N 7       C  8 1 2014  30000  28960           N 8       C  9 1 2014  53000  51200           N  just the values of column load date 0    1 1 2014 1    2 1 2014 2    3 1 2014 3    4 1 2014 4    4 1 2014 5    4 1 2014 6    7 1 2014 7    8 1 2014 8    9 1 2014 Name  load date  dtype  object   just the values of column number 2 0     1000 1    12000 2    36000 3    15000 4    12000 5    90000 6    22000 7    30000 8    53000 Name  budget  dtype  object   just the values of row number 7 cluster               C load date      8 1 2014 budget            30000 actual            28960 fixed price           N Name  7  dtype  object                                  JUST FOR COMPLETENESS                                  you can convert a series to a list   C    8 1 2014    30000    28960    N    lt class  list  gt    you can convert a dataframe to a nested list    A    1 1 2014    1000    4000    Y      A    2 1 2014    12000    10000    Y      A    3 1 2014    36000    2000    Y      B    4 1 2014    15000    10000    N      B    4 1 2014    12000    11500    N      B    4 1 2014    90000    11000    N      C    7 1 2014    22000    18000    N      C    8 1 2014    30000    28960    N      C    9 1 2014    53000    51200    N     lt class  list  gt   the content of a dataframe can be accessed as a numpy ndarray    A   1 1 2014   1000   4000   Y      A   2 1 2014   12000   10000   Y      A   3 1 2014   36000   2000   Y      B   4 1 2014   15000   10000   N      B   4 1 2014   12000   11500   N      B   4 1 2014   90000   11000   N      C   7 1 2014   22000   18000   N      C   8 1 2014   30000   28960   N      C   9 1 2014   53000   51200   N     lt class  numpy ndarray  gt    code     prefix ser refers to pd Series object   prefix df refers to pd DataFrame object   prefix lst refers to list object  import pandas as pd import numpy as np  df pd DataFrame             A      1 1 2014       1000       4000       Y              A      2 1 2014       12000      10000      Y              A      3 1 2014       36000      2000       Y              B      4 1 2014       15000      10000      N              B      4 1 2014       12000      11500      N              B      4 1 2014       90000      11000      N              C      7 1 2014       22000      18000      N              C      8 1 2014       30000      28960      N              C      9 1 2014       53000      51200      N              columns   cluster    load date      budget     actual     fixed price    print  df  df  sep   n   end   n n    ser aggCol df aggregate lambda x   x tolist     axis 0  map lambda x x 0   print  ser aggCol  collapse each column to a list   ser aggCol  sep   n   end   n n n    ser aggRows pd Series df values tolist     print  ser aggRows  collapse each row to a list   ser aggRows  sep   n   end   n n n    df gr df groupby  cluster   agg lambda x  list x   print  df gr  here you get lists for each cluster   df gr  sep   n   end   n n n    lst dfFiltGr   df loc df  cluster    val    for val in df  cluster   unique     print  a list of separate dataframes for each cluster   sep   n   end   n n   for dfTmp in lst dfFiltGr      print  df for cluster   str dfTmp loc dfTmp index 0   cluster    dfTmp  sep   n   end   n n    ser singleColLD df loc    load date   print  just the values of column load date  ser singleColLD  sep   n   end   n n n    ser singleCol2 df iloc   2  print  just the values of column number 2  ser singleCol2  sep   n   end   n n n    ser singleRow7 df iloc 7    print  just the values of row number 7  ser singleRow7  sep   n   end   n n n    print     30   JUST FOR COMPLETENESS       30  end   n n n    lst fromSer ser singleRow7 tolist   print  you can convert a series to a list  lst fromSer  type lst fromSer   sep   n   end   n n n    lst fromDf df values tolist   print  you can convert a dataframe to a nested list  lst fromDf  type lst fromDf   sep   n   end   n n    arr fromDf df values print  the content of a dataframe can be accessed as a numpy ndarray  arr fromDf  type arr fromDf   sep   n   end   n n     as pointed out by cs95 other methods should be preferred over pandas  values attribute from pandas version 0 24 on see here  I use it here  because most people will  by 2019  still have an older version  which does not support the new recommendations  You can check your version with print pd   version

User · Answer

Assuming the name of the dataframe after reading the excel sheet is df  take an empty list  e g  dataList   iterate through the dataframe row by row and append to your empty list like-  dataList       empty list for index  row in df iterrows         mylist    row cluster  row load date  row budget  row actual  row fixed price      dataList append mylist    Or   dataList       empty list for row in df itertuples         mylist    row cluster  row load date  row budget  row actual  row fixed price      dataList append mylist    No  if you print the dataList  you will get each rows as a list in the dataList

User · Answer

If your column will only have one value something like pd series tolist   will produce an error  To guarantee that it will work for all cases  use the code below         df          filter   column name             values          reshape 1  -1           ravel            tolist

User · Answer

This returns a numpy array   arr   df  cluster   to numpy       This returns a numpy array of unique values   unique arr   df  cluster   unique     You can also use numpy to get the unique values  although there are differences between the two methods   arr   df  cluster   to numpy   unique arr   np unique arr

[python] Get list from pandas dataframe column or row?

Examples related to python

Examples related to list

Examples related to pandas