Pandas DataFrame column to list

Question

I am pulling a subset of data from a column based on conditions in another column being met   I can get the correct values back but it is in pandas core frame DataFrame   How do I convert that to list   import pandas as pd  tst   pd read csv  C   SomeCSV csv    lookupValue   tst  SomeCol       SomeValue  ID   tst lookupValue    SomeCol     How To convert ID to a list

User · Answer

The above solution is good if all the data is of same dtype. Numpy arrays are homogeneous containers. When you do df.values the output is an numpy array. So if the data has int and float in it then output will either have int or float and the columns will loose their original dtype. Consider df

a  b 
0  1  4
1  2  5 
2  3  6 

a    float64
b    int64

So if you want to keep original dtype, you can do something like

row_list = df.to_csv(None, header=False, index=False).split('\n')

this will return each row as a string.

['1.0,4', '2.0,5', '3.0,6', '']

Then split each row to get list of list. Each element after splitting is a unicode. We need to convert it required datatype.

def f(row_str): 
  row_list = row_str.split(',')
  return [float(row_list[0]), int(row_list[1])]

df_list_of_list = map(f, row_list[:-1])

[[1.0, 4], [2.0, 5], [3.0, 6]]

User · Answer

I d like to clarify a few things    As other answers have pointed out  the simplest thing to do is use pandas Series tolist     I m not sure why the top voted answer leads off with using pandas Series values tolist   since as far as I can tell  it adds syntax confusion with no added benefit  tst lookupValue    SomeCol    is a dataframe  as stated in the question   not a series  as stated in a comment to the question    This is because tst lookupValue  is a dataframe  and slicing it with    SomeCol    asks for a list of columns  that list that happens to have a length of 1   resulting in a dataframe being returned   If you remove the extra set of brackets  as in tst lookupValue   SomeCol    then you are asking for just that one column rather than a list of columns  and thus you get a series back  You need a series to use pandas Series tolist    so you should definitely skip the second set of brackets in this case   FYI  if you ever end up with a one-column dataframe that isn t easily avoidable like this  you can use pandas DataFrame squeeze   to convert it to a series  tst lookupValue   SomeCol   is getting a subset of a particular column via chained slicing   It slices once to get a dataframe with only certain rows left  and then it slices again to get a certain column   You can get away with it here since you are just reading  not writing  but the proper way to do it is tst loc lookupValue   SomeCol    which returns a series   Using the syntax from  4  you could reasonably do everything in one line  ID   tst loc tst  SomeCol       SomeValue    SomeCol   tolist     Demo Code   import pandas as pd df   pd DataFrame   colA   1 2 1                       colB   4 5 6    filter value   1  print  df  print df print type df   rows to keep   df  colA      filter value print   ndf  colA      filter value  print rows to keep print type rows to keep   result   df rows to keep   colB   print   ndf rows to keep   colB    print result print type result   result   df rows to keep    colB    print   ndf rows to keep    colB     print result print type result   result   df rows to keep    colB    squeeze   print   ndf rows to keep    colB    squeeze    print result print type result   result   df loc rows to keep   colB   print   ndf loc rows to keep   colB    print result print type result   result   df loc df  colA      filter value   colB   print   ndf loc df  colA      filter value   colB    print result print type result   ID   df loc rows to keep   colB   tolist   print   ndf loc rows to keep   colB   tolist    print ID print type ID   ID   df loc df  colA      filter value   colB   tolist   print   ndf loc df  colA      filter value   colB   tolist    print ID print type ID    Result   df    colA  colB 0     1     4 1     2     5 2     1     6  lt class  pandas core frame DataFrame  gt   df  colA      filter value 0     True 1    False 2     True Name  colA  dtype  bool  lt class  pandas core series Series  gt   df rows to keep   colB   0    4 2    6 Name  colB  dtype  int64  lt class  pandas core series Series  gt   df rows to keep    colB       colB 0     4 2     6  lt class  pandas core frame DataFrame  gt   df rows to keep    colB    squeeze   0    4 2    6 Name  colB  dtype  int64  lt class  pandas core series Series  gt   df loc rows to keep   colB   0    4 2    6 Name  colB  dtype  int64  lt class  pandas core series Series  gt   df loc df  colA      filter value   colB   0    4 2    6 Name  colB  dtype  int64  lt class  pandas core series Series  gt   df loc rows to keep   colB   tolist    4  6   lt type  list  gt   df loc df  colA      filter value   colB   tolist    4  6   lt type  list  gt

User · Answer

You can use the Series to list method   For example   import pandas as pd  df   pd DataFrame   a    1  3  5  7  4  5  6  4  7  8  9                       b    3  5  6  2  4  6  7  8  7  8  9     print df  a   to list      Output    1  3  5  7  4  5  6  4  7  8  9    To drop duplicates you can do one of the following    gt  gt  gt  df  a   drop duplicates   to list    1  3  5  7  4  6  8  9   gt  gt  gt  list set df  a       as pointed out by EdChum  1  3  4  5  6  7  8  9

User · Answer

You can use pandas Series tolist  e g    import pandas as pd df   pd DataFrame   a   1 2 3    b   4 5 6      Run    gt  gt  gt  df  a   tolist     You will get   gt  gt  gt   1  2  3

[python] Pandas DataFrame column to list

Examples related to python

Examples related to pandas