Get list from pandas DataFrame column headers

Question

I want to get a list of the column headers from a pandas DataFrame   The DataFrame will come from user input so I won t know how many columns there will be or what they will be called  For example  if I m given a DataFrame like this   gt  gt  gt  my dataframe     y  gdp  cap 0   1    2    5 1   2    3    9 2   8    7    2 3   3    4    7 4   6    7    7 5   4    8    3 6   8    2    8 7   9    9   10 8   6    6    4 9  10   10    7  I would get a list like this   gt  gt  gt  header list   y    gdp    cap

User · Answer

Did some quick tests  and perhaps unsurprisingly the built-in version using dataframe columns values tolist   is the fastest   In  1    timeit  column for column in df  1000 loops  best of 3  81 6   s per loop  In  2    timeit df columns values tolist   10000 loops  best of 3  16 1   s per loop  In  3    timeit list df  10000 loops  best of 3  44 9   s per loop  In  4     timeit list df columns values  10000 loops  best of 3  38 4   s per loop    I still really like the list dataframe  though  so thanks EdChum

User · Answer

In the Notebook For data exploration in the IPython notebook  my preferred way is this  sorted df   Which will produce an easy to read alphabetically ordered list  In a code repository In code I find it more explicit to do df columns  Because it tells others reading your code what you are doing

User · Answer

I feel question deserves additional explanation   As  fixxxer noted  the answer depends on the pandas version you are using in your project  Which you can get with pd   version   command   If you are for some reason like me  on debian jessie I use 0 14 1  using older version of pandas than 0 16 0  then you need to use   df keys   tolist   because there is no df columns method implemented yet   The advantage of this keys method is  that it works even in newer version of pandas  so it s more universal

User · Answer

gt  gt  gt  list my dataframe    y    gdp    cap     To list the columns of a dataframe while in debugger mode  use a list comprehension    gt  gt  gt   c for c in my dataframe    y    gdp    cap     By the way  you can get a sorted list simply by using sorted    gt  gt  gt  sorted my dataframe    cap    gdp    y

User · Answer

Even though the solution that was provided above is nice  I would also expect something like frame column names   to be a function in pandas  but since it is not  maybe it would be nice to use the following syntax  It somehow preserves the feeling that you are using pandas in a proper way by calling the  tolist  function  frame columns tolist     frame columns tolist

User · Answer

There is a built in method which is the most performant   my dataframe columns values tolist      columns returns an Index   columns values returns an array and this has a helper function  tolist to return a list   If performance is not as important to you  Index objects define a  tolist   method that you can call directly   my dataframe columns tolist     The difference in performance is obvious    timeit df columns tolist   16 7   s    317 ns per loop  mean    std  dev  of 7 runs  100000 loops each    timeit df columns values tolist   1 24   s    12 3 ns per loop  mean    std  dev  of 7 runs  1000000 loops each      For those who hate typing  you can just call list on df  as so   list df

User · Answer

That s available as my dataframe columns

User · Answer

This gives us the names of columns in a list   list my dataframe columns    Another function called tolist   can be used too   my dataframe columns tolist

User · Answer

timeit final df columns values tolist   948 ns    19 2 ns per loop  mean    std  dev  of 7 runs  1000000 loops each      timeit list final df columns  14 2   s    79 1 ns per loop  mean    std  dev  of 7 runs  100000 loops each      timeit list final df columns values  1 88   s    11 7 ns per loop  mean    std  dev  of 7 runs  1000000 loops each      timeit final df columns tolist   12 3   s    27 4 ns per loop  mean    std  dev  of 7 runs  100000 loops each      timeit list final df head 1  columns  163   s    20 6   s per loop  mean    std  dev  of 7 runs  10000 loops each

User · Answer

For a quick  neat  visual check  try this   for col in df columns      print col

User · Answer

Surprised I haven t seen this posted so far  so I ll just leave this here   Extended Iterable Unpacking  python3 5      df  and Friends  Unpacking generalizations  PEP 448  have been introduced with Python 3 5  So  the following operations are all possible   df   pd DataFrame  x   columns   A    B    C    index range 5   df     A  B  C 0  x  x  x 1  x  x  x 2  x  x  x 3  x  x  x 4  x  x  x      If you want a list        df      A    B    C     Or  if you want a set     df      A    B    C     Or  if you want a tuple    df     Please note the trailing comma     A    B    C     Or  if you want to store the result somewhere     cols    df    A wild comma appears  again cols     A    B    C         if you re the kind of person who converts coffee to typing sounds  well  this is going consume your coffee more efficiently        P S   if performance is important  you will want to ditch the   solutions above in favour of  df columns to numpy   tolist       A    B    C         This is similar to Ed Chum s   answer  but updated for   v0 24 where  to numpy   is preferred to the use of  values  See   this answer  by me    for more information    Visual Check Since I ve seen this discussed in other answers  you can utilise iterable unpacking  no need for explicit loops    print  df  A B C  print  df  sep   n   A B C     Critique of Other Methods  Don t use an explicit for loop for an operation that can be done in a single line  List comprehensions are okay     Next  using sorted df  does not preserve the original order of the columns  For that  you should use list df  instead     Next  list df columns  and list df columns values  are poor suggestions  as of the current version  v0 24   Both Index  returned from df columns  and NumPy arrays  returned by df columns values  define  tolist   method which is faster and more idiomatic    Lastly  listification i e   list df  should only be used as a concise alternative to the aforementioned methods for python  lt   3 4 where extended unpacking is not available

User · Answer

It s interesting but df columns values tolist   is almost 3 times faster then df columns tolist   but I thought that they are the same   In  97    timeit df columns values tolist   100000 loops  best of 3  2 97   s per loop  In  98    timeit df columns tolist   10000 loops  best of 3  9 67   s per loop

User · Answer

This solution lists all the columns of your object my dataframe   print list my dataframe

User · Answer

list df columns  This gives you the list of column names of a data frame df

User · Answer

as answered by Simeon Visser   you could do  list my dataframe columns values     or      list my dataframe    for less typing    But I think most the sweet spot is   list my dataframe columns    It is explicit  at the same time not unnecessarily long

User · Answer

A DataFrame follows the dict-like convention of iterating over the    keys    of the objects   my dataframe keys     Create a list of keys columns - object method to list   and pythonic way  my dataframe keys   to list   list my dataframe keys      Basic iteration on a DataFrame returns column labels   column for column in my dataframe    Do not convert a DataFrame into a list  just to get the column labels  Do not stop thinking while looking for convenient code samples   xlarge   pd DataFrame np arange 100000000  reshape 10000 10000   list xlarge   compute time and memory consumption depend on dataframe size - O N  list xlarge keys     constant time operation - O 1

User · Answer

n      for i in my dataframe columns      n append i  print n

User · Answer

Its gets even simpler  by pandas 0 16 0      df columns tolist     will give you the column names in a nice list

User · Answer

You can get the values as a list by doing   list my dataframe columns values    Also you can simply use   as shown in Ed Chum s answer    list my dataframe

User · Answer

If the DataFrame happens to have an Index or MultiIndex and you want those included as column names too   names   list filter None  df index names   df columns values tolist       It avoids calling reset index   which has an unnecessary performance hit for such a simple operation    I ve run into needing this more often because I m shuttling data from databases where the dataframe index maps to a primary unique key  but is really just another  column  to me  It would probably make sense for pandas to have a built-in method for something like this  totally possible I ve missed it

[python] Get list from pandas DataFrame column headers

Examples related to python

Examples related to pandas

Examples related to dataframe