Renaming columns in Pandas

Question

I have a DataFrame using Pandas and column labels that I need to edit to replace the original column labels  I d like to change the column names in a DataFrame A where the original column names are     a     b     c     d     e    to   a    b    c    d    e     I have the edited column names stored it in a list  but I don t know how to replace the column names

User · Answer

You could use str slice for that   df columns   df columns str slice 1

User · Answer

The rename method can take a function  for example   In  11   df columns Out 11   Index  u  a   u  b   u  c   u  d   u  e    dtype object   In  12   df rename columns lambda x  x 1    inplace True   In  13   df columns Out 13   Index  u a   u b   u c   u d   u e    dtype object

User · Answer

Note that the approaches in previous answers do not work for a MultiIndex  For a MultiIndex  you need to do something like the following   gt  gt  gt  df   pd DataFrame     a    x    1 2      b    y     3 4     e   f    5 6     gt  gt  gt  df     a  b  e     x  y  f 0  1  3  5 1  2  4  6  gt  gt  gt  rename       a    x     a   x       b    y     b   y     gt  gt  gt  df columns   pandas MultiIndex from tuples           rename get item  item  for item in df columns tolist      gt  gt  gt  df    a  b  e    x  y  f 0  1  3  5 1  2  4  6

User · Answer

Use  old names      a     b     c     d     e    new names     a    b    c    d    e   df rename columns dict zip old names  new names    inplace True   This way you can manually edit the new names as you wish  It works great when you need to rename only a few columns to correct misspellings  accents  remove special characters  etc

User · Answer

Pandas 0 21  Answer  There have been some significant updates to column renaming in version 0 21     The rename method has added the axis parameter which may be set to columns or 1  This update makes this method match the rest of the pandas API  It still has the index and columns parameters but you are no longer forced to use them   The set axis method with the inplace set to False enables you to rename all the index or column labels with a list    Examples for Pandas 0 21   Construct sample DataFrame   df   pd DataFrame    a   1 2     b    3 4                         c   5 6     d   7 8                         e   9 10         a   b   c   d   e 0   1   3   5   7   9 1   2   4   6   8  10   Using rename with axis  columns  or axis 1  df rename    a   a     b   b     c   c     d   d     e   e    axis  columns     or   df rename    a   a     b   b     c   c     d   d     e   e    axis 1    Both result in the following      a  b  c  d   e 0  1  3  5  7   9 1  2  4  6  8  10   It is still possible to use the old method signature   df rename columns    a   a     b   b     c   c     d   d     e   e      The rename function also accepts functions that will be applied to each column name   df rename lambda x  x 1    axis  columns     or  df rename lambda x  x 1    axis 1      Using set axis with a list and inplace False  You can supply a list to the set axis method that is equal in length to the number of columns  or index   Currently  inplace defaults to True  but inplace will be defaulted to False in future releases   df set axis   a    b    c    d    e    axis  columns   inplace False    or  df set axis   a    b    c    d    e    axis 1  inplace False      Why not use df columns     a    b    c    d    e     There is nothing wrong with assigning columns directly like this  It is a perfectly good solution    The advantage of using set axis is that it can be used as part of a method chain and that it returns a new copy of the DataFrame  Without it  you would have to store your intermediate steps of the chain to another variable before reassigning the columns     new for pandas 0 21  df some method1      some method2      set axis      some method3      old way df1   df some method1            some method2   df1 columns   columns df1 some method3

User · Answer

Assuming you can use a regular expression  this solution removes the need of manual encoding using a regular expression  import pandas as pd import re  srch   re compile r quot  w  quot    data   pd read csv  quot CSV FILE csv quot   cols   data columns new cols   list map lambda v v group     list map srch search  cols      data columns   new cols

User · Answer

Let s say this is your dataframe     You can rename the columns using two methods    Using dataframe columns   list   df columns   a   b   c   d   e       The limitation of this method is that if one column has to be changed  full column list has to be passed  Also  this method is not applicable on index labels  For example  if you passed this   df columns     a   b   c   d     This will throw an error  Length mismatch  Expected axis has 5 elements  new values have 4 elements  Another method is the Pandas rename   method which is used to rename any index  column or row  df   df rename columns    a   a         Similarly  you can change any rows or columns

User · Answer

Since you only want to remove the   sign in all column names  you could just do   df   df rename columns lambda x  x replace             OR  df rename columns lambda x  x replace           inplace True

User · Answer

My method is generic wherein you can add additional delimiters by comma separating delimiters  variable and future-proof it  Working Code  import pandas as pd import re   df   pd DataFrame    a   1 2     b    3 4    c   5 6     d    7 8     e    9 10     delimiters       matchPattern       join map re escape  delimiters   df columns    re split matchPattern  i  1  for i in df columns    Output   gt  gt  gt  df     a   b   c   d   e 0   1   3   5   7   9 1   2   4   6   8  10   gt  gt  gt  df    a  b  c  d   e 0  1  3  5  7   9 1  2  4  6  8  10

User · Answer

df   pd DataFrame    a    1     b    1     c    1     d    1     e    1      If your new list of columns is in the same order as the existing columns  the assignment is simple   new cols     a    b    c    d    e   df columns   new cols  gt  gt  gt  df    a  b  c  d  e 0  1  1  1  1  1   If you had a dictionary keyed on old column names to new column names  you could do the following   d      a    a     b    b     c    c     d    d     e    e   df columns   df columns map lambda col  d col      Or   map d get   as pointed out by  PiRSquared   gt  gt  gt  df    a  b  c  d  e 0  1  1  1  1  1   If you don t have a list or dictionary mapping  you could strip the leading   symbol via a list comprehension   df columns    col 1   if col 0         else col for col in df

User · Answer

It is real simple  Just use  df columns     Name1    Name2    Name3       And it will assign the column names by the order you put them in

User · Answer

df rename index str  columns   A   a    B   b     pandas DataFrame rename

User · Answer

Another way we could replace the original column labels is by stripping the unwanted characters  here      from the original column labels  This could have been done by running a for loop over df columns and appending the stripped columns to df columns  Instead  we can do this neatly in a single statement by using list comprehension like below  df columns    col strip      for col in df columns    strip method in Python strips the given character from beginning and end of the string

User · Answer

Just assign it to the  columns attribute   gt  gt  gt  df   pd DataFrame    a   1 2     b    10 20     gt  gt  gt  df     a   b 0   1  10 1   2  20   gt  gt  gt  df columns     a    b    gt  gt  gt  df    a   b 0  1  10 1  2  20

User · Answer

Let s understand renaming by a small example     Renaming columns using mapping   df   pd DataFrame   quot A quot    1  2  3    quot B quot    4  5  6      Creating a df with column name A and B  df rename   quot A quot    quot new a quot    quot B quot    quot new b quot    axis  columns   inplace  True    Renaming column A with  new a  and B with  new b    Output       new a  new b  0  1       4  1  2       5  2  3       6   Renaming index Row Name using mapping   df rename  0   quot x quot   1   quot y quot   2   quot z quot    axis  index   inplace  True    Row name are getting replaced by  x    y   and  z     Output           new a  new b      x  1       4      y  2       5      z  3       6

User · Answer

Here s a nifty little function I like to use to cut down on typing  def rename data  oldnames  newname       if type oldnames     str    Input can be a string or list of strings         oldnames    oldnames    When renaming multiple columns         newname    newname    Make sure you pass the corresponding list of new names     i   0     for name in oldnames          oldvar    c for c in data columns if name in c          if len oldvar     0              raise ValueError  quot Sorry  couldn t find that column in the dataset quot           if len oldvar   gt  1    Doesn t have to be an exact match             print  quot Found multiple columns that matched  quot    str name     quot    quot               for c in oldvar                  print str oldvar index c      quot    quot    str c               ind   input  Please enter the index of the column you would like to rename                 oldvar   oldvar int ind           if len oldvar     1              oldvar   oldvar 0          data   data rename columns    oldvar   newname i            i    1     return data  Here is an example of how it works  In  2   df   pd DataFrame np random randint 0  10  size  10  4    columns     col1    col2    omg    idk      First list   existing variables   Second list   new names for those variables In  3   df   rename df    col    omg     first    ohmy    Found multiple columns that matched col  0  col1 1  col2  Please enter the index of the column you would like to rename  0  In  4   df columns Out 5   Index   first    col2    ohmy    idk    dtype  object

User · Answer

I needed to rename features for XGBoost  and it didn t like any of these  import re regex   r quot     quot     amp        -      lt   gt                   quot  X trn columns   X trn columns str replace regex       regex True  X tst columns   X tst columns str replace regex       regex True

User · Answer

If you ve got the dataframe  df columns dumps everything into a list you can manipulate and then reassign into your dataframe as the names of columns    columns   df columns columns    row replace  quot   quot    quot  quot   for row in columns  df rename columns dict zip columns  things    inplace True  df head     To validate the output  Best way  I don t know  A way - yes  A better way of evaluating all the main techniques put forward in the answers to the question is below using cProfile to gage memory and execution time   kadee   kaitlyn  and  eumiro had the functions with the fastest execution times - though these functions are so fast we re comparing the rounding of 0 000 and 0 001 seconds for all the answers  Moral  my answer above likely isn t the  best  way  import pandas as pd import cProfile  pstats  re  old names      a     b     c     d     e   new names     a    b    c    d    e   col dict      a    a     b    b     c    c     d    d     e    e    df   pd DataFrame    a   1  2     b    10  20     c     bleep    blorp      d    1  2     e     texa           df head    def eumiro df  nn       df columns   nn       This direct renaming approach is duplicated in methodology in several other answers      return df  def lexual1 df       return df rename columns col dict   def lexual2 df  col dict       return df rename columns col dict  inplace True   def Panda Master Hayden df       return df rename columns lambda x  x 1    inplace True   def paulo1 df       return df rename columns lambda x  x replace            def paulo2 df       return df rename columns lambda x  x replace           inplace True   def migloo df  on  nn       return df rename columns dict zip on  nn    inplace True   def kadee df       return df columns str replace           def awo df       columns   df columns     columns    row replace  quot   quot    quot  quot   for row in columns      return df rename columns dict zip columns        inplace True   def kaitlyn df       df columns    col strip      for col in df columns      return df  print  eumiro  cProfile run  eumiro df  new names    print  lexual1  cProfile run  lexual1 df    print  lexual2  cProfile run  lexual2 df  col dict    print  andy hayden  cProfile run  Panda Master Hayden df    print  paulo1  cProfile run  paulo1 df    print  paulo2  cProfile run  paulo2 df    print  migloo  cProfile run  migloo df  old names  new names    print  kadee  cProfile run  kadee df    print  awo  cProfile run  awo df    print  kaitlyn  cProfile run  kaitlyn df

User · Answer

RENAME SPECIFIC COLUMNS Use the df rename   function and refer the columns to be renamed  Not all the columns have to be renamed  df   df rename columns   oldName1    newName1    oldName2    newName2      Or rename the existing DataFrame  rather than creating a copy   df rename columns   oldName1    newName1    oldName2    newName2    inplace True   Minimal Code Example df   pd DataFrame  x   index range 3   columns list  abcde    df     a  b  c  d  e 0  x  x  x  x  x 1  x  x  x  x  x 2  x  x  x  x  x  The following methods all work and produce the same output  df2   df rename   a    X    b    Y    axis 1     new method df2   df rename   a    X    b    Y    axis  columns   df2   df rename columns   a    X    b    Y       old method    df2     X  Y  c  d  e 0  x  x  x  x  x 1  x  x  x  x  x 2  x  x  x  x  x  Remember to assign the result back  as the modification is not-inplace  Alternatively  specify inplace True  df rename   a    X    b    Y    axis 1  inplace True  df     X  Y  c  d  e 0  x  x  x  x  x 1  x  x  x  x  x 2  x  x  x  x  x    From v0 25  you can also specify errors  raise  to raise errors if an invalid column-to-rename is specified  See v0 25 rename   docs   REASSIGN COLUMN HEADERS Use df set axis   with axis 1 and inplace False  to return a copy   df2   df set axis   V    W    X    Y    Z    axis 1  inplace False  df2     V  W  X  Y  Z 0  x  x  x  x  x 1  x  x  x  x  x 2  x  x  x  x  x  This returns a copy  but you can modify the DataFrame in-place by setting inplace True  this is the default behaviour for versions  lt  0 24 but is likely to change in the future   You can also assign headers directly  df columns     V    W    X    Y    Z   df     V  W  X  Y  Z 0  x  x  x  x  x 1  x  x  x  x  x 2  x  x  x  x  x

User · Answer

Column names vs Names of Series I would like to explain a bit what happens behind the scenes  Dataframes are a set of Series  Series in turn are an extension of a numpy array  numpy arrays have a property  name  This is the name of the series  It is seldom that Pandas respects this attribute  but it lingers in places and can be used to hack some Pandas behaviors  Naming the list of columns A lot of answers here talks about the df columns attribute being a list when in fact it is a Series  This means it has a  name attribute  This is what happens if you decide to fill in the name of the columns Series  df columns     column one    column two   df columns names     name of the list of columns   df index names     name of the index    name of the list of columns     column one  column two name of the index 0                                    4           1 1                                    5           2 2                                    6           3  Note that the name of the index always comes one column lower  Artefacts that linger The  name attribute lingers on sometimes  If you set df columns     one    two   then the df one name will be  one   If you set df one name    three  then df columns will still give you   one    two    and df one name will give you  three   BUT pd DataFrame df one  will return     three 0       1 1       2 2       3  Because Pandas reuses the  name of the already defined Series  Multi-level column names Pandas has ways of doing multi-layered column names  There is not so much magic involved  but I wanted to cover this in my answer too since I don t see anyone picking up on this here       one                   one       two    0      4         1    1      5         2    2      6         3     This is easily achievable by setting columns to lists  like this  df columns      one    one      one    two

User · Answer

If you have to deal with loads of columns named by the providing system out of your control  I came up with the following approach that is a combination of a general approach and specific replacements in one go  First create a dictionary from the dataframe column names using regular expressions in order to throw away certain appendixes of column names and then add specific replacements to the dictionary to name core columns as expected later in the receiving database  This is then applied to the dataframe in one go  dict   dict zip df columns  df columns str replace    S   C1   L   D    Serial L           dict  brand timeseries C1      BTS  dict  respid L      RespID  dict  country C1      CountryID  dict  pim1 D      pim actual  df rename columns dict  inplace True

User · Answer

In addition to the solution already provided  you can replace all the columns while you are reading the file  We can use names and header 0 to do that   First  we create a list of the names that we like to use as our column names   import pandas as pd  ufo cols     city    color reported    shape reported    state    time   ufo columns   ufo cols  ufo   pd read csv  link to the file you are using   names   ufo cols  header   0    In this case  all the column names will be replaced with the names you have in your list

User · Answer

Another option is to rename using a regular expression   import pandas as pd import re  df   pd DataFrame    a   1 2     b   3 4     c   5 6     df   df rename columns lambda x  re sub         x    gt  gt  gt  df    a  b  c 0  1  3  5 1  2  4  6

User · Answer

df columns     a    b    c    d    e     It will replace the existing names with the names you provide  in the order you provide

User · Answer

As documented in Working with text data  df columns   df columns str replace

User · Answer

Renaming columns in Pandas is an easy task  df rename columns    a    a     b    b     c    c     d    d     e    e    inplace True

User · Answer

One line or Pipeline solutions I ll focus on two things   OP clearly states  I have the edited column names stored it in a list  but I don t know how to replace the column names   I do not want to solve the problem of how to replace     or strip the first character off of each column header   OP has already done this step   Instead I want to focus on replacing the existing columns object with a new one given a list of replacement column names   df columns   new where new is the list of new columns names is as simple as it gets   The drawback of this approach is that it requires editing the existing dataframe s columns attribute and it isn t done inline   I ll show a few ways to perform this via pipelining without editing the existing dataframe     Setup 1 To focus on the need to rename of replace column names with a pre-existing list  I ll create a new sample dataframe df with initial column names and unrelated new column names  df   pd DataFrame   Jack    1  2    Mahesh    3  4    Xin    5  6    new     x098    y765    z432    df     Jack  Mahesh  Xin 0     1       3    5 1     2       4    6   Solution 1 pd DataFrame rename It has been said already that if you had a dictionary mapping the old column names to new column names  you could use pd DataFrame rename  d     Jack    x098    Mahesh    y765    Xin    z432   df rename columns d      x098  y765  z432 0     1     3     5 1     2     4     6  However  you can easily create that dictionary and include it in the call to rename   The following takes advantage of the fact that when iterating over df  we iterate over each column name    Given just a list of new column names df rename columns dict zip df  new        x098  y765  z432 0     1     3     5 1     2     4     6  This works great if your original column names are unique   But if they are not  then this breaks down   Setup 2 Non-unique columns df   pd DataFrame        1  3  5    2  4  6        columns   Mahesh    Mahesh    Xin     new     x098    y765    z432    df     Mahesh  Mahesh  Xin 0       1       3    5 1       2       4    6   Solution 2 pd concat using the keys argument First  notice what happens when we attempt to use solution 1  df rename columns dict zip df  new        y765  y765  z432 0     1     3     5 1     2     4     6  We didn t map the new list as the column names   We ended up repeating y765   Instead  we can use the keys argument of the pd concat function while iterating through the columns of df  pd concat  c for    c in df items     axis 1  keys new       x098  y765  z432 0     1     3     5 1     2     4     6   Solution 3 Reconstruct   This should only be used if you have a single dtype for all columns   Otherwise  you ll end up with dtype object for all columns and converting them back requires more dictionary work  Single dtype pd DataFrame df values  df index  new      x098  y765  z432 0     1     3     5 1     2     4     6  Mixed dtype pd DataFrame df values  df index  new  astype dict zip new  df dtypes        x098  y765  z432 0     1     3     5 1     2     4     6   Solution 4 This is a gimmicky trick with transpose and set index   pd DataFrame set index allows us to set an index inline  but there is no corresponding set columns   So we can transpose  then set index  and transpose back   However  the same single dtype versus mixed dtype caveat from solution 3 applies here  Single dtype df T set index np asarray new   T     x098  y765  z432 0     1     3     5 1     2     4     6  Mixed dtype df T set index np asarray new   T astype dict zip new  df dtypes        x098  y765  z432 0     1     3     5 1     2     4     6   Solution 5 Use a lambda in pd DataFrame rename that cycles through each element of new  In this solution  we pass a lambda that takes x but then ignores it   It also takes a y but doesn t expect it   Instead  an iterator is given as a default value and I can then use that to cycle through one at a time without regard to what the value of x is  df rename columns lambda x  y iter new   next y       x098  y765  z432 0     1     3     5 1     2     4     6  And as pointed out to me by the folks in sopython chat  if I add a   in between x and y  I can protect my y variable   Though  in this context I don t believe it needs protecting   It is still worth mentioning  df rename columns lambda x     y iter new   next y       x098  y765  z432 0     1     3     5 1     2     4     6

[python] Renaming columns in Pandas

Examples related to python

Examples related to pandas

Examples related to replace

Examples related to dataframe

Examples related to rename