[r] Rename multiple columns by names

Someone should have asked this already, but I couldn't find an answer. Say I have:

x = data.frame(q=1,w=2,e=3, ...and many many columns...)  

what is the most elegant way to rename an arbitrary subset of columns, whose position I don't necessarily know, into some other arbitrary names?

e.g. Say I want to rename "q" and "e" into "A" and "B", what is the most elegant code to do this?

Obviously, I can do a loop:

oldnames = c("q","e")
newnames = c("A","B")
for(i in 1:2) names(x)[names(x) == oldnames[i]] = newnames[i]

But I wonder if there is a better way? Maybe using some of the packages? (plyr::rename etc.)

This question is related to r dataframe rename r-faq

The answer is


So I recently ran into this myself, if you're not sure if the columns exist and only want to rename those that do:

existing <- match(oldNames,names(x))
names(x)[na.omit(existing)] <- newNames[which(!is.na(existing))]

names(x)[names(x) %in% c("q","e")]<-c("A","B")

There are a few answers mentioning the functions dplyr::rename_with and rlang::set_names already. By they are separate. this answer illustrates the differences between the two and the use of functions and formulas to rename columns.

rename_with from the dplyr package can use either a function or a formula to rename a selection of columns given as the .cols argument. For example passing the function name toupper:

library(dplyr)
rename_with(head(iris), toupper, starts_with("Petal"))

Is equivalent to passing the formula ~ toupper(.x):

rename_with(head(iris), ~ toupper(.x), starts_with("Petal"))

When renaming all columns, you can also use set_names from the rlang package. To make a different example, let's use paste0 as a renaming function. pasteO takes 2 arguments, as a result there are different ways to pass the second argument depending on whether we use a function or a formula.

rlang::set_names(head(iris), paste0, "_hi")
rlang::set_names(head(iris), ~ paste0(.x, "_hi"))

The same can be achieved with rename_with by passing the data frame as first argument .data, the function as second argument .fn, all columns as third argument .cols=everything() and the function parameters as the fourth argument .... Alternatively you can place the second, third and fourth arguments in a formula given as the second argument.

rename_with(head(iris), paste0, everything(), "_hi")
rename_with(head(iris), ~ paste0(.x, "_hi"))

rename_with only works with data frames. set_names is more generic and can also perform vector renaming

rlang::set_names(1:4, c("a", "b", "c", "d"))

Here is the most efficient way I have found to rename multiple columns using a combination of purrr::set_names() and a few stringr operations.

library(tidyverse)

# Make a tibble with bad names
data <- tibble(
    `Bad NameS 1` = letters[1:10],
    `bAd NameS 2` = rnorm(10)
)

data 
# A tibble: 10 x 2
   `Bad NameS 1` `bAd NameS 2`
   <chr>                 <dbl>
 1 a                    -0.840
 2 b                    -1.56 
 3 c                    -0.625
 4 d                     0.506
 5 e                    -1.52 
 6 f                    -0.212
 7 g                    -1.50 
 8 h                    -1.53 
 9 i                     0.420
 10 j                     0.957

# Use purrr::set_names() with annonymous function of stringr operations
data %>%
    set_names(~ str_to_lower(.) %>%
                  str_replace_all(" ", "_") %>%
                  str_replace_all("bad", "good"))

# A tibble: 10 x 2
   good_names_1 good_names_2
   <chr>               <dbl>
 1 a                  -0.840
 2 b                  -1.56 
 3 c                  -0.625
 4 d                   0.506
 5 e                  -1.52 
 6 f                  -0.212
 7 g                  -1.50 
 8 h                  -1.53 
 9 i                   0.420
10 j                   0.957

Lot's of sort-of-answers, so I just wrote the function so you can copy/paste.

rename <- function(x, old_names, new_names) {
    stopifnot(length(old_names) == length(new_names))
    # pull out the names that are actually in x
    old_nms <- old_names[old_names %in% names(x)]
    new_nms <- new_names[old_names %in% names(x)]

    # call out the column names that don't exist
    not_nms <- setdiff(old_names, old_nms)
    if(length(not_nms) > 0) {
        msg <- paste(paste(not_nms, collapse = ", "), 
            "are not columns in the dataframe, so won't be renamed.")
        warning(msg)
    }

    # rename
    names(x)[names(x) %in% old_nms] <- new_nms
    x
}

 x = data.frame(q = 1, w = 2, e = 3)
 rename(x, c("q", "e"), c("Q", "E"))

   Q w E
 1 1 2 3

Building on @user3114046's answer:

x <- data.frame(q=1,w=2,e=3)
x
#  q w e
#1 1 2 3

names(x)[match(oldnames,names(x))] <- newnames

x
#  A w B
#1 1 2 3

This won't be reliant on a specific ordering of columns in the x dataset.


With dplyr you would do:

library(dplyr)

df = data.frame(q = 1, w = 2, e = 3)
    
df %>% rename(A = q, B = e)

#  A w B
#1 1 2 3

Or if you want to use vectors, as suggested by @Jelena-bioinf:

library(dplyr)

df = data.frame(q = 1, w = 2, e = 3)

oldnames = c("q","e")
newnames = c("A","B")

df %>% rename_at(vars(oldnames), ~ newnames)

#  A w B
#1 1 2 3

L. D. Nicolas May suggested a change given rename_at is being superseded by rename_with:

df %>% 
  rename_with(~ newnames[which(oldnames == .x)], .cols = oldnames)

#  A w B
#1 1 2 3

If one row of the data contains the names you want to change all columns to you can do

names(data) <- data[row,]

Given data is your dataframe and row is the row number containing the new values.

Then you can remove the row containing the names with

data <- data[-row,]

You can get the name set, save it as a list, and then do your bulk renaming on the string. A good example of this is when you are doing a long to wide transition on a dataset:

names(labWide)
      Lab1    Lab10    Lab11    Lab12    Lab13    Lab14    Lab15    Lab16
1 35.75366 22.79493 30.32075 34.25637 30.66477 32.04059 24.46663 22.53063

nameVec <- names(labWide)
nameVec <- gsub("Lab","LabLat",nameVec)

names(labWide) <- nameVec
"LabLat1"  "LabLat10" "LabLat11" "LabLat12" "LabLat13" "LabLat14""LabLat15"    "LabLat16" " 

This is the function that you need: Then just pass the x in a rename(X) and it will rename all values that appear and if it isn't in there it won't error

rename <-function(x){
  oldNames = c("a","b","c")
  newNames = c("d","e","f")
  existing <- match(oldNames,names(x))
  names(x)[na.omit(existing)] <- newNames[which(!is.na(existing))]
  return(x)
}

If the table contains two columns with the same name then the code goes like this,

rename(df,newname=oldname.x,newname=oldname.y)

Another solution for dataframes which are not too large is (building on @thelatemail answer):

x <- data.frame(q=1,w=2,e=3)

> x
  q w e
1 1 2 3

colnames(x) <- c("A","w","B")

> x
  A w B
1 1 2 3

Alternatively, you can also use:

names(x) <- c("C","w","D")

> x
  C w D
1 1 2 3

Furthermore, you can also rename a subset of the columnnames:

names(x)[2:3] <- c("E","F")

> x
  C E F
1 1 2 3

This would change all the occurrences of those letters in all names:

 names(x) <- gsub("q", "A", gsub("e", "B", names(x) ) )

You can use a named vector.

With base R (maybe somewhat clunky):

x = data.frame(q = 1, w = 2, e = 3) 

rename_vec <- c(q = "A", e = "B")

names(x) <- ifelse(is.na(rename_vec[names(x)]), names(x), rename_vec[names(x)])

x
#>   A w B
#> 1 1 2 3

Or a dplyr option with !!!:

library(dplyr)

rename_vec <- c(A = "q", B = "e") # the names are just the other way round than in the base R way!

x %>% rename(!!!rename_vec)
#>   A w B
#> 1 1 2 3

The latter works because the 'big-bang' operator !!! is forcing evaluation of a list or a vector.

?`!!`

!!! forces-splice a list of objects. The elements of the list are spliced in place, meaning that they each become one single argument.


Update dplyr 1.0.0

The newest dplyr version became more flexible by adding rename_with() where _with refers to a function as input. The trick is to reformulate the character vector newnames into a formula (by ~), so it would be equivalent to function(x) return (newnames).

In my subjective opinion, that is the most elegant dplyr expression.

# shortest & most elegant expression
df %>% rename_with(~ newnames, oldnames)

A w B
1 1 2 3

Side note:

If you reverse the order, argument .fn must be specified as .fn is expected before .cols argument.

df %>% rename_with(oldnames, .fn = ~ newnames)

A w B
1 1 2 3

Sidenote, if you want to concatenate one string to all of the column names, you can just use this simple code.

colnames(df) <- paste("renamed_",colnames(df),sep="")

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe

Examples related to rename

Gradle - Move a folder from ABC to XYZ How to rename a component in Angular CLI? How do I completely rename an Xcode project (i.e. inclusive of folders)? How to rename a directory/folder on GitHub website? How do I rename both a Git local and remote branch name? Convert row to column header for Pandas DataFrame, Renaming files using node.js Replacement for "rename" in dplyr Rename multiple columns by names Rename specific column(s) in pandas

Examples related to r-faq

What does "The following object is masked from 'package:xxx'" mean? What does "Error: object '<myvariable>' not found" mean? How do I deal with special characters like \^$.?*|+()[{ in my regex? What does %>% function mean in R? How to plot a function curve in R Use dynamic variable names in `dplyr` Error: unexpected symbol/input/string constant/numeric constant/SPECIAL in my code How should I deal with "package 'xxx' is not available (for R version x.y.z)" warning? How to select the row with the maximum value in each group R data formats: RData, Rda, Rds etc