[r] How can a add a row to a data frame in R?

In R, how do you add a new row to a data frame once the data frame has already been initialized?

So far I have this:

df <- data.frame("hi", "bye")
names(df) <- c("hello", "goodbye")

#I am trying to add "hola" and "ciao" as a new row
de <- data.frame("hola", "ciao")

merge(df, de) # Adds to the same row as new columns

# Unfortunately, I couldn't find an rbind() solution that wouldn't give me an error

Any help would be appreciated

This question is related to r dataframe

The answer is


To formalize what someone else used setNames for:

add_row <- function(original_data, new_vals_list){ 
  # appends row to dataset while assuming new vals are ordered and classed appropriately. 
  # new_vals must be a list not a single vector. 
  rbind(
    original_data,
    setNames(data.frame(new_vals_list), colnames(original_data))
    )
  }

It preserves class when legal and passes errors elsewhere.

m <- mtcars[ ,1:3]
m$cyl <- as.factor(m$cyl)
str(m)

#'data.frame':  32 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
# $ disp: num  160 160 108 258 360 ...

Factor preserved when adding 4, even though it was passed as a numeric.

str(add_row(m, list(20,4,160)))
#'data.frame':  33 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ... 
# $ disp: num  160 160 108 258 360 ...

Attempting to pass a non- 4,6,8 would return an error that factor level is invalid.

str(add_row(m, list(20,3,160)))
# 'data.frame': 33 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
# $ disp: num  160 160 108 258 360 ...
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 3) :
  invalid factor level, NA generated

There's now add_row() from the tibble or tidyverse packages.

library(tidyverse)
df %>% add_row(hello = "hola", goodbye = "ciao")

Unspecified columns get an NA.


I need to add stringsAsFactors=FALSE when creating the dataframe.

> df <- data.frame("hello"= character(0), "goodbye"=character(0))
> df
[1] hello   goodbye
<0 rows> (or 0-length row.names)
> df[nrow(df) + 1,] = list("hi","bye")
Warning messages:
1: In `[<-.factor`(`*tmp*`, iseq, value = "hi") :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, iseq, value = "bye") :
  invalid factor level, NA generated
> df
  hello goodbye
1  <NA>    <NA>
> 

.

> df <- data.frame("hello"= character(0), "goodbye"=character(0), stringsAsFactors=FALSE)
> df
[1] hello   goodbye
<0 rows> (or 0-length row.names)
> df[nrow(df) + 1,] = list("hi","bye")
> df[nrow(df) + 1,] = list("hola","ciao")
> df[nrow(df) + 1,] = list(hello="hallo",goodbye="auf wiedersehen")
> df
  hello         goodbye
1    hi             bye
2  hola            ciao
3 hallo auf wiedersehen
> 

Let's make it simple:

df[nrow(df) + 1,] = c("v1","v2")

Not terribly elegant, but:

data.frame(rbind(as.matrix(df), as.matrix(de)))

From documentation of the rbind function:

For rbind column names are taken from the first argument with appropriate names: colnames for a matrix...


I like list instead of c because it handles mixed data types better. Adding an additional column to the original poster's question:

#Create an empty data frame
df <- data.frame(hello=character(), goodbye=character(), volume=double())
de <- list(hello="hi", goodbye="bye", volume=3.0)
df = rbind(df,de, stringsAsFactors=FALSE)
de <- list(hello="hola", goodbye="ciao", volume=13.1)
df = rbind(df,de, stringsAsFactors=FALSE)

Note that some additional control is required if the string/factor conversion is important.

Or using the original variables with the solution from MatheusAraujo/Ytsen de Boer:

df[nrow(df) + 1,] = list(hello="hallo",goodbye="auf wiedersehen", volume=20.2)

Note that this solution doesn't work well with the strings unless there is existing data in the dataframe.


Make certain to specify stringsAsFactors=FALSE when creating the dataframe:

> rm(list=ls())
> trigonometry <- data.frame(character(0), numeric(0), stringsAsFactors=FALSE)
> colnames(trigonometry) <- c("theta", "sin.theta")
> trigonometry
[1] theta     sin.theta
<0 rows> (or 0-length row.names)
> trigonometry[nrow(trigonometry) + 1, ] <- c("0", sin(0))
> trigonometry[nrow(trigonometry) + 1, ] <- c("pi/2", sin(pi/2))
> trigonometry
  theta sin.theta
1     0         0
2  pi/2         1
> typeof(trigonometry)
[1] "list"
> class(trigonometry)
[1] "data.frame"

Failing to use stringsAsFactors=FALSE when creating the dataframe will result in the following error when attempting to add the new row:

> trigonometry[nrow(trigonometry) + 1, ] <- c("0", sin(0))
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "0") :
  invalid factor level, NA generated

Or, as inspired by @MatheusAraujo:

df[nrow(df) + 1,] = list("v1","v2")

This would allow for mixed data types.


There is a simpler way to append a record from one dataframe to another IF you know that the two dataframes share the same columns and types. To append one row from xx to yy just do the following where i is the i'th row in xx.

yy[nrow(yy)+1,] <- xx[i,]

Simple as that. No messy binds. If you need to append all of xx to yy, then either call a loop or take advantage of R's sequence abilities and do this:

zz[(nrow(zz)+1):(nrow(zz)+nrow(yy)),] <- yy[1:nrow(yy),]

If you want to make an empty data frame and add contents in a loop, the following may help:

# Number of students in class
student.count <- 36

# Gather data about the students
student.age <- sample(14:17, size = student.count, replace = TRUE)
student.gender <- sample(c('male', 'female'), size = student.count, replace = TRUE)
student.marks <- sample(46:97, size = student.count, replace = TRUE)

# Create empty data frame
student.data <- data.frame()

# Populate the data frame using a for loop
for (i in 1 : student.count) {
    # Get the row data
    age <- student.age[i]
    gender <- student.gender[i]
    marks <- student.marks[i]

    # Populate the row
    new.row <- data.frame(age = age, gender = gender, marks = marks)

    # Add the row
    student.data <- rbind(student.data, new.row)
}

# Print the data frame
student.data

Hope it helps :)