[r] Calculate row means on subset of columns

Given a sample data frame:

C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)

DF
    ID C1 C2 C3
  1  A  3  3  5
  2  B  2  7  4
  3  C  4  3  3
  4  D  4  4  6
  5  E  5  5  3

What is the best way to create a second data frame that would contain the ID column and the mean of each row? Something like this:

ID  Mean
A    3.66
B    4.33
C    3.33
D    4.66
E    4.33

Something similar to:

RM<-rowMeans(DF[,2:4])

I'd like to keep the means aligned with their ID's.

This question is related to r dataframe

The answer is


(Another solution using pivot_longer & pivot_wider from latest Tidyr update)

You should try using pivot_longer to get your data from wide to long form Read latest tidyR update on pivot_longer & pivot_wider (https://tidyr.tidyverse.org/articles/pivot.html)

library(tidyverse)
C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)

Output here

  ID     mean
  <fct> <dbl>
1 A      3.67
2 B      4.33
3 C      3.33
4 D      4.67
5 E      4.33

Using dplyr:

library(dplyr)

# exclude ID column then get mean
DF %>%
  transmute(ID,
            Mean = rowMeans(select(., -ID)))

Or

# select the columns to include in mean
DF %>%
  transmute(ID,
            Mean = rowMeans(select(., C1:C3)))

#   ID     Mean
# 1  A 3.666667
# 2  B 4.333333
# 3  C 3.333333
# 4  D 4.666667
# 5  E 4.333333

Starting with your data frame DF, you could use the data.table package:

library(data.table)

## EDIT: As suggested by @MichaelChirico, setDT converts a
## data.frame to a data.table by reference and is preferred
## if you don't mind losing the data.frame
setDT(DF)

# EDIT: To get the column name 'Mean':

DF[, .(Mean = rowMeans(.SD)), by = ID]

#      ID     Mean
# [1,]  A 3.666667
# [2,]  B 4.333333
# [3,]  C 3.333333
# [4,]  D 4.666667
# [5,]  E 4.333333

You can create a new row with $ in your data frame corresponding to the Means

DF$Mean <- rowMeans(DF[,2:4])

rowMeans is nice, but if you are still trying to wrap your head around the apply family of functions, this is a good opprotunity to begin understanding it.

DF['Mean'] <- apply(DF[,2:4], 1, mean)

Notice I'm doing a slightly different assignment than the first example. This approach makes it easier to incorporate it into for loops.