How can I get the average (mean) of selected columns

29

I would like to get the average for certain columns for each row.

I have this data:

w=c(5,6,7,8)
x=c(1,2,3,4)
y=c(1,2,3)
length(y)=4
z=data.frame(w,x,y)

Which returns:

  w x  y
1 5 1  1
2 6 2  2
3 7 3  3
4 8 4 NA

I would like to get the mean for certain columns, not all of them. My problem is that there are a lot of NAs in my data. So if I wanted the mean of x and y, this is what I would like to get back:

  w x  y mean
1 5 1  1    1
2 6 2  2    2
3 7 3  3    3
4 8 4 NA    4

I guess I could do something like z$mean=(z$x+z$y)/2 but the last row for y is NA so obviously I do not want the NA to be calculated and I should not be dividing by two. I tried cumsum but that returns NAs when there is a single NA in that row. I guess I am looking for something that will add the selected columns, ignore the NAs, get the number of selected columns that do not have NAs and divide by that number. I tried ??mean and ??average and am completely stumped.

ETA: Is there also a way I can add a weight to a specific column?

This question is tagged with r

~ Asked on 2012-02-28 22:05:17

The Best Answer is


46

Here are some examples:

> z$mean <- rowMeans(subset(z, select = c(x, y)), na.rm = TRUE)
> z
  w x  y mean
1 5 1  1    1
2 6 2  2    2
3 7 3  3    3
4 8 4 NA    4

weighted mean

> z$y <- rev(z$y)
> z
  w x  y mean
1 5 1 NA    1
2 6 2  3    2
3 7 3  2    3
4 8 4  1    4
> 
> weight <- c(1, 2) # x * 1/3 + y * 2/3
> z$wmean <- apply(subset(z, select = c(x, y)), 1, function(d) weighted.mean(d, weight, na.rm = TRUE))
> z
  w x  y mean    wmean
1 5 1 NA    1 1.000000
2 6 2  3    2 2.666667
3 7 3  2    3 2.333333
4 8 4  1    4 2.000000

~ Answered on 2012-02-28 22:20:02


23

Try using rowMeans:

z$mean=rowMeans(z[,c("x", "y")], na.rm=TRUE)

  w x  y mean
1 5 1  1    1
2 6 2  2    2
3 7 3  3    3
4 8 4 NA    4

~ Answered on 2012-02-28 22:20:25


Most Viewed Questions: