Assuming that your original dataset is similar to the one you created (i.e. with NA
as character
. You could specify na.strings
while reading the data using read.table
. But, I guess NAs would be detected automatically.
The price
column is factor
which needs to be converted to numeric
class. When you use as.numeric
, all the non-numeric elements (i.e. "NA"
, FALSE) gets coerced to NA
) with a warning.
library(dplyr)
df %>%
mutate(price=as.numeric(as.character(price))) %>%
group_by(company, year, product) %>%
summarise(total.count=n(),
count=sum(is.na(price)),
avg.price=mean(price,na.rm=TRUE),
max.price=max(price, na.rm=TRUE))
I am using the same dataset
(except the ...
row) that was showed.
df = tbl_df(data.frame(company=c("Acme", "Meca", "Emca", "Acme", "Meca","Emca"),
year=c("2011", "2010", "2009", "2011", "2010", "2013"), product=c("Wrench", "Hammer",
"Sonic Screwdriver", "Fairy Dust", "Kindness", "Helping Hand"), price=c("5.67",
"7.12", "12.99", "10.99", "NA",FALSE)))