While many options above function well, I found coercion of non-target variables to chr
problematic. Using ifelse
and grepl
within lapply
resolves this off-target effect (in limited testing). Using slarky's regular expression in grepl
:
set.seed(42)
x1 <- sample(c("a","b"," ", "a a", NA), 10, TRUE)
x2 <- sample(c(rnorm(length(x1),0, 1), NA), length(x1), TRUE)
df <- data.frame(x1, x2, stringsAsFactors = FALSE)
The problem of coercion to character class:
df2 <- lapply(df, function(x) gsub("^$|^ $", NA, x))
lapply(df2, class)
$x1
[1] "character"
$x2 [1] "character"
Resolution with use of ifelse:
df3 <- lapply(df, function(x) ifelse(grepl("^$|^ $", x)==TRUE, NA, x))
lapply(df3, class)
$x1
[1] "character"
$x2 [1] "numeric"