I just benchmarked these two approaches on data frame with 663,552 rows using the following code:
system.time(
resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
s <- strsplit(x, ".", fixed=TRUE)[[1]]
s[length(s)]
})
)
user system elapsed
3.722 0.000 3.594
and
system.time(
resultsByLevel$subject <- sapply(resultsByLevel$variable, function(x) {
s <- strsplit(x, ".", fixed=TRUE)[[1]]
tail(s, n=1)
})
)
user system elapsed
28.174 0.000 27.662
So, assuming you're working with vectors, accessing the length position is significantly faster.