It is a known issue, and one possible remedy is provided by drop.levels()
in the gdata package where your example becomes
> drop.levels(subdf)
letters numbers
1 a 1
2 b 2
3 c 3
> levels(drop.levels(subdf)$letters)
[1] "a" "b" "c"
There is also the dropUnusedLevels
function in the Hmisc package. However, it only works by altering the subset operator [
and is not applicable here.
As a corollary, a direct approach on a per-column basis is a simple as.factor(as.character(data))
:
> levels(subdf$letters)
[1] "a" "b" "c" "d" "e"
> subdf$letters <- as.factor(as.character(subdf$letters))
> levels(subdf$letters)
[1] "a" "b" "c"