Solution #2 of @Ben's answer is not reproducible in other more generic examples. It happens to give the correct lookup in the example because the unique HouseType
in houses
appear in increasing order. Try this:
hous <- read.table(header = TRUE, stringsAsFactors = FALSE, text="HouseType HouseTypeNo
Semi 1
ECIIsHome 17
Single 2
Row 3
Single 2
Apartment 4
Apartment 4
Row 3")
largetable <- data.frame(HouseType = as.character(sample(unique(hous$HouseType), 1000, replace = TRUE)), stringsAsFactors = FALSE)
lookup <- unique(hous)
Bens solution#2 gives
housenames <- as.numeric(1:length(unique(hous$HouseType)))
names(housenames) <- unique(hous$HouseType)
base2 <- data.frame(HouseType = largetable$HouseType,
HouseTypeNo = (housenames[largetable$HouseType]))
which when
unique(base2$HouseTypeNo[ base2$HouseType=="ECIIsHome" ])
[1] 2
when the correct answer is 17 from the lookup table
The correct way to do it is
hous <- read.table(header = TRUE, stringsAsFactors = FALSE, text="HouseType HouseTypeNo
Semi 1
ECIIsHome 17
Single 2
Row 3
Single 2
Apartment 4
Apartment 4
Row 3")
largetable <- data.frame(HouseType = as.character(sample(unique(hous$HouseType), 1000, replace = TRUE)), stringsAsFactors = FALSE)
housenames <- tapply(hous$HouseTypeNo, hous$HouseType, unique)
base2 <- data.frame(HouseType = largetable$HouseType,
HouseTypeNo = (housenames[largetable$HouseType]))
Now the lookups are performed correctly
unique(base2$HouseTypeNo[ base2$HouseType=="ECIIsHome" ])
ECIIsHome
17
I tried to edit Bens answer but it gets rejected for reasons I cannot understand.