You can learn more about AUROC in this blog post by Miron Kursa:
He provides a fast function for AUROC:
# By Miron Kursa https://mbq.me
auroc <- function(score, bool) {
n1 <- sum(!bool)
n2 <- sum(bool)
U <- sum(rank(score)[!bool]) - n1 * (n1 + 1) / 2
return(1 - U / n1 / n2)
}
Let's test it:
set.seed(42)
score <- rnorm(1e3)
bool <- sample(c(TRUE, FALSE), 1e3, replace = TRUE)
pROC::auc(bool, score)
mltools::auc_roc(score, bool)
ROCR::performance(ROCR::prediction(score, bool), "auc")@y.values[[1]]
auroc(score, bool)
0.51371668847094
0.51371668847094
0.51371668847094
0.51371668847094
auroc()
is 100 times faster than pROC::auc()
and computeAUC()
.
auroc()
is 10 times faster than mltools::auc_roc()
and ROCR::performance()
.
print(microbenchmark(
pROC::auc(bool, score),
computeAUC(score[bool], score[!bool]),
mltools::auc_roc(score, bool),
ROCR::performance(ROCR::prediction(score, bool), "auc")@y.values,
auroc(score, bool)
))
Unit: microseconds
expr min
pROC::auc(bool, score) 21000.146
computeAUC(score[bool], score[!bool]) 11878.605
mltools::auc_roc(score, bool) 5750.651
ROCR::performance(ROCR::prediction(score, bool), "auc")@y.values 2899.573
auroc(score, bool) 236.531
lq mean median uq max neval cld
22005.3350 23738.3447 22206.5730 22710.853 32628.347 100 d
12323.0305 16173.0645 12378.5540 12624.981 233701.511 100 c
6186.0245 6495.5158 6325.3955 6573.993 14698.244 100 b
3019.6310 3300.1961 3068.0240 3237.534 11995.667 100 ab
245.4755 253.1109 251.8505 257.578 300.506 100 a