You did not post the code generated by the compiler, so there' some guesswork here, but even without having seen it, one can say that this:
test rax, 1
jpe even
... has a 50% chance of mispredicting the branch, and that will come expensive.
The compiler almost certainly does both computations (which costs neglegibly more since the div/mod is quite long latency, so the multiply-add is "free") and follows up with a CMOV. Which, of course, has a zero percent chance of being mispredicted.