Just step through the machine code in your IDE, and you'll see why it's faster (if it's faster). It leaves out a lot of hand-holding. Chances are your Cxx can also be told to leave it out too, in which case it should be about the same.
Compiler optimizations are overrated, as are almost all perceptions about language speed.
Optimization of generated code only makes a difference in hotspot code, that is, tight algorithms devoid of function calls (explicit or implicit). Anywhere else, it achieves very little.