[java] How do I write a correct micro-benchmark in Java?

Tips about writing micro benchmarks from the creators of Java HotSpot:

Rule 0: Read a reputable paper on JVMs and micro-benchmarking. A good one is Brian Goetz, 2005. Do not expect too much from micro-benchmarks; they measure only a limited range of JVM performance characteristics.

Rule 1: Always include a warmup phase which runs your test kernel all the way through, enough to trigger all initializations and compilations before timing phase(s). (Fewer iterations is OK on the warmup phase. The rule of thumb is several tens of thousands of inner loop iterations.)

Rule 2: Always run with -XX:+PrintCompilation, -verbose:gc, etc., so you can verify that the compiler and other parts of the JVM are not doing unexpected work during your timing phase.

Rule 2.1: Print messages at the beginning and end of timing and warmup phases, so you can verify that there is no output from Rule 2 during the timing phase.

Rule 3: Be aware of the difference between -client and -server, and OSR and regular compilations. The -XX:+PrintCompilation flag reports OSR compilations with an at-sign to denote the non-initial entry point, for example: Trouble$1::run @ 2 (41 bytes). Prefer server to client, and regular to OSR, if you are after best performance.

Rule 4: Be aware of initialization effects. Do not print for the first time during your timing phase, since printing loads and initializes classes. Do not load new classes outside of the warmup phase (or final reporting phase), unless you are testing class loading specifically (and in that case load only the test classes). Rule 2 is your first line of defense against such effects.

Rule 5: Be aware of deoptimization and recompilation effects. Do not take any code path for the first time in the timing phase, because the compiler may junk and recompile the code, based on an earlier optimistic assumption that the path was not going to be used at all. Rule 2 is your first line of defense against such effects.

Rule 6: Use appropriate tools to read the compiler's mind, and expect to be surprised by the code it produces. Inspect the code yourself before forming theories about what makes something faster or slower.

Rule 7: Reduce noise in your measurements. Run your benchmark on a quiet machine, and run it several times, discarding outliers. Use -Xbatch to serialize the compiler with the application, and consider setting -XX:CICompilerCount=1 to prevent the compiler from running in parallel with itself. Try your best to reduce GC overhead, set Xmx(large enough) equals Xms and use UseEpsilonGC if it is available.

Rule 8: Use a library for your benchmark as it is probably more efficient and was already debugged for this sole purpose. Such as JMH, Caliper or Bill and Paul's Excellent UCSD Benchmarks for Java.

Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to jvm

Cannot inline bytecode built with JVM target 1.8 into bytecode that is being built with JVM target 1.6 How can I get a random number in Kotlin? Kotlin unresolved reference in IntelliJ Is JVM ARGS '-Xms1024m -Xmx2048m' still useful in Java 8? Android Gradle Could not reserve enough space for object heap Android java.exe finished with non-zero exit value 1 Android Studio Gradle project "Unable to start the daemon process /initialization of VM" Android Studio - No JVM Installation found Android Studio error: "Environment variable does not point to a valid JVM installation" Installing Android Studio, does not point to a valid JVM installation error

Examples related to benchmarking

Why is 2 * (i * i) faster than 2 * i * i in Java? ab load testing Why is reading lines from stdin much slower in C++ than Python? Execution time of C program How to use clock() in C++ Clang vs GCC - which produces faster binaries? How to Calculate Execution Time of a Code Snippet in C++ Which is faster: multiple single INSERTs or one multiple-row INSERT? What do 'real', 'user' and 'sys' mean in the output of time(1)? How do I write a correct micro-benchmark in Java?

Examples related to jvm-hotspot

Class JavaLaunchHelper is implemented in both ... libinstrument.dylib. One of the two will be used. Which one is undefined -XX:MaxPermSize with or without -XX:PermSize How do I write a correct micro-benchmark in Java? Real differences between "java -server" and "java -client"?

Examples related to microbenchmark

How do I write a correct micro-benchmark in Java?