Friday, November 30, 2018

jvm - Why do the execution time of independent code blocks depend on execution order in Scala?




I have a program written in Scala. I wanted to measure the execution time of different independent code blocks. When I did it in the obvious way (i.e. inserting System.nanoTime() before and after each block), I observed that the the execution time depends on the ordering of the blocks. The first some blocks always took more time than the others.



I created a minimalistic example that reproduces this behaviour. All code blocks are the same and call hashCode() for an array of integers, for simplicity.




package experiments

import scala.util.Random

/**
* Measuring execution time of a code block
*
* Minimalistic example
*/

object CodeBlockMeasurement {

def main(args: Array[String]): Unit = {
val numRecords = args(0).toInt
// number of independent measurements
val iterations = args(1).toInt

// Changes results a little bit, but not too much
// val records2 = Array.fill[Int](1)(0)
// records2.foreach(x => {})


for (_ <- 1 to iterations) {
measure(numRecords)
}
}

def measure(numRecords: Int): Unit = {
// using a new array every time
val records = Array.fill[Int](numRecords)(new Random().nextInt())
// block of code to be measured

def doSomething(): Unit = {
records.foreach(k => k.hashCode())
}
// measure execution time of the code-block
elapsedTime(doSomething(), "HashCodeExperiment")
}

def elapsedTime(block: => Unit, name: String): Unit = {
val t0 = System.nanoTime()
val result = block

val t1 = System.nanoTime()
// print out elapsed time in milliseconds
println(s"$name took ${(t1 - t0).toDouble / 1000000} ms")
}
}


After running the program with numRecords = 100000 and iterations = 10, my console looks like this:





HashCodeExperiment took 14.630283 ms
HashCodeExperiment took 7.125693 ms
HashCodeExperiment took 0.368151 ms
HashCodeExperiment took 0.431628 ms
HashCodeExperiment took 0.086455 ms
HashCodeExperiment took 0.056458 ms
HashCodeExperiment took 0.055138 ms
HashCodeExperiment took 0.062997 ms
HashCodeExperiment took 0.063736 ms
HashCodeExperiment took 0.056682 ms




Can somebody explain why is that? Shouldn't all be the same? Which is the real execution time?



Thanks a lot,
Peter




Environment parameters:
OS: ubuntu 14.04 LTS (64 bit)
IDE: IntelliJ IDEA 2016.1.1 (IU-145.597)
Scala: 2.11.7




Answer



It's Java's JIT kicking in. Initially the plain bytecode is executed but after some time (1.5k/10k invocations by default for Oracle JVM, see -XX:CompileThreshold) the optimizations start processing the actual executed native code which usually results in quite drastic performance improvements.



As Ivan mentions, then there's caching of intermediate bytecode/native code and various other technologies involved, one of the most significant ones being the Garbage Collector itself which cause even more variance to individual results. Depending how heavily the code allocates new objects this might absolutely trash performance whenever GC occurs, but that's a separate issue.



To remove such outlier results when microbenchmarking it is recommendable that you benchmark multiple iterations of the action and discard the bottom and top 5..10% results and do the performance estimation based on remaining samples.


No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...