5

I put together a microbenchmark that seemed to show that the following types of calls took roughly the same amount of time across many iterations after warmup.

static.method(arg);
static.finalAnonInnerClassInstance.apply(arg);
static.modifiedNonFinalAnonInnerClassInstance.apply(arg);

Has anyone found evidence that these different types of calls in the aggregate will have different performance characteristics? My findings are they don't, but I found that a little surprising (especially knowing the bytecode is quite different for at least the static call) so I want to find if others have any evidence either way.

If they indeed had the same exact performance, then that would mean there was no penalty to having that level of indirection in the modified non final case.

I know standard optimization advice would be: "write your code and profile" but I'm writing a framework code generation kind of thing so there is no specific code to profile, and the choice between static and non final is fairly important for both flexibility and possibly performance. I am using framework code in the microbenchmark which I why I can't include it here.

My test was run on Windows JDK 1.7.0_06.

4
  • 7
    This level of microbenchmarking is pretty much pointless. My guess is what you're seeing is the JIT inlining the call chains, as they're an obvious hotspot if they're in the middle of the benchmark loop. The reason why the advice is "write your code and profile" isn't laziness, it's because so many environmental factors affect performance that you're unlikely to gain any tangible benefit by microoptimising. The best way to replicate these environmental factors is to measure on a realistic dataset and hardware. Commented Nov 22, 2012 at 0:58
  • 1
    Function calls take few instructions at best. This only matters if all you're doing is calling functions (and they cannot be inlined). Commented Nov 22, 2012 at 0:59
  • Further to what @millimoose says.... The JVM will compile bytecode at run-time to machine code (via the Just In Time - JIT compiler). All future invocations of the code will be very fast. Well, fast compared with interpreting bytecode. Commented Nov 22, 2012 at 1:01
  • I'm afraid, I cannot figure out what those "calls" mean. (For a start, they are not valid Java.) Unless you show us the actual code, we can only guess what it is doing, and our explanations are then guesswork based on guesswork ... Commented Nov 22, 2012 at 1:24

1 Answer 1

1

If you benchmark it in a tight loop, JVM would cache the instance, so there's no apparent difference.

If the code is executed in a real application,

  1. if it's expected to be executed back-to-back very quickly, for example, String.length() used in for(int i=0; i<str.length(); i++){ short_code; }, JVM will optimize it, no worries.

  2. if it's executed frequently enough, that the instance is mostly likely in CPU's L1 cache, the extra load of the instance is very fast; no worries.

  3. otherwise, there is a non trivial overhead; but it's executed so infrequently, the overhead is almost impossible to detect among the overall cost of the application. no worries.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.