1

I have a microservice (springboot) inside a docker container. I am also passing a long list of JVM flags so that the program runs on that VM environment to test the latency values with each flag combination.

I am using this command to start the container:

docker run --rm --name factorialorialContainer -p 8080:8080 -e JAVA_OPTIONS="$(cat /Users/sulekahelmini/Documents/fyp/fyp_work/MLscripts/flags.txt)" suleka96/factorial:latest

The flags.txt looks like this:

-XX:-ResizePLAB -XX:+ResizeOldPLAB -XX:+AlwaysPreTouch -XX:-ParallelRefProcEnabled -XX:-ParallelRefProcBalancingEnabled -XX:-UseTLAB -XX:-ResizeTLAB -XX:+ZeroTLAB -XX:+FastTLABRefill -XX:-UseAutoGCSelectPolicy -XX:-UseAdaptiveSizePolicy -XX:+UsePSAdaptiveSurvivorSizePolicy -XX:-UseAdaptiveGenerationSizePolicyAtMinorCollection -XX:+UseAdaptiveGenerationSizePolicyAtMajorCollection -XX:-UseAdaptiveSizePolicyWithSystemGC -XX:+UseAdaptiveGCBoundary -XX:+UseAdaptiveSizePolicyFootprintGoal -XX:YoungPLABSize=4354 -XX:OldPLABSize=1141 -XX:GCTaskTimeStampEntries=255 -XX:TargetPLABWastePct=5 -XX:PLABWeight=73 -XX:OldPLABWeight=21 -XX:MarkStackSize=5927008 -XX:MarkStackSizeMax=579749070 -XX:RefDiscoveryPolicy=1 -XX:InitiatingHeapOccupancyPercent=49 -XX:ErgoHeapSizeLimit=0 -XX:MaxRAMFraction=3 -XX:MinRAMFraction=3 -XX:InitialRAMFraction=74 -XX:AutoGCSelectPauseMillis=2934 -XX:AdaptiveSizeThroughPutPolicy=0 -XX:AdaptiveSizePausePolicy=0 -XX:AdaptiveSizePolicyInitializingSteps=24 -XX:AdaptiveSizePolicyOutputInterval=4060 -XX:AdaptiveSizePolicyWeight=11 -XX:AdaptiveTimeWeight=69 -XX:PausePadding=39200 -XX:PromotedPadding=3 -XX:SurvivorPadding=1 -XX:ThresholdTolerance=11 -XX:AdaptiveSizePolicyCollectionCostMargin=68 -XX:YoungGenerationSizeIncrement=28 -XX:YoungGenerationSizeSupplement=81 -XX:YoungGenerationSizeSupplementDecay=9 -XX:TenuredGenerationSizeIncrement=48 -XX:TenuredGenerationSizeSupplement=63 -XX:TenuredGenerationSizeSupplementDecay=2 -XX:MaxGCPauseMillis=16157174462788231168 -XX:GCPauseIntervalMillis=3009 -XX:GCTimeRatio=77 -XX:AdaptiveSizeDecrementScaleFactor=4 -XX:AdaptiveSizeMajorGCDecayTimeScale=9 -XX:MinSurvivorRatio=4 -XX:InitialSurvivorRatio=4 -XX:BaseFootPrintEstimate=463170010 -XX:GCHeapFreeLimit=4 -XX:ProcessDistributionStride=4 -Xms338224019 -Xmx1103493303

when I run the above-mentioned docker command I get the below error:

#

A fatal error has been detected by the Java Runtime Environment:

#

SIGSEGV (0xb) at pc=0x00007f9b014fe3c3, pid=1, tid=0x00007f9b10633b10

#

JRE version: OpenJDK Runtime Environment (8.0_212-b04) (build 1.8.0_212-b04)

Java VM: OpenJDK 64-Bit Server VM (25.212-b04 mixed mode linux-amd64 compressed oops)

Derivative: IcedTea 3.12.0

Distribution: Custom build (Sat May 4 17:33:35 UTC 2019)

Problematic frame:

j java.net.URLStreamHandler.toExternalForm(Ljava/net/URL;)Ljava/lang/String;+88

#

Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

#

An error report file with more information is saved as:

/usr/src/factorial/hs_err_pid1.log Compiled method (c1) 1054 256 1 java.net.URL::getRef (5 bytes) total in heap

[0x00007f9b0166bad0,0x00007f9b0166bd60] = 656 relocation
[0x00007f9b0166bbf8,0x00007f9b0166bc20] = 40 main code
[0x00007f9b0166bc20,0x00007f9b0166bca0] = 128 stub code
[0x00007f9b0166bca0,0x00007f9b0166bd30] = 144 scopes data
[0x00007f9b0166bd30,0x00007f9b0166bd38] = 8 scopes pcs
[0x00007f9b0166bd38,0x00007f9b0166bd58] = 32 dependencies
[0x00007f9b0166bd58,0x00007f9b0166bd60] = 8 #

If you would like to submit a bug report, please include

instructions on how to reproduce the bug and visit:

https://icedtea.classpath.org/bugzilla

#

What am I doing wrong here?

7
  • do you have break lines in your txt file ? you can use shell vars: export SOME_VARS = "-XX:..." then ... -e JAVA_OPTIONS=${SOME_VARS}, also try removing flags from your text file, like having only one, and keep adding them... Commented Feb 14, 2020 at 22:13
  • 4
    Just wow! Do you understand even 1/10 of the flags here? Some are duplicated, some are deprecated, some make no sense, and I only talking about the ones I know. Fantastic! Commented Feb 15, 2020 at 5:08
  • Do you have a Dockerfile/similar for us to reproduce this issue with? A minimal example? Commented Feb 15, 2020 at 15:42
  • 2
    @Eugene there are no duplicates, only very similar looking (sometimes contradicting) options. My Favorite is -XX:MaxGCPauseMillis=16157174462788231168. Didn’t even know that the option parser handles values larger than Long.MAX_VALUE. But if you even encounter a GC pause larger than 512 mullion years, it’s a good idea to put a limit on it (but mind that this is not a hard limit)… Commented Feb 17, 2020 at 11:01
  • 1
    @Holger I have waited for the weekend to ask for your wisdom here, if that would not be too much to ask for Commented Feb 17, 2020 at 11:16

2 Answers 2

7

It's -XX:+ZeroTLAB option which is broken. Just don't use this flag.

I guess a random option generator is involved here for the purpose of performance tuning. While the idea itself is fine, and there are known cases1,2 when Machine Learning with Bayesian Optimization helped to find better values of JVM options, the key point here is to include in the experiments only those options which meaning and implications you understand well.

Also the experiments should take care of the reasonable ranges of each option, as well as the mutual relationship between connected options.

However, the above option list looks totally random and does not make much sense (maybe, except for JVM testing). No surprise that such configuration may produce unpredictable results, including a JVM crash.


1 Automated Tuning of the JVM with Bayesian Optimization
2 Performance tuning Twitter services with Graal and Machine Learning

Sign up to request clarification or add additional context in comments.

Comments

0

Very similar I was to trying to run cassandra and it errored immediately with "Quit" thats it. The exit code was 131 which is some java type error. These were the flags:

-XX:+AlwaysPreTouch -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler -XX:G1RSetUpdatingPauseTimePercent=5 -XX:GCLogFileSize=10485760 -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=17179869184 -XX:+ManagementServer -XX:MaxGCPauseMillis=500 -XX:MaxHeapSize=17179869184 -XX:MaxNewSize=2147483648 -XX:MaxRAMFraction=2 -XX:NewSize=2147483648 -XX:NumberOfGCLogFiles=5 -XX:OnOutOfMemoryError=kill -9 %p -XX:+PerfDisableSharedMem -XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution -XX:+ResizeTLAB -XX:StringTableSize=1000003 -XX:ThreadPriorityPolicy=42 -XX:ThreadStackSize=256 -XX:+UnlockExperimentalVMOptions -XX:-UseBiasedLocking -XX:+UseCGroupMemoryLimitForHeap -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation -XX:+UseTLAB -XX:+UseThreadPriorities

After carefully removing a few flags it came down to "-XX:+AlwaysPreTouch".

Sigh.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.