JVM Performance Tuning Tips
-> Problems
– Same as client side, default collector used most of the time
– heaps are sized bigger
– stop the world collection
-> See
– larger pauses as heap sizes are bigger
– GC frequency dependent on load
– CPUs are idle during collection
– scalability problems
Solution
Performance Tuning
-> Manual tuning by application modeling
– What is application modeling
– GC Portal
– General Tuning Tips
-> Automatic tuning
– Ergonomics
Application Modeling From A GC Perspective
Application Modeling and Performance Analysis
-> What is it->
-> Recommendations based on the model
-> Empirical modeling
-> Theoretical projections
Application Modeling And Performance Tuning: GC Perspective
-> Goal : remove the unpredictable behavior of an application
-> Construct a mathematical model by mining the verbose GC log files
-> The model takes into account
– Incoming load information
– Data in verbosegc log files
The Model
Incoming load information
-> Transaction Rate (Allocation Rate)
-> Active Transaction Duration
– Lifetimes of short and long lived data
-> Size of objects per transaction
The Model (1)
Data in the verbose GC log files
-> GC pauses
– Young and Old generation pauses
– Time to start-stop application threads
– Application time
-> GC frequency
– Young and old generation periodicity
-> Rate of allocation/promotion of objects
-> Direct allocation of objects in old generation
The Model (2)
Data in the verbosegc log files (Contd.)
-> Total
– GC time, Application time
– Objects promoted
– Garbage collected
-> Heap sizes
– Size of Young generation (Eden, Semi-Space)
– Size of Old generation
– Initial and Final Size of old generation
– Size of Permanent generation
– Average occupancy, and heap thresholds for GC
Verbosegc Log Sample 0.740905: [GC {Heap before GC invocations=8: Heap def new generation total 1536K, used 1055K [0xf2c00000, 0xf2e00000, 0xf2e00000) eden space 1024K, 99% used [0xf2c00000, 0xf2cfdfe0, 0xf2d00000) from space 512K, 7% used [0xf2d00000, 0xf2d09c50, 0xf2d80000) to space 512K, 0% used [0xf2d80000, 0xf2d80000, 0xf2e00000) concurrent mark-sweep generation total 59392K, used 540K [0xf2e00000, 0xf6800000, 0xf6800000) concurrent-mark-sweep perm gen total 4096K, used 1158K [0xf6800000, 0xf6c00000, 0xfa800000) 0.741773: [DefNew Desired survivor size 262144 bytes, new threshold 1 (max 31) age 1: 280048 bytes, 280048 total age 2: 40016 bytes, 320064 total : 1055K->312K(1536K), 0.0048282 secs] 1595K->853K(60928K) Heap after GC invocations=9:Heap def new generation total 1536K, used 312K [0xf2c00000, 0xf2e00000, 0xf2e00000) eden space 1024K, 0% used [0xf2c00000, 0xf2c00000, 0xf2d00000 ) from space 512K, 61% used [0xf2d80000, 0xf2dce240, 0xf2e00000) to space 512K, 0% used [0xf2d00000, 0xf2d00000, 0xf2d80000 ) concurrent mark-sweep generation total 59392K, used 540K [0xf2e00000, 0xf6800000, 0xf6800000) concurrent-mark-sweep perm gen total 4096K, used 1158K [0xf6800000, 0xf6c00000, 0xfa800000)} , 0.0063803 secs]
Data Calculated
-> GC sequential overhead (Directly related to application throughput)
-> GC concurrent overhead
-> Average size of objects
-> Active data duration (long and short term objects)
-> Actual throughput
-> Application efficiency
-> Speedup (Amdahl’s law)
-> % CPU utilization
-> Memory Leak detection
General Recommendations based on the model (1)
-> General JVM Tuning and Sizing methodology
– Size of old generation = Call rate * active call duration * long lived data/call
– Size of young generation = Call rate * expected periodicity of GC * short lived data/call
-> for desired pause and frequency
-> Reduce GC pauses
-> Reduce GC sequential overhead
General Recommendations
Based On The Model (2)
-> Size the young and old generation heaps to handle a given load
-> Detect memory leaks
-> Choice of collector
-> Choice of the different JVM options and switches
Empirical Modeling
-> Rank the Application runs based on data analyzed from the verbosegc logs
-> Choose the optimum JVM environment based on criteria:
– Heap sizes
– No. of Processors
– GC sequential overhead
– GC concurrent overhead, etc.
– Application efficiency
Theoretical Projections For Tuning Based On The Model
-> “What-if” scenarios could be tried
– How GC behavior changes with change in Application and JVM parameters
-> “What-if” input parameters include:
– Size of young generation
– Size of old generation
– Request rate/Load
– Garbage/request
– No. of processors
Theoretical Projections For Tuning
-> Projection output shows :
– what could be the
-> GC pause (latency)
-> GC frequency
-> GC sequential load (bandwidth)
-> % CPU utilization, Speedup
-> Application efficiency
-> Allocation rate, Promotion rate
-> Size and duration of Short lived data
-> Size and duration of Long lived data
GC Portal
Enables as a service, Application Modeling and Performance Tuning from GC perspective.
-> Implemented in J2EE
-> Allows developers to submit log files, and analyze application behavior
-> Portal can be used to performance tune, and size application to run optimally under lean, peak, and burst conditions
GC Portal
-> Plots and displays graphically GC behavior over time. Parameters include:
– GC pauses (Max. and Average)
– GC frequency
– GC sequential load
– GC concurrent load
– Garbage Allocation rate
– Garbage Promotion rate
Snapshot from the GC portal Graphical Engine
GC Portal
-> Provides General Recommendations
-> Projections for sizing and tuning via
“what-if” scenarios
-> Empirical modeling
Snapshot from the GC portal
What-if scenarios
GC General Tuning Tips
Reducing Collection Times
-> Use -Xconcgc for low pause applications
-> Use -XX:+AggressiveHeap for throughput applications
– Use -XX:+PrintCommandLine to see AggressiveOptions, and use this to tune further
-> Size Permanent Generation
-> Reduce pooled objects
-> Using NIO
-> Avoid System.gc() and distributed RMI GC
– Use -XX:+DisableExplicitGC
-> Making immutables, mutables
– String -> String Buffer for String manipulation, and maybe storage
-> Avoiding old generation undersized heaps
– Reduces collection time, but leads to lot of other problems like fragmentation, triggers Full GC
Reducing Frequency Of GC
-> Frequency of a collection is dependent on
– Size of young and old generations
– Incoming load
– Object life time
-> Increase young generation to decrease frequency of collection but this will increase pause
– Choose a size where pause is tolerable
-> Increase in load will fill up the heap faster so increases collection frequency
– Increase heap to reduce frequency
-> Increase in lifetime of objects increases frequency as live objects take space
– Keep live objects to the needed minimum
Sizing The Heap
-> Heap size influences the following
– GC frequency and collection times
– Number of short and long term objects
– Fragmentation and locality problems
-> Undersized heap with concurrent collector
– leads to Full GCs with increase in load
– Fragmentation problems
-> Oversized heap
– leads to increased collection times
– locality problems (smear problem)
– Use ISM and variable page sizes to reduce smear problem
-> Size heap to handle peak and burst loads
Improving Execution Efficiency
-> GC Portal computes execution efficiency
-> Efficiency calculated using Amdahl’s law
-> Translates to CPU utilization
-> Higher this value the better
-> Increase efficiency by reducing serial parts
– Reducing GC pause & frequency
– Reducing long term objects and increasing short term objects
– Creating only needed objects like using NIO, mutables
– Avoiding Full GC, For e.g. RMI DGC, undersized heaps
– Choosing optimum heap size to reduce smear effect
Other Ways To Improve Performance On Solaris
-> Using the Solaris RT (real-time) scheduling class
-> Using the alternate thread library (/usr/lib/lwp)
– default thread library on Solaris 9
-> Using hires_tick to change clock resolution
-> Using processor sets
-> Binding process to a CPU
-> Turning off interrupts
-> Modifying the dispatch table
-> Use large page sizes
-> Use multi-threaded malloc library
Automatic Tuning In J2SE 1.5
Ergonomics
-> What is ergonomics->
-> Why do it->
-> When is it used->
-> What does it do->
-> How does it work->
What Is Ergonomics->
-> JVM™ automatically selects
– Compiler
– Garbage collector
– Heap size
-> User specifies behavior
-> GC dynamically does tuning
– AKA GC ergonomics
Why Do Ergonomics->
-> Better Performance
– Hand tuned performance is good
-> Ease of Use
– Hand tuning is hard
-> Better Resource Usage
– Use what you need
When Is Ergonomics Used->
-> Server class machines
– 2 CPUs, 2 Gbytes
-> Exceptions
– Microsoft Windows ia32
What Does Ergonomics Do->
-> Server compiler
-> Parallel GC collector
-> Maximum heap
– Smaller of
-> ¼ physical memory
-> 1 Gbyte
-> Initial heap
– Smaller of
-> 1/64 physical memory
-> 1 Gbyte
What is GC Ergonomics->
-> User specifies
– Maximum pause time goal
– Throughput goal
– Assumes minimum footprint goal
-> GC tunes
– Young generation size
– Old generation size
– Survivor space sizes
– Tenuring threshold
Why Do GC Ergonomics->
-> Common complaints
– Pauses are too long
– GC is too frequent
-> Solution
– Hand tune the GC
What Does GC Ergonomics
Do->
-> Goals, not guarantees
-> User specified behavior
– Maximum pause time goal
-> Reduce size of generation
– Throughput goal
-> Increase size of generations
– Minimum footprint
-> Reduce size of generations
-> Again, goals, not guarantees
Ergonomics Usage
-> Use -XX:+UseParallelGC with the below options
-> Throughput Goal
– -XX:GCTimeRatio=nnn
-> The ratio of GC time to application time
-> 1 / (1 + nnn) where nnn is a value to obtain the percentage GC time vs application time. E.g. Nnn =
19, GC time 5% of application time
-> Pause Time Goal
– -XX:MaxGCPauseMillis=nnn
-> An hint to the JVM to keep the pauses below this value
Ergonomics Strategy
-> Use throughput strategy, and set desired throughput
-> Change maximum heap size if throughput cannot be achieved
-> If throughput goal is achieved, set pause time goal, if pauses are high
JVM Monitoring & Management in J2SE 1.5
Java Monitoring and Management API
-> Provides a way to manage and monitor a JVM
– Information about loaded classes and threads
– Memory usage
– Garbage collection statistics
– Low memory detection & thresholds
-> Provides monitoring utilities
– jconsole
– jstat
Java Monitoring and Management API
-> Provides MBeans
– GarbageCollectorMXBean
– MemoryManagerMXBean
– MemoryMXBean
– MemoryPoolMXBean
– Other MBeans
-> MBeans can be accessed through
– jconsole
-> jconsole jvmpid
jconsole – GarbageCollection
jstat
-> An utility to obtain JVM statistics dynamically
– Compiler statistics
– Class loader statistics
– GC statistics
-> GC statistics include
– cause of GC
– generation information
-> capacity
-> utilization
jstat Usage
-> jstat -gc jvmid
– Provides statistics on the behavior of the garbage collected heap
-> jstat -gcutil jvmid
– Provides a concise summary of garbage collection statistics.
-> jstat -gcutil 21891
– S0 S1 E O P YGC YGCT FGC FGCT GCT – 12.44 0.00 27.20 9.49 96.70 78 0.176 5 0.495 0.672 – 12.44 0.00 62.16 9.49 96.70 78 0.176 5 0.495 0.672 – 12.44 0.00 83.97 9.49 96.70 78 0.176 5 0.495 0.672 – 0.00 7.74 0.00 9.51 96.70 79 0.177 5 0.495 0.673
Summary
-> Introduced low pause and throughput collectors
-> Performance problems seen with garbage collection
-> Improving performance using manual and automatic tuning
-> Introduction to the new monitoring & management API
Resources
-> http://java.sun.com/docs/hotspot/index.html -> http://java.sun.com/docs/hotspot/gc1.4.2/ -> http://developers.sun.com/techtopics/mobility/midp/articles/garbagecollection2/ -> http://http://java.sun.com/developer/technicalArticles/Programming/turbo/ -> http://java.sun.com/docs/hotspot/VMOptions.html -> http://sdc.sun.com/gcportal/ -> http://java.sun.com/developer/technicalArticles/Programming/GCPortal/index.html
In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.