Java Performance Tuning Tips
Why is it slow ?
The virtual machine layer that abstracts Java away from the underlying hardware increase the overhead.
These overheads can cause Java application to run slower that an equivalent application written in a lower-level language.
Java’s advantages platform-independence, memory management, powerful exception checking, built-in multi-threading, dynamic resource loading and security checks add costs.
The tuning game
Performance tuning is similar to playing a strategy game.
Your target is to get a better score than the last score after each attempt.
You are playing with, not against, the computer, the programmer, the design, the compiler.
Techniques include switching compilers, turning on optimizations, using a different VM, finding 2 or 3 bottleneck in the code that have simple fixes.
System limitations
Three ressources limits all applications :
CPU speed and availability
System memory
Disk (and network) input/output
The first step in the tuning is to determine which of these is causing your application to run slowly.
When you fix a bottleneck, is normal that the next bottleneck switch to another limitations.
A tuning strategy
1.Identify the main bottlenecks (look for about the top five bottlenecks)
2.Choose the quickest and easiest one to fix, and address it.
3.Repeat from Step 1.
Advantage :
– once a bottleneck has been eliminated, the characteristics of the application change, and the topmost bottleneck may no need to be addressed any longer.
Identify bottleneck
1. Measure the performance by using profilers and benchmark suites.
2. Identify the location of any bottlenecks.
3. Think of a hypothesis for the cause of the bottleneck.
4. Consider any factors that may refute your hypothesis.
5. Create a test to isolate the factor identified by the hypothesis.
6. Test the hypothesis
7. Alter the application to reduce the bottleneck
8. Test that the alteration improves performance, and measure the improvement
9. Repeat from Step 1.
Perceived Performance
The users has a particular view of performance that allows you to cut some corners.
Ex : A browser that gives a running countdown of the amount left to be downloaded from a server is seen to be faster that one that just sits here until all the data is downloaded.
Rules :
if application is unresponsive for more than 2 sec, it is seem as slow.
Users are not aware of response time improvements of less than 20 %
How to appear quicker ?
Threading : ensuring that your application remains responsive to the user, even while it is executing some other function.
Streaming : display a partial result of the activity while continuing to compile more results in background. (very useful in distributed systems).
Caching : the caching technics help you to speed the data access. The read-ahead algorithms use in disk hardware is fast when you reading forward through a file.
Starting to tune
User agreements : you should agree with your users what the performance of the applications is expected to be : response times, systemwide throughput, max number of users, data, …
Setting benchmarks : these are precise specifications stating what part of code needs to run in what amount of time.
How much faster and in which parts, and for how much effort ?
Without clear performance objectives, tuning will never be completed
Taking Measurements
Each run of your benchmarks needs to be under conditions that are identical as possible.
The benchmark should be run multiples times, and the full list of results retained, not just the average and deviation.
Run a initial benchmark to specify how far you need to go and highlight how much you have achieved when you finish tuning.
Make your benchmark long enough (over 5 sec)
What to measure ?
Main : the wall-clock time (System.currentTimeMillis())
CPU time : time allocated on the CPU for a particular procedure
Memory size
Disk throughput
Network traffic, throughput, and latency Java doesn’t provide mechanisms for measuring theses values directly
Profiling Tools
Measurements and timings
Garbage collection
Method calls
Object-creation profiling
Monitoring gross memory usage
Measurements and Timings
Any profiler slow down the application it is profiling.
Using currentTimeMillis() is the only reliable way.
The OS interfere with the results by the allocation of different priorities to the process.
On certain OS, the foreground processes are given maximum priority.
Some cache effects can lead to wrong result.
Garbage Collection
Some of the commercial profilers provide statistics showing what the garbage collector is doing. Or use the -verbosegc option with the VM.
With VM1.4 : java -Xloggc:<file>
The printout includes explicit synchronous calls to the garbage collector and asynchronous executions of the garbage collector when free memory available gets low.
The important items that all -verbosegc output are
-the size of the heap after garbage collection
-the time taken to run the garbage collection
-the number of bytes reclaimed by the garbage collection.
Interesting value :
-Cost of GC to your application (percentage)
-Cost of the GC in the application’s processing time
GC Viewer Supported verbose:gc formats are:
Sun JDK 1.3.1/1.4 with the option -verbose:gc
Sun JDK 1.4 with the option -Xloggc:<file> (preferred)
IBM JDK 1.3.0/1.2.2 with the option -verbose:gc
GCViewer shows a number of lines :
Full GC Lines: Black vertical line at every Full GC
Inc GC Lines: Cyan vertical line at every Incremental GC
GC Times Line: Green line that shows the length of all GCs
Total Heap: Red line that shows heap size
Used Heap: Blue line that shows used heap size
GCViewer also provides some metrics: ( Acc Pauses: Sum of all pauses due to GC)
Avg Pause: Average length of a GC pause
Min Pause: Shortest GC pause
Max Pause: Longest GC pause
Total Time: Time data was collected for (only Sun 1.4 and IBM 1.3.0/1.2.2)
Footprint: Maximal amount of memory allocated
Throughput:Time percentage the application was NOT busy with GC
Freed Memory: Total amount of memory that has been freed 0
Freed Mem/Min: Amount of memory that has been freed per minute
Method Calls
1. Show where the bottlenecks in your code are and helping you to decide where to target your efforts.
2. Most method profilers work by sampling the call stack at regular intervals and recording the methods on the stack.
3. The JDK comes with a minimal profiler, obtain by using the -Xrunhprof option (depends on the JDK). This option produces a profile data file (java.hprof.txt).
Rolf’s Profile Viewer
For each method
-a count of the number of times the method is invoked
-a short form of the class and method name itself the time spent in that method (in seconds)
-a bargraph of the time.
-All the methods which call the current method are listed in the caller pane
-All the methods that the current method itself invokes are listed in the caller pane.
Object creation
Determine object numbers
Identifying where particular objects are created in the code.
The JDK provides very rudimentary objectcreation statistics.
Use a commercial tool in place of the SDK.
Monitoring Gross Memory Usage ? The JDK provides two methods for monitoring
the amount of memory used by the runtime system : freeMemory() and totalMemory() in the java.lang.Runtime class.
totalMemory() returns a long, which is the number of bytes currently allocated to the runtime system for this particular VM process.
freeMemory() returns a long, which is the number of bytes available to the VM to create objects from the section of memory it controls.
Tools
(commercial) Optimizeit from Borland
(commercial) JProbe from Quest Software
(commercial) JProfiler from ej-technologies
(commercial) WebSphere Studio from IBM
(free) HPjmeter from Hewlett-Packard
(free) HPjtune
Tuning IO performance H The example consists of reading lines from a large files.
We compare differents methods on 2 files :
small file with long lines
long file with short lines
We test our methods with four JVM config :
JVM 1.2.2
JVM 1.3.1
JVM 1.4.1
JVM 1.4.1 -server
JVM 1.5
Method 1 : Unbuffered input stream
Use the deprecated method readLine() from
DataInputStream.
DataInputStream in = new DataInputStream(new
FileInputStrem(file));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();
Method 2 : Buffered input stream
Use a BufferedInputStream to wrap the
FileInputStream.
DataInputStream in = new DataInputStream(new
BufferedInputStream(new FileInputStrem(file)));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();
Method 3 : 8K buffered input stream
Set the size of the buffer to 8192 bytes.
DataInputStream in = new DataInputStream(new
BufferedInputStream(new FileInputStrem(file),8192));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();
Method 4 : Buffered reader
Use Readers instead of InputStreams, according to the Javadoc, for full portability, etc.
BufferedReader in = new BufferedReader(new
FileReader(file));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();
Method 5 : Custom-built reader
Let’s get down to some real tuning.
You know from general tuning practices that creating objects is overhead.
Up until now, we have used the readLine() method, which returns a string.
Suppose we avoid the String creation.
Better, why not working directly on the underlying char array.
We need to implement the readLine() functionnality with our own buffer while passing the buffer to the method that does the string processing.
Our implementation uses its own char array buffer.
It reads in characters to fill the buffer, then runs through the buffer looking for ends of lines.
Each time the end of a line is found, the buffer together with the start and end index of the line in that buffer, is passed to the doSomething() method.
This implementation avoids both String-creation overhead and the subsequent String-processing overhead.
Method 6 : Custom reader and converter
Better, performing the byte-to-char conversion.
Change the FileReader to FileInputStream and add a byte array buffer of the same size as the char array buffer.
Create a convert() method that convert the byte buffer to the char buffer.
Results with small file
The file contains 10000 lines of 100 caracters. (977Kb)
Results with long file
The file contains 35000 lines of 50 caracters. (1,7Mb)
Reference Links
www.javaperformancetuning.com
www-2.cs.cmu.edu/~jch/java/optimization.html
www.cs.utexas.edu/users/toktb/J-Breeze/javaperform.tips.html
www.javagrande.com
http://java.sun.com/j2se/1.4.1/docs/guide/jvmpi/jvmpi.html
www.run.montefiore.ulg.ac.be/~skivee/java-perf/
In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.