Java Performance Tuning Tips

Copy file to shared folder java

Why is it slow ?

The virtual machine layer that abstracts Java away from the underlying hardware increase the overhead.

These overheads can cause Java application to run slower that an equivalent application written in a lower-level language.

Java’s advantages  platform-independence, memory management, powerful exception checking, built-in multi-threading, dynamic resource loading and security checks  add costs.

The tuning game

Performance tuning is similar to playing a strategy game.

Your target is to get a better score than the last score after each attempt.

You are playing with, not against, the computer, the programmer, the design, the compiler.

Techniques include switching compilers, turning on optimizations, using a different VM, finding 2 or 3 bottleneck in the code that have simple fixes.

System limitations

Three ressources limits all applications :

CPU speed and availability

System memory

Disk (and network) input/output

The first step in the tuning is to determine which of these is causing your application to run slowly.

When you fix a bottleneck, is normal that the next bottleneck switch to another limitations.

A tuning strategy
1.Identify the main bottlenecks (look for about the top five bottlenecks)
2.Choose the quickest and easiest one to fix, and address it.
3.Repeat from Step 1.

Advantage :
– once a bottleneck has been eliminated, the characteristics of the application change, and the topmost bottleneck may no need to be addressed any longer.

Identify bottleneck
1. Measure the performance by using profilers and benchmark suites.
2. Identify the location of any bottlenecks.
3. Think of a hypothesis for the cause of the bottleneck.
4. Consider any factors that may refute your hypothesis.
5. Create a test to isolate the factor identified by the hypothesis.
6. Test the hypothesis
7. Alter the application to reduce the bottleneck
8. Test that the alteration improves performance, and measure the improvement
9. Repeat from Step 1.

Perceived Performance

The users has a particular view of performance that allows you to cut some corners.

Ex : A browser that gives a running countdown of the amount left to be downloaded from a server is seen to be faster that one that just sits here until all the data is downloaded.

Rules :
if application is unresponsive for more than 2 sec, it is seem as slow.
Users are not aware of response time improvements of less than 20 %

 

How to appear quicker ?

Threading : ensuring that your application remains responsive to the user, even while it is executing some other function.

Streaming : display a partial result of the activity while continuing to compile more results in background. (very useful in distributed systems).

Caching : the caching technics help you to speed the data access. The read-ahead algorithms use in disk hardware is fast when you reading forward through a file.

Starting to tune

User agreements : you should agree with your users what the performance of the applications is expected to be : response times, systemwide throughput, max number of users, data, …

Setting benchmarks : these are precise specifications stating what part of code needs to run in what amount of time.

How much faster and in which parts, and for how much effort ?

Without clear performance objectives, tuning will never be completed

 

Taking Measurements

Each run of your benchmarks needs to be under conditions that are identical as possible.

The benchmark should be run multiples times, and the full list of results retained, not just the average and deviation.

Run a initial benchmark to specify how far you need to go and highlight how much you have achieved when you finish tuning.

Make your benchmark long enough (over 5 sec)

 

What to measure ?

Main : the wall-clock time (System.currentTimeMillis())

CPU time : time allocated on the CPU for a particular procedure

Memory size

Disk throughput

Network traffic, throughput, and latency Java doesn’t provide mechanisms for measuring theses values directly

 

Profiling Tools

Measurements and timings

Garbage collection

Method calls

Object-creation profiling

Monitoring gross memory usage

 

Measurements and Timings

Any profiler slow down the application it is profiling.

Using currentTimeMillis() is the only reliable way.

The OS interfere with the results by the allocation of different priorities to the process.

On certain OS, the foreground processes are given maximum priority.

Some cache effects can lead to wrong result.

Garbage Collection

Some of the commercial profilers provide statistics showing what the garbage collector is doing. Or use the -verbosegc option with the VM.

With VM1.4 : java -Xloggc:<file>

The printout includes explicit synchronous calls to the garbage collector and asynchronous executions of the garbage collector when free memory available gets low.
The important items that all -verbosegc output are

-the size of the heap after garbage collection

-the time taken to run the garbage collection

-the number of bytes reclaimed by the garbage collection.

Interesting value :

-Cost of GC to your application (percentage)

-Cost of the GC in the application’s processing time

GC Viewer  Supported verbose:gc formats are:

Sun JDK 1.3.1/1.4 with the option -verbose:gc

Sun JDK 1.4 with the option -Xloggc:<file> (preferred)

IBM JDK 1.3.0/1.2.2 with the option -verbose:gc

GCViewer shows a number of lines :

Full GC Lines: Black vertical line at every Full GC

Inc GC Lines: Cyan vertical line at every Incremental GC

GC Times Line: Green line that shows the length of all GCs

Total Heap: Red line that shows heap size

Used Heap: Blue line that shows used heap size

GCViewer also provides some metrics: ( Acc Pauses: Sum of all pauses due to GC)
Avg Pause: Average length of a GC pause

Min Pause: Shortest GC pause

Max Pause: Longest GC pause

Total Time: Time data was collected for (only Sun 1.4 and IBM 1.3.0/1.2.2)

Footprint: Maximal amount of memory allocated

Throughput:Time percentage the application was NOT busy with GC

Freed Memory: Total amount of memory that has been freed 0
Freed Mem/Min: Amount of memory that has been freed per minute

GC viewer

 

Method Calls
1. Show where the bottlenecks in your code are and helping you to decide where to target your efforts.
2. Most method profilers work by sampling the call stack at regular intervals and recording the methods on the stack.
3. The JDK comes with a minimal profiler, obtain by using the -Xrunhprof option (depends on the JDK). This option produces a profile data file (java.hprof.txt).

Rolf’s Profile Viewer
For each method

-a count of the number of times the method is invoked

-a short form of the class and method name itself  the time spent in that method (in seconds)

-a bargraph of the time.

-All the methods which call the current method are listed in the caller pane

-All the methods that the current method itself invokes are listed in the caller pane.

Heap ViewHeap Tree viewRoll's Profile Viewer

 

Object creation

Determine object numbers

Identifying where particular objects are created in the code.

The JDK provides very rudimentary objectcreation statistics.

Use a commercial tool in place of the SDK.

 

Monitoring Gross Memory Usage ? The JDK provides two methods for monitoring
the amount of memory used by the runtime system : freeMemory() and totalMemory() in the java.lang.Runtime class.

totalMemory() returns a long, which is the number of bytes currently allocated to the runtime system for this particular VM process.

freeMemory() returns a long, which is the number of bytes available to the VM to create objects from the section of memory it controls.

Tools

(commercial) Optimizeit from Borland

(commercial) JProbe from Quest Software

(commercial) JProfiler from ej-technologies

(commercial) WebSphere Studio from IBM

(free) HPjmeter from Hewlett-Packard

(free) HPjtune

 

Tuning IO performance H The example consists of reading lines from a large files.

We compare differents methods on 2 files :

small file with long lines

long file with short lines

We test our methods with four JVM config :

JVM 1.2.2

JVM 1.3.1

JVM 1.4.1

JVM 1.4.1 -server

JVM 1.5

 

Method 1 : Unbuffered input stream

Use the deprecated method readLine() from
DataInputStream.
DataInputStream in = new DataInputStream(new
FileInputStrem(file));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();

 

Method 2 : Buffered input stream

Use a BufferedInputStream to wrap the
FileInputStream.
DataInputStream in = new DataInputStream(new
BufferedInputStream(new FileInputStrem(file)));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();

Method 3 : 8K buffered input stream

Set the size of the buffer to 8192 bytes.
DataInputStream in = new DataInputStream(new
BufferedInputStream(new FileInputStrem(file),8192));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();

Method 4 : Buffered reader

Use Readers instead of InputStreams, according to the Javadoc, for full portability, etc.
BufferedReader in = new BufferedReader(new
FileReader(file));
while ((line = in.readLine()) != null) {
doSomething(line);
}
in.close();

Method 5 : Custom-built reader

Let’s get down to some real tuning.

You know from general tuning practices that creating objects is overhead.

Up until now, we have used the readLine() method, which returns a string.

Suppose we avoid the String creation.

Better, why not working directly on the underlying char array.
We need to implement the readLine() functionnality with our own buffer while passing the buffer to the method that does the string processing.

Our implementation uses its own char array buffer.

It reads in characters to fill the buffer, then runs through the buffer looking for ends of lines.

Each time the end of a line is found, the buffer together with the start and end index of the line in that buffer, is passed to the doSomething() method.

This implementation avoids both String-creation overhead and the subsequent String-processing overhead.

Method 6 : Custom reader and converter

Better, performing the byte-to-char conversion.

Change the FileReader to FileInputStream and add a byte array buffer of the same size as the char array buffer.

Create a convert() method that convert the byte buffer to the char buffer.

Results with small file

Custom reader and converter test resultsThe file contains 10000 lines of 100 caracters. (977Kb)

 

Results with long file

 

Results with long file

Results with long file

The file contains 35000 lines of 50 caracters. (1,7Mb)

 

Reference Links

www.javaperformancetuning.com

www-2.cs.cmu.edu/~jch/java/optimization.html

www.cs.utexas.edu/users/toktb/J-Breeze/javaperform.tips.html

www.javagrande.com

http://java.sun.com/j2se/1.4.1/docs/guide/jvmpi/jvmpi.html

www.run.montefiore.ulg.ac.be/~skivee/java-perf/

In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.