Item 68: Tune the JVM





Item 68: Tune the JVM

With the release of JDK 1.3, Sun made an important enhancement to the Java Virtual Machine: specifically, they released Hotspot, an optimizing virtual machine that promised better garbage collection, better JIT compilation to native code, and a whole slew of other enhancements intended to make Java applications run faster. In fact, two such VMs were created, one for client-based applications and one for server-based applications. The client VM was optimized for short-running applications (like your average Swing application), favoring short-term optimizations to long-term ones; the server VM was optimized in the opposite direction.

Amazing, then, isn't it, that most default installations of Java-based J2EE containers (as opposed to native-code implementations) don't make use of the VM tuned specifically for long-running server operations?

To understand what I mean, we have to take a quick side trip into the JVM invocation code. You can get this as part of the standard J2SDK download if you turn on the Install Sources checkbox during the install. When the installation program finishes installing, a roughly 10MB-sized zip file shows up in the root of the J2SDK directory called src.zip. Exploding that zip file yields, among other things, a "launcher" directory, which includes the source code to the java.exe launcher. In particular, the core of what we're interested in is the java.c source file. (The other C source file, java_md.c, is the machine-dependent support routine that varies based on platform: Linux, Win32, or Solaris, depending on which JDK you're currently using.)

The aspects of the java launcher that we're interested in come fairly early in the C file:






/*

 * Entry point.

 */

int

main(int argc, char ** argv)

{

  JavaVM *vm = 0;

  JNIEnv *env = 0;

  char *jarfile = 0;

  char *classname = 0;

  char *s = 0;

  jclass mainClass;

  jmethodID mainID;

  jobjectArray mainArgs;

  int ret;

  InvocationFunctions ifn;

  char *jvmtype = 0;

  jlong start, end;

  char jrepath[MAXPATHLEN], jvmpath[MAXPATHLEN];

  char ** original_argv = argv;



  /* ... Code elided for brevity ... */



  /* Find out where the JRE is that we will be using. */

  if (!GetJREPath(jrepath, sizeof(jrepath)))

  {

    fprintf(stderr,

      "Error: could not find Java 2 Runtime Environment.\n");

    return 2;

  }



  /* ... Code elided for brevity ... */



  /* Find the specified JVM type */

  if (ReadKnownVMs(jrepath) < 1)

  {

    fprintf(stderr,

      "Error: no known VMs. (check for corrupt jvm.cfg file)\n");

    exit(1);

  }

  jvmtype = CheckJvmType(&argc, &argv);



  jvmpath[0] = '\0';

  if (!GetJVMPath(jrepath, jvmtype, jvmpath, sizeof(jvmpath)))

  {

    fprintf(stderr,

      "Error: no '%s' JVM at '%s'.\n", jvmtype, jvmpath);

    return 4;

  }

/* If we got here, jvmpath has been correctly initialized. */



  /* ... Rest of main() elided for brevity ... */

}


For those readers who are a bit rusty in C (who isn't, these days?), main essentially calls out to a utility method, also inside this file, called ReadKnownVMs to determine what the available VM options are within the JDK and to compare against the parameters passed on the java.exe command line to discover which one the user wishes to use. Once the VM type, stored in the local variable jvmpath, is known, the launcher uses the standard JNI Invocation API to create the JVM, find the class to load, and invoke its main method.

Drilling into ReadKnownVMs (not shown here) reveals that the list of VMs known to the JDK is controlled by nothing more technically complex than a text file—in this particular case, a text file called jvm.cfg, stored in a CPU-specific directory (e.g., for Win32 it's called i386) underneath the lib subdirectory of the Java Runtime Environment (JRE). This subdirectory structure and filename are hard-coded into the launcher code, by the way; there's no way to change this without writing your own launcher. Under JDK 1.4.1, for example, the jvm.cfg file looks like the following:






#

# @(#)jvm.cfg  1.6 01/12/03

#

# Copyright 2002 Sun Microsystems, Inc. All rights reserved.

# SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.

#

# List of JVMs that can be used as an option to java, javac, etc.

# Order is important — first in this list is the default JVM.

# NOTE that this both this file and its format are UNSUPPORTED and

# WILL GO AWAY in a future release.

#

# You may also select a JVM in an arbitrary location with the

# "-XXaltjvm=<jvm_dir>" option, but that too is unsupported

# and may not be available in a future release.

#

-client KNOWN

-server KNOWN

-hotspot ALIASED_TO -client

-classic WARN

-native ERROR

-green ERROR


And, in case the comments in the file don't give away the punchline, the key here is that the first option in the list is the default option. This means that unless you specify -server as part of the J2EE server invocation command line, you're running with the client Hotspot VM. (In earlier releases of the JVM, Hotspot was the product name given to the highly optimized JVM, and it was generally recommended for use on client applications; with the release of JDK 1.3, Hotspot had two possible configurations, server and classic, and both are called Hotspot.)

If you're not quite sure which JVM is being invoked, you can see the options the java.exe launcher uses by turning on an undocumented environment variable, _JAVA_LAUNCHER_DEBUG, which will display a collection of interesting details about the JVM and the options used to create it:






C:\Prg\Test>set _JAVA_LAUNCHER_DEBUG=1



C:\Prg\Test>java Hello

----_JAVA_LAUNCHER_DEBUG----

JRE path is C:\Prg\java\j2sdk1.4.1\jre

jvm.cfg[0] = ->-client<-

jvm.cfg[1] = ->-server<-

jvm.cfg[2] = ->-hotspot<-

jvm.cfg[3] = ->-classic<-

jvm.cfg[4] = ->-native<-

jvm.cfg[5] = ->-green<-

41768 micro seconds to parse jvm.cfg

JVM path is C:\Prg\java\j2sdk1.4.1\jre\bin\client\jvm.dll

9816 micro seconds to LoadJavaVM

JavaVM args:

    version 0x00010002, ignoreUnrecognized is JNI_FALSE,

                        nOptions is 2

    option[ 0] = '-Djava.class.path=.'

    option[ 1] = '-Dsun.java.command=Hello'

250409 micro seconds to InitializeJVM

Main-Class is 'Hello'

Apps' argc is 0

317641 micro seconds to load main class

----_JAVA_LAUNCHER_DEBUG----

Hello, world!



C:\Prg\Test>java -server Hello

----_JAVA_LAUNCHER_DEBUG----

JRE path is C:\Prg\java\j2sdk1.4.1\jre

jvm.cfg[0] = ->-client<-

jvm.cfg[1] = ->-server<-

jvm.cfg[2] = ->-hotspot<-

jvm.cfg[3] = ->-classic<-

jvm.cfg[4] = ->-native<-

jvm.cfg[5] = ->-green<-

31218 micro seconds to parse jvm.cfg

JVM path is C:\Prg\java\j2sdk1.4.1\jre\bin\server\jvm.dll

25056 micro seconds to LoadJavaVM

JavaVM args:

    version 0x00010002, ignoreUnrecognized is JNI_FALSE,

                        nOptions is 2

    option[ 0] = '-Djava.class.path=.'

    option[ 1] = '-Dsun.java.command=Hello'

288153 micro seconds to InitializeJVM

Main-Class is 'Hello'

Apps' argc is 0

398426 micro seconds to load main class

----_JAVA_LAUNCHER_DEBUG----

Hello, world!



C:\Prg\Test>


The giveaway here is the line JVM path is....When running with the client VM, the JVM used is the DLL stored in jre/bin/client/jvm.dll, whereas when invoked explicitly with the -server flag, the JVM used is jre/bin/server/jvm.dll.

To fix this, if you can get at the command line used to invoke the JVM, add the -server option into the command-line parameters. If this is somehow hidden from you, simply rearrange the options found in the jvm.cfg file in the JDK to list the -server option first. Bear in mind, however, the comment at the top of the jvm.cfg file: this file format is subject to change and may use a different mechanism sometime in the future. (Case in point: the JDK 1.3 mechanism didn't make use of the keywords following each entry—KNOWN, ALIAS, and so forth.) Future J2SDK releases may change this mechanism, so be prepared to do a little spelunking when J2SDK 1.5 is released.

Note that you can also take a one-time slight performance boost at startup by stripping out the extraneous options in the file (-classic, -native, and –green and the huge comment block at the top); this will reduce the average parse time from around 30,000 microseconds to around 9,000. Not much, but every microsecond counts sometimes.

It's also possible to specify the location of an alternate VM via the use of the –XXaltjvm command-line option, but until alternative JVMs become more of a commodity than they are as of this writing, this won't be of practical use. This is also a Sun JVM-only feature (and undocumented to boot) and may go away in a future release, so use it with caution.

Note that the Hotspot JVM isn't the only option here; in particular, BEA acquired JRockit, a JRE replacement tuned for server operations. JRockit essentially replaces the Sun JRE, so making use of it simply involves pointing your PATH at the java launcher in the JRockit installation tree instead of the Sun JRE.

Once the right JVM is executing, and you've profiled your code to find any obvious bottlenecks or choke points, at some point it becomes an attraction to figure out how to configure the JVM itself to better work with your system at garbage collection, threading policies, and so on. In some cases, this knowledge will reduce the portability of the system since your implementation may depend on the behavior of the underlying JVM, but this may or may not be a bad thing, depending on your reaction to Item 11.

To begin, the JVM spends a tremendous amount of time allocating memory when the system is starting up since the default starting memory footprint for the JVM is a measly 2MB. This is fine for such worthy applications as "Hello, world!" and "Goodbye, world!" but hardly satisfying for an enterprise application running a servlet container or EJB container. Pass a reasonable value to the -Xms option, based on realistic profiling statistics: write a simple servlet to spit back the value from Runtime.totalMemory, deploy that to your servlet container, shut it down, start it back up, and hit that servlet right away. This will give a rough estimation of the servlet container's overhead requirements.

Similarly, the JVM establishes a maximum heap size it will grow to. Many Java resources state that this is the value at which the JVM will begin garbage collection, but the truth is a bit more complex than that. What is true, however, is that the JVM will never grow beyond this value; by default, the JVM establishes 64MB as the upper limit. Use the –Xmx option to bump this value up to something more reasonable, based on profiling statistics (gathered either by using a commercial profiler or the servlet mentioned above) during a reasonable approximation of peak-load testing on your system.

Be very careful with passing large values to -Xmx because as the heap grows larger, it becomes exponentially harder to scan for objects; realistically, if you're looking at maximum heap size parameters of larger than 1GB to the JVM, you might want to experiment with running multiple JVM processes of smaller size rather than one large one. If you're worried about taking the overhead of the JVM multiple times (once per process), be cheered by the fact that (a) most operating systems will silently map code segments in multiple processes to the same physical memory, thus obviating the need to load duplicate code twice into physical memory, and (b) the J2SE 1.5 release is introducing "code sharing" into the JVM itself, to try to avoid the overhead of the loaded-and-defined classes, for example, among other things. Again, you'll want to profile this approach (see Item 10) before committing to it in production.

Another option is to set the starting heap size to the same value as the maximum heap size to avoid resizing the heap as it grows beyond the starting heap size, but this assumes that the JVM will never actually release allocated heap memory when requirements shrink. Although this is the case for Sun JVMs through release 1.4, it's supposedly something that will get fixed "sooner or later," and since it's rare that the JVM is the only application executing on that machine, it's usually polite to return something (like memory) back to its owner (the operating system) when you're not using it anymore.

While we're on the subject of releasing things, sometimes developers will put in calls to System.gc in order to force finalization of objects that have external resources that need release. On top of being incredibly wasteful (this will trigger a full garbage collection sweep), the garbage collection pass is never guaranteed to find the unused objects and collect them—as explained in Item 72, garbage collection algorithms may in fact not release the object right away despite its unreachable status. Instead, the far better approach is to explicitly release the object's resources using it's close or dispose method, assuming it has one. (If it doesn't have one, and it's not using Reference objects or shutdown hooks to ensure release, drop it like a hot potato, fire off an angry e-mail to the support department of the vendor you got it from, and find something else to use in its place.)

One tuning parameter that you can also play with comes from the RMI plumbing. RMI over JRMP, thanks to its distributed garbage collection behavior, forces explicit garbage collections periodically, as controlled by the JVM system properties sun.rmi.dgc.client.gcInterval and sun.rmi.dgc.server.gcInterval, for client and server respectively, and both are set to 1 minute (60,000 milliseconds) by default. In order to avoid such frequent garbage collection passes, set these values higher (recognizing that this will also leave RMI objects that are no longer in use alive for longer before being reclaimed). Think you're not using RMI? Guess again—remember, RMI is the preferred RPC protocol over which EJB operates. (By default, it should be RMI-over-IIOP, but most servers seem to prefer using RMI-over-JRMP since it's more Java-friendly.)

Additionally, as described in more detail in Item 72, a number of garbage collection parameters can be used to make the garbage collector operate in a more efficient manner over your code; using these is an option of secondary resort, however—start playing with these only if you have profiled the code (or seen verbose:gc output that seems to indicate a problem) and you're not planning to ship the following day.

In general, tuning the JVM is a coarse-grained brush: you're making statements about how your code should execute in broad fashion, and you should expect results of a similar nature. Choosing the client over the server JVM, for example, will have dramatic differences that may (or, ironically enough, may not, depending on your particular application) yield better performance in your enterprise applications. Don't expect orders-of-magnitude results, but don't be surprised during the one or two times when that happens, either.


     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows