Reloading Java Classes 401: HotSwap and JRebel — Behind the Scenes

Sunday, January 24th, 2010

In this article we’ll review how classes can be reloaded without dynamic class loaders. We will take a look at the JVM HotSwap class reloading support, Instrumentation API and ZeroTurnaround’s JRebel.

Other Articles in the Reloading Java Classes Series

HotSwap and Instrumentation

In 2002, Sun introduced a new experimental technology into the Java 1.4 JVM, called HotSwap. It was incorporated within the Debugger API, and allowed debuggers to update class bytecode in place, using the same class identity. This meant that all objects could refer to an updated class and execute new code when their methods were called, preventing the need to reload a container whenever class bytecode was changed. All modern IDEs (including Eclipse, IDEA and NetBeans) support it. As of Java 5 this functionality is also available directly to Java applications, through the Instrumentation API.

hotswap

Unfortunately, this redefinition is limited only to changing method bodies — it cannot either add methods or fields or otherwise change anything else, except for the method bodies. This limits the usefulness of HotSwap, and it also suffers from other problems:

  • The Java compiler will often create synthetic methods or fields even if you have just changed a method body (e.g. when you add a class literal, anonymous and inner classes, etc).
  • Running in debug mode will often slow the application down or introduce other problems

This causes HotSwap to be used less than, perhaps, it should be.

Why is HotSwap limited to method bodies?

This question has been asked a lot during the almost 10 years since the introduction of HotSwap. One of the most voted for bugs for the JVM calls for supporting a whole array of changes, but so far it has not been implemented.

A disclaimer: I do not claim to be a JVM expert. I have a good general idea how the JVM is implemented and over the years I talked to a few (ex-)Sun engineers, but I  haven’t verified everything I’m saying here against the source code. That said, I do have some ideas as to the reasons why this bug is still open (but if you know the reasons better, feel free to correct me).

The JVM is a heavily optimized piece of software, running on multiple platforms. Performance and stability are the highest priorities. To support them in different environments the Sun JVM features:

  • Two heavily optimized Just-In-Time compilers (-client and -server)
  • Several multi-generational garbage collectors

These features make evolving the class schema a considerable challenge. To understand why, we need to look a little closer as to what exactly is necessary to support adding methods and fields (and even more advanced, changing the inheritance hierarchy).

When loaded into the JVM, an object is represented by a structure in memory, occupying a continuous region of memory with a specific size (its fields plus metadata). In order to add a field, we would need to resize that structure, but since nearby regions may already be occupied, we would need to relocate the whole structure to a different region where there is enough free space to fit it in. Now, since we’re actually updating a class (and not just a single object) we would have to do this to every object of that class.

In itself this would not be hard to achieve — Java garbage collectors already relocate objects all the time. The problem is that the abstraction of one “heap” is just that, an abstraction. The actual layout of memory depends on the garbage collector that is currently active and, to be compatible with all of them, the relocation should probably be delegated to the active garbage collector. The JVM will also need to be suspended for the time of relocation, so doing GC at the same time makes sense.

Adding a method does not require updating the object structure, but it does require updating the class structure, which is also present on the heap. But consider this: the moment after a class has been loaded it is essentially is frozen forever. This enables the JIT to perform the main optimization that the JVM does — inlining. Most of the method calls in your application hot spots are eliminated and the code is copied to the calling method. A simple check is inserted to ensure that the target object is indeed what we think it is.

Here’s the punchline: the moment we can add methods to classes this “simple check” is not enough. We would need a considerably more complicated check that needs to ensure not only that no methods with the same name were added to the target class, but also to all it’s superclasses. Alternatively we could track all the inlined spots and their dependencies and deoptimize them when a class is updated. Either way it has a cost in either performance or complexity.

On top of that, consider that we’re talking about multiple platforms with varying memory models and instructions sets that probably require at least some specific handling and you get yourself an expensive problem with not much return on investment.

Introducing JRebel

In 2007, ZeroTurnaround announced the availability of a tool called JRebel (then JavaRebel) that could update classes without dynamic class loaders and with very few limitations. Unlike HotSwap, which is dependent on IDE integration, the tool works by monitoring the actual compiled .class files on disk and updating the classes whenever the files are updated. This means that you can use JRebel with a text editor and command-line compiler if so willing. Of course, it’s also integrated neatly into Eclipse, IntelliJ, and NetBeans. Unlike dynamic classloaders, JRebel preserves the identity and state of all existing objects and classes, allowing developers to continue using their application without delay.

jrebel-agent

How does this work?

For starters, JRebel works on a different level of abstraction than HotSwap. Whereas HotSwap works at the virtual machine level and is dependent on the inner workings of the JVM, JRebel makes use of two remarkable features of the JVM — abstract bytecode and classloaders. Classloaders allow JRebel to recognize the moment when a class is loaded, then translate the bytecode on-the-fly to create another layer of abstraction between the virtual machine and the executed code.

Others have used this features to enable profilers, performance monitoring, continuations, software transactional memory and even distributed heap. Combining bytecode abstraction with classloaders is a powerful combination, and can be used to implement a variety of features even more exotic than class reloading. As we examine the issue closer, we’ll see that the challenge is not just in reloading classes, but also doing so without a visible degradation in performance and compatibility.

As we reviewed in Reloading Java Classes 101 the problem in reloading classes is that once a class has been loaded it cannot be unloaded or changed; but we are free to load new classes as we please. To understand how we could theoretically reload classes, let’s take a look at dynamic languages on the Java platform. Specifically, let’s take a look at JRuby (we’ll simplify a lot, so don’t crucify anyone important).

Although JRuby features “classes”, at runtime each object is dynamic and new fields and methods can be added at any moment. This means that a JRuby object is not much more than a Map from method names to their implementations and from field names to their values. The implementations for those methods are contained in anonymously named classes that are generated when the method is encountered. If you add a method, all JRuby has to do is generate a new anonymous class that includes the body of that method. As each anonymous class has a unique name there are no issues loading it and as a result the application is updated on-the-fly.

Theoretically, since bytecode translation is usually used to modify the class bytecode, there is no reason why we can’t use the information in that class and just create as many classes as necessary to fulfill its function. We could then use the same transformation as JRuby and split all Java classes into a holder class and method body classes. Unfortunately, such an approach would be subject to (at least) the following problems:

  • Perfomance. Such a setup would mean that each method invocation would be subject to indirection. We could optimize, but the application would be at least an order of magnitude slower. Memory use would also skyrocket, as so many classes are created.
  • Java SDK classes. The classes in the Java SDK are considerably harder to process than the ones in the application or libraries. Also they often are implemented in native code and cannot be transformed in the “JRuby” way. However if we leave them as is, then we’ll cause numerous incompatibility errors, which are likely not possible to work around.
  • Compatibility. Although Java is a static language it includes some dynamic features like reflection and dynamic proxies. If we apply the “JRuby” transformation none of those features will work unless we replace the Reflection API with our own classes, aware of the transformation.

Therefore, JRebel does not take such an approach. Instead it uses a much more complicated approach, based on advanced compilation techniques, that leaves us with one master class and several anonymous support classes backed by the JIT transformation runtime that allow modifications to take place without any visible degradation in performance or compatibility. It also

  • Leaves as many method invocations intact as possible. This means that JRebel minimizes its performance overhead, making it lightweight.
  • Avoids instrumenting the Java SDK except in a few places that are necessary to preserve compatibility.
  • Tweaks the results of the Reflection API, so that we can correctly include the added/removed members in these results. This also means that the changes to Annotations are visible to the application.

Beyond Class Reloading – Archives

Reloading classes is something Java developers have complained about for a long time, but once we solved it, other problems turned up.

The Java EE standard was developed without much concern for development Turnaround (the time it takes between making a change to code and seeing the effects of that change in an application). It expects that all applications and their modules be packaged into archives (JARs, WARs and EARs), meaning that before you can update any file in your application, you need to update the archive — which is usually an expensive operation involving a build system like Ant or Maven. As we discussed in Reloading Java Classes 301 this can be minimized by using exploded development and incremental IDE builds, but for large application this is commonly not a viable option.

To solve this problem in JRebel 2.x we developed a way for the user to map archived applications and modules back to the workspace — our users create a rebel.xml configuration file in each application and module that tells JRebel where the source files can be found. JRebel integrates with the application server, and when a class or resource is updated it is read from the workspace instead of the archive.

workspace-map

This allows for instant updates of not just classes, but any kind of resources like HTML, XML, JSP, CSS, .properties and so on. Maven users don’t even need to create a rebel.xml file, since our Maven plugin will generate it automatically.

Beyond Class Reloading – Configurations and Metadata

En route to eliminating Turnaround, another issue becomes obvious: Nowadays, applications are not just classes and resources, they are wired together by extensive configuration and metadata. When that configuration changes it should be reflected in the running application. However it’s not enough to make the changes to the configuration files visible, the specific framework must reload it and reflect the changes in the application.

conf

To support these kinds of changes in JRebel we developed an open source API that allows our team and third party contributers to make use of JRebel’s features and propagate changes in configuration to the framework, using framework-specific plugins. E.g. we support adding beans and dependencies in Spring on-the-fly as well as a wide variety of changes in other frameworks.

Conclusions

This article sums up the methods to reload Java classes without dynamic class loaders. We also discuss the reasons for HotSwap’s limitations, how JRebel works behind the scenes and the problems that arise when class reloading is solved.

Other articles in the series include:

Reloading Java Classes 301: Classloaders in Web Development — Tomcat, GlassFish, OSGi, Tapestry 5 and so on

Thursday, January 14th, 2010

In this article we’ll review how dynamic classloaders are used in real servers, containers and frameworks to reload Java classes and applications.  We’ll also touch on how to get faster reloads and redeploys by using them in optimal ways.

Java EE (web) applications

In order for a Java EE web application to run, it has to be packaged into an archive with a .WAR extension and deployed to a servlet container like Tomcat. This makes sense in production, as it gives you a simple way to assemble and deploy the application, but when developing that application you usually just want to edit the application’s files and see the changes in the browser.

A Java EE enterprise application has to be packaged into an archive with an .EAR extension and deployed to an application container. It can contain multiple web applications and EJB modules, so it often takes a while to assemble and deploy it. Recently, 1100+ EE developers told us how much time it takes them, and we compiled the results into the Redeploy and Restart Report.
Spoiler: Avg redeploy & restart time is 2.5 minutes – which is higher than we expected.

In Reloading Java Classes 101, we examined how dynamic classloaders can be used to reload Java classes and applications. In this article we will take a look at how servers and frameworks use dynamic classloaders to speed up the development cycle. We’ll use Apache Tomcat as the primary example and comment when behavior differs in other containers (Tomcat is also directly relevant for JBoss and GlassFish as these containers embed Tomcat as the servlet container).

Redeployment

To make use of dynamic classloaders we must first create them. When deploying your application, the server will create one classloader for each application (and each application module in the case of an enterprise application). The classloaders form a hierarchy as illustrated:

classloaders-jee

In Tomcat each .WAR application is managed by an instance of the StandardContext class that creates an instance of WebappClassLoader used to load the web application classes. When a user presses “reload” in the Tomcat Manager the following will happen:

tomcat-cl-reload

Calling Servlet.init() recreates the “initialized” application state with the updated classes loaded using the new classloader instance. The main problem with this approach is that to recreate the “initialized” state we run the initialization from scratch, which usually includes loading and processing metadata/configuration, warming up caches, running all kinds of checks and so on. In a sufficiently large application this can take many minutes, but in a  in small application this often takes just a few seconds and is fast enough to seem instant, as commonly demonstrated in the Glassfish v3 promotional demos.

If your application is deployed as an .EAR archive, many servers allow you to also redeploy each application module separately, when it is updated. This saves you the time you would otherwise spend waiting for non-updated modules to reinitialize after the redeployment.

Hot Deployment

Web containers commonly have a special directory (e.g. “webapps” in Tomcat, “deploy” in JBoss) that is periodically scanned for new web applications or changes to the existing ones. When the scanner detects that a deployed .WAR is updated, the scanner causes a redeploy to happen (in Tomcat it calls the StandardContext.reload() method). Since this happens without any additional action on the user’s side it is commonly referred to “Hot Deployment”.

Hot Deployment is supported by all wide-spread application servers under different names: autodeployment, rapid deployment, autopublishing, hot reload, and so on. In some containers, instead of moving the archive to a predefined directory you can configure the server to monitor the archive at a specific path. Often the redeployment can be triggered from the IDE (e.g. when the user saves a file) thus reloading the application without any additional user involvement. Although the application is reloaded transparently to the user, it still takes the same amount of time as when hitting the “Reload” button in the admin console, so code changes are not immediately visible in the browser, for example.

Another problem with redeployment in general and hot deployment in particular is classloader leaks. As we reviewed in Reloading Java Classes 201, it is amazingly easy to leak a classloader and quickly run out of heap causing an OutOfMemoryError. As each deployment creates new classloaders, it is common to run out of memory in just a few redeploys on a large enough application (whether in development or in production).

Exploded Deployment

An additional feature supported by the majority of web containers is the so called “exploded deployment”, also known as “unpackaged” or “directory” deployment. Instead of deploying a .WAR archive, one can deploy a directory with exactly the same layout as the .WAR archive:

exploded

Why bother? Well, packaging an archive is an expensive operation, so deploying the directory can save quite a bit of time during build. Moreover, it is often possible to set up the project directory with exactly the same layout as the .WAR archive. This means an added benefit of editing files in place, instead of copying them to the server. Unfortunately, as Java classes cannot be reloaded without a redeploy, changing a .java file still means waiting for the application to reinitialize.

With some servers it makes sense to find out exactly what triggers the hot redeploy in the exploded directory. Sometimes the redeploy will be triggered only when the “web.xml” timestamp changes, or as in the case of GlassFish only when a special ”.reload” file timestamp changes. In most servers any change to deployment descriptors or compiled classes will cause a hot redeploy.

If your server only supports deploying by copying to a special directory (e.g. Tomcat “webapps”, JBoss “deploy” directories) you can skip the copying by creating a symlink from that special directory to your project workspace. On Linux and Mac OS X you can use the common “ln -s” command to do that, whereas on Windows you should download the Sysinternals “junction” utility.

If you use Maven, then it’s quite complicated to set up exploded development from your workspace. If you have a solo web application you can use the Maven Jetty plugin, which uses classes and resources directly from Maven source and target project directories. Unfortunately, the Maven Jetty plugin does not support deploying multiple web applications, EJB modules or EARs so in the latter case you’re stuck doing artifact builds.

Session Persistence

Since we’re on the topic of reloading classes, and redeploying involves reinitializing an application, it makes sense to talk about session state. An HTTP session usually holds information like login credentials and conversational state. Losing that session when developing a web application means spending time logging in and browsing to the changes page – something that most web containers have tried to solve by serializing all of the objects in the HttpSession map and then deserializing them in the new classloader. Essentially, they copy all of the session state. This requires that all session attributes implement Serializable (ensuring session attributes can be written to a database or a file for later use), which is not restricting in most cases.

hot-deploy-session

Session persistence has been present in most major containers for many years (e.g. Restart Persistence in Tomcat), but was notoriously absent in Glassfish before v3.

OSGi

There is a lot of misunderstanding surrounding what exactly OSGi does and doesn’t do. If we ignore the aspects irrelevant to the current issue, OSGi is basically a collection of modules each wrapped in its own classloader, which can be dropped and recreated at will. When it’s recreated, the modules are reinitialized exactly the same way a web application is.

osgi

The difference between OSGi and a web container is that OSGi is something that is exposed to your application, that you use to split your application into arbitrarily small modules. Therefore, by design, these modules will likely be much smaller than the monolithic web applications we are used to building. And since each of these modules is smaller and we can “redeploy” them one-by-one, re-initialization takes less time. The time depends on how you design your application (and can still be significant).

Tapestry 5, RIFE & Grails

Recently, some web frameworks, such as Tapestry 5, RIFE and Grails, have taken a different approach, taking advantage of the fact that they already need to maintain application state. They’ll ensure that state will be serializable, or otherwise easily re-creatable, so that after dropping a classloader, there is no need to reinitialize anything.

This means that application developers use frameworks’ components and the lifecycle of those components is handled by the framework. The framework will initialize (based on some configuration, either xml or annotation based), run and destroy the components.

As the lifecycle of the components is managed by the framework, it is easy to recreate a component in a new classloader without user intervention and thus create the effect of reloading code. In the background, the old component is destroyed (classloader is dropped) and a new one created (in a new classloader where the classes are read in again) and the old state is either deserialized or created based on the configuration.

component

This has the obvious advantage of being very quick, as components are small and the classloaders are granular. Therefore the code is reloaded instantly, giving a smooth experience in developing the application. However such an approach is not always possible as it requires the component to be completely managed by the framework. It also leads to incompatibilities between the different class versions causing, among others, ClassCastExceptions.

We’ve Covered a Lot – and simplified along the way

It’s worth mentioning that using classloaders for code reloading really isn’t as smooth as we have described here – this is an introductory article series. Especially with the more granular approaches (such as frameworks that have per component classloaders, manual classloader dropping and recreating, etc), when you start getting a mixture of older and newer classes all hell can break loose. You can hold all kinds of references to old objects and classes, which will conflict with the newly loaded ones (a common problem is getting a ClassCastException), so watch what you’re doing along the way. As a side note: Groovy is actually somewhat better at handling this, as all calls through the Meta-Object Protocol are not subject to such problems.

This article addressed the following questions:

  • How are dynamic classloaders used to reload Java classes and applications?
  • How do Tomcat, GlassFish (incl v3), and other servers reload Java classes and applications?
  • How does OSGi improve reload and redeploy times?
  • How do frameworks (incl Tapestry 5, RIFE, Grails) reload Java classes and applications?

Coming up next, we continue our explanation of classloaders and the redeploy process with an investigation into HotSwap and JRebel, two tools used to reduce time spent reloading and redeploying. Stay tuned!

Other articles in the series:

JRebel 2.2.1 Released

Monday, December 21st, 2009

We’re glad to announce the JRebel 2.2.1 release. It is a maintenance release incorporating all the bugfixes that have made since the 2.2 release. You can see the details from the full changelog.

Pick up the new version at our download page.

JRebel 2.2 “Easy Peasy” Released

Tuesday, December 15th, 2009

It is our great pleasure to announce JRebel 2.2, the “Easy Peasy” release. In this release we have focused heavily on ease of installation, configuration and use. The main new feature is the semi-automatic installer and configuration wizard, that makes installing JRebel and configuring your application a snap. We have also included a configuration utility that supplements all those funky system properties with a centralized GUI configuration. For those who prefers the Zen of the Command Line we have compiled a comprehensive reference manual about all things JRebel included in the distribution.

We also invested heavily into making the JRebel IDE integration as seamless as possible. The new NetBeans plugin now supports debugging when the JRebel agent is enabled. The numerous updates to Eclipse and IntelliJ IDEA plugins allow you to run the application with JRebel effortlessly from inside the IDE.

Take a look at the full changelog, download now or check out the screenshots:

jrebel-config-wizard Picture 2 Picture 3

launch cartoon

JRebel 2.2 Feature Preview Available

Thursday, December 10th, 2009

Next week we plan to release the 2.2 version of JRebel. The two main features of this release are:

  • The new semi-automatic installer and configuration wizard, that make installing and configuring JRebel a snap.
  • New or updated releases of IDE plugins for Eclipse, IntelliJ and NetBeans (both 6.5 and 6.7).

While we’re finishing the polish on the release itself, we have already uploaded the IDE plugins to their respective repositories. Just hit “Update” and you can start using them right away. We have also uploaded a preview version of the Installer-enabled JRebel distribution, which you can now get from Downloads.

Reloading Java Classes 201: How do ClassLoader leaks happen?

Thursday, December 10th, 2009

For the full article series on Reloading Java Classes, see:

From ClassLoaders to Classes

If you have programmed in Java for some time you know that memory leaks do happen. Usually it’s the case of a collection somewhere with references to objects (e.g. listeners) that should have been cleared, but never were. Classloaders are a very special case of this, and unfortunately, with the current state of the Java platform, these leaks are both inevitable and costly: routinely causing OutOfMemoryError’s in production applications after just a few redeploys.

Let’s get started. Recalling RJC101: to reload a class we threw away the old classloader and created a new one, copying the object graph as best we could:

reloading-object

Every object had a reference to its class, which in turn had a reference to its classloader. However we didn’t mention that every classloader in turn has a reference to each of the classes it has loaded, each of which holds static fields defined in the class:

classloader-refs

This means that

  1. If a classloader is leaked it will hold on to all its classes and all their static fields. Static fields commonly hold caches, singleton objects, and various configuration and application states. Even if your application doesn’t have any large static caches, it doesn’t mean that the framework you use doesn’t hold them for you (e.g. Log4J is a common culprit as it’s often put in the server classpath). This explains why leaking a classloader can be so expensive.
  2. To leak a classloader it’s enough to leave a reference to any object, created from a class, loaded by that classloader. Even if that object seems completely harmless (e.g. doesn’t have a single field), it will still hold on to its classloader and all the application state. A single place in the application that survives the redeploy and doesn’t do a proper cleanup is enough to sprout the leak. In a typical application there will be several such places, some of them almost impossible to fix due to the way third-party libraries are built. Leaking a classloader is therefore, quite common.

To examine this from a different perspective let’s return to the code example from our previous article. Breeze through it to quickly catch up.

Introducing the Leak

We will use the exact same Main class as before to show what a simple leak could look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class Main {
  private static IExample example1;
  private static IExample example2;
 
  public static void main(String[] args)  {
    example1 = ExampleFactory.newInstance().copy();
 
    while (true) {
      example2 = ExampleFactory.newInstance().copy();
 
      System.out.println("1) " +
        example1.message() + " = " + example1.plusPlus());
      System.out.println("2) " +
        example2.message() + " = " + example2.plusPlus());
      System.out.println();
 
      Thread.currentThread().sleep(3000);
    }
  }
}

The ExampleFactory class is also exactly the same, but here’s where things get leaky. Let’s introduce a new class called Leak and a corresponding interface ILeak:

1
2
3
4
5
6
7
8
9
10
interface ILeak {
}
 
public class Leak implements ILeak {
  private ILeak leak;
 
  public Leak(ILeak leak) {
    this.leak = leak;
  }
}

As you can see it’s not a terribly complicated class: it just forms a chain of objects, with each doing nothing more than holding a reference to the previous one. We will modify the Example class to include a reference to the Leak object and throw in a large array to take up memory (it represents a large cache). Let’s omit some methods shown in the previous article for brevity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class Example implements IExample {
  private int counter;
  private ILeak leak;
 
  private static final long[] cache = new long[1000000];
 
  /* message(), counter(), plusPlus() impls */
 
  public ILeak leak() {
    return new Leak(leak);
  }
 
  public IExample copy(IExample example) {
    if (example != null) {
      counter = example.counter();
      leak = example.leak();
    }
    return this;
  }
}

The important things to note about Example class are:

  1. Example holds a reference to Leak, but Leak has no references to Example.
  2. When Example is copied (method copy() is called) a new Leak object is created holding a reference to the previous one.

If you try to run this code an OutOfMemoryError will be thrown after just a few iterations:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at example.Example.<clinit>(Example.java:8)

With the right tools, we can look deeper and see how this happens.

Post Mortem

Since Java 5.0, we’ve been able to use the jmap command line tool included in the JDK distribution to dump the heap of a running application (or for that matter even extract the Java heap from a core dump). However, since our application is crashing we will need a feature that was introduced in Java 6.0: dumping the heap on OutOfMemoryError. To do that we only need to add -XX:+HeapDumpOnOutOfMemoryError to the JVM command line:

java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid37266.hprof ...
Heap dump file created [57715044 bytes in 1.707 secs]
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at example.Example.<clinit>(Example.java:8)

After we have the heap dump we can analyze it. There are a number of tools (including jhat, a small web-based analyzer included with the JDK), but here we will use the more sophisticated Eclipse Memory Analyzer (EMA).

After loading the heap dump into the EMA we can look at the Dominator Tree analysis. It is a very useful analysis that will usually reliably identify the biggest memory consumers in the heap and what objects hold a reference to them. In our case it seems quite obvious that the Leak class is the one that consumes most of the heap:

classloader-dominator-tree.png

Now let’s run a search for all of the Leak objects and see what are they holding to. To do that we run a search List objects -> with outgoing references for “example.Leak”:

ema-search

The results include several Leak objects. Expanding the outgoing references we can see that each of them holds on to a separate instance of Example through a bunch of intermediate objects:

ema-analysis

You may notice that one of the intermediate objects is ExampleFactory$1, which refers to the anonymous subclass of URLClassLoader we created in the ExampleFactory class. In fact what is happening is exactly the situation we described in the beginning of the article:

  • Each Leak object is leaking. They are holding on to their classloaders
  • The classloaders are holding onto the Example class they have loaded:

classloader-leak

Conclusions

Though this example is slightly contrived, the main idea to take away is that it’s easy to leak a single object in Java. Each leak has the potential to leak the whole classloader if the application is redeployed or otherwise a new classloader is created. Since preventing such leaks is very challenging, it’s a better idea to use Eclipse Memory Analyzer and your understanding of classloaders to hunt them down after you get an OutOfMemoryError on redeploy.

This article addressed the following questions:

  • How does reloading a class cause the classloader to leak?
  • What are some consequences of leaking classloaders?
  • What tools can be used to troubleshoot these memory leaks?

Check out the next articles in the series:

Resources

JRebel 2.1.1 Released

Friday, November 13th, 2009

We’re glad to announce the JRebel 2.1.1 release. It is a maintenance release incorporating all the bugfixes that have accumulated during the past month or so. It also includes the new Log4J plugin, support for Jetty 7 and GlassFish v3 Preview (apparently Prelude and Preview differ a lot, go figure).

Changes include:

  • Support for Jetty 7
  • Preliminary support for GlassFish v3 Preview
  • Log4J plugin now reloads changes to log configuration on-the-fly, contributed by Julien Richard.
  • Fixed an issue causing the annotations on constructors to disappear after class reload.
  • Fixed an issue with Google Web Toolkit client side classes in hosted mode
  • Fixed an issue with FileNotFoundException thrown by JavaRebelResourceServlet
  • Fixed an issue with ClassCastException when defining web services in web.xml
  • Fixed several issues in the Wicket plugin.

You can pick up the new version on our download page, by choosing the standard download.

Reloading Java Classes 101: Objects, Classes and ClassLoaders

Tuesday, November 10th, 2009

Welcome to Turnaround article series from ZeroTurnaround.

In this article we will review how to reload a Java class using a dynamic classloader. To get there we’ll see how objects, classes and classloaders are tied to each other and the process required to make changes. We begin with a bird’s eye view of the problem, explains the reloading process, and then proceed to a specific example to illustrate typical problems and solutions. Other articles in the series include:

A Bird’s Eye View

The first thing to understand when talking about reloading Java code is the relation between classes and objects. All Java code is associated with methods contained in classes. Simplified, you can think of a class as a collection of methods, that receive “this” as the first argument. The class with all its methods is loaded into memory and receives a unique identity. In the Java API this identity is represented by an instance of java.lang.Class that you can access using the MyObject.class expression.

Every object created gets a reference to this identity accessible through the Object.getClass() method. When a method is called on an object, the JVM consults the class reference and calls the method of that particular class. That is, when you call mo.method() (where mo is an instance of MyObject), then the JVM will call mo.getClass().getDeclaredMethod("method").invoke(mo) (this is not what the JVM actually does, but the result is the same).

object

Every Class object is in turn associated with its classloader (MyObject.class.getClassLoader()). The main role of the class loader is to define a class scope — where the class is visible and where it isn’t. This scoping allows for classes with the same name to exist as long as they are loaded in different classloaders. It also allows loading a newer version of the class in a different classloader.

reloading-object

The main problem with code reloading in Java is that although you can load a new version of a class, it will get a completely different identity and the existing objects will keep referring the previous version of the class. So when a method is called on those objects it will execute the old version of the method.

Let’s assume that we load a new version of the MyObject class. Let’s refer to the old version as MyObject_1 and to the new one as MyObject_2. Let’s also assume that MyObject.method() returns “1″ in MyObject_1 and “2″ in MyObject_2. Now if mo2 is an instance of MyObject_2:

  • mo.getClass() != mo2.getClass()
  • mo.getClass().getDeclaredMethod("method").invoke(mo)
    != mo2.getClass().getDeclaredMethod("method").invoke(mo2)
  • mo.getClass().getDeclaredMethod("method").invoke(mo2) throws a ClassCastException, because the Class identities of mo and mo2 do no match.

This means that any useful solution must create a new instance of mo2 that is an exact copy of mo and replace all references to mo with it. To understand how hard it is, remember the last time you had to change your phone number. It’s easy enough to change the number itself, but then you have to make sure that everyone you know will use the new number, which is quite a hassle. It’s just as difficult with objects (in fact, it’s actually impossible, unless you control the object creation yourself), and we’re talking about many objects that you must update at the same time.

Down and Dirty

Let’s see how this would look in code. Remember, what we’re trying to do here is load a newer version of a class, in a different classloader. We’ll use an Example class that looks like this:

public class Example implements IExample {
  private int counter;
  public String message() {
    return "Version 1";
  }
  public int plusPlus() {
    return counter++;
  }
  public int counter() {
    return counter;
  }
}

We’ll use a main() method that will loop infinitely and print out the information from the Example class. We’ll also need two instances of the Example class: example1 that is created once in the beginning and example2 that is recreated on every roll of the loop:

public class Main {
  private static IExample example1;
  private static IExample example2;
 
  public static void main(String[] args)  {
    example1 = ExampleFactory.newInstance();
 
    while (true) {
      example2 = ExampleFactory.newInstance();
 
      System.out.println("1) " +
        example1.message() + " = " + example1.plusPlus());
      System.out.println("2) " +
        example2.message() + " = " + example2.plusPlus());
      System.out.println();
 
      Thread.currentThread().sleep(3000);
    }
  }
}

IExample is an interface with all the methods from Example. This is necessary because we’ll be loading Example in an isolated classloader, so Main cannot use it directly (otherwise we’d get a ClassCastException).

public interface IExample {
  String message();
  int plusPlus();
}

From this example, you might be surprised to see how easy it is to create a dynamic class loader. If we remove the exception handling it boils down to this:

public class ExampleFactory {
  public static IExample newInstance() {
    URLClassLoader tmp = 
      new URLClassLoader(new URL[] {getClassPath()}) {
        public Class loadClass(String name) {
          if ("example.Example".equals(name))
            return findClass(name);
          return super.loadClass(name);
        }
      };
 
    return (IExample) 
      tmp.loadClass("example.Example").newInstance();
  }
}

The method getClassPath() for the purposes of this example could return the hardcoded classpath. However, in the full source code (available in the Resources section below) you can see how we can use the ClassLoader.getResource() API to automate that.

Now let’s run Main.main and see the output after waiting for a few loop rolls:

1) Version 1 = 3
2) Version 1 = 0

As expected, while the counter in the first instance is updated, the second stays at “0″. If we change the Example.message() method to return “Version 2″. The output will change as follows:

1) Version 1 = 4
2) Version 2 = 0

As we can see, the first instance continues incrementing the counter, but uses the old version of the class to print out the version. The second instance class was updated, however all of the state is lost.

To remedy this, let’s try to reconstruct the state for the second instance. To do that we can just copy it from the previous iteration.

First we add a new copy() method to Example class (and corresponding interface method):

  public IExample copy(IExample example) {
    if (example != null)
      counter = example.counter();  
    return this;
  }

Next we update the line in the Main.main() method that creates the second instance:

example2 = ExampleFactory.newInstance().copy(example2);

Now waiting for a few iterations yields:

1) Version 1 = 3
2) Version 1 = 3

And changing Example.message() method to return “Version 2″ yields:

1) Version 1 = 4
2) Version 2 = 4

As you can see even though it’s possible for the end user to see that the second instance is updated and all its state is preserved, it involves managing that state by hand. Unfortunately, there is no way in the Java API to just update the class of an existing object or even reliably copy its state, so we will always have to resort to complicated workarounds.

In subsequent articles we’ll review how web containers, OSGi, Tapestry 5, Grails and others confront the problem of managing state when reloading classes, then we’ll dig into how HotSwap, Dynamic Languages, and the Instrumentation API work, and go behind the scenes with JRebel as well.

Check out the next articles in the series, here:

Resources

Screencast: Speedy Struts 1 and Struts2 with JRebel

Tuesday, October 27th, 2009

On September 30th, 2009, Apache released Struts 2.1.8 for general availability.  Though we couldn’t find much info on the differences between 2.1.6 and 2.1.8, here’s what Musachy Barroso said about “Why web developers should choose Struts 2″, in his interview on InfoQ.

“Struts 2 is probably the most loosely-coupled framework available. Out of the box, many features are usable with little or no customization and it is easy to learn. The same knowledge can then be applied to add plugins to override default behaviors. The loose coupling also allows business logic to be written with no knowledge of the existence of Struts. Despite this, Struts scales up really well and is currently powering some very high-traffic sites.”

We like the evolution of Struts, and wanted to do our part to help minimize the time between writing Struts code & seeing the changes, so with the release of JRebel 2.1, we are proud to present extended support for Struts 1.x & 2.x, with full support for reloading action mappings. Combined with JRebel’s previously released features (skipping redeploys by reloading changes to Java classes in the running application, and skipping builds by mapping the project workspace to the deployed WARs or EARs), JRebel is now an even more potent time-saving tool for Struts 1 and Struts 2 users.

See the screencast for a demonstration of coding with Struts 2 and JRebel:

If you’re not familiar with JRebel, catch up in under 3 mins with this screencast, which shows other JRebel features, including support for the Spring framework (or take a look at the feature list):

We’re glad to see the success of the Struts 2 framework, and happy to support the community.  Did you know that on average there are more than 2 million project downloads of Struts per month, since March of 2008?

The Build Tool Report: Turnaround Times using Ant, Maven, Eclipse, IntelliJ, and NetBeans

Wednesday, October 21st, 2009

Some time ago we ran a survey asking a few questions about the build process, specifically the tools that are used to do incremental builds and how much time those builds take. We had over 600 responses, so now it’s time to count the results.

This is the first time that we’ve published results on the incremental build process, so the information is more likely to serve as a guide than an authoritative information source. That being said, the information is still quite interesting, and if it serves to start a conversation that improves the process of even one team, then we’re proud to have helped out. If you haven’t answered the 3-question survey yet, take two minutes and go for it – and do let your community know about it – as more answers trickle in we’ll update this post with the new data. If you’d like to play with the results on your own we‘ve provided all the data and our calculations in a handy Excel sheet that you can download here.

The first question in the survey was “What build tool do you use for incremental builds on your largest current project?” The breakdown follows:

Chart 1: “Which build tool is used most often for incremental builds?”

chart1
This does not include tools that scored less than 10 answers in the survey. Those are:

  • Buildr
  • Shell scripts
  • javac
  • Make
  • NAnt
  • Savant
  • Hudson
  • PHP

Maven and Ant are responsible for over half of the incremental builds in the wild. It also seems that Maven has overtaken Ant by popularity, although you should take it with a grain of salt as the questions were not phrased for this purpose. Unfortunately I couldn’t find any external data to confirm or contradict these results. Please do let us know if you know of other surveys on this topic.

The other half of incremental builds are done inside IDEs. Although some of those are just driving Ant or Maven inside the IDE, there is also a large number of developers that use the IDE as the primary build tool during development (we’ll analyze this in detail in the end of the report).

It might be valuable to reiterate here that this is a poll where respondents are self-selecting, and therefore it may not accurately reflect the actual marketplace when it comes to determining market share.  Since we’re more interested in the incremental build process itself, that’s something we can live with.  That being said, there is a chance that people who have fast (or slow) builds may be more likely to complete a poll like this.  Since we’re not sure if this is serious enough to sway any results, we’ll just display the data, and let you decide.  So, while 53% of developers are using Ant or Maven for their incremental builds, everyone else uses their IDEs: Eclipse is dominating the IDE landscape with 32% , followed by IntelliJ IDEA at 10% and NetBeans at 5%.

With that in mind, we asked, “How long does an incremental build take?”

Chart 2: “How long does an incremental build take?”

chart2

This is relatively good news. Nearly half of our respondents (44%) indicated that their incremental build process takes less than 30 seconds. 40% of respondents have incremental builds lasting from 1 to 3 minutes. Only 16% of incremental builds last over 4 minutes.

The average length of a build is 1.9 minutes with the standard deviation of 2.8.

To finish up, we asked, “In an hour of coding, how many times do you run an incremental build?”

Chart 3: “How often do Java developers run incremental builds?”

chart3

A healthy 31% of respondents don’t have to run the build at all (e.g. it’s run automatically on save). The rest of the numbers are all over the place, so we’ll wait with the analysis until we can put them in context.

The average number of incremental builds an hour is 3.9 times, with a standard deviation of 4.1.

It’s time to crunch some data. We assigned numeric values to each of the intervals (e.g. “2.5″ for the “2-3″ interval) and multiplied the number of incremental builds an hour by the amount of time one incremental build takes (basically, Chart 2 times Chart 3), thus finding the approximate amount of time respondents spend building in each hour of development. This was done per respondent, so if someone said that their build takes 4-5 minutes and they build twice an hour, we’d see a result of 9 minutes per hour. We broke down the data into the following intervals:

Chart 4: “Time spent on incremental builds during an hour of coding”

chart4

The average total time taken by incremental builds in an hour is exactly 6 minutes, but the standard deviation is 10.1, rendering this number unreliable. We can, however, divide the respondents in three quite well defined groups:

  • Less than 1 minute an hour. 34% of respondents basically don’t spend any time on the incremental build process.
  • 1 to 5 minutes an hour. 34% of respondents spend a “reasonable” amount of time on incremental builds – under 5 minutes an hour. This group spends an average of 3 minutes an hour on incremental builds, which corresponds to about 5% of total development time.
  • Over 5 minutes an hour. 32% of respondents spend over 5 minutes an hour on incremental builds. The weighted average in this group is over 13 minutes an hour. This group of developers is spending about 22% of their development time on incremental builds.

Chart 5: “Time spent on incremental builds, per hour, by build tool”

chart5

It is clear from this chart that Ant and Maven take significantly more time than IDE builds. Both take about 8 minutes an hour, which corresponds to 13% of total development time. There seems to be little difference between the two, perhaps because the projects where you have to use Ant or Maven for incremental builds are large and complex.

Eclipse is definitely the fastest with 2.9 minutes an hour, which corresponds to about 5% of total development time. Eclipse is the only IDE supporting true incremental build on save using a fast embedded compiler, so these results are expected. It is likely that the true number is even lower, as some of the respondents use Eclipse to launch the Ant or Maven builds (we’ll return to this with the next chart).

IntelliJ IDEA falls in between with 5.7 minutes an hour and about 10% of total development time. It does not have true background compilation and often the IDE builds just launch Ant or Maven behind the scenes.

Chart 6: “Incremental build length breakdown per build tool”

chart6

This final chart shows the proportion of respondents in one of the three groups we defined with Chart 4 broken down per build tool.

Finally some things become clear. About 61% of Eclipse builds happen instantly (taking less than 1 minute per hour). We can assume that those are the respondents using compile-on-save, whereas the rest use Eclipse to launch Ant or Maven builds. Considering that 32% of respondents are using Eclipse (see Chart 1), this means that about 20% of all of the respondents use Eclipse with compile-on-save and thus benefit from the instant incremental builds.

The breakdowns of Ant and Maven times are quite similar, with Maven being slightly slower. Moreover the breakdown for IntelliJ IDEA is also similar, supporting the hypotheses that IntelliJ IDEA builds launch Ant or Maven in the background. The proportion itself likely corresponds to the size and complexity of projects, with smaller ones building quickly and larger ones taking build time of over 5 minutes and hour.

It is hard to draw any simple conclusions from this survey, as different groups have different problems, so instead let’s put together the most interesting and reliable numbers we got:

  • 56% of respondents have builds that last over half a minute, each.
  • 20% of respondents use Eclipse with compile-on-save and thus benefit from the instant incremental builds.
  • 34% of respondents spend an average of 3 minutes an hour on incremental builds, which corresponds to about 5% of total development time, or 1.5 weeks per year (40-hour weeks).
  • 32% of respondents spend more than 5 minutes per hour on incremental builds.  Of the developers spending more than 5 mins per hour, 13 minutes per hour is the average amount of time spent.  13 mins per hour equals 22% of total development time, or 6.5 weeks per year (40 hour workweeks).
  • Weeks spent on incremental builds are calculated by assuming 48 work weeks per year (minus vacation) and an average of 5 hours of development time per day.
  • Java developers spend 1.5 to 6.5 work weeks a year (with an average of 3.8 work weeks,  or 152 hours, annually) waiting for builds, unless they are using Eclipse with compile-on-save.


Olark Livehelp