Java troubleshooting: out of memory errors

Small picture of Mark
Founder & Expert Button Pusher of Teaspoon Consulting

Running out of memory is one of the most common issues affecting Java (and other JVM language) applications in production. This article explains how to recognise low memory issues, and uses a small program to demonstrate some of the tools you can use to find what's using all your memory.

Overview

A typical OutOfMemoryError: java.lang.OutOfMemoryError: Java heap space

Memory issues are an unfortunate part of the Java landscape. If you're running programs on the Java Virtual Machine (JVM) and haven't seen an error like the one shown above, count yourself lucky. For everyone else, these sort of issues are all-too-common—often dealt with by increasing heap sizes, or trying random permutations of JVM switches until they go away.

We tend to see these sort of issues with the JVM because it runs in a fixed-size heap, set when the JVM first starts. As the person running an application, it's up to you to estimate the peak amount of memory your application will require, then to size the JVM heap accordingly. This is tough to get right and often requires some level of experimentation.

Complicating this, low memory issues are not always obvious. If you're lucky, your application crashes outright when it runs out of memory. But just as often it continues limping along, with the garbage collector constantly trying to free some space. This often causes memory issues to be misdiagnosed as application performance issues: top reports high CPU usage from the application's JVM, but it's actually the garbage collector thread doing all the work.

This article focuses on two topics: how to know when you're running out of memory, and how to track down the spot in your application that's using it all.

Identifying low memory

If you're lucky, you know your application is running out of memory because it has told you. If you see this in your log files:

 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Then you're running out of memory. A popular option at this point is to try increasing your JVM heap size with a JVM switch like:

 -Xmx1g

and then hope your problem goes away. This might actually be the right thing to do! If your application's working set peaks at around 1GB for normal usage, then there's no problem here: set your JVM options and take the rest of the day off.

For those still here, we need to do some more digging. An excellent starting point is to run your application with the -verbose:gc JVM switch. This tells the garbage collector to log diagnostic information after each garbage collection run. Once you enable this, you'll see lines like the following in your console (or log file):

 [GC 52082K->44062K(116544K), 0.0040520 secs]
 [GC 58654K->45190K(116544K), 0.0042610 secs]
 [GC 59782K->47526K(116544K), 0.0015520 secs]
 [GC 71590K->52294K(116544K), 0.0010260 secs]
 [GC 66886K->93618K(116544K), 0.0074450 secs]
 [Full GC 93618K->42708K(116544K), 0.0126130 secs]
 [GC 56599K->49312K(116544K), 0.0040300 secs]
 [GC 63904K->50976K(116544K), 0.0069590 secs]
 [GC 65568K->53248K(116544K), 0.0014260 secs]
 [GC 77056K->57888K(116544K), 0.0019720 secs]
 [Full GC 99310K->47201K(116544K), 0.0139520 secs]
 [GC 61793K->54476K(116544K), 0.0044440 secs]
 [GC 69068K->56180K(116544K), 0.0054920 secs]
 [GC 70772K->58452K(116544K), 0.0019430 secs]
 [GC 82260K->63060K(116544K), 0.0008430 secs]
 [Full GC 99119K->50845K(116544K), 0.0137490 secs]
 [GC 65437K->58441K(116544K), 0.0050000 secs]
 [GC 73033K->60161K(116544K), 0.0054100 secs]

Each line records the details of a garbage collection run. The lines marked with GC represent garbage collections of the JVM's young generation, while lines with Full GC represent garbage collections of the JVM's old generation.

We won't describe the complete workings of the JVM's memory management here, so here's the short version: the young generation and old generation are different areas of memory where objects are stored, and the JVM garbage collects each area using a different method. The young generation garbage collector is optimised for lots of objects that live fast and die young; the old generation garbage collector is optimised for long-lived objects that are more likely to stick around than not. Objects are allocated from the young generation when first created and, if they live long enough, eventually wind up in the old generation where they live out their days listening to the radio and complaining about the young generation. Good.

So we tend to see more GC lines than Full GC lines, but they're both telling us the same stuff. Picking one line at random:

 [GC 52082K->44062K(116544K), 0.0040520 secs]

This tells us that during this garbage collection run against the young generation:

What low memory problems look like

When an application is in trouble, you'll start to see a relatively high number of Full GC messages. That's a sign that objects are sticking around and filling up the old generation area:

 [GC 85682K->73458K(104448K), 0.0010870 secs]
 [GC 88050K->75730K(116160K), 0.0017100 secs]
 [Full GC 100370K->80306K(116160K), 0.0110770 secs]
 [Full GC 95730K->81808K(116160K), 0.0256440 secs]
 [Full GC 97232K->76495K(116160K), 0.0116220 secs]
 [Full GC 90013K->77613K(116160K), 0.0114290 secs]
 [Full GC 93037K->82221K(116160K), 0.0060700 secs]
 [Full GC 91372K->82222K(116160K), 0.0081970 secs]
 [Full GC 97646K->84525K(116160K), 0.0097500 secs]
 [Full GC 98993K->87661K(116160K), 0.0202760 secs]

People often misinterpret garbage collection logs by focusing on the number on the left of the arrow (the heap utilisation pre-collection). Keep in mind that the normal mode of operation for the JVM is to accumulate garbage until some threshold is reached, then to run the garbage collector to clear it.

If we take some garbage collection logs and plot, over time, the numbers on either side of the arrows (showing heap utilisation before and after garbage collection), a healthy application looks like this:

A healthy GC memory pattern

Here we see a sawtooth pattern, where the points at the top represent heap utilisation prior to garbage collection, and the points at the bottom represent heap utilisation immediately after. It's the points at the bottom that matter here: we see that the garbage collector is successfully freeing memory and reverting heap utilisation back to a fairly consistent baseline.

If we plot the garbage collection log lines shown above, things aren't so rosy:

An application in trouble!

This application seems to be consuming more memory over time. There's a lot of variability in how much memory the garbage collector frees during each run, and a long-term trend of freeing less memory with each collection.

To summarise, you might have low memory problems if:

If that's the case, it's time to start tracking down your memory leak.

Finding memory leaks

In this section we'll look at some of the tools you can use for finding memory leaks in your code. In the spirit of contrived examples, here's a contrived example! This example has two parts:

Here's the code:

 import java.util.ArrayList;
 import java.util.List;
 import java.util.Iterator;
 import java.util.Random;


 public class MemoryLeak
 {
     // Generates long lines of gibberish words.
     static class GibberishGenerator implements Iterator<String>
     {
         private int MAX_WORD_LENGTH = 20;
         private int WORDS_PER_LINE = 250000;
         private String alphabet = ("abcdefghijklmnopqrstuvwxyz" +
                                    "ABCDEFGHIJKLMNOPQRSTUVWXYZ");

         public boolean hasNext() {
             return true;
         }

         public String next() {
             StringBuffer result = new StringBuffer();
             for (int i = 0; i < WORDS_PER_LINE; i++) {
                 if (i > 0) { result.append(" "); }
                 result.append(generateWord(MAX_WORD_LENGTH));
             }

             return result.toString();
         }

         public void remove() {
             // not implemented
         }


         private String generateWord(int maxLength) {
             int length = (int)(Math.random() * (maxLength - 1)) + 1;
             StringBuffer result = new StringBuffer(length);
             Random r = new Random();

             for (int i = 0; i < length; i++) {
                 result.append(alphabet.charAt(r.nextInt(alphabet.length())));
             }

             return result.toString();
         }
     }


     // A "good" word has as many vowels as consonants and is more than two
     // letters long.
     private static String vowels = "aeiouAEIOU";

     private static boolean isGoodWord(String word) {
         int vowelCount = 0;
         int consonantCount = 0;

         for (int i = 0; i < word.length(); i++) {
             if (vowels.indexOf(word.charAt(i)) >= 0) {
                 vowelCount++;
             } else {
                 consonantCount++;
             }
         }

         return (vowelCount > 2 && vowelCount == consonantCount);
     }


     public static void main(String[] args) {
         GibberishGenerator g = new GibberishGenerator();
         List<String> goodWords = new ArrayList<String>();

         for (int i = 0; i < 1000; i++) {
             String line = g.next();
             for (String word : line.split(" ")) {
                 if (isGoodWord(word)) {
                     goodWords.add(word);

                     System.out.println("Found a good word: " + word);
                     break;
                 }
             }
         }
     }
 }

This code has a memory leak (gasp!). Let's give it a run with a modest JVM heap and see what happens:

 $ java -verbose:gc -Xmx64m MemoryLeak
 ...
 Found a good word: EDeGOG
 [Full GC 44032K->32382K(51136K), 0.0088630 secs]
 [GC 39742K->33566K(58368K), 0.0027320 secs]
 [GC 44382K->36990K(58304K), 0.0041990 secs]
 [Full GC 45982K->44686K(58304K), 0.0224270 secs]
 Found a good word: eZjovI
 [Full GC 51071K->38076K(58304K), 0.0059460 secs]
 [GC 45436K->39228K(58304K), 0.0007690 secs]
 [GC 46588K->40412K(53440K), 0.0009190 secs]
 [Full GC 52380K->42684K(53440K), 0.0085100 secs]
 [Full GC 50044K->42684K(53440K), 0.0086750 secs]
 [Full GC 50044K->42313K(53440K), 0.0092640 secs]
 [Full GC 47351K->42313K(53440K), 0.0077370 secs]
 [GC 42313K->42313K(57984K), 0.0008850 secs]
 [Full GC 42313K->42297K(57984K), 0.0151990 secs]
 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
     at java.util.Arrays.copyOf(Arrays.java:2882)
     at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
     at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
     at java.lang.StringBuffer.append(StringBuffer.java:224)
     at MemoryLeak$GibberishGenerator.next(MemoryLeak.java:24)
     at MemoryLeak.main(MemoryLeak.java:74)

It finds a few words and then runs out of memory. Let's look at some tools for finding out more.

Using jmap

The jmap tool comes as standard with the Java SDK, so there's a good chance you have it already. Using jmap, you can connect to a running JVM and walk its memory pools, printing a summary of the number of instances of each class and how many bytes of heap each is using.

To start, get the process ID of the JVM you're debugging:

 $ ps -ef | grep MemoryLeak
 mst      14979 14511 99 13:53 pts/29   00:00:03 java -verbose:gc -Xmx64m MemoryLeak

Then read the warning below and instruct jmap to attach to that JVM and walk its heap:

 $ jmap -F -histo 14979
 Attaching to process ID 14916, please wait...
 Debugger attached successfully.
 Server compiler detected.
 JVM version is 20.12-b01
 Iterating over heap. This may take a while...
 Object Histogram:

 num       #instances    #bytes  Class description
 —————————————————————————————————————
 1:              952     37806240        char[]
 2:              4997    684344  * MethodKlass
 3:              4997    617648  * ConstMethodKlass
 4:              344     381264  * ConstantPoolKlass
 5:              7790    346488  * SymbolKlass

Warning: The target JVM will be suspended while jmap does its thing, so don't do this on an application that people are actually using!

Looking at the top few lines of jmap's output, there is already a clue: at the moment we took our sample, there were 952 character arrays occupying some 37MB of our 64MB JVM heap. Seeing big character arrays at the top of the list is often a good sign that there are some big strings being stored somewhere. Interesting...

A handy trick for running jmap (or any command!) at the exact moment your JVM throws an OutOfMemoryError is to use the -XX:OnOutOfMemoryError JVM switch. You can invoke it like this:

 java -Xmx64m -XX:OnOutOfMemoryError='jmap -histo -F %p' MemoryLeak

The JVM's process ID will automatically be substituted for the %p placeholder.

Working with heap dumps

Sometimes seeing the jmap histogram is enough to make you realise where the memory leak is—if the topmost entry is one of your own classes (like MyBrilliantCacheEntry) then just curse yourself and move on. But, unfortunately, the largest user of space is very often either char[] or byte[], so all you have is the not-that-helpful knowledge that your memory is being used by... data?

At this point, a heap dump can be helpful. A heap dump gives you a static snapshot of your application's memory and, using tools like jvisualvm or the Eclipse Memory Analyzer, you can inspect your heap dump and look at what's in there.

Getting a heap dump using jmap

The jmap tool can connect to a running JVM and produce a heap dump. Here's how that works:

 $ jmap -F -dump:format=b,file=heap.bin 14979
 Attaching to process ID 16610, please wait...
 Debugger attached successfully.
 Server compiler detected.
 JVM version is 20.12-b01
 Dumping heap to heap.bin ...
 Heap dump file created

This produces a file called heap.bin that can be read by several analysis tools. We'll see how that works in the following sections.

Getting a heap dump on OutOfMemoryError

If you can run your JVM with special switches, you can ask it to produce a heap dump whenever an OutOfMemoryError is thrown. Just run your JVM with:

 java -Xmx64m \
      -XX:+HeapDumpOnOutOfMemoryError \
      -XX:HeapDumpPath=/tmp/heap.bin  \
      MemoryLeak

As before, this will produce a heap dump file called heap.bin

Producing a heap dump from a core file

Usually one of the previous two ways is sufficient to get a heap dump, but here's a bit of trivia. If you connect to your JVM process with gdb:

 $ gdb -p 14979
 GNU gdb (GDB) 7.4.1-debian
 ...
 (gdb)

you can use it to dump a standard Unix core file:

 (gdb) gcore
 Saved corefile core.14979

and then use jmap to create a heap dump from the resulting core file:

 $ jmap -dump:format=b,file=heap.bin `which java` core.14979
 Attaching to core core.14979 from executable /usr/local/bin/java, please wait...
 Debugger attached successfully.
 Server compiler detected.
 JVM version is 20.12-b01
 Dumping heap to heap.bin ...
 Finding object size using Printezis bits and skipping over...
 Heap dump file created

This also works under Solaris for core files produced with the gcore command.

Exploring heap dumps

Whichever method you chose, you should now have a file called heap.bin containing a snapshot of your application's memory from around the time it was having problems.

There are several tools that you can use to analyse heap dumps. If you're feeling old-school, there's jhat, which comes with the Java SDK. Run it against your dump file like this:

 $ jhat heap.bin
 Reading from heap.bin...
 Dump file created Tue Oct 29 14:19:49 EST 2013
 Snapshot read, resolving...
 Resolving 29974 objects...
 Chasing references, expect 5 dots.....
 Eliminating duplicate references.....
 Snapshot resolved.
 Started HTTP server on port 7000
 Server is ready.

and then browse to http://localhost:7000/ to explore the heap dump.

For a long time, jhat was my go-to for this sort of thing. But these days, the interface provided by jvisualvm is my weapon of choice. The Java SDK ships with jvisualvm as standard, so you should be able to start it up by running jvisualvm (or just visualvm on some systems). You should see something like:

JVisualVM

We'll load our application's heap dump:

You'll see your heap dump open in the main jvisualvm pane:

Heap dump loaded

Clicking the Classes tab, we see pretty much what jmap showed us before: lots of char[] instances using all of our memory.

The JVisualVM classes display

From this screen, we can drill down and look at the instances themselves. Double click on the entry for char[] and you'll be transferred to the Instances tab.

The JVisualVM classes display

At the top of the list on the left, we see several instances, each about 5MB. That's a lot of character data! Looking at the pane on the right gives the game away:

 qUqCMAzhjybtUitV vgiUVYuTcROHFygG gjoYlXqOhKdQkvOqot gEQlgdcINvhqjxJ ...

Welsh?! No... gibberish.

An epiphany!

What jvisualvm is showing us is seven large strings, each around 5MB, and each containing lots of gibberish words. That's the clue we needed: something in our program is holding the input lines of gibberish in memory, stopping the garbage collector from freeing them.

Now that you know the cause, you might like to return to the original code and see if you can spot the problem. Spoiler below...




Spoiler

Here's the culprit:

             String line = g.next();
             for (String word : line.split(" ")) {
                 if (isGoodWord(word)) {
                     goodWords.add(word);

We take each (5MB) line of gibberish, split it into words, check each word, and keep the first one we find. What's wrong with that?

It's actually a curiosity of Java: strings are immutable, so there's no harm if two strings share the same character array behind the scenes. In fact, Java uses this to good effect to make taking substrings very fast. If we have two strings like:

 String s1 = "hello world";
 String s2 = "hello world".substring(0, 5);

Then s2 can be created without the need for an additional character array at all: it just stores a reference to the character array of s1, along with the appropriate offset and length information to capture the fact that it's a "slice" of the original string.

In the case of our code, split creates a substring of the original 5MB input line, but that substring contains a reference to the character array from the original input line. Even after the original string is long gone and garbage collected, its character array lives on in the substring! It's sort of beautiful, really.

We can fix it by explicitly cloning the substring:

             String line = g.next();
             for (String word : line.split(" ")) {
                 if (isGoodWord(word)) {
                     goodWords.add(new String(word));

and now our program runs without blowing up, and this concludes our emotional journey.

Update: As of Java 1.7 update 6, string behaviour has changed so that taking a substring doesn't hold on to the original byte array. That's a bit of a blow to my contrived example, but the overall approach to finding the issue still holds.

Update 2: Since around 2017 I've used YourKit for this sort of stuff. It's a great general-purpose profiler and its tools for memory analysis are hard to beat. It costs money (like you) and if you do this sort of thing often it's well worth it.

Links

Some of the tools mentioned in this article:

Need a hand? Find me at mark@teaspoon-consulting.com.