Friday, May 23, 2008

Reliability testing – Detecting and fixing memory leaks

Last month, I encountered an interesting situation regarding memory leak in production. I would like to share my experience of how we were able to diagnose and fix the problem.

While working for a client, we released a new version of the application. A strange behavior was observed. On every Monday, the application used to give “Out of Memory exception”. On discussion with the users it was clear that the load on the system was 50 – 60 % higher on Monday. So, definitely there was a memory leak in the system which was causing the out of memory exception.

The client had not invested in any dedicated QA environment. So, it was not possible to simulate or diagnose the problem. We started looking for a freeware tool which could help us diagnose the system in the production environment.

We finally used JConsole which comes along with JDK 5.0. This is a freeware which can be used for monitoring, the memory utilization with time. It clearly shows how many threads are running. It shows at which time the Garbage collection is happening. In fact, it has a provision where you can use it to simulate Garbage Collection.

Using this tool, we could locate the probable portion of the code where the problem could be. Then we did a thorough code walk thru of that portion of the code. We found that the calls to certain functions were being made by starting a new session. This problem was there at multiple places in the application. After making the changes, we could observe the difference in JConsole tool.

More details of JConsole are available at the following location:
http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html