How easily can you cause OutOfMemoryException?

dam

OutOfMemoryException despite GC

That post will be rather short note than fully featured post.  In few words I will show you (and warn you) how easily we can cause OutOfMemoryException despite GC doing its great job. Example comes from my short struggle with the said issue.

Background

Few months ago I was indexing into Elasticsearch over 16 millions XML documents containing geospatial data. It involved great library elastic4s (which I personally recommend when interacting with ES from Scala) and Observable from Monix.

Code

Main for-comprehension included three steps: creating backup of the index, indexing data from scratch and cleaning afterwards. It was something like that:

val geoFetcherObs: Observable[File] =
        GeoFetcher.fetchGeoDataObs(tempDataPath, fetchingParallelism)

for {
  _       <- createBackup(client)
  _       <- processData(client, geoFetcherObs).toListL
  result  <- realiasAndCleanup(client)
} yield result

The type returned by the second line was somewhat complicated (and actually crucial): List[Either[Throwable, RequestSuccess[BulkResponse]]].

That type was originating from processData function. Its body was as follows:

def processData(client: EsClient, fetcherObservable: Observable[File]):
    Observable[Either[Throwable, RequestSuccess[BulkResponse]]] = {
      fetcherObservable
        .transform(extractZipFiles)
        .transform(parseXmls)
        .transform(toIndexable)
        .bufferIntrospective(conf.batchSize)
        .transform(indexBatch(client))
  }

As you can see the whole stream and stages of processing involved:
– getting zip file from stream,
– deflating it to separate XML files,
– parsing XML files,
– converting parsed data to an indexable form,
– dividing into batches,
– finally – indexing batch of data into an Elasticsearch index.

Outcome

At first, after starting stream, everything looked great. Files with geospatial data were downloaded and extracted. Then, after XML parsing, thousands of documents were landing safely in the index.  Well, until OutOfMemoryException happened…

Applying many potential solutions (which I will save you from) didn’t work at all. Then, next logical step was looking into JVM and heap state during application run. In such cases VisualVM is your best friend. In order to make my application successfully finished I needed 64GB of RAM which was screaming symptom there was something really wrong. Here is screenshot from VisualVM (you can click it to see it bigger):

VisualVM screenshot - before fix
VisualVM screenshot – before fix

As you can see CPU usage was acceptable but Garbage Collector was struggling to restore as much heap as it could. In fact, in the end of application run there was 20GB of heap space that couldn’t be released at all. Maybe some memory leak? Maybe not in my code but in elastic4s library? (Yeah, for sure… ( ͡° ͜ʖ ͡°))

After some more spent time it suddenly hit me what I was accumulating. Do you still remember that .toListL from first code snippet producing List[Either[Throwable, RequestSuccess[BulkResponse]]]? Well, that list contained statuses of successful indexing for roughly 16 millions documents and occupied … 20GB of RAM memory.

Problem solved

Finally, when the (shameful) problem was located I realized I didn’t need those statuses at all because I didn’t do anything with them. All I needed was just waiting up to indexing end. The solution couldn’t be easier. I just changed .toListL  to .completedLwhich, in for-comprehension, returns just Unit . And that’s really all. VisualVM screenshot after applying fix (again you can click it to see it in bigger resolution):

VisualVM screenshot - after fix
VisualVM screenshot – after fix

What you can see is 128MB of RAM  being perfectly enough for the whole application lifespan. It’s 512 times less than initial 64GB!

Summary

Despite Garbage Collector existing inside JVM and doing its dirty job you can still hurt yourself and cause OutOfMemoryException. In my case GC could not release heap occupied by indexing statuses because I explicitly requested to get all of them. It happened when I inattentively invoked .toListL. Beware of such practices, even in JVM world. Good luck! 🙂