Monday, November 10, 2008

Hibernate Query Cache: A Dirty Little Secret

You Mean, Memory Is Not Infinite?

We're working hard getting MySQL Enterprise Monitor 2.0, featuring Query Analyzer, ready for release. As part of that, we started really ramping up the number of MySQL servers reporting in query data to see how we could scale. Not surprising (to me, anyway), the first efforts did not go so well. My old friend OutOfMemoryError reared its ugly head once again.

Query Cache -- It's More Than Just Results!

We're big (ab)users of hibernate query caching, and more importantly to us the natural id optimized query cache. Firing up the profiler, I was not shocked to see that the second level (let's call it L2) and query caches were hogging the majority of memory. But, something didn't smell right...

What I was seeing was tons of object/entity referenes for our persistent objects. However, the hibernate cache does not really work that way. Hibernate 'dehydrates' query results and persistent objects into their primitive components and identifiers. It stores this decomposed data in the L2 and query results cache, and on a cache hit, it rehydrates/recomposes them into the requested persistent object.

But the profiler does not lie. One of our objects, for which there are conceptually 60 some odd instances, had over 20,000 referenced instances on the heap. Yikes. Obviously, we're doing something wrong, or are we... ?

Ok, Mr. Profiler, who is holding the references on the heap? Drill down a bit through the query cache, past some HashMap entries, into the... keys... keys, you say? Hrm, not values. Interesting. Well, looky here. Hibernate's QueryKey is holding on to a Object[] values array, which is holding the majority of our persistent object references. Ouch. In addition to that, it has a map inside of it whose values also contain non-trivial amount of references to our entities.

Well, nuts. Code spelunking ensues. QueryKey is just as it sounds -- an object that can act as a unique key for query results. This means it includes stuff like the SQL query text itself as well as any parameters (positional or named) that specifically identify a unique result set.

Objects, Objects, Everywhere

Now, silly me, since we are using an object relational mapping, I was using objects for the parameters in my HQL. Something along the lines of:
final Cat mate = ...;
final String hql = "from Cat as cat where cat.mate = ?"
final Query q = session.createQuery(hql);
q.setParameter(0, mate);
q.setCacheable(true);
In this case, the entire Cat mate (and everything he references) would be held in perpetuity. Well, until either the query cache exceeds his configured limits and it is evicted, or the table is modified and the results become dirty.

Let's not forget our friends the Criteria queries, either. Because it is only through criteria that we can get our friend the natural id cache optimization. (and please pardon the contrived-ness of the cat natural id example)
final String foo = "something";
final Cat mate = ...;

final Criteria crit;
final NaturalIdentifier natId;

crit = session.createCriteria(Cat.class);
natId = Restrictions.naturalId();
natId.set("mate", mate);
natId.set("foo", foo);
crit.add(natId);
crit.setCacheable(true);
In the same fashion as the HQL, this will result in hard references to 'mate' and 'foo' held for the cache-life of the query results.

How To Make a Bad Problem Worse

Even worse, in our case, was the fact that we would do the equivalent of load the same 'mate' over and over again (maybe this cat is severely non-monogamous). And whether loaded from L2 cache or directly from the database, the mate Cat now existed as multiple instances, even though they are hashCode()/equals() equivalent. But QueryKey in the query cache doesn't know that. He only knows what he is handed. And he is handed equivalent duplicates over and over and over again, and only lets go of them on cache eviction. So, not only do we end up with essentially unnecessary references to objects held onto by the query keys in the cache, we instantiate and hold onto multiple multiple instantiations of the same object and hold on to those, too. Bear with me as I bow my head in shame...

Fix Attempt 1: Make a Smarter Cache

I've been down this road before. I tried to be smarter than Hibernate once before. It did not end well. Unsullied by prior defeat, I resolved to attempt being smarter than Hibernate once again!

Hibernate's query cache implementation is pluggable. So I'm going to write my own. Ok, I'm not going to write my own -- from scratch. My going theory is that I can at least eliminate the duplication of the equivalent objects referenced in memory. I'm going to decorate hibernate's StandardQueryCache and do the following: For each QueryKey coming in a cache put(), introspect the Object[] values (which are positional parameters to the query). For each object in values[], see if an equivalent canonical object has already been seen (same hashCode/equals()). If so, use the canonical object. Else, initialize the canonical store with this newly seen object.

Notice we only have to do this on put(). A get() can use whatever objects already come in, as they are assumed to be hashCode/equals equivalent. Hell, it HAS to work that way, otherwise QueryKey would just be broken from the start. Here is some snippets of relevant code that implement org.hibernate.cache.QueryCache.
public boolean put(QueryKey key, Type[] returnTypes,
@SuppressWarnings("unchecked") List result, boolean isNaturalKeyLookup,
SessionImplementor session) throws HibernateException {

// duplicate natural key shortcut for space and time efficiency
if (isNaturalKeyLookup && result.isEmpty()) {
return false;
}

canonicalizeValuesInKey(key);

return queryCache.put(key, returnTypes, result, isNaturalKeyLookup,
session);
}

private void canonicalizeValuesInKey(QueryKey key) {
try {
final Field valuesField;
valuesField = key.getClass().getDeclaredField("values");
valuesField.setAccessible(true);
final Object[] values = (Object[]) valuesField.get(key);
canonicalizeValues(values);
} catch (Exception e) {
throw new RuntimeException(e);
}
}

private void canonicalizeValues(Object[] values) {
synchronized (canonicalObjects) {
for (int i = 0; i < values.length; i++) {
Object object = values[i];
Object co = canonicalObjects.get(object);
if (co == null) {
co = object;
canonicalObjects.put(object, co);
} else if (co != object) {
// System.out.println("using pre-existing canonical object "
// + co);
}
values[i] = co;
}
}
}
It's pretty much what i described. I didn't even attempt to get permission to post the whole thing, because it is probably not worth my time. The only thing missing is a HashMap of the canonical objects, an the instantiation of the StandardQueryCache queryCache. You'll also need to implement org.hibernate.cache.QueryCacheFactory to create this smarter query cache factory, and then plug that into your hibernate config.

This did work as expected. My 'outstanding' objects on the heap were greatly reduced. Unfortunately, it was not good enough. I still had thousands of these guys on the heap, essentially unused except to at some point fetch their numeric id to be used by hibernate's generated SQL. And this didn't take care of the named parameters, which are stored in a map of string names to some other internal hibernate class, which I no longer felt like introspecting via reflection. So anything using named parameters was still potentially duplicated.

Fix Attempt 2: Objects? Who Needs Objects?

Hrm, the last paragraph stirred a thought -- hibernate only needs the id's from these objects. While the academic in me enjoyed the exercise in decorating the query cache to be smarter about duplicate references, the idiot in me said "well, duh. if you only need the id, why not just use the id?" Because we're supposed to use objects! Oh well. It occurred to me that I could rewrite any and all HQL to reference object id's instead of the object property reference itself. It should end up in the same SQL eventually sent to the database. Seems like a cheesy code monkey work around, but the theory is that hibernate QueryKey will only be holding onto Integer and Long references instead of entire object entities.

So, I hunt down all of our our cacheable queries. I change them all to object id queries. The previous Cat business can now look like this:
final Cat mate = <...>;
final String hql = "from Cat as cat where cat.mate.id = ?"
final Query q = session.createQuery(hql);
q.setParameter(0, mate.getId());
q.setCacheable(true);
cat.mate becomes cat.mate.id. The mapped parameter becomes mate.getId(). It could be a named parameter just as well. I didn't find a single HQL in our application that I could not convert this way. Good.

But what about our friend Criteria? He requires objects for the properties, right? At first, I thought this was true. And I was consoled by the fact that my smart query cache would do its best to keep duplication at a minimum. In fact, it was the next day (after resting my brain), that another "I'm an idiot" moment came to light. The Criteria API still just takes strings for property references. Perhaps, it follows the same property resolution rules as HQL? In other words instead of "mate", can I say "mate.id"? And the answer is, YES, yes I can! Woo hoo! Absolutely no more object references for me! Here is what the criteria would like like:
final String foo = "something";
final Cat mate = ...;

final Criteria crit;
final NaturalIdentifier natId;

crit = session.createCriteria(Cat.class);
natId = Restrictions.naturalId();
natId.set("mate.id", mate.getId());
natId.set("foo", foo);
crit.add(natId);
crit.setCacheable(true);
Subtle, yes. But trust me, it makes a huge difference. Interesting now, that Fix Attempt 2 likely alleviates the need for Fix Attempt 1. At the worst, we end up with lots and lots of Integer/Long object references and even duplicates of them. The profiler says it is not very much in my limited testing. But, I decide to leave the smarter cache in, because it appears to be working, and it does reduce memory.

Lessons Learned

If you use hibernate query caching, and actually want to use memory for caching useful results, and waste as little as possible with overhead, follow some simple advice:
  • Write your HQL queries to use identifiers in any substitutable parameters.WHERE clauses, IN lists, etc. Using full objects results in the objects being kept on the heap for the life of the cache entry.
  • Write your Criteria restrictions to use identifiers as well.
  • Use the 'smart' query cache implementation skeleton in this article to eliminate duplicate objects used in query keys. This helps a little if you use the ids in your HQL/Criteria, but if you still must use objects then it helps a lot.
  • If you use a smart query cache implementation, prefer '?' positional parameters over named parameters. I know, I know. I love named parameters too. But the smart query cache implementation in this article only detects duplicates for positional parameters (or params in criteria). Alternatively, feel free to extend the example to locate duplicates in the named parameter map and replace them with canonical ones as well.

I'm Not the Only One Who Noticed...

In doing some final research for this post, I came across hibernate core issue HHH-3383. Let's keep our eye on it to see if the wonderful hibernate devs can fix this down at their level, so we don't have to change our code. Also, the issue lists that Criteria cannot be fixed with the same 'use id' workaround. Since I was able to, I wonder if the bug submitter did not realize you can reference dotted property paths in criteria restrictions exactly as you can in HQL. Perhaps I shall start a dialog with him.