capricious diatribes: Hibernate: Cache Queries the Natural Id Way

Wednesday, June 18, 2008

Hibernate: Cache Queries the Natural Id Way

I work on the MySQL Enterprise Tools team, formerly of MySQL and now with Sun Microsystems. The 2.0 version of the Enterprise Monitor is well under way. As part of this, the Java server backend has been refactored to utilize Spring and Hibernate. Honestly, I didn't know either one of those technologies before starting this project. Oh, what a fun road it has been...

A big draw for using an off-the-shelf ORM was so that we didn't have to write our own (kind of bad and slightly wrong -- those darn transactions) caching implementations for the custom one-off ORM that existed previously. A lot of our internal meta-model is very static, so clearly caching would be a HUGE win for performance, right?

Not so fast, turbo. Let me continue...

The headline feature for the 2.0 Monitor is "Query Analysis." Coupled with the MySQL Proxy, the Monitor receives captured query data to/from a MySQL server. Once at the monitor, the data can be aggregated, analyzed, and reported upon. What better test for this feature than to use it on ourselves, to tune ourselves!

And this brings me back to hibernate caching. In the course of monitoring ourselves, I noticed that a certain query was happening WAY more than it should be, just based on my gut feeling. The query in question loaded an object that was generally static -- save for one value that represents the frequency of how often some data should be collected. Its the only mutable value, and once in place, it rarely changes.

Hrm... how to debug. First, we checked the cache settings. Whoops -- WAY too low for both the cache expiry timeout and the max cache elements. Fix that. Still sucks. Some cursory hibernate source and log perusal showed that the cache for these objects was being invalidated at a rapid rate. Yes, the entire cache. Even though the objects are essentially static, the query cache takes the safe route and says any change to a related table invalidates any and all queries that reference that table. This makes sense, from a generic cache standpoint. But I thought to myself -- surely there has to be a way. *I* am smarter than hibernate in this case, and *I* can more rightly determine when the query results should be invalidated. Lucky for me, hibernate allows you to extend the StandardQueryCache "up to date" policy checks. w00t. I implemented one, overrode the timeout policy for the object(s) in question, and re-ran tests. FAILURE. Turns out I am not smarter than hibernate.

However, in the process of implementing the custom query cache policy, I had debugged through some more hibernate code and noticed that "natural id" queries are treated special. Some more google-fu, and quickly I come across Queries by Natural Identifier in the hibernate docs.

Now, the docs just aren't real clear on what optimizations can be made internally by utilizing the Criteria with a natural id restriction. But, as I was just in that section of code, I could correlate it. Here's the meaty bit -- if you make a natural id / key lookup, and hibernate recognizes it as such, it can bypass the table timestamp invalidation and go directly to second-level cache to fetch the object. Hibernate knows, with an immutable and unique natural key, that a table modification will not effect the composition of the object in question (of course, an object modification would, and it would have been evicted from L2 cache).

I cannot overemphasize the utility of this discovery. You see, we were making frequent inserts into the table. But existing objects (rows) where changed almost never. But without the natural key lookup, the inserts invalidated all results in the query cache. There you have why I was seeing way more selects for the same objects than I had anticipated.

Some quick assurances that we mapped the natural id correctly, some quick refactoring of the HQL into Criteria queries with natural id restrictions, and whammo, we're good. Lets run the tests and query analysis again... ruh roh. OH COME ON! (not my exact reaction, but I think you can guess what it was really like).

Confident that the natural id cache lookup optimization was what I really really really wanted, there had to be something else going on. More debuggage ensued. I set a breakpoint near the same area in StandardQueryCache where I first noticed the query cache optimization in the first place. Lo and behold, the hibernate metadata for saying "i am a natural key lookup" was returning false.

I am not amused. I am confident my hibernate mapping is correct, because the unique index was present in the schema. Think. Think. Think. Well, I had recently been on an effort to move from the hbm xml mappings to hibernate annotations mappings. @NaturalId support was, in fact, the very reason I had recently upgraded the annotations jar. On a hunch, I reverted the persistence mapping back to the xml form. Debug, the metadata returns the correct value... test, and YES, finally -- the queries issued are in line with my expectations and the rows present in the database. I. Have. Won.

Being the good open-source citizen, I made a hibernate forum post that detailed my findings, including simplified sample code demonstrating the problem. The good folks on the hibernate forum (after questioning the sillyness of my contrived example) were quick to recognize the problem, and I got a hibernate jira issue opened.

The workaround, obviously, is leave the xml mapping in place until the fix makes it into a hibernate release. Not too bad of a deal, I guess, considering the overall win I now have in my cache hit ratio.

In conclusion -- if it makes sense for your data model, the natural id query cache optimization can be a huge performance win for your app. If you have immutable, or rarely changed objects with a constant natural key lookup -- look into the Criteria natural id restriction. And, use the xml mapping until the bug is fixed.

ps -- there is one other performance note to consider, actually. If using the natural id query, and it returns no rows, this NULL result will not be cached. So, if you have more of these than 'object/row found' results, you will still get tons of these queries that you don't expect. Either stop using the natural key optimization (if 'not found' is more common), or extend your object/schema to include a 'not supported' field. In our case, the lack of a row meant "not supported" and we had a flag "not supported" in case it was supported, but then went away. In those cases where something was frequently "not supported" I simply went ahead and created the object/row and just set the flag to false -- thus ensuring the natural key optimization was not subverted.

224 comments:

«Oldest ‹Older 201 – 224 of 224

Richu said...: AI certification programs are valuable for career growth. Here is an
AI Certification Course in Electronic City Bangalore with practical learning modules.; March 14, 2026 at 8:48 AM
Siva Balan said...: Interesting read. I also found a good option for
AI Training Center in Electronic City where students can learn AI through hands-on projects.; March 15, 2026 at 8:30 AM
Nikhil said...: Artificial intelligence classes provide step-by-step learning for understanding intelligent technologies and algorithms. They explain programming concepts, machine learning techniques, and data processing methods used in AI systems. These artificial intelligence classes help learners build technical knowledge through exercises and coding practice. Students work on projects that demonstrate real-world AI applications. The classes prepare learners for professional AI and data science careers.; March 16, 2026 at 3:49 AM
Pushpalatha said...: Great explanation on Hibernate caching! If you’re also interested in learning the basics of digital marketing, you might find this guide useful: What is Digital Marketing. Thanks for sharing!; March 20, 2026 at 6:20 AM
salesforce cpq course said...: ServiceNow online training offers flexibility to learn from anywhere. With live sessions and recorded classes, it’s ideal for working professionals who want to upskill efficiently.servicenow online training; March 23, 2026 at 12:20 PM
salesforce cpq course said...: Service Now admin course helps learners understand core administrative tasks and system management. It prepares candidates for certification and real-time job roles.service now admin course; March 23, 2026 at 12:20 PM
Anjna Global said...: Very insightful post. I like how natural ID caching reduces the need for repeated database lookups, making applications more efficient and responsive.
Dubai DMC
Singapore DMC
Malaysia DMC
Bali DMC
Azerbaijan DMC; March 25, 2026 at 6:53 AM
Anjna Global said...: Great read! This approach is especially useful in scenarios where business keys are frequently used for data retrieval instead of surrogate keys.
Oracle Fusion SCM Training
Salesforce Sales Cloud Certification Training
SAP IS Utilities Training
SAP BRIM Training
SAP Document and Reporting Compliance (DRC) Training
Six Sigma Green Belt Training
SailPoint Identity Security Cloud (ISC) Training; March 25, 2026 at 6:55 AM
Gokul said...: For those who are also interested in strengthening their technical skills alongside blogging, this might be useful:
Java Course for Beginners

Thanks for providing such an easy-to-use platform for content creators!; March 25, 2026 at 8:13 AM
Version IT said...: Great article! The explanation about SAP infrastructure and system administration was very informative and easy to understand. For anyone looking to build a strong career in SAP administration, Version IT SAP BASIS Training is a great option. They provide hands-on practical sessions, real-time scenarios, and expert trainer support which really helps in understanding SAP system architecture, installation, configuration, and monitoring.; March 26, 2026 at 12:13 AM
Gokul said...: Loved the way you explained this topic. Also check: Digital Marketing Training in Chennai; March 30, 2026 at 5:24 AM
hhkk said...: This is really helpful. Also see best digital marketing training institute.; April 1, 2026 at 6:29 AM
LOGIN360 said...: Very helpful and easy to understand.
You can also visit Best Digital Marketing Training Institute.; April 1, 2026 at 7:06 AM
Pushpalatha said...: Really helpful blog post.
I learned something new today.
Adding this resource:
Data Science or Data Analytics; April 2, 2026 at 6:07 AM
Twelve said...: Современный электрощит — это не просто коробка с автоматами, а полноценный центр управления нагрузками.; April 5, 2026 at 12:55 AM
Vibe everything said...: For developers looking to build strong backend and database optimization skills while working with frameworks like Hibernate, this guide on 👉 Best Full Stack Developer Course is a great resource to gain hands-on experience with real-world applications.; April 8, 2026 at 2:10 AM
Lovely_Tails said...: Great insights shared here. I found the explanations quite helpful. For those interested in learning more, this could be useful: Best Full Stack Developer Course. Keep posting!; April 8, 2026 at 7:12 AM
Vibe everything said...: Also, leveraging second-level cache with natural IDs is a smart approach, as it stores mappings between natural keys and primary keys, speeding up repeated lookups.

For developers who also want to improve how their applications are designed from a user experience perspective, this
Figma UI UX Course
is a great resource to build strong UI/UX skills and create more intuitive applications.; April 9, 2026 at 1:23 AM
Vibe everything said...: Loved this post. Both a Figma UI UX Course and a Web Designing Course in Chennai are very useful.; April 9, 2026 at 7:23 AM
ab initio course said...: Salesforce Developer Course
There are many options available, but I found this salesforce developer course. Can anyone suggest if it’s good?; April 9, 2026 at 8:18 AM
vr said...: Very informative! Our training boomi
equips learners with practical Boomi skills for enterprise cloud integration projects.; April 9, 2026 at 10:52 AM
hhkk said...: This course seems practical Social Media Marketing Course; April 10, 2026 at 5:22 AM
ab initio course said...: Salesforce Development Training
Looking for a complete package that covers everything in Salesforce development. Found this salesforce development training. Does it really cover end-to-end topics?; April 10, 2026 at 8:23 AM
hhkk said...: Is this suitable for beginners Data Science Course with Placement; April 13, 2026 at 5:22 AM

«Oldest ‹Older 201 – 224 of 224 Newer› Newest»

capricious diatribes

Wednesday, June 18, 2008

Hibernate: Cache Queries the Natural Id Way

224 comments:

Useless information about

Labels

Blog Archive