capricious diatribes

Monday, May 4, 2009

Hibernate Wars: The Query Cache Strikes Back

Not so long ago, in a galaxy not very far away... a middle-aged programmer battled to free memory from the dark side of the hibernate query cache. He was successful, or so it seemed. For the query cache memory problems had risen from the ashes -- stronger and more bloated than ever...

What's With All This Garbage?

We hit a case at work (again), where the java server process came to a grinding halt. It wasn't dead, it just wasn't making much forward progress either. A quick look at the heap stats showed we were nearly at capacity and that garbage collection was eating all CPU time, thus not allowing any real work to happen. Looks like it is time to whip out the ol' memory profiler. This time I went with Eclipse's Memory Analysis Tool . It's pretty slick. I suggest you try it.

Query Cache Waste Redux

Not terribly unexpected, the hibernate caches were the top consumer of heap space. I have previously explored ways to trim down the memory used by the query cache. But, we depend heavily on it and size it pretty big, so that's why I wasn't too surprised to see them at the top of the heap report. Drilling down a little further showed that it was not the contents of the query cache results that were causing the problem. That was the unexpected part. Rats, our friend QueryKey is once again a giant waster of memory.

What is it this time? In A Dirty Little Secret, I showed that you should use object identifiers in your HQL and as parameters so that full objects are not stored as part of the QueryKey. This was a major win. I also showed that by providing a decorator query cache implementation, you can reduce all duplicate stored parameters by replacing them with canonical representations. Using MAT (Memory Analysis Tool), I proved that this was still working as expected.

What I hadn't previously accounted for was the QueryKey.sqlQueryString field. Since we use a lot of natural id query cache optimizations, we have tens of thousands (and in some cases, well over 100,000) copies of identical queries tucked away in QueryKey as the sqlQueryString (thus, differentiated only by the query parameters). And since hibernate generated SQL is not exactly terse, we have a nice formula for memory explosion.

Different Tune, Same Dance

We're already using my decorated query cache which eliminates duplicates in the parameters. So I decide to modify it to also swap out the sqlQueryString for a canonical representation. One caveat is that sqlQueryString is private and final. Lo and behold, you can modify private final fields with reflection in Java 1.5! Granted, you could really do some silly stuff and fake out the JVM and screw up compiler optimizations, but we're only replacing the field with another String that should be functionally equivalent, so hopefully any 'weirdness' is mitigated. Again, for licensing reasons I won't paste the whole query cache decorator. Creating the full one (and the factory to instantiate it) is left as an exercise to the reader.

The modified query cache decorator looks like this:

 
    private final Map<Object, Object> canonicalObjects = new HashMap<Object, Object>();

 
    public boolean put(QueryKey key, Type[] returnTypes,
        @SuppressWarnings("unchecked") List result, boolean isNaturalKeyLookup,
        SessionImplementor session) throws HibernateException {

        // duplicate natural key shortcut for space and time efficiency
        if (isNaturalKeyLookup && result.isEmpty()) {
            return false;
        }

        canonicalizeKey(key);

        return queryCache.put(key, returnTypes, result, isNaturalKeyLookup,
            session);
    }

    private void canonicalizeKey(QueryKey key) {
        try {
            synchronized (canonicalObjects) {
                canonicalizeParamValues(key);
                canonicalizeQueryString(key);
            }
        } catch (Exception e) {
            throw Exceptions.toRuntime(e);
        }
    }

    private void canonicalizeParamValues(QueryKey key)
        throws NoSuchFieldException, IllegalAccessException {
        final Field valuesField;
        valuesField = key.getClass().getDeclaredField("values");
        valuesField.setAccessible(true);
        final Object[] values = (Object[]) valuesField.get(key);
        canonicalizeValues(values);
    }

    private void canonicalizeQueryString(QueryKey key)
        throws NoSuchFieldException, IllegalAccessException {
        final Field sqlQueryString;
        sqlQueryString = key.getClass().getDeclaredField("sqlQueryString");
        sqlQueryString.setAccessible(true);
        Object sql = sqlQueryString.get(key);
        Object co = ensureCanonicalObject(sql);
        if (co != sql) {
            sqlQueryString.set(key, co);
        }
    }

    private void canonicalizeValues(Object[] values) {
        for (int i = 0; i < values.length; i++) {
            Object object = values[i];
            Object co = ensureCanonicalObject(object);
            values[i] = co;
        }
    }

    // assumes canonicalObjects is locked. TODO: consider a concurrent hash
    // map and putIfAbsent().
    private Object ensureCanonicalObject(Object object) {
        Object co = canonicalObjects.get(object);
        if (co == null) {
            co = object;
            canonicalObjects.put(object, co);
        } else if (co != object) {
            // System.out.println("using pre-existing canonical object "
            // + co);
        }
        return co;
    }

As you can see, we simply change the sqlQueryString via reflection just like we do the param values.

Your Mileage May Vary

How much of a win this is for you depends on your use case. As I said, since we heavily use the natural id query optimization, we had tons of repeated sql strings. So, exactly how much memory you 'reclaim' in this fashion totally depends on the variety and uniqueness of the queries you run through the hibernate query cache.

Bonus: L2 Cache Reduction

While we're at it, I noticed a few other things that seemed odd in the L2 cache. There was more HashMap overhead than there was contents in the cache itself. I poked around the source a bit and saw that every entry in the L2 cache was being turned into a property->value map before it was stored in your L2 cache provider (and the inverse process occurs on the way out). This seemed odd to me, as we already have a decomposed CacheEntry object which is an array of the Serializable[] properties from your persistent entity. Why create another (less efficient) representation as well as introduce unnecessary conversions? After some google-fu, I realized you can bypass this conversion by setting hibernate.cache.use_structured_entries to false in your hibernate configuration.

Any docs I found on hibernate.cache.use_structured_entries merely seemed to suggest that it stores the properties in a more 'human friendly format.' And, who wouldn't want that? And, all examples we built on when first starting with hibernate turned it on, so... so did we. What they don't mention is what it actually does, and the penalty you pay for doing so -- which is apparently too much HashMap overhead for what should be pretty simple in memory storage.

However, be aware -- this only works for non-clustered, in-memory use of L2 cache. Apparently, if you cluster your JVMs and need L2 cache clustering, the in/out of the property order cannot be guaranteed between JVMs. Thus, you have to use structured entities in the cache so they can be re-hydrated properly by property name.

Right-Size Your Caches

We moved to the 1.6 release of EH Cache, so... this may only apply to that version. But, I noticed that whatever you configure as 'max elements' for ehcache, it uses as the 'min capacity' for the ConcurrentHashMap. Needless to say, if you size them for your potential largest usage, but then deploy to a smaller-usage environment, you can end up with some waste in the overhead for the hash tables. It didn't seem terribly significant in my case, but it did show up on the radar.

Wrap-Up

Even in today's multi-gigabyte server (heck, even desktop...) environments, memory is still a precious resource. Even if you have lots of it available, wouldn't you want to make more use of that memory instead of having it wasted? Freeing up wasted memory means there is more for the 'transient' objects that come and go quickly. There's less pressure to garbage collect and try to find available heap. And, there's more memory available to do smart things with useful caching. In short:

Use a query cache decorator to reduce parameter and SQL string duplication.

If you are in a single JVM using in memory cache only, use hibernate.cache.use_structured_entries=false in your hibernate configuration.

Right-size your caches to reduce hash table overhead.

Tuesday, April 14, 2009

Arbitrary Replication/Proxy Topologies with MySQL Sandbox

How would you like to use MySQL Sandbox to deploy and test any replication topology imaginable? Do you want to describe load balancer and proxy configurations, too? If you already love sandbox (or even if you don't), and you want to play with arbitrarily complex replication/proxy topologies... read on.

Sandbox Is Your Friend

MySQL Sandbox is a great tool. As of this writing, version 3 is feature complete -- and the new features are greatly desired. Replication support has been there a while for master/slave setups with as many slaves as you want and for replication rings of any size. Version 3 adds multi-level replication trees on a multi-sandbox using sbtool.

But what if the provided replication setups do not quite match what you need? Granted, I'll argue they are sufficient for 99% of setups. But there's always that 1%... Then, throw MySQL Proxy in the mix and things really start get interesting. Sandbox (at the moment) supports very limited proxy configuration.

It's Just a Directed Graph

Giuseppe and I had some conversations a while back on ways to describe replication topologies. In its purest form, a replication topology is a directed graph with some limitations imposed by MySQL replication technology itself (a slave cannot have two master). One can either think of it as Master->Slave (conceptual data flow direction), or Slave->Master (connection initiation direction). I choose to think of it as Master->Slave, because for my wee bitty brain it helps to think in terms of the event flow.

After a modicum of research, I realized the Graphviz DOT language seems very well suited to describing graph/network topologies. Sure, the intention of dot is mostly for visualization. But the language itself is generic enough to describe graphs, nodes, edges, and any properties you want on any of those components. Perfect.

Enter Sandbox

So, we have a generic graph description language. We have a tool that can deploy nodes and configure them. Sandbox is in perl. There exists Graph::Reader::Dot in CPAN. We. Are. Golden. In short order, I've got the latest MySQL::Sandbox and Graph::Reader::Dot installed into my perl environment. It doesn't take long to start reading in dot files and deploying mysqls with sandbox. The rest is just SMOP™ -- a Simple Matter of Programming™.

Enough with the History Lesson

The perl code is here: make_sandbox_topology. Get the perl prerequisites from CPAN, MySQL::Sandbox and Graph::Reader::Dot. Then in typical sandbox fashion, you invoke it as

./make_sandbox_topology { VERSION | tarball } topology.dot

Some example dot topologies follow now...

Master/Slave

digraph MasterSlave
{
Master->Slave;
}

Master, Slaves

digraph MasterSlaves
{
Master->Slave1;
Master->Slave2;
}

... and repeat with as many slaves as you want.

Tree

digraph Tree
{
A->B;
A->C;
B->D;
B->E;
C->F;
C->G;
D->H;
D->I;
E->J;
E->K;
F->L;
F->M;
G->N;
G->O;
}

Master/Master

digraph MasterMaster
{
A->B->A;
}

Ring

digraph Ring
{
A->B->C->D->A;
}

Ring With Slaves

digraph Ethanol
{
C1->C2->C1;
C1->H11;
C1->H12;
C1->H13;
C2->H21;
C2->H22;
C2->O->H;
}

digraph Benzene
{
C1->C2->C3->C4->C5->C6->C1;
C1->H1;
C2->H2;
C3->H3;
C4->H4;
C5->H5;
C6->H6;
}

Ok. Enough sillyness. I think you get the point.

Let There Be Proxy

Proxies are just another node in the topology. But, we have to identify them to sandbox as a different type, so they are not deployed as mysql instances. In dot notation, this becomes Proxy [class=proxy]. How about a simple load balanced setup with two slaves.

digraph LoadBalanced
{
Master->SlaveA;
Master->SlaveB;

LoadBalancer [class=proxy];
LoadBalancer->SlaveA;
LoadBalancer->SlaveB;
}

The LoadBalancer when deployed and started, is now configured for the two slave backends. Notice, however, that there is no lua goodness for the proxy (yet)....

Let's do one more with a combination of load balancing and pure proxy.

digraph ProxyExample
{
Master->SlaveA;
Master->SlaveB;
Master->StandBy;

LoadBalancer [class=proxy];
LoadBalancer->SlaveA;
LoadBalancer->SlaveB;

ProxyM [class=proxy];
ProxyM->Master;

ProxyA [class=proxy];
ProxyB [class=proxy];

ProxyA->SlaveA;
ProxyB->SlaveB;
}

In this case, SlaveA and SlaveB are load balanced, and every other mysql node has a proxy in front him (for who knows what, query analysis perhaps?) except for the standby slave.

Node Configuration

The previous section introduced the first concept of a node's configuration. A node's class defaults to mysql, but can be specified as proxy. All other attribute=value pairs are passed as configuration to the deployed type.

Topology Level

digraph Simple
{
baseport=22000;
Master->Slave;
}

Currently, the only 'topology' level supported attribute is baseport. It, as you might guess, defines the starting port for port assignment via sandbox.

MySQL Class

Any argument that is valid for a mysqld command line or my.cnf configuration file is fair game for a mysql class node configuration. Want to hard code the port? Want to make a slave read-only? No problem.

Slave [port=9999, read_only=1]

Now the slave will be deployed via sandbox with the given port 9999 and read_only in the my.sandbox.cnf

Proxy Class

Any argument that is valid for mysql-proxy command line or ini file setting may be set as node configuration. The class=proxy attribute is not forwarded, as this is a recognized 'internal' node config option.

Proxy [class=proxy, "log-level"=debug, "proxy.proxy-lua-script"="share/mysql-proxy/lb.lua"]

Notice, the quotes around some of the keys and values is due to dot grammar limitations.

Default Node Attributes

Obviously, a lot of configuration could end up being repeated. And that sort of defeats the purpose of a lot of the automation that sandbox provides. Dot provides a default node attribute concept that we can build on.

node ["proxy.log-level"=message, "proxy.proxy-lua-script" = "script.lua",

"mysql.read_only"=1, "mysql.default-character-set"=utf8 ]

"node" is literally n o d e. That is the dot syntax for "every node should contain these attributes." Every proxy will have log-level as message and the script.lua lua script, unless the node definition itself overrides that same attribute. Likewise for mysql nodes. Every one would be read only and utf8, unless the node itself is explicit about those attributes.

Since the options are really passed to ALL nodes, I had to scope the attributes to what classes they really belong to. To me, it makes it slightly more readable at a 'default attribute' level anyway, but one could envision being smarter about what attributes belong to what classes, and doing some automagic scoping for you. The only problem then would be collisions -- which could be resolved with explicit scoping in those cases.

Future

I've already joined sandbox-developers on launchpad. Next step is to work with the team and get it on the blueprints, and then into sandbox release.

Once on launchpad, we'll start soliciting user ideas for features.

Issues and Limitations

Dot language. Attribute key and value grammar is pretty limited, so remember to quote them. I don't catch parse errors and die (yet), so if you get funky errors later remember to check your dot syntax.
Only valid replication topologies are allowed. If you try to configure two masters for one slave, we'll catch it. Wait, this is actually a feature, not a limitation...
master/slave binlog events not proxied, so it is caught and disallowed. Maybe proxy will proxy binlogs... soon?
Currently it does not honor all sandbox options (like user and password, or port conflict detection and resolution)
the Blogger editor REFUSES to maintain proper whitespace in my dot examples. Ideally, I like to use 4-space indent. But it keeps removing them for me, so I'm tired of adding them back in. sorry.

Please remember, this is just a rough cut at the concept. Architecture and design could use some serious refactoring. But, it's out there to use and abuse -- and we'll see about getting into MySQL::Sandbox proper.

If desired, I can followup in further blogs with actual usage examples and walk-throughs. Perhaps the best place for documentation though is on mysql forge.

Monday, November 10, 2008

Hibernate Query Cache: A Dirty Little Secret

You Mean, Memory Is Not Infinite?

We're working hard getting MySQL Enterprise Monitor 2.0, featuring Query Analyzer, ready for release. As part of that, we started really ramping up the number of MySQL servers reporting in query data to see how we could scale. Not surprising (to me, anyway), the first efforts did not go so well. My old friend OutOfMemoryError reared its ugly head once again.

Query Cache -- It's More Than Just Results!

We're big (ab)users of hibernate query caching, and more importantly to us the natural id optimized query cache. Firing up the profiler, I was not shocked to see that the second level (let's call it L2) and query caches were hogging the majority of memory. But, something didn't smell right...

What I was seeing was tons of object/entity referenes for our persistent objects. However, the hibernate cache does not really work that way. Hibernate 'dehydrates' query results and persistent objects into their primitive components and identifiers. It stores this decomposed data in the L2 and query results cache, and on a cache hit, it rehydrates/recomposes them into the requested persistent object.

But the profiler does not lie. One of our objects, for which there are conceptually 60 some odd instances, had over 20,000 referenced instances on the heap. Yikes. Obviously, we're doing something wrong, or are we... ?

Ok, Mr. Profiler, who is holding the references on the heap? Drill down a bit through the query cache, past some HashMap entries, into the... keys... keys, you say? Hrm, not values. Interesting. Well, looky here. Hibernate's QueryKey is holding on to a Object[] values array, which is holding the majority of our persistent object references. Ouch. In addition to that, it has a map inside of it whose values also contain non-trivial amount of references to our entities.

Well, nuts. Code spelunking ensues. QueryKey is just as it sounds -- an object that can act as a unique key for query results. This means it includes stuff like the SQL query text itself as well as any parameters (positional or named) that specifically identify a unique result set.

Objects, Objects, Everywhere

Now, silly me, since we are using an object relational mapping, I was using objects for the parameters in my HQL. Something along the lines of:

final Cat mate = ...;
final String hql = "from Cat as cat where cat.mate = ?"
final Query q = session.createQuery(hql);
q.setParameter(0, mate);
q.setCacheable(true);

In this case, the entire Cat mate (and everything he references) would be held in perpetuity. Well, until either the query cache exceeds his configured limits and it is evicted, or the table is modified and the results become dirty.

Let's not forget our friends the Criteria queries, either. Because it is only through criteria that we can get our friend the natural id cache optimization. (and please pardon the contrived-ness of the cat natural id example)

final String foo = "something";
final Cat mate = ...;

final Criteria crit;
final NaturalIdentifier natId;

crit = session.createCriteria(Cat.class);
natId = Restrictions.naturalId();
natId.set("mate", mate);
natId.set("foo", foo);
crit.add(natId);
crit.setCacheable(true);

In the same fashion as the HQL, this will result in hard references to 'mate' and 'foo' held for the cache-life of the query results.

How To Make a Bad Problem Worse

Even worse, in our case, was the fact that we would do the equivalent of load the same 'mate' over and over again (maybe this cat is severely non-monogamous). And whether loaded from L2 cache or directly from the database, the mate Cat now existed as multiple instances, even though they are hashCode()/equals() equivalent. But QueryKey in the query cache doesn't know that. He only knows what he is handed. And he is handed equivalent duplicates over and over and over again, and only lets go of them on cache eviction. So, not only do we end up with essentially unnecessary references to objects held onto by the query keys in the cache, we instantiate and hold onto multiple multiple instantiations of the same object and hold on to those, too. Bear with me as I bow my head in shame...

Fix Attempt 1: Make a Smarter Cache

I've been down this road before. I tried to be smarter than Hibernate once before. It did not end well. Unsullied by prior defeat, I resolved to attempt being smarter than Hibernate once again!

Hibernate's query cache implementation is pluggable. So I'm going to write my own. Ok, I'm not going to write my own -- from scratch. My going theory is that I can at least eliminate the duplication of the equivalent objects referenced in memory. I'm going to decorate hibernate's StandardQueryCache and do the following: For each QueryKey coming in a cache put(), introspect the Object[] values (which are positional parameters to the query). For each object in values[], see if an equivalent canonical object has already been seen (same hashCode/equals()). If so, use the canonical object. Else, initialize the canonical store with this newly seen object.

Notice we only have to do this on put(). A get() can use whatever objects already come in, as they are assumed to be hashCode/equals equivalent. Hell, it HAS to work that way, otherwise QueryKey would just be broken from the start. Here is some snippets of relevant code that implement org.hibernate.cache.QueryCache.

public boolean put(QueryKey key, Type[] returnTypes,
    @SuppressWarnings("unchecked") List result, boolean isNaturalKeyLookup,
    SessionImplementor session) throws HibernateException {

    // duplicate natural key shortcut for space and time efficiency
    if (isNaturalKeyLookup && result.isEmpty()) {
      return false;
    }

    canonicalizeValuesInKey(key);

    return queryCache.put(key, returnTypes, result, isNaturalKeyLookup,
      session); 
}

private void canonicalizeValuesInKey(QueryKey key) {
    try {
      final Field valuesField;
      valuesField = key.getClass().getDeclaredField("values");
      valuesField.setAccessible(true);
      final Object[] values = (Object[]) valuesField.get(key);
      canonicalizeValues(values);
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
}

private void canonicalizeValues(Object[] values) {
    synchronized (canonicalObjects) {
      for (int i = 0; i < values.length; i++) {
          Object object = values[i];
          Object co = canonicalObjects.get(object);
          if (co == null) {
              co = object;
              canonicalObjects.put(object, co);
          } else if (co != object) {
              // System.out.println("using pre-existing canonical object "
              // + co);
          }
          values[i] = co;
      }
    }
}

It's pretty much what i described. I didn't even attempt to get permission to post the whole thing, because it is probably not worth my time. The only thing missing is a HashMap of the canonical objects, an the instantiation of the StandardQueryCache queryCache. You'll also need to implement org.hibernate.cache.QueryCacheFactory to create this smarter query cache factory, and then plug that into your hibernate config.

This did work as expected. My 'outstanding' objects on the heap were greatly reduced. Unfortunately, it was not good enough. I still had thousands of these guys on the heap, essentially unused except to at some point fetch their numeric id to be used by hibernate's generated SQL. And this didn't take care of the named parameters, which are stored in a map of string names to some other internal hibernate class, which I no longer felt like introspecting via reflection. So anything using named parameters was still potentially duplicated.

Fix Attempt 2: Objects? Who Needs Objects?

Hrm, the last paragraph stirred a thought -- hibernate only needs the id's from these objects. While the academic in me enjoyed the exercise in decorating the query cache to be smarter about duplicate references, the idiot in me said "well, duh. if you only need the id, why not just use the id?" Because we're supposed to use objects! Oh well. It occurred to me that I could rewrite any and all HQL to reference object id's instead of the object property reference itself. It should end up in the same SQL eventually sent to the database. Seems like a cheesy code monkey work around, but the theory is that hibernate QueryKey will only be holding onto Integer and Long references instead of entire object entities.

So, I hunt down all of our our cacheable queries. I change them all to object id queries. The previous Cat business can now look like this:

final Cat mate = <...>;
final String hql = "from Cat as cat where cat.mate.id = ?"
final Query q = session.createQuery(hql);
q.setParameter(0, mate.getId());
q.setCacheable(true);

cat.mate becomes cat.mate.id. The mapped parameter becomes mate.getId(). It could be a named parameter just as well. I didn't find a single HQL in our application that I could not convert this way. Good.

But what about our friend Criteria? He requires objects for the properties, right? At first, I thought this was true. And I was consoled by the fact that my smart query cache would do its best to keep duplication at a minimum. In fact, it was the next day (after resting my brain), that another "I'm an idiot" moment came to light. The Criteria API still just takes strings for property references. Perhaps, it follows the same property resolution rules as HQL? In other words instead of "mate", can I say "mate.id"? And the answer is, YES, yes I can! Woo hoo! Absolutely no more object references for me! Here is what the criteria would like like:

final String foo = "something";
final Cat mate = ...;

final Criteria crit;
final NaturalIdentifier natId;

crit = session.createCriteria(Cat.class);
natId = Restrictions.naturalId();
natId.set("mate.id", mate.getId());
natId.set("foo", foo);
crit.add(natId);
crit.setCacheable(true);

Subtle, yes. But trust me, it makes a huge difference. Interesting now, that Fix Attempt 2 likely alleviates the need for Fix Attempt 1. At the worst, we end up with lots and lots of Integer/Long object references and even duplicates of them. The profiler says it is not very much in my limited testing. But, I decide to leave the smarter cache in, because it appears to be working, and it does reduce memory.

Lessons Learned

If you use hibernate query caching, and actually want to use memory for caching useful results, and waste as little as possible with overhead, follow some simple advice:

Write your HQL queries to use identifiers in any substitutable parameters.WHERE clauses, IN lists, etc. Using full objects results in the objects being kept on the heap for the life of the cache entry.
Write your Criteria restrictions to use identifiers as well.
Use the 'smart' query cache implementation skeleton in this article to eliminate duplicate objects used in query keys. This helps a little if you use the ids in your HQL/Criteria, but if you still must use objects then it helps a lot.
If you use a smart query cache implementation, prefer '?' positional parameters over named parameters. I know, I know. I love named parameters too. But the smart query cache implementation in this article only detects duplicates for positional parameters (or params in criteria). Alternatively, feel free to extend the example to locate duplicates in the named parameter map and replace them with canonical ones as well.

I'm Not the Only One Who Noticed...

In doing some final research for this post, I came across hibernate core issue HHH-3383. Let's keep our eye on it to see if the wonderful hibernate devs can fix this down at their level, so we don't have to change our code. Also, the issue lists that Criteria cannot be fixed with the same 'use id' workaround. Since I was able to, I wonder if the bug submitter did not realize you can reference dotted property paths in criteria restrictions exactly as you can in HQL. Perhaps I shall start a dialog with him.

Friday, August 29, 2008

MySQL Query Analyzer: A Peek Under the Hood

I had a hard time coming up with a title for this blog post. I really wanted to capture its essence. Several discarded titles include, but are not limited to:

How It Works
The Gory Details
Everything You Never Wanted to Know
The Insider's Perspective

I picked the lamest one, hoping that it is catchy enough to ensnare the unfortunate soul that comes across my blog. Just in case you don't know by now, I work for MySQL (ok, Sun) on the Enterprise Tools team. We're approaching (some sort of) release of MySQL Enterprise Monitor 2.0, with the headline feature Query Analysis.

What I want to achieve with this article is to give definitions, data, flow, etc., that describe how we look at things internally in the architecture of the application. This is intended to give the hapless reader an idea of what Query Analysis can do, and consequently what it can not.

What is a query?

I actually call them statements inside the application. To me, query seemed to imply asking about something. And we track more than that. Is an INSERT or UPDATE a query? Is a CREATE TABLE a query? Not in my feeble mind. So, I chose statement because it is generic enough to encompass them all, yet seems still to have ties to database vernacular. Too bad "statement analysis" doesn't quite have that sexy marketing appeal.

A unique statement/query is an application's query that is normalized and executes against a given database. By normalized, I mean all literals are replaced with '?', (some) comments are stripped, the case of SQL itself is made capitalized, and extraneous whitespace is collapsed.

There is a subtle phrase in there that is very important -- "a unique statement ... executes against a given database." Basically, we identify a statement key as {database, normalized text}. That means if you have statement SELECT 1 and execute it in the context of database 'test', switch databases to webapp_foo, and execute it again, in the mind of MySQL Query Analyzer it is a different statement.

This works well for things like replication scale out, where a slave farm will have bunches of queries executing against the same database(s). It breaks down if, somehow, your application uses multiple databases, but yet you really want the queries to be treated the same for aggregation. We haven't found this as the typical use case (yet), but if we get feedback we can revisit this in future releases.

For a statement that executes against a MySQL server instance, I call that a 'statement summary.' Because it is at this level that we actually track and summarize execution statistics. More on this later...

Where does the data come from?

Enter the MySQL Proxy, with some enterprise-only plugins and lua scripts. The proxy sits between your application and your MySQL database server. All queries passing through it are captured, normalized, and stored until they are reported back to the Monitor server. The infamous agent is now a plugin that shares a process inside the chassis, along side the proxy. Proxy and agent communicate, in this case via lua, in order to push query data to the agent plugin so the agent can send it up to the monitor in the same fashion as any other data it collects.

There are potentially other sources of data... but that's a story for a different day.

What data is tracked?

For a given statement, we track:

number of executions
execution time
min/max execution time(s)
rows returned/affected
min/max row(s)
result set size
min/max result set size

The normalized statement is tracked by the proxy. Each statement that fits in that normalized bucket is added to the ongoing statistics. The Monitor backend periodically asks for a snapshot (currently every minute), at which point the proxy resets his statistical aggregation (think, "read-clear" semantics).

Also, some extra metadata is kept with the query, for housekeeping reasons:

database (which can actually be null - not every query executes in a database context)
normalized query text
a hash of the text
query type (select, insert, update, etc.)

What does the agent/proxy actually send to the Monitor?

Actually? You don't want to know. When you get your hands on it, you can turn up the agent log level to debug and see for yourself. It's not like it's a secret. It's just not worth mentioning here.

Conceptually? Now there's a better topic... Although we don't take full advantage of it in all agent communications, we've moved to a more RESTful approach to agent interaction. Internally, we take the legacy xml agent protocol and turn that into sexy 'representations' that we put to internal REST APIs. An example JSON representation for the query data looks something like this:

PUT /instance/mysql/statementsummary/f8ccc1d2-f63b-4c30-9d42-1d383e626510.em2.02a6ed17a7627d1777f33a6062826951
{
"name": "f8ccc1d2-f63b-4c30-9d42-1d383e626510.em2.02a6ed17a7627d1777f33a6062826951",
"values":     {
  "bytes": 325,
  "count": 8,
  "database": "em2",
  "exec_time": 3174,
  "max_bytes": 45,
  "max_exec_time": 613,
  "max_rows": 1,
  "min_bytes": 38,
  "min_exec_time": 394,
  "min_rows": 1,
  "query_type": "SELECT",
  "rows": 8,
  "text": "SELECT hibtag0_ . tag_id AS tag1_18_ , hibtag0_ . tag AS tag18_ FROM tags hibtag0_ WHERE hibtag0_ . tag = ?",
  "text_hash": "02a6ed17a7627d1777f33a6062826951"
 }
}

This is our conceptual representation of a statement summary (remember, a statement summary is a summarized statement execution on a single mysql instance). Statements themselves actually do not exist on the proxy/agent side or in the communications, since they are always tracking a mysql instance. The statement itself is purely a Monitor-server side concept that we track for rollup purposes. The UUID you see in the RESTy URI is the mysql.inventory unique identifier, something we have to introduce for 2.0 to identify, without any uncertainty, a specific mysql instance we are monitoring. Then, it has the database name and the text hash. The uuid.db.text_hash defines the key for this statement summary.

What does the Monitor do with the statement summaries?

From the statement summary, we pull out the pieces which define a statement, which happen to be immutable. They are the database, query text, text hash, and query type. After storing that, we create a statement summary, which basically just ties the statement to a server instance. We also store the creation time of this object/row, so we can track the first time this query was ever seen on a server (or, ever seen period when you take the min and group by the statement). Finally, we take the ongoing snapshot data and store that.

Tell me how the aggregation works.

Aggregation works on two levels. Because we split out immutable statement data, and statement summary (per server) data, we can then aggregate on each of them.

Lets revisit what these snapshots mean. Each statement summary data instance is a snapshot of the summarized statistics for the normalized query over the time period since the last time it was reported. This gives us the powerful ability to then aggregate these snapshots over an arbitrary time range chosen by the end user.

Consider the single server aggregation case. If count is the number of executions in a given minute, and we store this count snapshot over many minutes, then the total count for an arbitrary time range is merely the sum of the counts (to the minimum granularity of our collection frequency, which is currently one minute). Rows and result set size aggregation work exactly the same way.

The mins and maxes aren't magical either. The mins/maxes per snapshot are for that one minute range. If you want the mins/maxes over the larger requested range, just take the min of mins and maxes of the maxes.

There is one "trick" on the monitor-side to show averages over the arbitrary range. We show average execution time, average rows returned/affected, and average result set size. What does an 'average' mean in this case? We want an average per query execution (not per unit time). Average execution time would be the total execution time divided by the total executions. Well, we aggregate execution time, and we aggregate count... so we just take those columns in the query we build, and write an expression column of time/count. There's your average. Same for rows and bytes -- just take the aggregated totals and divide by the count.

Now, what about summaries across multiple servers? Since we have the immutable statement data broken out, its just another aggregation on top of the server one, grouped by the statement identity. And the same sum, min, max, and average tricks still apply.

Can I see statistics per user?

Not at this time. The user stats might be pretty easy to tack on in the next release. It would just become another element to the {database, query} key. Then we could still selectively aggregate at the user level, the server level, or the statement level, or any combination.

Sounds neat! When can I see it?

Currently, query analysis with Monitor 2.0 is in limited beta to select enterprise customers. Email beta-enterprise@mysql.com for more information. There is also a special MySQL Query Analysis forum. It has some good links, and as people get their hands on it, I bet it becomes the central place to go for all things Query Analysis.

What does Query Analysis in MySQL Enterprise Monitor 2.0 look like?

Check out our product manager's blog. He's better at doing touchy feely marketing flap like screen shots. ;)

Conclusion

Here, I covered all the gory details from what is the data we collect, where we collect it from, and how we conceptually view and aggregate it on our backend. I'd like to touch on how the UI interacts with the data -- specifically group views, server views, sorting, searching, and aliasing, but I think this article has hit capacity. Perhaps a blog for another day...

Friday, July 25, 2008

Code Formatting Manifesto

First, let me say that there is only "one true format" for any given programming language. And, of course, that is my format. Then again, the one true format for you -- is your preferred format.

How do we solve this problem today? For any personal projects, one uses his/her one true format, as they are the masters of their own destiny. For shared projects, job related or otherwise, we often rely on code formatting standards.

Now, I'm not anti-coding standard by any means. I'm also not strict on what I feel is the one true format in any given situation. We have human minds that are adaptable. We are trained in the syntax and layout of code, and what it means. If a brace starts on this line or the next, I can grok it. However, there is something to be said for consistency within a project. I believe it boosts productivity, and that's why I'm not anti-coding standard(s).

But why does this have to be so? The meaning of the code itself doesn't change depending on where you put the brace (as long as the semantic structure doesn't change). The compiler doesn't care.* Why should we?

I think it all stems from the fact that the format of code is inherently tied to how it is stored -- as unstructured text documents. Can we do better than that?

How many years now has the visual representation of some document been isolated from its storage format? I'm too young to remember the beginnings of TeX and other typesetting languages. Throw in WYSIWYG word processing next. The office word processor that shall remain nameless used a binary storage format for a long time, only recently switching to some XML internal structure (perhaps compressed, but still XML). And OpenOffice.org definitely uses structured text (xml, compressed on disk), to describe documents. In the case of (most) WYSIWYG word processing, though, you are describing layout instead of structure, but there are good examples where that is not the case (I leave that as an exercise to the reader).

I give you the quintessential example -- HTML. HTML describes documents, usually intended for human viewing, in a structured fashion. How they are *visually* represented is completely up to the renderer! Hell, it doesn't even have to be visual. Site-impaired folk can have their HTML documents read to them.

An HTML page rendered my phone has the same content as one rendered for FireFox, but they look completely different. Even HTML on a given website can be rendered differently, if the designers were so forward thinking as to make the page skinnable via dynamic CSS changes. Again, the content is the same. Only the presentation changes.

Now, I come full circle. Why can't this *very same* (old) idea be applied to code? Let's remove the storage format from the presentation/editing format. I argue that we should be able to store, for example, Java code with all unnecessary whitespace removed. Load it up in your fancy new rendering editor, and your rendering/formatting preferences are applied for the visualization and editing of the file. When you save it, the editor does the opposite -- remove any formatting-only specific text, and save it in "canonical" form.

Syntax highliting is arguably the (minimalist) first step. The colors on your keywords are not stored in the file. The are added by the editor/IDE as part of visualizing the code. Some are simple text matching stupid, but other editors grok the structure and "do the right thing."

Next, there are plenty of "pretty printer" reformatting tools out there. Eclipse does it. And I believe there are other tools for other languages that do it. People use them to enforce coding standards as an "on commit" step into the source code repository. Code checked in is automatically run thru a formatter and is committed in canonical form.

Well, I say screw all that jumping thru hoops. Lets make this the editor's job. If we can already syntax highlight, and thusly grok real code structure, and already auto-format to configurable specifications, then lets take it to the next step. Let the editor do it on every load and save.

The one argument I see against this is a potential for losing nice diff-ability. If we store in 'canonical' format (perhaps compressed to minimal whitespace), and I want to diff revisions, it makes it slightly more difficult. The diff tool would need to understand the rendering process as well, and thus you might liklely have to use some diff feature built into the editor. Otherwise, your diff tool of choice would need the same renderability from the canonical format to your rendered view of choice. Again, I'll use eclipse as my argument -- it provides more of a structural diff view anyway (not just +++ --- text added/removed stuff). Which, since it already understands code structure of new vs old (regardless of view and storage format), it shouldn't have any problems if the stored format is not the same as the viewed format. The idea actually plays BETTER with this kind of diff, because you see actual structural changes and not just text format changes. Line ending changes and JoeBob Teamate's goofy reformatting no longer show up as diff's, and potentially don't even need to be saved because no *content* has actually changed. This is a good thing.

Anyway, I'm curious what the hapless reader of this blog thinks. I tried some cursory googling, but nothing following my ideas comes across in terms of actual programming. There are plenty of pretty printers and web code-sample displayers etc. These all have the same end goal as my idea, but none take it back to the actual editing step. Do you know of such a tool with the features I desire?

If I ever get magical "free time" I might play with some eclipse code to see if my idea would work. The pieces all appear to be there... just gotta knit them together. Yay, open source!

* This argument only works for stream oriented languages. I'll ignore python for the moment, but any language where whitespace/indenting is meaningful doesn't deserve acknowledgment anyway.

MySQL and the Missing Rows

I was doing some multi-threaded, multi-transactional testing for the backend of the MySQL Enterprise Monitor. I came across a weird failure, where it appeared I was able to successfully insert a row, and then (in the same transaction), a select from the same table did not return any rows.

Consider the following transactions:

mysql> create table t1 (id integer primary key auto_increment, name varchar(32) unique) engine=innodb;

a> begin;

b> begin;
b> select * from t1;

a> insert into t1 values(null, 'oldag');

b> insert into t1 values(null, 'oldag) on duplicate key update id=LAST_INSERT_ID(id);
(b blocks)

a> commit;

(b is released)
...
Query OK, 0 rows affected (0.00 sec)
b> select * from t1;
Empty set (0.00 sec)

what, what, what?!?!

So, the gist is that the insert appears to succeed (notice the ok, no warnings or errors), but it doesn't affect any rows. Surely, this cannot be correct. So, let's perform a thought experiment.

B begins a transaction, and selects from t1. He locks in his transaction read view (repeatable read isolation level) at whatever it was right then. A then makes a modification in another transaction. B then does the insert, but it is *not* actually an insert. A normal insert would have failed with a duplicate key constraint failure. But the lovely "on duplicate key update" feature turns that into an update request on the existing row with the constraint. But wait -- notice the update is designed to essentially be a no-op. The LAST_INSERT_ID trick is used so that hibernate gets the correct id for the now-not-inserted row (call it poor man's lazy create). Additionally, MySQL has the optimization that, if an update really wouldn't change anything, then by god -- don't change anything! Hence, the "0 rows affected."

Where does that leave us? We have a frozen transaction read view. We have an apparently successful insert (data manipulation) that should be visible in the same transaction. Due to internal trickery, no data manipulation actually occurred. Thusly, the 'B' transaction still has the same frozen read view as before the insert, and no rows appear in the select.

So, you can argue with me all day about the technical internals of what the database engine is doing, and why it is correct, and why it would be hard to change. But I'll argue back that the client experience is unexpected and therefore wrong. An insert with no errors (or warnings) is furthermore not visible immediately after a successful return. I don't care that there were zero rows affected. That is accurate. There WERE zero rows affected. So what. Insert. Select. Show me my rows!

So, what actually may make it impossible... Consider if transaction A had modified/inserted a lot more rows. But then B only inserts a subset of those. How do you make only those rows visible (that B insert/updated) but not the ones that B didn't 'touch?' Granted, it's a hard problems, and in pure transactional database land, it might just be impossible. And, "on duplicate key update" is a mysql-ism that throws a kink in the whole works.

Ok, now that I've told you it is impossible, I will tell you how to make it work. ;)

There is a workaround. Remember how I said that the update is a no-op if nothing is actually updated? Well, what if we actually FORCE some kind of update to happen. In this case, I added a 'dummy' insert_count column, starting at zero. I then changed it to "on duplicate key update id=LAST_INSERT_ID(id), insert_count = insert_count + 1". This forces an update to occur. The data manipulation is recognized, and the row becomes visible in the transaction. It occurred to me soon after that, that I should probably be doing hibernate 'versioning' on the objects with optimistic locking anyway, and maybe that would play nicely with this.

Wednesday, July 9, 2008

Oh Noes -- My Mac is infected!

... with a Windows virus, or with Windows?

While running skype on my macbook pro, I got this lovely little instant message...

WINDOWS REQUIRES IMMEDIATE ATTENTION
=============================

ATTENTION ! Security Center has detected
malware on your computer !

Affected Software:

Microsoft Windows Vista
Microsoft Windows XP
Microsoft Windows 2000
Microsoft Windows NT Server 4.0
Microsoft Windows Win98
Microsoft Windows Server 2003

Impact of Vulnerability: Remote Code Execution / Virus Infection /
Unexpected shutdowns

Recommendation: Users running vulnerable version should install a repair
utility immediately

Your system IS affected, download the patch from the address below !
Failure to do so may result in severe computer malfunction.

[url removed]

I find that just absolutely hillarious. What a grand scheme -- spam random skype addresses with some windows virus message. For what it is worth, I also get the same kind of random windows specific virus warnings when visiting certain websites.

Good thing I'm a smug mac bastard and don't have to worry about such things (for now).