Jay Walters

QBE Cache

Headlines

A QBE Cache for JavaBeans

A Brief Critique of the J2EE model for database connections...

Where web services are going...

What I think about web services...

How Important is Database Portability...

QBE Cache

I've been playing with a simple Javabean caching solution. This was driven by some weird office politics and turned a bit into a challenge for me to see what it takes to do this. First I looked for some type of open source solution, but there is no open source cache with secondary indices. I didn't turn anything up simpler with secondary indices than an in-memory database which just seemed too heavy for me.

So I built my own little cache, and implemented it on top of the in memory databases as well as on top of the ever present HashMap. This has allowed to me to measure the overhead of using an in-memory database vs the HashMap version. In general my intuition has born out, all the in-memory database solutions have been more expensive.

More Details

The problem QBECache solves is providing a thread-safe write through in-memory cache for java beans, where the beans can be looked up using an equality query on any combination of their attributes. A business case for this is cross referencing securities, where you might want o lookup a security by CUSIP, SEDOL, ISIN, TICKER or internal ID.

Query By Example (QBE) is a simple paradigm which supports equality lookup for any attribute. It is implemented by allowing the client to provide a template object which has values for the fields to use in the lookup, and nulls in the other attributes. Inside QBECache all the joins use the nested loop strategy, and statistics on selectivity are used to start with the most restrictive condition first in order to optimize the join order.

You can download the code here. It includes several different implementations of the QBECache for java beans:

Each implementation is very simple, there is a top wrapper object which implements the Cache interface and an implementation of the store which is hidden from the client.

Right now all the implementations use regular JavaBean introspection to identify the attributes to be indexed. A future enhancement would be to provide an extension mechanism using either a Property class or an xml file to define which attributes should be indexed.

The intern implementation has a reduced memory footprint at the cost of slower insertions (the object interning happens then). It uses attribute set operations if they are provided to perfrorm the interning, and it will intern any object field which is indexed. Once the insertion is over, performance is about the same between the standard implementation and intern implementation. Both of the database implementations are slower by close to a factor of 10. This can be attributed to the longer code paths required to traverse the jdbc drivers and then the database kernel, where the two HashMap implementation have very short code paths.

Locking in both of the HashMap implementations is managed using read/write locking which allows multiple readers but only one writer at a time. The structures are exclusively locked during writing, but given that the write operation is very fast and the use case is many more reads than writes, this seems an acceptable tradeoff. Loading the cache from a collection replaces the existing cache, but this operation will not impact queries which are in progress against the old cache.

The Cache API is very simple, it contains the following public API:

About Us | Site Map | Contact Us | ©2000-2008 Jay Walters