The subtle art of cache configuration

Summary: It’s deceptively simple to enable caching in Spring, but you must investigate actual production usage in order to configure it properly. There are four areas of concern: peak-time load, uniqueness of requests, savings in time and resources, and the longevity of cacheable values. An example project shows these concerns in action.

I like the topic of caching in Java code. It’s a technique where the Platonic world of clean code meets actual usage. You can never tell just by looking at code whether caching is a good idea or unnecessary optimization. You need to measure or at least predict realistic production loads.

Maybe the Spring framework had made it a little too easy. With minimal boilerplate you can configure any managed object to return cached responses upon repeated requests. Call it once to run the method body, call it twice and the framework intervenes and returns the result of the first call. You can (well, must) plug in a full featured third-party implementation like Caffeine or ehcache to enable things like disk overflow and automatic eviction. 

Container terminal, Port of Rotterdam
public Person getPersonById(long id){ ... } 

There are lots of good tutorials that explain the technical nitty gritty of caching, but they tend to gloss over how important it is to carefully consider your use cases. Caches are essentially key/value stores. The aspect-oriented approach in Spring evaluates method arguments to an identifier that is used as the key to a cached value. It is all deceptively simple to implement, but there is no single solution that fits all use cases, certainly not the default concurrent map implementation where values live forever and keys grow unchecked. That is usually worse than no cache at all.

Caching is supposed to improve overall performance, but it comes at a startup cost. There’s the storage and housekeeping of cached entries that put a constant strain on computing resources. You need to predict usage and figure out the best configuration. Let’s look at four aspects of the ideal use case and examine how most real-world scenarios are likely to differ. 

Frequent 1 and identical 2 requests to an expensive 3 resource (in terms of duration or computation) whose output changes infrequently 4.

  1. The number of concurrent requests. The more the better. A server sitting idle will not benefit much from caching
  2. The number of unique entries to be cached. The smaller the better, because the cache size will be smaller, and each entry will be read more frequently.
  3. The computing resources involved in the request. The heavier the better, because the bigger the savings.
  4. How long does a cache entry remain valid and can we serve expired values? The longer the better. Managing stale values is an expensive background process.

One: frequency of access

It’s obvious that caching can give a much-needed performance boost during peak traffic. If requests pile up and make the service noticeably slower or unresponsive it can be a good solution. It can still be of benefit even when the service performs acceptably without caching. The efficiency boost in cutting out expensive repetition means you can get by with lighter and cheaper server specs. In a cloud infrastructure this is done with the push of a button and the savings are immediate.

Two: uniqueness of requests

The greater the range of unique requests the bigger your store becomes. Individual keys will be requested rarely, thus making the cache less effective. You may however find that the distribution is skewed, and that is a good thing. If you have 20 million possible keys but 98% of requests are for the same 5000 different keys, limit your cache size close to that number. The policy will be to throw out the least frequently read keys when the cache grows beyond that limit. Don’t worry about the extra database calls. You don’t want you cache to become a stale copy of your master database. Yet be aware that the administration of tracking these access times is not free.

Three: time and resources saved

Caching prevents repetitive expensive operations. When there is a database call involved in fetching a resource (and there usually is) the improvement of caching can be orders of magnitude. Reducing response time from 5 to 500 milliseconds is a hundredfold increase in performance, which means the server can handle more concurrent requests. It is perhaps not an improvement that the user will notice but it can become one if it’s part of a chain of incremental improvements. However, please consider the number and uniqueness of requests before caching raw database fetches. There’s plenty of performance enhancing tricks you can do at the database level itself that don’t carry the inherent problem of serving stale data. Which brings us to point four.

Four: cache entry longevity

How should the cache handle values that no longer reflect the latest state? Well, it needs to be informed about those changes in the first place. Secondly, is it acceptable to return stale data, and if yes for how long? Take the example of a networked thermometer whose readings change constantly, albeit slightly. We can probably serve readings of five minutes ago, depending on the use case. To do so we configure the cache to evict every entry five minutes are creation. If we get a new request after that, the thermometer will be read afresh, and its value cached. Now take a cache of values that changes at irregular intervals (like exchange rates). This time we are not allowed to serve stale values, so we invalidate the cached value every time a change occurs. This is not as simple as it sounds. The caching service has to be in constant contact with the original resource and be notified of any change. Whether this is worth it again comes down to the frequency and similarity of the requests. Caching is actually wasteful if a short-lived entry is requested less frequently than its shelf life. Storing a cached value that expires before the next request comes along means you pay for the housekeeping with none of the benefits.

The weather station project

Consider a heavily used service that returns a temperature reading from a number of remote weather stations based on a Dutch postcode. It is backed by a database that maps a postcode to a weather station ID and a service that retrieves the temperature from that station through a mock networked service. We are caching the postcode/weather station mapping as well as the temperature readings. The cache implementation of choice is the Caffeine project, based on the popular Google Guava cache. You can clone the project here:

Dutch postal codes consist of four digits and two letters. They are accurate to individual street level, and there are roughly half a million valid ones. We expect the vast majority of requests to come from the largest urban areas, perhaps 20K unique codes.

We make the cache size configurable with application parameters. Mappings between postal code and weather station are fixed, so we need no expiry mechanism. Keys can live forever. The true argument is the allowNullValues parameter. This means the cache can store a null value for a postal code that does not exist in the database. This saves a trip to the database if the same non-existing key is requested often.

 private int postcodeMaxSize;    
private CaffeineCache buildPostCodeCache() {
        return new CaffeineCache(POSTCODE_CACHE, Caffeine

For the weather station cache we are dealing with a small, fixed set of keys (the ids of the stations) that hold a value which goes stale the moment it is cached. We require no cap on the cache size and null values don’t come into play. We do however need to make sure that values are expired after a configurable number of seconds.

private int expiryTemperatureMilliSeconds;    

private CaffeineCache buildTemperatureCache() {
        return new CaffeineCache(TEMPERATURE_CACHE, Caffeine
                .expireAfterWrite(expiryTemperatureMilliSeconds, TimeUnit.MILLISECONDS)

Wrapping up

While it’s true that an effective caching implementation is harder than it seems, the flexibility of the framework makes it equally easy to twiddle the values, even at runtime without code changes. So make sure not to hard-code essential configuration parameters like cache size and expiry times, but set them through the Spring property mechanism, as in this example.

Happy experimenting!