Caching the results of Future-returning functions

Often when making read requests or queries to some external service, it is helpful to cache the results of these queries so that other requests for the same information can use that directly, rather than having to repeat the query to that same service.

A simple implementation of a cache may be to use a hash variable, keyed by the query, to store the result. The //= operator allows a neat expression to manage this cache - return the result immediately from the cache if it's defined, or execute the actual query function and then assign the result into that cache, ready for next time. For synchronous value-returning code, it might look like this:

my %usernames_by_id;

sub get_username_by_id_cached
{
    my ($id) = @_;
    return $usernames_by_id{$id} //= $db->get_username_by_id($id);
}

Value cache

When working with asynchronous code that uses futures, it's often the case that such a database query method would return a Future that eventually provides the result. It might at first appear more awkward to use this via a cache, as you'd have to separately check if the value was in the hash and if so return a Future->done() with the result, otherwise perform the query then store its result in the hash for next time, perhaps by using an on_done side-effect observer. An initial attempt might look like this:

my %usernames_by_id;

sub get_username_by_id_cached
{
    my ($id) = @_;
    exists $usernames_by_id{$id} and
        return Future->done($usernames_by_id{$id});
        
    return $db->get_username_by_id($id)
        ->on_done(sub {
            my ($username) = @_;
            $usernames_by_id{$id} = $username;
        });
}

This already looks far messier than the synchronous one, and leaves unanswered one key question about where the $db->get_username_by_id() future is stored while it's pending.

It turns out though that the nature of futures provides a far neater solution. Rather than the hash storing a mapping from keys to values directly, it simply has to store a mapping from keys to futures that provide those values. Using the exact same source code as the synchronous version given earlier, if the $db->... method returns a Future, then so too does this cached lookup function. When the Future resolves, it will yield this result. The now-completed Future is stored in the hash to immediately yield its result next time the function is invoked.

Since the hash mapping from keys to futures is updated as soon as the query is started, rather than later when it eventually completes, this now means that any subsequent queries that arrive after the first one is made can now also return the same future instance. If another call to this caching function is made with the same user ID as a key, it will return the same pending future that a previous call had started, rather than making a second external database query. When that query eventually succeeds, both callers will receive the result even though only one database query was actually made.

By storing results as futures and accepting that the hash may well contain still-pending futures, we can overlap multiple requests for the same information down into a single outbound query on the database. Often this can have a big impact during application startup or other periods of heavy traffic, when many such requests are performed at once.

Clearing On Failure

One remaining behaviour that the synchronous version of the code has that we've now lost is that a lookup might fail for some reason. If the synchronous version used exceptions to indicate failure, the expression that would assign the result into the hash would be aborted and so the hash would not be updated and the failure would not persist there. In our future-based version, the hash is already storing a pending future, and if that operation eventually fails, the failure will persist in the cache.

As it is unlikely that we want to store failures, we'll have to take one additional step to remove the future from the hash if a failure happens, so that subsequent lookups will perform a fresh request on the external service and hopefully obtain a better result:

my %username_futures_by_id;

sub get_username_by_id_cached
{
    my ($id) = @_;
    return $username_futures_by_id{$id} //= $db->get_username_by_id($id)
        ->on_fail(sub { delete $username_futures_by_id{$id} });
}

This surprisingly-short code example captures a multitude of behaviours:

  • Transparent read cache of single-keyed mappings.
  • Overlap multiple concurrent requests for the same key into a single outbound request.
  • Stores only successful lookups but not failed ones.