/ perl

A Sense of Closure

A discussion came up the other day on the subject of how best to structure a certain piece of code, involving an IO::Async timer. The code in question was a watchdog, whose job is to alert the surrounding program to the fact that a connection appears to be dead, because no traffic has been seen on it for a while. Central to this discussion were two alternative forms that the code inside the timer's expiry callback function would be constructed.

The initial suggestion involved creating a timer that simply invokes a method on its containing object when it expires, with a piece of code in the connection that looked something like this:

    $self->{watchdog} = IO::Async::Timer::Countdown->new(
        delay => $WATCHDOG_DELAY,
        on_expire => sub {
            $self->on_watchdog_expired;
        },
    );

At this point the keen-eyed observer might notice a problem. We've created a reference cycle here. Namely, the new countdown timer object stores a code reference which captures the $self lexical, which is our own connection instance, but this timer object is also stored as a key of the connection instance itself, thus forming a cycle from connection to timer to code reference back to connection again.

Weakening A Cycle

A fairly standard solution to this cycle problem is to weaken one of the references, so that a cycle is no longer formed. Typically this would be done by weakening the reference to $self inside the callback function, so that it alone doesn't hold the connection instance alive. While that instance remains alive this reference continues to point at it, but if all other (strong) references to that instance go away, the connection instance is destroyed and any remaining weak references to it become undef.

use Scalar::Util qw( weaken );
...
    weaken(my $weakself = $self);
    $self->{watchdog} = IO::Async::Timer::Countdown->new(
        delay => $WATCHDOG_DELAY,
        on_expire => sub {
            $weakself->on_watchdog_expired;
        },
    );

Lexical Closure

Both of these first two code examples have involved an anonymous function that captured an outer lexical variable - $self in the first case, and $weakself in the second. In each case that variable is created in the immediately containing scope, which is a method on the connection instance itself (whose full code I omitted from these short examples).

In both of these cases, the inner (anonymous) function we created that captured this lexical is stored by the timer object instance, and therefore can survive for longer than the current "activation" of the method it was created in. In this situation, Perl will store that captured variable as part of that new anonymous function instance stored in the timer, forming a structure known as a closure. This closure is the combination of a function, and any outer variables it had captured, giving the code some context to work with.

Because the closure has to store the surrounding context of its captured variables, each time the sub { ... } expression is encountered, Perl has to allocate some new memory to store it all in. While it doesn't matter too much in this case (as we are only about to store that closure reference into a new timer object anyway, and that will allocate more), there are sometimes occasions when that extra memory usage can start to grow more than is necessary. In those situations, there can sometimes be a better solution.

When A Closure Isn't

By taking a look a little further at the original code, we can find a different technique here, that as well as saving us some memory will also avoid the need to perform that slightly awkward weaken() dance.

The connection instance itself is already an IO::Async::Notifier subclass (well, actually it wasn't but lets say for sake of argument it could easily be made so, because that makes the following technique work), and to make the timer work the connection adds it as a child notifier instance:

    $self->{watchdog} = IO::Async::Timer::Countdown->new(
        delay => $WATCHDOG_DELAY,
        on_expire => sub {
            $self->on_watchdog_expired;
        },
    );
    $self->add_child($self->{watchdog});

Rather than using the $weakself trick we can instead apply a different idea. As IO::Async::Notifier instances can be formed into trees, they have to track their parent and child relationships. This has to be done with one direction being weakened, otherwise another reference cycle issue will be created. The general structure of programs suggests that notifiers should strongly hold their children, but only weakly hold their parents, so that is what is done here. This means that rather than needing to form a closure over a $weakself lexical, we can instead use the ->parent method to access it. This is easily obtained from the timer instance because, like for all event callback functions on all IO::Async objects, the timer instance itself is passed in as the first argument to its event callback:

    $self->{watchdog} = IO::Async::Timer::Countdown->new(
        delay => $WATCHDOG_DELAY,
        on_expire => sub {
            $_[0]->parent->on_watchdog_expired;
        },
    );
    $self->add_child($self->{watchdog});

Like the earlier example with weaken(), this example also avoids a reference cycle, and so is well-behaved for memory usage. Because that weak reference is created by the IO::Async internals, we get to make use of it "for free", without needing to manage it ourselves, so the code is a little simpler too. We haven't needed to capture a lexical variable to store $self (being the connection instance) this time, because the watchdog timer is added as a child of that connection, and so the watchdog has access already via the ->parent method.

There's another, far more interesting advantage to this code structure. The anonymous function inside the timer no longer has to capture any lexical variables at all. It operates purely on the value that is passed into it when it is invoked. Because of this, the function is not a closure, and Perl does not need to allocate new memory every time. All of the timer instances will share a reference to the same anonymous function, because it does not need to be copied.

Memory Savings

In fact we can demonstrate this advantage more directly with a little toy example program. In the following two cases we generate and store a list of five code references. In the first, we store true lexical closures that capture individual values of an outer variable. In the second, we store references to an anonymous function that does not capture any outer variables.

my @codes;
foreach ( 1 .. 5 ) {
    my $self = [];
    push @codes, sub { $self->method };
}
say for @codes

CODE(0x559962290108)
CODE(0x5599622902a0)
CODE(0x55996229a2e8)
CODE(0x559962299e08)
CODE(0x55996229a330)
my @codes;
foreach ( 1 .. 5 ) {
    my $self = [];
    push @codes, sub { $_[0]->parent->method };
}
say for @codes

CODE(0x560a864a16a0)
CODE(0x560a864a16a0)
CODE(0x560a864a16a0)
CODE(0x560a864a16a0)
CODE(0x560a864a16a0)

Here we can directly see that in the first case they all had different memory addresses, so we must have allocated five different objects. In the second case, the same memory address is printed five times, as it does not have to create new closures but can simply generate more references to the same anonymous function.

By carefully designing a callback structure and what arguments are passed in to invoked functions, we can often arranage to reduce the number of closures that are needed. This in turn helps us save memory, and with it overall performance, when those structures are used many times within a larger application.

A Sense of Closure
Share this