A Catchy Way to Handle Failure (part 1)

A two-part look at failure handling in Perl

Lets face it - failures sometimes happen. However well-laid our plans might be, there are times when the universe conspires against us, and we don't achieve the desired result. This is just as true for the programs we write.

As with many things, there can often be a variety of ways in which failure is indicated. At the lowest level in a Perl program, that of individual native Perl operators, failures are usually indicated by return value - often false or undef, with perhaps a side-effect of setting the value of the special $! global to indicate the nature of this particular failure. For example, the familiar pattern we often see around the open operator:

open my $fh, "<", $path or die "Cannot open $path - $!";

Error Codes

Perl's style of returning undef comes from the C and UNIX heritage behind the language, where often functions will return some special value (sometimes known as a "rogue" or "sentinel" value) to indicate an error - typically -1 for integer-returning functions, or NULL for pointer ones. The $! Perl variable is a direct accessor for libc's errno value, which performs the same role in C. This pattern is widely-used among the language and system libraries provided with it.

Like in Perl, making use of this style of failure indication involves the caller having to remember to check the result every time. While this seems convenient enough in a single function case, it rapidly becomes long and tedious in a larger example, where a function may invoke a number of others as part of its work. If all the error handling is written in this form, such code soon starts to drown out the actual business logic of the function, and it becomes harder to see the essence of what the function is really doing, for all the noise of this error-handling machinery.

Another downside to this style of detecting failure is that it requires conscious effort on the part of the programmer to always remember to do it. If the results of function calls are used without checking there was an error, it can lead to further failures of different, or worse kinds. A common class of error in C programs that can easily lead to crashes or potential security holes, is forgetting to check that malloc() returned a valid pointer before trying to use it:

struct Point *p = malloc(sizeof struct Point);
p->x = 10;
p->y = 20;

If malloc() returns NULL then the second line here will immediately crash the program, because it tries to access a field off the NULL address. Typically the solution to this is to remember to check it, and have it cause the containing function itself to also yield its error sentinel value:

struct Point *p = malloc(sizeof struct Point);
if(!p) return NULL;

As our programs get larger, we eventually find that this sort of error indication by simple return value becomes less flexible and useful, because the detail of the code in successful cases (i.e. the code we really want to read) gets obscured by the error handling. Additionally, we have observed it's easy to forget some case or other, and that can often lead to situations where because such failure isn't handled, it goes on to cause further things to break.

We need to find better ways to handle these failures.

Fatal Errors

Rather than returning an error condition when they fail, an alternative behaviour for functions is instead to cause the entire program to abort. This avoids us having to check for errors after calling every single library function, and thus neatens up the flow of the program's logic, because it isn't littered with error-handling noise. For example, C's GLib library provides some wrappers around malloc() that cause the program to abort if that fails; as generally on desktop software if you run out of memory entirely there isn't much else that's sensible to do.

struct Point *p = g_malloc0(sizeof struct Point);
p->x = 10;
p->y = 20;

Now, we can't accidentally write to a NULL pointer because GLib's g_malloc0() wrapper will abort the entire program. It still crashes, but at least it does so safely without risk of performing any dangerous activities first.

Again this might work for smaller programs, but isn't very suitable for larger systems, especially in the parts of code that need to remain highly-available and keep running even under partial failure conditions.

Exceptional Control Flow

Instead of causing the entire program to abort, we can instead set up regions, in which such a failure will cause that entire section to abort but the overall program remains in control and is able to recover in some way.

For example, one of the main themes that the C++ language introduces on top of C is the exception handling keywords try, throw and catch. Exception handling has appeared in many languages before C++, but that was one of the first more "mainstream" languages to include the feature, so many other languages since have largely used the same keywords with similar semantics behind them.

try {
catch (std::exception e) {
    cout << "It failed!" << endl;

In Perl we have the behaviour of throw given by the Perl keyword die, and while the spirit of try and catch can be implemented by using eval, it is usually frowned upon to do so directly. Instead, one of several CPAN modules should be used to provide exception-handling semantics by way of these more familiar keywords (because additionally, these modules take care of a number of awkward manual steps that otherwise need to be performed). Two popular modules are Try::Tiny and Syntax::Keyword::Try.

use Syntax::Keyword::Try;

try {
catch {
    warn "Something went wrong - $@";

What we now have is a section of code (the try part) where reported failures will immediately stop the flow of logic within the block, and instead will jump to executing a new piece of code, specifically designated for handling such a failure (the catch part).

It is important to realise that this try block is dynamic, not lexical, in nature. Any failure that happens within the braces, or inside any function called there - recursively - will be trapped by this construction and sent to the catch block. It is this dynamic behaviour that provides the power and convenience of this construction.

Part 2

In part 2 we will take a look at how the concepts discussed in this part relate to asynchronous programming using futures, and how they can be implemented.