Coding with Style
Jeff Pinyan
Using Closures
Email comments to
japhy@pobox.com
Closures have quite a few purposes, but their use boils down to one concept in
general: lexically scoped variables (ones you create via my()) that
exist after their scope (the block they're defined in) is closed.
What is a Closure?
Luckily, the
Perl FAQ
(section 7)
answers this for us:
perldoc -q closure
Found in perlfaq7.pod
What's a closure?
Closures are documented in
perlref.
Closure is a computer science term with a precise but hard-to-explain
meaning. Closures are implemented in Perl as anonymous subroutines with lasting
references to lexical variables outside their own scopes. These lexicals
magically refer to the variables that were around when the subroutine was
defined (deep binding).
This means you can do something like:
{ # $name exists ONLY in this block
my $name = "Jeff"; # $name is lexically scoped
sub whoami { $name } # we use the lexical $name here...
} # $name goes out of scope
(Yes, that's a silly thing to do, but just assume you have reason to do it.)
What we have effectively done is made a variable that is private to the
whoami() function (and anything else we may have put in that block).
We can't change the value returned by the function now, unless we change the
function itself. But this example was simple... let's pick a more dynamic
example.
Simple Closures in Action
Let's say we want to make a counter function. Every time we call it, we want
to get a number one higher than before. And let's say we wanted to have MORE
than just one counter. We could make separate counter functions, or we could
make a counter function that gets an argument, and increments a particular
counter in an array:
{
my @values; # @values is seen only by counter()
sub counter {
my $which = shift; # which element to increment
++$values[$which]; # return new value
}
}
for (1..5) { print counter(0) } # 12345
for (1..3) { print counter(1) } # 123
for (1..2) { print counter(0) } # 67
We could use a hash, instead, so that we could do:
$yet_another_user = counter("Perl");
$and_a_competitor = counter("Python");
We could change our function to allow a value which determines how much to
increment by, and add a function that sets the starting value.
{
my %ctr; # "shared" by start() and counter()
sub start {
my ($which,$value) = @_;
$ctr{$which} = $value || 0;
}
sub counter {
my ($which,$inc) = @_;
$ctr{$which} += $inc;
}
}
start("foo",5);
print counter("foo",10); # 15
print counter("foo",10); # 25
print counter("foo",10); # 35
# starts at 0 by default
print counter("bar",-1); # -1
print counter("bar",-1); # -2
print counter("bar",-1); # -3
This isn't very magical though -- the only thing special is the %ctr
variable that is shared by the two functions, and exists long after the block
it is defined in is closed. It would be more interesting if we could call a
function, and then define a function inside of that, that has something to do
with the arguments we passed to the original function:
sub make_counter {
my ($start,$inc) = @_;
sub ctr { $start += $inc }
}
But this doesn't work for a couple reasons. Subroutine declarations are just
that -- declarations. They don't return a value (such as a subroutine
reference, or a subroutine itself (which isn't possible, anyway)). And the
ctr() function is defined only once, so it will only keep the value of
$start and $inc before and during the first call to make_counter().
So even if you tried getting around this problem by returning a reference to
the ctr() function:
#!/usr/bin/perl -w
sub make_counter {
my ($start,$inc) = @_;
sub ctr { $start += $inc }
return \&ctr;
}
$one = make_counter(0,1);
$ten = make_counter(0,10);
print $one->(), $ten->();
you'd not get the expected results.
First, if you're using the -w switch to Perl (who wouldn't be?), you'll
get warned that Variable "$start" will not stay shared at inner line 6
and Variable "$inc" will not stay shared at inner line 6. Note that
these are compile-time warnings. Perl can tell you're up to no good. The next
thing you'll notice is what Perl prints: 12. That is, the number 1,
and then the number 2. Not, as we'd hope, 1 and then 10. This is because the
ctr() function is not redefined (Perl would actually give us a warning
about that, too) and because ctr() uses the first values of the two
variables.
Closures Using Code References
The solution is to use an anonymous subroutine, a code reference:
$coderef = sub {
# ...
};
There is no name given after sub (thus the fact that it's an
anonymous subroutine), and there is a semicolon after the closing brace
since this is an expression, and not a declaration. Using such a data type,
you can create the closure we are looking for:
#!/usr/bin/perl -w
sub make_counter {
my ($start,$inc) = @_;
return sub { $start += $inc }
}
$one = make_counter(0,1);
$ten = make_counter(0,10);
print $one->(), $ten->();
The output is 110 (1, and then 10) as we'd like (and expect). For my
own nefarious reasons, Let's say we want the make_counter() function
to just get the initial value, and the function reference returned to
handle the increment value.
sub make_counter {
my $start = shift;
return sub { $start += ($_[0] || 1) }
}
$ctr = make_counter(10); # starting value of counter is 10
print $ctr->(); # defaults to incrementing by 1
print $ctr->(5); # increments by 5
This prints 11 and then 16. But we can go even further. How about an empty
call to the anonymous subroutine just uses the previous increment value? And
that way, we could send the increment value to the make_counter()
function, and then not need to send it ever again:
sub make_counter {
my $start = shift;
my $inc = shift || 1;
return sub { $start += (@_ ? ($inc = $_[0]) : $inc) }
}
$ctr = make_counter(10,5); # starting value of counter is 10
print $ctr->(); # defaults to incrementing by 5
print $ctr->(1); # sets increment to 1
This prints 15 and then 16. Both $inc and $start are private
to the returned code reference (or closure, as we should be thinking of them
by now). The "noisiest" line of this code is probably the returning of the
subroutine reference. If the subroutine reference is passed an argument, then
$inc is set to it. Either way, the value of $inc is then
used. This could have been written in a few other ways:
$start += (@_ and $inc = $_[0], $inc);
$start += ($inc = $_[0] || $inc); # doesn't allow for 0