by Jeff "japhy" Pinyan
Repeating Yourself
Email comments to
japhy@pobox.com
If you're concerned with writing redundant code in a program -- code that might
be expensive, time-wise -- then read on. I'll show you how to spot these, and
how to fix them.
What You Coded
Let's say you have an array of words. If the array contains words in ALL
CAPITAL LETTERS, you want to remove those words, and call foo(). If
not, you just want to call bar(). When you code this the first time,
you use the grep() function:
if (grep /^[A-Z]+$/, @list) {
@list = grep not /^[A-Z]+$/, @list;
foo();
}
else {
bar();
}
Do you see the redundancy here? We call grep() on the array once to
determine the course of action, and then we end up calling it again (doing the
opposite thing this time) if we need to remove elements!
Or, let's say you want to remove a substring from a string, using a regular
expression. Far too often I have seen people write:
if ($str =~ /%%\w+%%/) {
$str =~ s/%%(\w+)%%/$WORD{$1}/;
do_something();
}
There is little point in scanning $str for text, if you're only going
to re-scan for the text right away.
How to Fix It
Well, for my first example, let's go through a bit of logic here. If the array
has elements that are all in uppercase, then removing those elements would make
the array smaller than it was before. So, we should try to remove these
elements, and then check to see whether the size has changed!
$orig_size = @list;
@list = grep not /^[A-Z]+$/, @list;
$new_size = @list;
if ($orig_size != $new_size) { foo() }
else { bar() }
And we can beautify this code a bit:
$orig = @list;
if ($orig != (@list = grep not /^[A-Z]+$/, @list)) { foo() }
else { bar() }
What I did here was combine the modification of the the @list array
with the evaluation of the modified array in scalar context. First, it gets
modified, and then evaluated in scalar context (which returns the number of
elements in it).
For our regex example, the solution is much less involving. Simply remove the
initial scan, and use the substitution as the condition:
if ($str =~ s/%%(\w+)%%/$WORD{$1}/) { do_something() }
Or, if you're like me, you'll probably write:
do_something() if $str =~ s/%%(\w+)%%/$WORD{$1}/;
"Now never let that happen again!"
So how do you avoid this? Well, there are a couple steps. First, ask yourself
what a particular block of code is doing. If you hear yourself repeating
something, chances are your code repeats itself (even if not as obviously as
the examples above).
You should also look for surprising similar blocks of code -- sometimes the
repetition is useless, and sometimes it should be streamlined in the form of a
function.