by Jeff "japhy" Pinyan

Repeating Yourself

Email comments to japhy@pobox.com
If you're concerned with writing redundant code in a program -- code that might be expensive, time-wise -- then read on. I'll show you how to spot these, and how to fix them.

What You Coded

Let's say you have an array of words. If the array contains words in ALL CAPITAL LETTERS, you want to remove those words, and call foo(). If not, you just want to call bar(). When you code this the first time, you use the grep() function: if (grep /^[A-Z]+$/, @list) { @list = grep not /^[A-Z]+$/, @list; foo(); } else { bar(); } Do you see the redundancy here? We call grep() on the array once to determine the course of action, and then we end up calling it again (doing the opposite thing this time) if we need to remove elements!

Or, let's say you want to remove a substring from a string, using a regular expression. Far too often I have seen people write: if ($str =~ /%%\w+%%/) { $str =~ s/%%(\w+)%%/$WORD{$1}/; do_something(); } There is little point in scanning $str for text, if you're only going to re-scan for the text right away.

How to Fix It

Well, for my first example, let's go through a bit of logic here. If the array has elements that are all in uppercase, then removing those elements would make the array smaller than it was before. So, we should try to remove these elements, and then check to see whether the size has changed! $orig_size = @list; @list = grep not /^[A-Z]+$/, @list; $new_size = @list; if ($orig_size != $new_size) { foo() } else { bar() } And we can beautify this code a bit: $orig = @list; if ($orig != (@list = grep not /^[A-Z]+$/, @list)) { foo() } else { bar() } What I did here was combine the modification of the the @list array with the evaluation of the modified array in scalar context. First, it gets modified, and then evaluated in scalar context (which returns the number of elements in it).

For our regex example, the solution is much less involving. Simply remove the initial scan, and use the substitution as the condition: if ($str =~ s/%%(\w+)%%/$WORD{$1}/) { do_something() } Or, if you're like me, you'll probably write: do_something() if $str =~ s/%%(\w+)%%/$WORD{$1}/;

"Now never let that happen again!"

So how do you avoid this? Well, there are a couple steps. First, ask yourself what a particular block of code is doing. If you hear yourself repeating something, chances are your code repeats itself (even if not as obviously as the examples above).

You should also look for surprising similar blocks of code -- sometimes the repetition is useless, and sometimes it should be streamlined in the form of a function.