RocketAware > Perl >

COMMON MISTAKES

Tips: Browse or Search all pages for efficient awareness of Perl functions, operators, and FAQs.

Home

Search Perl pages

Subjects

By activity
Professions, Sciences, Humanities, Business, ...

User Interface
Text-based, GUI, Audio, Video, Keyboards, Mouse, Images,...

Text Strings
Conversions, tests, processing, manipulation,...

Math
Integer, Floating point, Matrix, Statistics, Boolean, ...

Processing
Algorithms, Memory, Process control, Debugging, ...

Stored Data
Data storage, Integrity, Encryption, Compression, ...

Communications
Networks, protocols, Interprocess, Remote, Client Server, ...

Hard World
Timing, Calendar and Clock, Audio, Video, Printer, Controls...

File System
Management, Filtering, File & Directory access, Viewers, ...

COMMON MISTAKES

The two most common mistakes made in constructing something like an array of arrays is either accidentally counting the number of elements or else taking a reference to the same memory location repeatedly. Here's the case where you just get the count instead of a nested array:

    for $i (1..10) {
        @list = somefunc($i);
        $LoL[$i] = @list;       # WRONG!
    }

That's just the simple case of assigning a list to a scalar and getting its element count. If that's what you really and truly want, then you might do well to consider being a tad more explicit about it, like this:

    for $i (1..10) {
        @list = somefunc($i);
        $counts[$i] = scalar @list;
    }

Here's the case of taking a reference to the same memory location again and again:

    for $i (1..10) {
        @list = somefunc($i);
        $LoL[$i] = \@list;      # WRONG!
    }

So, what's the big problem with that? It looks right, doesn't it? After all, I just told you that you need an array of references, so by golly, you've made me one!

Unfortunately, while this is true, it's still broken. All the references in @LoL refer to the very same place, and they will therefore all hold whatever was last in @list! It's similar to the problem demonstrated in the following C program:

    #include <pwd.h>
    main() {
        struct passwd *getpwnam(), *rp, *dp;
        rp = getpwnam("root");
        dp = getpwnam("daemon");

        printf("daemon name is %s\nroot name is %s\n",
                dp->pw_name, rp->pw_name);
    }

Which will print

    daemon name is daemon
    root name is daemon

The problem is that both rp and dp are pointers to the same location in memory! In C, you'd have to remember to malloc() yourself some new memory. In Perl, you'll want to use the array constructor [] or the hash constructor {} instead. Here's the right way to do the preceding broken code fragments:

    for $i (1..10) {
        @list = somefunc($i);
        $LoL[$i] = [ @list ];
    }

The square brackets make a reference to a new array with a copy of what's in @list at the time of the assignment. This is what you want.

Note that this will produce something similar, but it's much harder to read:

    for $i (1..10) {
        @list = 0 .. $i;
        @{$LoL[$i]} = @list;
    }

Is it the same? Well, maybe so--and maybe not. The subtle difference is that when you assign something in square brackets, you know for sure it's always a brand new reference with a new copy of the data. Something else could be going on in this new case with the @{$LoL[$i]}} dereference on the left-hand-side of the assignment. It all depends on whether $LoL[$i] had been undefined to start with, or whether it already contained a reference. If you had already populated @LoL with references, as in

    $LoL[3] = \@another_list;

Then the assignment with the indirection on the left-hand-side would use the existing reference that was already there:

    @{$LoL[3]} = @list;

Of course, this would have the ``interesting'' effect of clobbering @another_list. (Have you ever noticed how when a programmer says something is ``interesting'', that rather than meaning ``intriguing'', they're disturbingly more apt to mean that it's ``annoying'', ``difficult'', or both? :-)

So just remember always to use the array or hash constructors with [] or {}, and you'll be fine, although it's not always optimally efficient.

Surprisingly, the following dangerous-looking construct will actually work out fine:

    for $i (1..10) {
        my @list = somefunc($i);
        $LoL[$i] = \@list;
    }

That's because my() is more of a run-time statement than it is a compile-time declaration per se. This means that the my() variable is remade afresh each time through the loop. So even though it looks as though you stored the same variable reference each time, you actually did not! This is a subtle distinction that can produce more efficient code at the risk of misleading all but the most experienced of programmers. So I usually advise against teaching it to beginners. In fact, except for passing arguments to functions, I seldom like to see the gimme-a-reference operator (backslash) used much at all in code. Instead, I advise beginners that they (and most of the rest of us) should try to use the much more easily understood constructors [] and {} instead of relying upon lexical (or dynamic) scoping and hidden reference-counting to do the right thing behind the scenes.

In summary:

    $LoL[$i] = [ @list ];       # usually best
    $LoL[$i] = \@list;          # perilous; just how my() was that list?
    @{ $LoL[$i] } = @list;      # way too tricky for most programmers

Source: Perl Data Structures Cookbook
Copyright: Larry Wall, et al.

Next: CAVEAT ON PRECEDENCE

Previous: REFERENCES

(Corrections, notes, and links courtesy of RocketAware.com)

[Overview Topics]

Up to: Data structures (In memory)

Rapid-Links: Search | About | Comments | Submit Path: RocketAware > Perl > perldsc/COMMON_MISTAKES.htm