There are several
I/O operators you should know about.
A string is enclosed by backticks (grave accents) first undergoes variable substitution just like a double quoted string. It is then interpreted as a command, and the output of that command is the value of the pseudo-literal, like in a shell. In a scalar context, a single string consisting of all the output is returned. In a list context, a list of values is returned, one for each line of output. (You can set
$/ to use a different line terminator.) The command is executed each time the
pseudo-literal is evaluated. The status value of the command is returned in $? (see the perlvar manpage for the interpretation of $? ). Unlike in csh, no translation is done on the return data--newlines remain newlines.
Unlike in any of the shells, single quotes do not hide variable names in
the command from interpretation. To pass a $ through to the shell you need
to hide it with a backslash. The generalized form of backticks is qx//. (Because backticks always undergo shell expansion as well, see the perlsec manpage for security concerns.)
Evaluating a filehandle in angle brackets yields the next line from that
file (newline, if any, included), or undef at end of file. Ordinarily you must assign that value to a variable, but
there is one situation where an automatic assignment happens. If and ONLY if the input symbol is the only thing inside the conditional of a while or
for(;;) loop, the value is automatically assigned to the variable
$_ . The assigned value is then tested to see if it is defined. (This may seem
like an odd thing to you, but you'll use the construct in almost every Perl
script you write.) Anyway, the following lines are equivalent to each
other:
while (defined($_ = <STDIN>)) { print; }
while (<STDIN>) { print; }
for (;<STDIN>;) { print; }
print while defined($_ = <STDIN>);
print while <STDIN>;
The filehandles
STDIN,
STDOUT, and
STDERR are predefined. (The filehandles
stdin , stdout , and stderr will also work except in packages, where they would be interpreted as local
identifiers rather than global.) Additional filehandles may be created with
the open() function. See open() for details on this.
If a <
FILEHANDLE> is used in a context that is looking for a list, a list consisting of all
the input lines is returned, one line per list element. It's easy to make a LARGE data space this way, so use with care.
The null filehandle <> is special and can be used to emulate the behavior of sed and awk. Input from <> comes either from standard input, or from each file listed on the command
line. Here's how it works: the first time <> is evaluated, the @ARGV array is checked, and if it is null, $ARGV[0] is set to ``-'', which when opened gives you standard input. The
@ARGV array is then processed as a list of filenames. The loop
while (<>) {
... # code for each line
}
is equivalent to the following Perl-like pseudo code:
unshift(@ARGV, '-') unless @ARGV;
while ($ARGV = shift) {
open(ARGV, $ARGV);
while (<ARGV>) {
... # code for each line
}
}
except that it isn't so cumbersome to say, and will actually work. It really does shift array @ARGV and put the current filename into variable
$ARGV. It also uses filehandle
ARGV internally--<> is just a synonym for <
ARGV>, which is magical. (The pseudo code above doesn't work because it treats <
ARGV> as non-magical.)
You can modify @ARGV before the first <> as long as the array ends up containing the list of filenames you really
want. Line numbers ($. ) continue as if the input were one big happy file. (But see example under
eof() for how to reset line numbers on each file.)
If you want to set @ARGV to your own list of files, go right
ahead. If you want to pass switches into your script, you can use one of
the Getopts modules or put a loop on the front like this:
while ($_ = $ARGV[0], /^-/) {
shift;
last if /^--$/;
if (/^-D(.*)/) { $debug = $1 }
if (/^-v/) { $verbose++ }
... # other switches
}
while (<>) {
... # code for each line
}
The <> symbol will return
FALSE only once. If you call it again after this it will assume you are processing another @ARGV list, and if you haven't set
@ARGV, will input from
STDIN.
If the string inside the angle brackets is a reference to a scalar variable
(e.g., <$foo >), then that variable contains the name of the filehandle to input from, or
a reference to the same. For example:
$fh = \*STDIN;
$line = <$fh>;
If the string inside angle brackets is not a filehandle or a scalar
variable containing a filehandle name or reference, then it is interpreted
as a filename pattern to be globbed, and either a list of filenames or the
next filename in the list is returned, depending on context. One level of $
interpretation is done first, but you can't say <$foo>
because that's an indirect filehandle as explained in the previous
paragraph. (In older versions of Perl, programmers would insert curly
brackets to force interpretation as a filename glob: <${foo}> . These days, it's considered cleaner to call the internal function
directly as glob($foo), which is probably the right way to have done it in the first place.)
Example:
while (<*.c>) {
chmod 0644, $_;
}
is equivalent to
open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
while (<FOO>) {
chop;
chmod 0644, $_;
}
In fact, it's currently implemented that way. (Which means it will not work
on filenames with spaces in them unless you have csh(1) on
your machine.) Of course, the shortest way to do the above is:
chmod 0644, <*.c>;
Because globbing invokes a shell, it's often faster to call
readdir() yourself and do your own grep() on the
filenames. Furthermore, due to its current implementation of using a shell,
the glob() routine may get ``Arg list too long'' errors
(unless you've installed tcsh(1L) as /bin/csh).
A glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In a list context this isn't important, because you automatically get them all anyway. In a scalar context, however, the operator returns the next value each time it is called, or a
FALSE value if you've just run out. Again,
FALSE is returned only once. So if you're expecting a single value from a glob, it is much better to say
($file) = <blurch*>;
than
$file = <blurch*>;
because the latter will alternate between returning a filename and returning
FALSE.
It you're trying to do variable interpolation, it's definitely better to
use the glob() function, because the older notation can cause
people to become confused with the indirect filehandle notation.
@files = glob("$dir/*.[ch]");
@files = glob($files[$i]);
Source: Perl operators and precedence Copyright: Larry Wall, et al. |