Home
Search Perl pages
Subjects
By activity
Professions, Sciences, Humanities, Business, ...
User Interface
Text-based, GUI, Audio, Video, Keyboards, Mouse, Images,...
Text Strings
Conversions, tests, processing, manipulation,...
Math
Integer, Floating point, Matrix, Statistics, Boolean, ...
Processing
Algorithms, Memory, Process control, Debugging, ...
Stored Data
Data storage, Integrity, Encryption, Compression, ...
Communications
Networks, protocols, Interprocess, Remote, Client Server, ...
Hard World Timing, Calendar and Clock, Audio, Video, Printer, Controls...
File System
Management, Filtering, File & Directory access, Viewers, ...
|
|
|
study SCALAR
study
Takes extra time to study
SCALAR ($_ if unspecified) in anticipation of doing many pattern matches on the string before it is next modified. This may or may not save time, depending on the nature and number of patterns you are searching on, and on the distribution of character frequencies in the string to be searched -- you probably want to compare run times with and without it to see which runs faster. Those loops which scan for many short constant strings (including the constant parts of more complex patterns) will benefit most. You may have only one study active at a time -- if you study a different scalar the first is ``unstudied''. (The way study works is this: a linked list of every character in the string to be searched is made, so we know, for example, where all the 'k' characters are. From each search string, the rarest character is selected, based on some static frequency tables constructed from some
C programs and English text. Only those places that contain this ``rarest'' character are examined.)
For example, here is a loop which inserts index producing entries before
any line containing a certain pattern:
while (<>) {
study;
print ".IX foo\n" if /\bfoo\b/;
print ".IX bar\n" if /\bbar\b/;
print ".IX blurfl\n" if /\bblurfl\b/;
...
print;
}
In searching for /\bfoo\b/, only those locations in $_ that
contain ``f'' will be looked at, because ``f'' is rarer than ``o''. In
general, this is a big win except in pathological cases. The only question
is whether it saves you more time than it took to build the linked list in
the first place.
Note that if you have to look for strings that you don't know till runtime,
you can build an entire loop as a string and eval that to avoid recompiling
all your patterns all the time. Together with undefining $/ to input entire
files as one record, this can be very fast, often faster than specialized
programs like fgrep(1). The following scans a list of files (@files ) for a list of words (@words ), and prints out the names of those files that contain a match:
$search = 'while (<>) { study;';
foreach $word (@words) {
$search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
}
$search .= "}";
@ARGV = @files;
undef $/;
eval $search; # this screams
$/ = "\n"; # put back to normal input delimiter
foreach $file (sort keys(%seen)) {
print $file, "\n";
}
Source: Perl builtin functions Copyright: Larry Wall, et al. |
Next: sub BLOCK
Previous: stat FILEHANDLE
(Corrections, notes, and links courtesy of RocketAware.com)
Up to: NUL terminated String Comparison and Search
Rapid-Links:
Search | About | Comments | Submit Path: RocketAware > Perl >
perlfunc/study.htm
RocketAware.com is a service of Mib Software Copyright 2000, Forrest J. Cavalier III. All Rights Reserved. We welcome submissions and comments
|