For those preferring a higher-level interface to socket programming, the IO::Socket module provides an object-oriented approach. IO::Socket is included as part of the standard Perl distribution as of the 5.004 release. If you're running an earlier version of Perl, just fetch IO::Socket from
CPAN, where you'll also find find modules providing easy interfaces to the following systems:
DNS,
FTP, Ident
(RFC 931),
NIS and NISPlus,
NNTP, Ping,
POP3,
SMTP,
SNMP, SSLeay, Telnet, and Time--just to name a few.
Here's a client that creates a
TCP connection to the ``daytime'' service at port 13
of the host name ``localhost'' and prints out everything that the server
there cares to provide.
#!/usr/bin/perl -w
use IO::Socket;
$remote = IO::Socket::INET->new(
Proto => "tcp",
PeerAddr => "localhost",
PeerPort => "daytime(13)",
)
or die "cannot connect to daytime port at localhost";
while ( <$remote> ) { print }
When you run this program, you should get something back that looks like
this:
Wed May 14 08:40:46 MDT 1997
Here are what those parameters to the new constructor mean:
- Proto
-
This is which protocol to use. In this case, the socket handle returned will be connected to a
TCP socket, because we want a stream-oriented connection, that is, one that acts pretty much like a plain old file. Not all sockets are this of this type. For example, the
UDP protocol can be used to make a datagram socket, used for message-passing.
- PeerAddr
-
This is the name or Internet address of the remote host the server is
running on. We could have specified a longer name like
"www.perl.com" , or an address like "204.148.40.9" . For demonstration purposes, we've used the special hostname "localhost" , which should always mean the current machine you're running on. The
corresponding Internet address for localhost is "127.1" , if you'd rather use that.
- PeerPort
-
This is the service name or port number we'd like to connect to. We could
have gotten away with using just
"daytime" on systems with a well-configured system services file,[FOOTNOTE: The
system services file is in /etc/services under Unix] but just in case, we've specified the port number (13) in
parentheses. Using just the number would also have worked, but constant
numbers make careful programmers nervous.
Notice how the return value from the new constructor is used as a filehandle in the while loop? That's what's called an indirect filehandle, a scalar variable
containing a filehandle. You can use it the same way you would a normal
filehandle. For example, you can read one line from it this way:
$line = <$handle>;
all remaining lines from is this way:
@lines = <$handle>;
and send a line of data to it this way:
print $handle "some data\n";
Here's a simple client that takes a remote host to fetch a document from,
and then a list of documents to get from that host. This is a more
interesting client than the previous one because it first sends something
to the server before fetching the server's response.
#!/usr/bin/perl -w
use IO::Socket;
unless (@ARGV > 1) { die "usage: $0 host document ..." }
$host = shift(@ARGV);
foreach $document ( @ARGV ) {
$remote = IO::Socket::INET->new( Proto => "tcp",
PeerAddr => $host,
PeerPort => "http(80)",
);
unless ($remote) { die "cannot connect to http daemon on $host" }
$remote->autoflush(1);
print $remote "GET $document HTTP/1.0\n\n";
while ( <$remote> ) { print }
close $remote;
}
The web server handing the ``http'' service, which is assumed to be at its
standard port, number 80. If your the web server you're trying to connect
to is at a different port (like 1080 or 8080), you should specify as the
named-parameter pair, PeerPort => 8080 . The autoflush
method is used on the socket because otherwise the system would buffer up
the output we sent it. (If you're on a Mac, you'll also need to change
every "\n" in your code that sends data over the network to be a "\015\012" instead.)
Connecting to the server is only the first part of the process: once you have the connection, you have to use the server's language. Each server on the network has its own little command language that it expects as input. The string that we send to the server starting with
``GET'' is in
HTTP syntax. In this case, we simply request each specified document. Yes, we really are making a new connection for each document, even though it's the same host. That's the way you always used to have to speak
HTTP. Recent versions of web browsers may request that the remote server leave the connection open a little while, but the server doesn't have to honor such a request.
Here's an example of running that program, which we'll call webget:
shell_prompt$ webget www.perl.com /guanaco.html
HTTP/1.1 404 File Not Found
Date: Thu, 08 May 1997 18:02:32 GMT
Server: Apache/1.2b6
Connection: close
Content-type: text/html
<HEAD><TITLE>404 File Not Found</TITLE></HEAD>
<BODY><H1>File Not Found</H1>
The requested URL /guanaco.html was not found on this server.<P>
</BODY>
Ok, so that's not very interesting, because it didn't find that particular
document. But a long response wouldn't have fit on this page.
For a more fully-featured version of this program, you should look to the lwp-request program included with the
LWP modules from
CPAN.
Well, that's all fine if you want to send one command and get one answer,
but what about setting up something fully interactive, somewhat like the
way telnet works? That way you can type a line, get the answer, type a line, get the
answer, etc.
This client is more complicated than the two we've done so far, but if
you're on a system that supports the powerful fork call, the solution isn't that rough. Once you've made the connection to
whatever service you'd like to chat with, call fork to clone your process. Each of these two identical process has a very
simple job to do: the parent copies everything from the socket to standard
output, while the child simultaneously copies everything from standard
input to the socket. To accomplish the same thing using just one process
would be much
harder, because it's easier to code two processes to do one thing than it
is to code one process to do two things. (This keep-it-simple principle is
one of the cornerstones of the Unix philosophy, and good software
engineering as well, which is probably why it's spread to other systems as
well.)
Here's the code:
#!/usr/bin/perl -w
use strict;
use IO::Socket;
my ($host, $port, $kidpid, $handle, $line);
unless (@ARGV == 2) { die "usage: $0 host port" }
($host, $port) = @ARGV;
# create a tcp connection to the specified host and port
$handle = IO::Socket::INET->new(Proto => "tcp",
PeerAddr => $host,
PeerPort => $port)
or die "can't connect to port $port on $host: $!";
$handle->autoflush(1); # so output gets there right away
print STDERR "[Connected to $host:$port]\n";
# split the program into two processes, identical twins
die "can't fork: $!" unless defined($kidpid = fork());
# the if{} block runs only in the parent process
if ($kidpid) {
# copy the socket to standard output
while (defined ($line = <$handle>)) {
print STDOUT $line;
}
kill("TERM", $kidpid); # send SIGTERM to child
}
# the else{} block runs only in the child process
else {
# copy standard input to the socket
while (defined ($line = <STDIN>)) {
print $handle $line;
}
}
The kill function in the parent's if block is there to send a signal to our child process (current running in
the else block) as soon as the remote server has closed its end of the connection.
The kill at the end of the parent's block is there to eliminate the child process as
soon as the server we connect to closes its end.
If the remote server sends data a byte at time, and you need that data
immediately without waiting for a newline (which might not happen), you may
wish to replace the while loop in the parent with the following:
my $byte;
while (sysread($handle, $byte, 1) == 1) {
print STDOUT $byte;
}
Making a system call for each byte you want to read is not very efficient
(to put it mildly) but is the simplest to explain and works reasonably
well.
Source: Perl interprocess communication (signals, fifos, pipes, Copyright: Larry Wall, et al. |