To access the contents, click the chapter and section titles.
Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97
Skill 9 Monitoring Web Site Activity
- Using log files and simple reports
- Extracting log file information
- Monitoring activity from a Web page
Web server software is written to connect with the whole wide world. Such grand ambition may not be completely fulfilled at your Web site, though you certainly can give it a try. But whether large or small, heavily visited or not, your Web server generates a nearly breathtaking amount of information about itself and what your visitors do when they connect with your site.
As a beginning Webmaster, you may not have looked at the logs created by your server or even know where the log files are. However, after completing this skill, you certainly will be able to find the logsand use them to your distinct advantage.
Using Log Files and Simple Reports
The logs maintained by your Web server contain a voluminous amount of information, which you can use in any way you like. Using a Perl script as your template, it isnt difficult to extract simple information from the log files and fashion the information into a report.
The benefits of keeping track of people who visit your Web site may not be immediately apparent. Also, when you consider that the standard server log records not just every visit to the site but every transaction made during the connection, you may begin to wonder what youre going to do with all of that information.
The benefits depend entirely on how you plan to use your Web site. When the World Wide Web began to become popular, most non-corporate Webmasters who could afford to hang a server on the Internet were running what could be termed vanity Web sitesHi, welcome to my Web site, heres some pictures of the kids. Increasingly, however, the Web is being regarded as a vehicle for commercial enterprises. Web sites are actually selling advertising space to businesses, which can then pop their messages up in your browser when you visit the sites. Most of them are set up as links to the advertisers own Web sites.
Whether advertising on the Web will become a profitable venture may be arguable. But you cant argue with numbers, and most of the statistics you need to put a marketing profile together are available in your server logs. Because every transaction is recorded, you can see where the visitor came from and what stuff on your site was actually visited. You can take these two statistics and do anything you like with them, wrong or right. Your analyses may be way off the markthats up to you. But think about the advantage this gives you over other advertisers: Newspapers and other more traditional advertising media have to depend on surveys and polls to develop profiles of the type of person who reads a particular ad. They have to depend on the honesty of the people participating in the surveys and polls to determine who even looks at the ads.
On a Web site, you have no such vagaries to contend with. The information is there, always at your fingertips. All you have to do is take advantage of it.
Decoding a Log File
To the uninitiated, a Web server log file looks just plain weird. Youll learn in this section what each of the log entries means, and how to decode an entry.
The log files are text filesyou can even edit them if you like, though this is not a recommended practice. There are a couple of formats and several log file locations to deal with, depending on the server.
TIP: Internet Information Server stores its log files in winnt\system32\logfiles with a different file for each day. The NCSA server puts its logging information in a file called access_log, which can be found where the HTTPD server was installed. The Sambar server puts its log files in a logs directory in its main installation location. Sambar logs several other aspects of a Web connection, including the types of softwarebrowsers, mostlythat are used to connect with it and the local error messages from the server. It keeps its logging files in access.log.
Getting Information from IIS Logs
One of Internet Information Servers log files, opened with Notepad, is illustrated in Figure 9.1.
Figure 9.1: The raw information in an IIS log file
As you can see in Figure 9.1, its not pretty. But all of it means something. Lets start by taking one line apart:
152.163.195.39, -, 6/13/97, 21:37:00, W3SVC, OWSLEY, 207.77.84.202,
⇒ 328, 60, 29, 304, 0, GET, /wml/homepage.gif, -,
Normally, log entries span only one line; the example is broken because thats the only way it will fit on the page.
Notice that each of the 15 entries is separated from the others with a comma. This is how you distinguish them, and this will also be Perls hint for breaking out entries when you run the log information into a report.
The entries in IISs log file are broken down in this way:
- IP address of the client
- Clients user name, which is always empty (-) in IIS
- Date of request
- Time of request
- Service name (always W3SVC in IIS)
- Web servers name (computer name)
- Server IP address
- Elapsed CPU time (in milliseconds) of the operation
- Number of bytes received
- Number of bytes sent
- Service status code
- Windows NT status code
- Operation requested
- Target of the operation
- Ender (always -)
As illustrated in Figure 9.2, each of the entries in our list is strung out in a line of text in the log file.
Figure 9.2: The format of an IIS log file entry
|