Click Here!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97

Bookmark It

Search this book:
 
Previous Table of Contents Next


Skill 9
Monitoring Web Site Activity

  Using log files and simple reports
  Extracting log file information
  Monitoring activity from a Web page

Web server software is written to connect with the whole wide world. Such grand ambition may not be completely fulfilled at your Web site, though you certainly can give it a try. But whether large or small, heavily visited or not, your Web server generates a nearly breathtaking amount of information about itself and what your visitors do when they connect with your site.

As a beginning Webmaster, you may not have looked at the logs created by your server or even know where the log files are. However, after completing this skill, you certainly will be able to find the logs—and use them to your distinct advantage.

Using Log Files and Simple Reports

The logs maintained by your Web server contain a voluminous amount of information, which you can use in any way you like. Using a Perl script as your template, it isn’t difficult to extract simple information from the log files and fashion the information into a report.

The benefits of keeping track of people who visit your Web site may not be immediately apparent. Also, when you consider that the standard server log records not just every visit to the site but every transaction made during the connection, you may begin to wonder what you’re going to do with all of that information.

The benefits depend entirely on how you plan to use your Web site. When the World Wide Web began to become popular, most non-corporate Webmasters who could afford to hang a server on the Internet were running what could be termed “vanity” Web sites—“Hi, welcome to my Web site, here’s some pictures of the kids.” Increasingly, however, the Web is being regarded as a vehicle for commercial enterprises. Web sites are actually selling advertising space to businesses, which can then pop their messages up in your browser when you visit the sites. Most of them are set up as links to the advertisers’ own Web sites.

Whether advertising on the Web will become a profitable venture may be arguable. But you can’t argue with numbers, and most of the statistics you need to put a marketing profile together are available in your server logs. Because every transaction is recorded, you can see where the visitor came from and what stuff on your site was actually visited. You can take these two statistics and do anything you like with them, wrong or right. Your analyses may be way off the mark—that’s up to you. But think about the advantage this gives you over other advertisers: Newspapers and other more traditional advertising media have to depend on surveys and polls to develop profiles of the type of person who reads a particular ad. They have to depend on the honesty of the people participating in the surveys and polls to determine who even looks at the ads.

On a Web site, you have no such vagaries to contend with. The information is there, always at your fingertips. All you have to do is take advantage of it.

Decoding a Log File

To the uninitiated, a Web server log file looks just plain weird. You’ll learn in this section what each of the log entries means, and how to decode an entry.

The log files are text files—you can even edit them if you like, though this is not a recommended practice. There are a couple of formats and several log file locations to deal with, depending on the server.


TIP:  Internet Information Server stores its log files in winnt\system32\logfiles with a different file for each day. The NCSA server puts its logging information in a file called access_log, which can be found where the HTTPD server was installed. The Sambar server puts its log files in a logs directory in its main installation location. Sambar logs several other aspects of a Web connection, including the types of software—browsers, mostly—that are used to connect with it and the local error messages from the server. It keeps its logging files in access.log.

Getting Information from IIS Logs

One of Internet Information Server’s log files, opened with Notepad, is illustrated in Figure 9.1.


Figure 9.1:  The raw information in an IIS log file

As you can see in Figure 9.1, it’s not pretty. But all of it means something. Let’s start by taking one line apart:

   152.163.195.39, -, 6/13/97, 21:37:00, W3SVC, OWSLEY, 207.77.84.202,
   ⇒ 328, 60, 29, 304, 0, GET, /wml/homepage.gif, -,

Normally, log entries span only one line; the example is broken because that’s the only way it will fit on the page.

Notice that each of the 15 entries is separated from the others with a comma. This is how you distinguish them, and this will also be Perl’s hint for breaking out entries when you run the log information into a report.

The entries in IIS’s log file are broken down in this way:

  IP address of the client
  Client’s user name, which is always empty (-) in IIS
  Date of request
  Time of request
  Service name (always W3SVC in IIS)
  Web server’s name (computer name)
  Server IP address
  Elapsed CPU time (in milliseconds) of the operation
  Number of bytes received
  Number of bytes sent
  Service status code
  Windows NT status code
  Operation requested
  Target of the operation
  Ender (always -)

As illustrated in Figure 9.2, each of the entries in our list is strung out in a line of text in the log file.


Figure 9.2:  The format of an IIS log file entry


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited.