Click Here!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97

Bookmark It

Search this book:
 
Previous Table of Contents Next


Here’s a recap of the Perl regular-expression functions we’ve covered so far with their formal parameters:

  tr /SEARCH_LIST/REPLACE_LIST/ Translates a regular expression in SEARCH_LIST to the characters or expression in REPLACE_LIST.
  hex (EXPRESSION) Interprets EXPRESSION as a hexadecimal number and returns the decimal value. For example, hex (10) would return 16.
  pack (TEMPLATE, EXPRESSION) Packs EXPRESSION into a binary structure based on TEMPLATE.
You’re Right: Regular Expressions Are Hard

One of Perl’s biggest strengths over bare-metal programming languages such as C and C++ comes from its ability to format text so easily. However, it uses regular expressions heavily to accomplish the formatting. Make no mistake about it, regular expressions are difficult.

But another of Perl’s strengths is that you can write a workable, useful program in it without knowing every detail of the language, something you’ve already demonstrated in the examples we’ve built so far. As you gain more proficiency with Perl, your programs will utilize the knowledge you’ve gained, too.

You will learn about some hairy details of regular expressions in this skill, but don’t be too concerned if you don’t get it right away. With practice, it’ll come to you.

Meanwhile, it’s helpful to think of regular expressions as nothing more than a search-and-replace function on steroids.

Regular Expressions in Detail

A regular expression is used to match a pattern in a string and, possibly, replace it with another pattern. The string can match any of the alternatives of the regular expression; alternatives are separated with a vertical bar (|), are evaluated from left to right, and always stop on the first match.

The building blocks of a regular expression are the characters used to represent events or other characters.

  ^ stands for the beginning of a string.
  $ stands for the end of the string.
  \B is a non-word boundary.
  \b is a single word boundary (see \w and \W).

Regular expressions may include quantifiers, which tell how many times an event or string must occur.

  {bottom, top} where bottom and top are numbers that mean the event must occur bottom times and no more than top times.
  {number,} means it has to happen at least number times.
  {number} exactly number times.
  * is the same as {0,}.
  + is the same as {1,}.
  ? is the same as {0,1}.

The period character, or dot (.), is an often-used tool because it matches any character except the newline. For specific characters, you may include them in lists enclosed by square brackets; ranges are indicated with a hyphen as in A-Z.

The backslash (\) before a character gives it a special meaning. Table 5.3 illustrates the backslashed special characters.

???
Table 5.3: Special Characters in Regular Expressions

Character Does

\n Newline
\r Carriage return
\t Tab
\f Form feed
\d A digit, or single number
\D A non-digit
\s White space, such as space, tab, or newline
\S Non-white space
\w An alphanumeric character
\W Non-alphanumeric
\xnn Where nn is a hex value, the character having that value
\0nn Same as above, using octal (base 8) numbers

Another convention you’ll see often in regular expressions is the use of $1, $2, $3, etc. These scalar variables correspond, left to right, to the expressions in parentheses in SEARCH_LIST. What makes them especially valuable is that they maintain their value outside of the regular expression. For example, the string 19 May 1997 could be split into its parts with this code snippet:

   $string =∼ /(…) (..) (….)/;
   $day = $1;
   $month = $2;
   $year = $3;


Warning:  $1, $2, $3, etc. are equivalent to \1, \2, \3, etc. in regular expressions. Keep this in mind if you mistakenly try to interpret a number literally by “escaping” it and the results aren’t what you expect.


EXERCISE:  The Sambar Server: Building Your Web Site

You’ve done some simple forms in this skill. You can prepare yourself for the next skill by expanding on them. After all, HTML forms aren’t difficult to set up. It’s processing the information in the forms that presents the knotty problems. Try the following:

  Just as an experiment, create some more complex forms that use all of the available HTML controls. Use radio buttons, check boxes, and the rest—it doesn’t matter if you know what they’ll return. Play with them and see what comes back in the resulting URLs.
  Analyze the URLs that are created with each of the forms you create. Look at what appears on the Go To line of your Web browser. Your familiarity with these conventions will be invaluable later.

Moving On

This is enough about regular expressions for now. You’ll learn more about them as you become comfortable with what you’ve been exposed to so far.

In the next skill, you’ll use Perl to perform some more complicated tasks: creating a guest book and a quiz form for your Web site.

Are You Experienced?

Now you can…

  create a simple HTML form
  process what a visitor has typed into the form through CGI and a Perl program
  understand how characters in a URL are specially coded when they get to your Perl program
  decode the characters using Perl regular expressions
  understand some of the details of how regular expressions are built and evaluated in your Perl programs


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited.