|
 |
 |
To access the contents, click the chapter and section titles.
Perl CGl Programming: No experience required.
(Publisher: Sybex, Inc.)
Author(s): Erik Strom
ISBN: 0782121578
Publication Date: 11/01/97
Heres a recap of the Perl regular-expression functions weve covered so far with their formal parameters:
- tr /SEARCH_LIST/REPLACE_LIST/ Translates a regular expression in SEARCH_LIST to the characters or expression in REPLACE_LIST.
- hex (EXPRESSION) Interprets EXPRESSION as a hexadecimal number and returns the decimal value. For example, hex (10) would return 16.
- pack (TEMPLATE, EXPRESSION) Packs EXPRESSION into a binary structure based on TEMPLATE.
Youre Right: Regular Expressions Are Hard
One of Perls biggest strengths over bare-metal programming languages such as C and C++ comes from its ability to format text so easily. However, it uses regular expressions heavily to accomplish the formatting. Make no mistake about it, regular expressions are difficult.
But another of Perls strengths is that you can write a workable, useful program in it without knowing every detail of the language, something youve already demonstrated in the examples weve built so far. As you gain more proficiency with Perl, your programs will utilize the knowledge youve gained, too.
You will learn about some hairy details of regular expressions in this skill, but dont be too concerned if you dont get it right away. With practice, itll come to you.
Meanwhile, its helpful to think of regular expressions as nothing more than a search-and-replace function on steroids.
|
Regular Expressions in Detail
A regular expression is used to match a pattern in a string and, possibly, replace it with another pattern. The string can match any of the alternatives of the regular expression; alternatives are separated with a vertical bar (|), are evaluated from left to right, and always stop on the first match.
The building blocks of a regular expression are the characters used to represent events or other characters.
- ^ stands for the beginning of a string.
- $ stands for the end of the string.
- \B is a non-word boundary.
- \b is a single word boundary (see \w and \W).
Regular expressions may include quantifiers, which tell how many times an event or string must occur.
- {bottom, top} where bottom and top are numbers that mean the event must occur bottom times and no more than top times.
- {number,} means it has to happen at least number times.
- {number} exactly number times.
- * is the same as {0,}.
- + is the same as {1,}.
- ? is the same as {0,1}.
The period character, or dot (.), is an often-used tool because it matches any character except the newline. For specific characters, you may include them in lists enclosed by square brackets; ranges are indicated with a hyphen as in A-Z.
The backslash (\) before a character gives it a special meaning. Table 5.3 illustrates the backslashed special characters.
Table 5.3: Special Characters in Regular Expressions
???
|
Character
| Does
|
|
\n
| Newline
|
\r
| Carriage return
|
\t
| Tab
|
\f
| Form feed
|
\d
| A digit, or single number
|
\D
| A non-digit
|
\s
| White space, such as space, tab, or newline
|
\S
| Non-white space
|
\w
| An alphanumeric character
|
\W
| Non-alphanumeric
|
\xnn
| Where nn is a hex value, the character having that value
|
\0nn
| Same as above, using octal (base 8) numbers
|
|
Another convention youll see often in regular expressions is the use of $1, $2, $3, etc. These scalar variables correspond, left to right, to the expressions in parentheses in SEARCH_LIST. What makes them especially valuable is that they maintain their value outside of the regular expression. For example, the string 19 May 1997 could be split into its parts with this code snippet:
$string =∼ /(
) (..) (
.)/;
$day = $1;
$month = $2;
$year = $3;
Warning: $1, $2, $3, etc. are equivalent to \1, \2, \3, etc. in regular expressions. Keep this in mind if you mistakenly try to interpret a number literally by escaping it and the results arent what you expect.
EXERCISE: The Sambar Server: Building Your Web Site
Youve done some simple forms in this skill. You can prepare yourself for the next skill by expanding on them. After all, HTML forms arent difficult to set up. Its processing the information in the forms that presents the knotty problems. Try the following:
- Just as an experiment, create some more complex forms that use all of the available HTML controls. Use radio buttons, check boxes, and the restit doesnt matter if you know what theyll return. Play with them and see what comes back in the resulting URLs.
- Analyze the URLs that are created with each of the forms you create. Look at what appears on the Go To line of your Web browser. Your familiarity with these conventions will be invaluable later.
Moving On
This is enough about regular expressions for now. Youll learn more about them as you become comfortable with what youve been exposed to so far.
In the next skill, youll use Perl to perform some more complicated tasks: creating a guest book and a quiz form for your Web site.
Are You Experienced?
Now you can
- create a simple HTML form
- process what a visitor has typed into the form through CGI and a Perl program
- understand how characters in a URL are specially coded when they get to your Perl program
- decode the characters using Perl regular expressions
- understand some of the details of how regular expressions are built and evaluated in your Perl programs
|