Previous | Table of Contents | Next |
The preceding sections describe Icon and explore various aspects of programming in Icon. This section presents a complete Icon program, including how it is designed and written.
6.3.5.1. A Recognizer Generator
The problem is to convert a description of a context-free language (Griswold & Griswold, 1996) into a recognizer for the language; that is, to write a program that writes another onea recognizer.
The context-free languages used here are written in a version of Backus-Naur Form in which nonterminal symbols are enclosed in angular brackets and alternatives are separated by vertical bars (Backus, 1959).
A simple grammar that illustrates this syntax-description notation is
plist> ::=[ ] | [<args>] | [<plist>] <args> ::= , | ,<plist> | ,<args>
Each line contains a definition, or production, that defines a nonterminal symbol. <plist> and <args> are nonterminal symbols, ::= stands for is defined to be, and vertical bars separate alternatives as mentioned previously. All other symbols (here brackets and commas) are terminal symbols that stand for themselves.
The language defined by <plist> contains strings such as [,[ ]], [,,,], and [[ ]].
An Icon recognizer for <plist> is
procedure plist_() suspend { (=[ ]) | (=[ || args() || =]) | (=[ || plist() || =]) } end procedure args_() suspend { (=,) | (=, || plist()) | (=, || args()) } end procedure main() while line := read() do { writes(image(line)) if line ? (plist() & pos(0)) then write(: accepted) else write(: rejected) } end
6.3.5.2. The Structure of a Production
The problem is a conceptually simple one: Analyze each production in the grammar, and construct a corresponding recognizing procedure.
String scanning is the obvious tool to use for the analysis of the productions, but some care is needed to avoid a messy, bug-ridden program.
The analysis should be organized around the syntax of productions, which have the form
production -> < name > ::= rhs
where rhs stands for right-hand side. A right-hand side, in turn, is a sequence of alternatives separated by vertical bars:
rhs -> alt | alt | | alt
and an alternative is a sequence of symbols:
alt -> sym sym sym
The point is that the hierarchical structure of a production should be reflected in the structure of its analysis.
6.3.5.3. Transforming a Production
The components of a production as described in the preceding section also need to be transformed to produce a recognizing procedure:
τ(< name > ::= rhs) = procedure name() suspend t(rhs) end τ(alt | alt | | alt) =τ(alt) | τ(alt) | | τ(alt) τ(sym sym sym) = (τ(sym)|| τ(sym) || || τ(sym)) τ(terminal) = =terminal τ(nonterminal) = nonterminal()
6.3.5.4. Organization of the Program
The hierarchical structure of a production is reflected in a hierarchy of procedures that analyze and transform a production:
6.3.5.5. Writing the Program
There are two main issues in actually writing the program:(1) how to perform the analysis and (2) how to produce the required transformed output.
The approach taken here uses string scanning exclusively, with matching procedures doing the work. These matching procedures operate in the context of a scanning environment that is established before they are called. For example, the transformation loop in the main procedure is
while line := read() do line ? transprod()
Thus, the procedure transprod() is called with the appropriate scanning environment already in place. transprod() needs to get the name of the nonterminal symbol being defined by the production, write a procedure heading, provide the shell for suspension, call tranalts() to transform the alternatives, and then finish off the procedure declaration:
procedure transprod() { =< & write(procedure , tab(many(nchars)), ()) & =>::= } | error() write( suspend {) transalts() write( }) write(end) return end
The global variable nchars contains a cset consisting of the characters that are acceptable in a nonterminal name. Note that in writing the procedure heading, the second argument of write() is the result of string scanning, inserting the name in the proper place in the procedure declaration. Conjunction is used to bind the three expressions involved in analyzing the production into a single unit, so that only one call of error() is needed. In this program, an error in the syntax of a production terminates program execution.
The procedure transalts() is next. It is called with the same scanning environment as for transprod() but with the position now at the first character of the right-hand side.
Because a right-hand side is a sequence of alternatives separated by vertical bars, a loop is the natural control structure to use. Since vertical bars separate alternatives, an alternative is matched by
tab(upto(|) | 0)
This construction is useful when items are separated by some marker but there is no marker after the last one. It matches up to the marker or else matches the remainder of the subject.
If there were no transformation to apply to the subject, the analysis loop would look like
repeat { tab(upto(|) | 0) ? transseq() move(1) | break }
where
move(1) | break
Previous | Table of Contents | Next |