Previous Table of Contents Next


4.2. Patterns of IL Conception

There are primarily two “patterns” of intermediate language conception:

  Extension: An existing (source) language becomes an intermediate language of another translation system.
  Between: An intermediate language is interposed between components of a translation system.

4.2.1. The Extension Pattern

LaTeX and the early C++ compilers are examples of layering a new software system on top of an extant one. Thus, although TeX is a text-processing language in its own right, it is viewed as an intermediate language from the LaTeX perspective. Building systems in this fashion is convenient and economic because an already established language—for which robust translators have presumably already been written—is essentially adopted as an intermediate language.

If the source language L is recruited as an intermediate language, then the following problems can arise:

  L may lack facilities to ferry information from the new source language to the L’s target. For example, there may not be a mechanism for transmitting source program line numbers so that error messages can be reported with respect to the source program.
  Because L was in use as a source language, it is likely that programs supplied to L were generated by humans. Ergonomic considerations may have influenced the engineering of some compilers for L. Conse-quently, compilers may fail when called upon to translate L programs generated automatically because such programs can exhibit complexity well beyond a human’s capability to author such programs. For example, programs automatically generated in L may have
Very deeply nested scopes
Expressions involving hundreds of operands
Excessively long methods

When presented with such programs, a compiler’s “failures” can be manifested as internal compiler errors: The compiler may simply not be equipped to handle such inputs, even though they are legal programs in the L language. It is also likely that the code generated by such compilers may lack the optimizations envisioned when L was recruited as an IL. Consider that L may lack support for non-local variables, whereas the new source language that uses L intermediately may require such support. Programs generated in L will require code in support of access to non-local variables (Fischer & LeBlanc, 1991); such code might maintain static links or displays, using an array to implement a stack, so that a source statement such as x=x+y would be translated into
     stack[stack[tos]+5] =
         stack[stack[tos+5]] + stack[stack[tos+13]]

The compiler writer who uses L as an IL might expect that compilers for L would recognize that stack[tos]+5 need only be computed once in the preceding code, but such optimization may be difficult to obtain in a compiler for L.

These problems are troublesome, particularly because the very reason for choosing L as an IL may have been to enhance portability. If L is a popular language, then good compilers for L may exist on many platforms. Yet these compilers may appear inconsistent with respect to their ability to handle programs automatically generated in L. In summary, it can be said that nothing stresses a compiler like another compiler.

4.2.2. The Between Pattern

In the “between” pattern, an intermediate language is interposed between software components, as shown in Figure 4.3(b). The IL could be created in an extant system by dividing the system into components that create and use the IL. More frequently, a “between” IL is conceived prior to software construction so that the software development process can take advantage of the IL.


Figure 4.3.  An IL can reduce the effort needed to re-source or re-target a compiler.

Suppose the compiler vendor produces a suite of compilers for s source languages (c, FTN, Ada, etc.) and currently supports these compilers for t target architectures (IBM-PC, Sun Sparc, Dec Alpha, etc.). If a different product is needed for each situation, then this company might develop and support s×t compilers, as shown in Figure 4.3(a); however, this work can be reduced to s+t if an IL can be introduced between the source and target specifications, as shown in Figure 4.3(b). Here, the company develops s front ends and t back ends. Each front end translates its source language to the IL; each back end translates the IL into native code for its architecture.


Previous Table of Contents Next