Previous Table of Contents Next


2.3.7. Successors

C and even B have several direct descendants, although they do not rival Pascal in generating progeny. One side branch developed early. When Steve Johnson visited the University of Waterloo on sabbatical in 1972, he brought B with him. It became popular on the Honeywell machines there and later spawned Eh and Zed (the Canadian answers to “What follows B?”). When Johnson returned to Bell Labs in 1973, he was disconcerted to find that the language whose seeds he brought to Canada had evolved back home; even his own yacc program had been rewritten in C by Alan Snyder.

More recent descendants of C proper include Concurrent C (Gehani & Roome, 1989), Objective C (Cox & Novobilski, 1986), C* (Thinking Machines Corporation, 1990), and especially C++ (Stroustrup, 1986). The language is also widely used as an intermediate representation (essentially, as a portable assembly language) for a wide variety of compilers, both for direct descendants such as C++ and independent languages such as Modula-3 (Nelson, 1991) and Eiffel (Meyer, 1988).

2.3.8. Critique

Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers and the way in which declaration syntax mimics expression syntax. They are also among its most frequently criticized features and often serve as stumbling blocks to the beginner. In both cases, historical accidents or mistakes have exacerbated their difficulty. The most important of these has been the tolerance of C compilers to errors in type. As should be clear from the history, C evolved from typeless languages. It did not suddenly appear to its earliest users and developers as an entirely new language with its own rules; instead, we continually had to adapt existing programs as the language developed and make allowance for an existing body of code. (Later, the ANSI X3J11 committee standardizing C would face the same problem.)

Compilers in 1977, and even well after, did not complain about usages such as assigning between integers and pointers or using objects of the wrong type to refer to structure members. Although the language definition presented in the first edition of K&R was reasonably (although not completely) coherent in its treatment of type rules, that book admitted that existing compilers didn’t enforce them. Moreover, some rules designed to ease early transitions contributed to later confusion. For example, the empty square brackets in the function declaration

   int f(a) int a[]; { … }

are a living fossil, a remnant of NB’s way of declaring a pointer; a is, in this special case only, interpreted in C as a pointer. The notation survived in part for the sake of compatibility, in part under the rationalization that it would allow programmers to communicate to their readers an intent to pass f a pointer generated from an array, rather than a reference to a single integer. Unfortunately, it serves as much to confuse the learner as to alert the reader.

In K&R C, supplying arguments of the proper type to a function call was the responsibility of the programmer, and the extant compilers did not check for type agreement. The failure of the original language to include argument types in the type signature of a function was a significant weakness, indeed the one that required the X3J11 committee’s boldest and most painful innovation to repair. The early design is explained (if not justified) by my avoidance of technological problems, especially cross-checking between separately compiled source files, and my incomplete assimilation of the implications of moving between an untyped to a typed language. The lint program, mentioned earlier, tried to alleviate the problem: Among its other functions, lint checks the consistency and coherency of a whole program by scanning a set of source files, comparing the types of function arguments used in calls with those in their definitions.

An accident of syntax contributed to the perceived complexity of the language. The indirection operator, spelled * in C, is syntactically a unary prefix operator, just as in BCPL and B. This works well in simple expressions, but in more complex cases, parentheses are required to direct the parsing. For example, to distinguish indirection through the value returned by a function from calling a function designated by a pointer, one writes *fp() and (*pf)(). The style used in expressions carries through to declarations, so the names might be declared

   int *fp();
   int (*pf)();

In more ornate but still realistic cases, things become worse:

   int *(*pfp)();

is a pointer to a function returning a pointer to an integer. There are two effects occurring. Most important, C has a relatively rich set of ways of describing types (compared, say, with Pascal). Declarations in languages as expressive as C Algol 68, for example, describe objects equally hard to understand, simply because the objects themselves are complex. A second effect owes to details of the syntax. Declarations in C must be read in an inside-out style that many find difficult to grasp (Anderson, 1980). Sethi (1981) observed that many of the nested declarations and expressions would become simpler if the indirection operator had been taken as a postfix operator instead of prefix, but by then, it was too late to change.

In spite of its difficulties, I believe that C’s approach to declarations remains plausible and I am comfortable with it; it is a useful unifying principle.

The other characteristic feature of C—its treatment of arrays—is more suspect on practical grounds, although it also has real virtues. Although the relationship between pointers and arrays is unusual, it can be learned. Moreover, the language shows considerable power to describe important concepts, for example, vectors whose length varies at runtime with only a few basic rules and conventions. In particular, character strings are handled by the same mechanisms as any other array, plus the convention that a null character terminates a string. It is interesting to compare C’s approach with that of two nearly contemporaneous languages, Algol 68 and Pascal (Jensen & Wirth, 1974). Arrays in Algol 68 either have fixed bounds, or are flexible: Considerable mechanism is required both in the language definition and in compilers to accommodate flexible arrays (and not all compilers fully implement them). Original Pascal had only fixed–sized arrays and strings, and this proved confining (Kernighan, 1981). Later, this was partially fixed, although the resulting language is not yet universally available.


Previous Table of Contents Next