Previous | Table of Contents | Next |
The solution constituted the crucial jump in the evolutionary chain between typeless BCPL and typed C. It eliminated the materialization of the pointer in storage and instead caused the creation of the pointer when the array name is mentioned in an expression. The rule, which survives in todays C, is that values of array type are converted, when they appear in expressions, into pointers to the first of the objects making up the array.
This invention enabled most existing B code to continue to work, despite the underlying shift in the languages semantics. The few programs that assigned new values to an array name to adjust its originpossible in B and BCPL, meaningless in Cwere easily repaired. More important, the new language retained a coherent and workable (if unusual) explanation of the semantics of arrays while opening the way to a more comprehensive type structure.
The second innovation that most clearly distinguishes C from its predecessors is this fuller type structure and especially its expression in the syntax of declarations. NB offered the basic types int and char, together with arrays of them, and pointers to them, but no further ways of composition. Generalization was required: Given an object of any type, it should be possible to describe a new object that gathers several into an array, yields it from a function, or is a pointer to it.
For each object of such a composed type, there was already a way to mention the underlying object: Index the array, call the function, and use the indirection operator on the pointer. Analogical reasoning led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,
int i, *pi, **ppi;
declares an integer, a pointer to an integer, and a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression. Similarly,
int f(), *f(), (*f)();
declares a function returning an integer, a function returning a pointer to an integer, and a pointer to a function returning an integer;
int *api[10], (*pai)[10];
declares an array of pointers to integers and a pointer to an array of integers. In all these cases, the declaration of a variable resembles its usage in an expression whose type is the one named at the head of the declaration.
The scheme of type composition adopted by C owes considerable debt to Algol 68, although it did not, perhaps, emerge in a form that Algols adherents would approve of. The central notion I captured from Algol was a type structure based on atomic types (including structures), composed into arrays, pointers (references), and functions (procedures). Algol 68s concept of unions and casts also had an influence that appeared later.
After creating the type system, the associated syntax, and the compiler for the new language, I felt that it deserved a new name; NB seemed insufficiently distinctive. I decided to follow the singleletter style and called it C, leaving open the question whether the name represented a progression through the alphabet or through the letters in BCPL.
Rapid changes continued after the language had been namedfor example, the introduction of the && and || operators. In BCPL and B, the evaluation of expressions depends on context: Within if and other conditional statements that compare an expressions value with zero, these languages place a special interpretation on the and (&) and or (|) operators. In ordinary contexts, they operate bitwise, but in the B statement
if (e1 & e2)
the compiler must evaluate e1 and whether it is nonzero; evaluate e2, and whether it too is nonzero; elaborate the statement dependent on the if. The requirement descends recursively on & and | operators within e1 and e2. The shortcircuit semantics of the Boolean operators in such truthvalue context seemed desirable, but the overloading of the operators was difficult to explain and use. At the suggestion of Alan Snyder, I introduced the && and || operators to make the mechanism more explicit.
Their tardy introduction explains an infelicity of Cs precedence rules. In B, one writes
if (a==b & c)
to check whether a equals b and c is nonzero; in such a conditional expression, it is better that & have lower precedence than ==. In converting from B to C, one wants to replace & with && in such a statement; to make the conversion less painful, we decided to keep the precedence of the & operator the same relative to == and merely split the precedence of && slightly from &. Today, it seems that it would have been preferable to move the relative precedences of & and == and thereby simplify a common C idiom: To test a masked value against another value, one must write
if ((a&mask) == b)
where the inner parentheses are required but easily forgotten.
Many other changes occurred around 19721973, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder (1974) but also in recognition of the utility of the file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Indeed, for some years, it was not even invoked unless the source program contained a special signal at its beginning. This attitude persisted and explains both the incomplete integration of the syntax of the preprocessor with the rest of the language and the imprecision of its description in early reference manuals.
Previous | Table of Contents | Next |