Previous | Table of Contents | Next |
Deck: As you learn the language, learn its pitfalls as well
by Paul Conte
C is not really a bad language; it's just too often misused. As a language for writing low-level device drivers or operating system kernels, C is superb. It's also a great language for torturing student programmers. But for business applications or other software above the operating system level, C is a minefield: "Explosive" results await the unwary C programmer's misstep. Here's an example that requires you to pick your way carefully:
if (xcnt < 2) return date = x[0]; time = x[1];
This code appears to guard references to array x by checking the count of its elements first. But a semicolon is missing after the return, so the code really means:
if (xcnt < 2) { return date = x[0]; } time = x[1];
C's "flexibility" lets you freely combine most expressions and statements, such as this assignment expression within a return statement. Unfortunately, this flexibility also means C compilers can't detect many errors caused by simple typos.
The problem occurs because of a missing semicolon. Many "old hand" C programmers would say the solution is simply to add the semicolon (after a few hours of debugging!). But does the following correction give us a safe program?
If (xcnt < 2) return; date = x[0]; time = x[1];
What if we decide to add an error message?
if (xcnt < 2) printf("Timestamp array is too small\n"); return; date = x[0]; time = x[1];
Indeed, this is the ultimate "safe" program for valid arrays, it never does anything but return, terminating program execution! The code's execution is identical to
if (xcnt < 2) { printf("Timestamp array is too small\n"); } return; date = x[0]; time = x[1];
For quick coding, C lets you omit the { } around a conditional statement, a shortcut most published C programs take advantage of. You will be tempted to take this shortcut, too. Don't! Ever! Always enclose conditional code in braces. The errors introduced by incorrectly matched conditions and subordinate statements are very hard to ferret out. Our original example is better coded like this:
if (xcnt < 2) { return; } date = x[0]; time = x[1];
Note that using braces also lets the compiler catch a missing semicolon, so you get lots of protection by following this simple rule.
Unwinding this example also suggests a rule I mentioned in Chapter 2: Do all your assignments as separate statements, not as part of a more complex expression. Another helpful rule is: Use parentheses around expressions in return statements. For example, if you really did want to return the date after assigning it to a global variable, you might code
if (xcnt < 2) { date = x[0]; return (date); }
This doesn't solve the original problem we looked at, but it does show how to code return statements so you're less likely to be tripped up by other problems with complex expressions.
Another problem related to if statements is the improper matching of else clauses. (This problem is not unique to C; COBOL programmers have been bit by the same type of "bugs.") Suppose we change our previous example so that array x must either have at least two elements to be assigned to date and time or be empty, in which case the program should do nothing. Other conditions should cause a return. The following fragment seems to do what we require:
if (xcnt < 2) if (xcnt != 0) return; else { date = x[0]; time = x[1]; }
But C associates an else with the closest unmatched if inside the same pair of braces. The compiler executes the above code the same as the following:
if (xcnt < 2) { if (xcnt != 0) { return; } else { date = x[0]; time = x[1]; } }
In other words, nothing at all happens when xcnt is 2 or greater. Again, using braces for all conditional statements comes to the rescue:
if (xcnt < 2) { if (xcnt != 0) { return; } } else { date = x[0]; time = x[1]; }
Although full use of braces increases the number of lines of source code, braces make future program modifications much easier and less error-prone. With braces delimiting segments of conditional code, adding and deleting subordinate statements requires less careful checking of how else clauses and subordinate statements match up with the if statements.
Another, elegant, solution is to define the following macros:
#define IF { if ( #define THEN ) { #define ELSE } else { #define ELSEIF } else if ( #define ENDIF } }
We could then code our previous example as:
IF xcnt < 2 THEN IF xcnt != 0 THEN return; ENDIF ELSE date = x[0]; time = x[1]; ENDIF
Once the macros are replaced with their corresponding definitions, the code is executed the same as the previous example. This style is guaranteed to cause traditional C programmers apoplexy, but they'll forget about it when they chase down their next bug. Meanwhile, your code can be readable and reliable.
As W. A. Wulf put it, "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason including blind stupidity." C's switch statement could be the all-time award winner in the "stupid efficiency" category. The error in the following code fragment may be obvious outside the context of a larger program; but in real programs, such errors are easy to make and hard to find.
switch (color) { case 1: printf("red\n"); case 2: printf("blue\n"); }
Given this code, when color is 1, both "red" and "blue" are printed. The proper code is
switch (color) { case 1: printf("red\n"); break; case 2: printf("blue\n"); }
Of course, when you add another color, you'd better add another break after the second case. C's switch is not what is generally recognized as a "case" multiway conditional control structure; it's nothing more than a jump table. The compiler evaluates the switch expression simply to determine the target of a jump (i.e., go to) operation into the code that follows. Unless you code a break, execution will continue sequentially through the code for cases that follow the case that matches the switch expression value.
Previous | Table of Contents | Next |