Previous | Table of Contents | Next |
C++ classes give programmers a way to define new types, along with fine-grained control over the behavior of those types. Along with that control comes a corresponding responsibility. If one class expects a particular behavior of another, combining those two classes successfully depends on the behavior of the latter class matching the expectations of the former.
One place where mismatches are particularly likely is when class objects are copied. There are two common styles for using objectsas pure objects and as valuesand although each style has its own uses, the two do not mix terribly well. Moreover, as a convenience to the class author, the compiler sometimes makes assumptions about operations that the author has not defined. Whether these assumptions are right or wrong depends on the context, which means that programmers must be aware of that context.
Another source of mismatches is in defining comparison. Not everything that looks like a comparison operation is actually an appropriate order relation. Moreover, for reasons of hardware efficiency and C compatibility, pointer comparisons are defined in a way that does not always meet naive expectations.
C++ compilers check for many potential problems during compilation, but no compiler can detect all such problems in advance. Moreover, some problems that could be detected in principle are often deliberately left undetected in practice.
For example, most C++ compilers do not defend against programs that overrun array boundaries. To do so would impose an overhead that many programmers would find unacceptable. Moreover, a C++ programmer who does not want to incur the risk of unchecked indexing operations can write a class that does the checking explicitly, whether the compiler offers such a feature or not.
More generally, the ability to wrap potentially hazardous operations in a class definition, and then debug those operations only once, means that well-written C++ programs avoid many execution pitfalls as a side effect of their design. Accordingly, the pitfalls described in this section tend to occur at a low level of abstraction.
They are important, however, for two reasons.
First, library authors must ultimately use low-level abstractions as a way of implementing the higher-level abstractions that they offer to their users. It is easy to avoid bounds errors by using an appropriate classbut someone must write that class, and that someone must also understand how to avoid bounds errors.
Second, C++ is designed to grant access to low-level facilities because such facilities can be essential for writing efficient applications. Of course, it is usually wise to isolate the low-level parts of a program and give them special care, but it is often impossible to avoid them entirely.
There is one kind of array bounds violation that is both common and deliberate in C programs, even though C does not officially sanction such programs. The practical effect of such violations is much greater in C++ programs than in C programs, however, which means that this pitfall deserves special attention from C programmers who are beginning to use C++.
8.4.1.1. A C Problem Solved
Suppose we want to design a C data structure that represents a variable-length array of characters. The object is to have a single data structure that holds both the length of the array and the characters that constitute the array. Here is one way to do it:
struct string1 { int length; char *data; };
Here, the length member says how many characters the structure represents and the data member is a pointer to the initial character of the array; that array is presumably dynamically allocated.
Copying such a data structure is a bit of a nuisance. Suppose we have a pointer to a string1 object:
struct string1 *p;
and we want to create a new string1 object that is a copy of the original. Then we must do something like this:
struct string1 *q = malloc(sizeof(struct string1)); q->length = p->length; q->data = malloc(p->length); strncpy(q->data, p->data, p->length);
Copying a string1 object requires an extra call to malloc. Similarly, freeing one of these objects requires an extra call to free to deallocate the memory to which the data member points.
8.4.1.2. Another Apparent C Solution
Part of being a good programmer is creative laziness. Why do by hand what a computer can do instead? It is therefore no surprise that C programmers look for an easier way of doing things. Here is one common way:
struct string2 { int length; char data[1]; /* [sic] */ };
Of course, the character array is not just a single character long. Instead, when allocating a string2, we leave extra memory at the end. Anyone using the data member runs off the end of the officially defined array, but because the extra memory was allocated, that formal violation is harmlessat least in C.
Accordingly, if p2 points to a string2, we might copy it this way:
struct string2 *q2 = malloc(sizeof(struct string2) + p2->length - 1); q2->length = p2->length; strncpy(q2->data, p2->data, p2->length);
This technique makes it slightly easier to copy a string2 than a string1. It makes it significantly easier to free it, too:
free(q2);
Previous | Table of Contents | Next |