Previous | Table of Contents | Next |
8.3.5.4. Object and Value Comparison
Are these all the comparison pitfalls we need to avoid? That is, suppose we have a < defined that always represents an order relation. Is that good enough to give us the behavior we want from our containers? Of course, the answer is noor I wouldnt have asked the question. Having a correctly defined < gives us what we asked for, but what we ask for is not always what we want.
One example of the difference between what we ask for and what we might want comes when we consider the definition of string literals such as abc. Recall that in C++, a string literal actually represents the address of the initial character of a null-terminated array; its type is pointer to char. That definition implies that comparing two string literals is equivalent to comparing two pointers; the pointers are equal if they point to the same memory location and unequal otherwise.
In both C and C++, it is up to the implementation whether or not two identical string literals are stored in the same location. Although it is guaranteed that
cat == dog
is false, the value of
cat == cat
is up to the implementation.
The implication of this definition of comparison is that sets of character pointers may well behave counterintuitively if those pointers represent (the initial characters of) string literals. For example, if we write
set<char*> s; s.insert(cat); s.insert(cat);
it is up to the implementation whether s now contains one element or two. If s contains two elements, those elements are pointers to distinct memory locations, each of which is the initial c of a distinct instance of cat.
The way to avoid this pitfall is to be careful when using sets of pointers, just as one should be careful when using pointers in other contexts. If, for example, we were to say
set<string> s; s.insert(cat); s.insert(cat);
then s is assured of containing only a single element because each insertion implicitly converts cat to a string, and the values of the strings then compare equal. In other words, s would have only a single element because
string(cat) == string(cat)
is assured of being true even if
cat == cat
is false. The point is that comparison of string values is defined in terms of comparing the characters that constitute the strings, not in terms of the addresses of those characters.
8.3.5.5. Containers and Comparisons Discussion
When connecting two pieces of software together, it is important to think about the interface between them and tempting to believe that there is some way to specify that interface that will ensure appropriate behavior.
The world is rarely as neat as that in practice. There may be more than one way of obtaining appropriate behavior, to the point that it is difficult to specify the behavior beyond saying it has to work if you use it in the following way. The behavior may be inappropriate in some cases, but by avoiding those cases, it may still be possible to make the combined software work as intended. Finally, your intentions may not be as obvious as they may appear at first. Even a value as simple as a built-in string literal may behave in a surprising way when used with the built-in comparison operators.
These real-world complexities are among the reasons we talk about software engineering instead of software mathematics.
Our next example comes from a finely tuned portability compromise that C++ inherited from C. In this particular case, the compromise is right for C most of the time but has the potential to cause trouble for C++. Accordingly, the C++ standard library circumvents the problem. As a result, most C++ programmers do not even need to know that the problem existsbut library designers, especially designers of container classes, may need to know about it.
To help us understand the problem, lets begin by asking what is wrong with this example:
int* p[100]; // ... sort(p, p+100);
On the surface, nothing is wrong. We have declared an array of pointers, called p, and called a function named sort, which presumably sorts the elements of the array.
Unfortunately, this example conceals a nasty pitfall, which is rendered all the nastier by the fact that it works without trouble on many machines. The problem is that what is being sorted is an array of pointers, and to sort the elements of the array, we must be able to compare them. Moreover, ordered comparison between pointers is defined only when the pointers both point to elements of the same array (or one past the end of the array, as in the case of p+100 in this example). For example, if we have two arrays
int x[10], y[10];
the mere evaluation of the expression x<y (which is equivalent to &x[0]<&y[0]) is undefined. It is not just that the implementation has a choice about whether to return true or false as the result of such a comparison; rather, the implementation is permitted to do anything it likes, including deleting all your files and sending hate mail in your name to all your friends.
Yet there is nothing wrong with expressions such as &x[0]<&x[3] (which is required to yield true) or even x==y (which is required to yield false). The point is that you are allowed to use == or != to compare arbitrary pointersand, if the pointers have well-defined values, those pointers compare equal if and only if they point to the same object (or are both zero)but the moment you use any of the other four relational operators (<, >, <=, or >=), the pointers must point to elements of the same array or all bets are off.
This restriction actually has its origin in C, but the C++ standard library offers a way around the restriction that does not work readily in C. We will first explore the reason for the restriction, then detail its effects, and finally explain the solution.
Previous | Table of Contents | Next |