Previous | Table of Contents | Next |
Generic programming in C++ is based on an engineering compromise that is aimed at one of the oldest controversies in programming language design: When is the right time to bind decisions about types? One school of thought, which pervades languages such as Fortran, Algol, and Pascal, is that types should be fixed at the time the program is written. The other, which pervades languages such as Lisp and Smalltalk, is that types should be determined dynamically as the program executes.
One argument in favor of determining types early is that it allows earlier error detection, which should make software more reliableat least in principle. Moreover, doing so enables the compiler to generate more efficient machine code, because it can take the types of variables into account. One argument in favor of determining types late is that programs can be more flexible, so late binding of types yields more expressive power.
The key idea behind C++ generic programming is that there is a third choice of when to fix the types in a program: when the program is compiled. This choice is particularly interesting in the context of libraries, because it permits programmers to write libraries without knowing the specific types of the objects with which those libraries will be used. When someone wants to use one of those libraries, that user will usually know the relevant types, so all the relevant information will still be available by the time the program is compiled.
As a typical example, consider the abs function that we used earlier as an example of overloading:
int abs(int x) { return x<0? -x: x; } short abs(short x) { return x<0? -x: x; } long abs(long x) { return x<0? -x: x; } float abs(float x) { return x<0? -x: x; } double abs(double x) { return x<0? -x: x; }
We had to define this function five times to cover even the most fundamental relevant types. What a nuisance! This is the kind of example that is often used to argue in favor of dynamic types.
In C++, we can abbreviate all these forms of abs by writing
template<class T> T abs(T x) { return x<0? -x: x; }
The use of the word class in the context of a template definition is slightly misleading: It really means any type, but templates are a relatively recent part of C++, and introducing type as a new keyword would have broken too many programs to introduce type as a new keyword. More recently, the usage
template<typename T> T abs(T x) { return x<0? -x: x; }
was defined to be equivalent, but the older usage of class is much more popular at present.
It is reasonable to think of either of these function definitions as an abbreviation for a finite, but unbounded, set of overloaded functions. The compiler will instantiate this function as needed, once for each relevant type T, after which using this abs is as easyand efficientas using the other overloaded versions would be. Moreover, the generic abs function will work with any type that can be compared with zero and supports unary negation.
One might argue that this version of abs is too general. For example, if we have a class that represents complex numbers, this version of abs does the wrong thing. Indeed, it does a senseless thing, because it is meaningless to ask whether a complex number is less than zero. For this and similar reasons, C++ allows template functions to be specialized by writing functions that handle specific types, such as
double abs(const Complex& z) { return sqrt(sq(z.re()) + sq(z.im())); }
which we have defined in terms of the sq function that we defined earlier. In effect, the combination of templates and specializations lets us tell the compiler how to do a limited form of dynamic computation on types while it is compiling the program, while still avoiding any of the execution-time inefficiencies associated with dynamic types.
Templates come in two forms: functions and classes. When we use a template function, the compiler can usually figure out the type to use for instantiation by looking at the argument(s).
So, for example, it is clear that if we call abs(3), we should instantiate abs with the type parameter T being int. When we use a template class, however, the value of the type parameter becomes part of the type itself.
The best way to make this distinction clear is with an example. Suppose, for instance, that we wanted to revise our String class to allow strings of any kind of character, not just plain char. We might begin by defining generic versions of strlen and strcpy:
template<class T> int strlen(const T* p) { int n = 0; while (*p++) ++n; return n; } template<class T> T* strcpy(T* p, const T* q) { T* r = p; while ((*p++ = *q++) != 0) ; return r; }
The only change from the previous versions, aside from turning them into templates, is that we changed the character literal \0 into plain 0 so as not to be biased in favor of any particular element type.
Then we could make our String class generic:
template <class T> class String { friend ostream& operator<<(ostream&, const String&); public: String() { init(null); } String(const T* p) { init(p); } String(const String& s) { init(s.data); } ~String() { destroy(); } String& operator=(const String& s) { if (this != &s) { destroy(); init(s.data); } return *this; } private: static T null[1]; T* data; void init(const T* p) { data = new T[strlen(p) + 1]; strcpy(data, p); } void destroy() { delete[] data; } }; template<class T> T String<T>::null[1]; template<class T> ostream& operator<<(ostream& ostrm, const String<T>& s) { ostrm << s.data; return ostrm; }
Previous | Table of Contents | Next |