Previous Table of Contents Next


7.3. Data Abstraction

The fundamental notion of data abstraction is that it should be possible to define and implement a data structure without exposing the implementation details to the people who use that data structure. For a programming language to support data abstraction, then, it must offer a way of distinguishing the parts of the program that use a data structure from the parts of the program that implement it. Moreover, it must be able to prevent the users of the data structure from getting at the details that are rightfully revealed only to the implementation.

C++ supports data abstraction by allowing the programmer to define data structures that have some of their components identified as private. Private components, as parts of the implementation, are off limits to ordinary uses of the data structure.

7.3.1. Structures and Classes

The main tool for constructing user-defined data types is the class, which can be defined in one of two ways:

   class { member definitions };

   struct { member definitions };

The only difference between struct and class is in the visibility assumed by default for members of the structure: They are assumed to be public in a class defined by struct, and private in a class defined by class.

The member definitions look like other definitions, including both variables and functions. It is probably the most useful to think about member variables as describing the contents of each object of this particular class, and of member functions as describing the actions that it makes sense for such an object to perform.

As a simple example, consider the following definition:

   struct Point {
         int x, y;
   };

Here we have defined a new type, called Point, and said that every object of type Point has members x and y, each of which has type int. Thus, if we define a Point object, by saying

   Point p;

we have implicitly said that the object p has members x and y, which we can refer to directly by mentioning p.x or p.y.

If we have a pointer to a Point object, such as

   Point* pt;

then we can refer to the x and y components of the object to which pt points by writing (*pt).x or (*pt).y, respectively. These usages are sufficiently common that we can abbreviate them as pt->x and pt->y.

Once we have defined a structure, we can use it as if it were any other type. So, for example, we can create arrays of Points, pointers and references to Points, functions that take Points as arguments and return them as results, and so on.

In order to make it easier to use structures as parts of other structures, C++ ordinarily defines assignment and initialization on structures in terms of assignment and initialization of their elements. So, for example, if we write

   Point p;
   p.x = 7;
   p.y    = 3;
   Point q = p;

we rely on being able to initialize the Point object called q from the Point object called p, which initialization is equivalent to initializing p.x and p.y independently. As another example, we can define

   struct Box {
         Point origin;
         Point corner;
   };

after which Box assignment and initialization will automatically be defined in terms of Point assignment and initialization, which in turn will automatically be defined in terms of the constituent ints.

Simple as they are, these structures are sufficient to define most kinds of classical data structures. However, as they stand, their structure is completely open. The fact that a Point is a pair of integers is available for the world to see. What we need is a way to hide how a Point is implemented, while still allowing Points to be useful. That is, we would like to define a version of Point that lets us put integers into a Point object and take them out again, but does not reveal how those integers are stored. In C++, we do that by defining member functions that correspond to the actions an object can perform, and then using protection labels to bar access to the parts of the object that constitute the implementation.


Previous Table of Contents Next