Previous Table of Contents Next


Traditional program organization represents how things work in computer software and not how things work in life generally. Technically speaking, all the contents of a computer memory constitute data, but a significant portion of what's stored in memory is a special kind of data called code. The code comprises instructions to the processor: add two numbers, jump to a new address, and so on. Other memory contents make up data as people normally use the term: data that stores information for you.

Code and data are well segregated inside computer memory, because (unlike a human brain) a computer has only one central processor, and this processor treats code and data differently. Large chunks of code have to be organized together into code segments, because the processor executes them sequentially, only occasionally jumping to a new location. Program organization reflects this fact. Typically, code is organized into a hierarchy of functions, each function being a collection of instructions executed as a block. Data is organized into records (structures), arrays, and tables.

Despite all this nice structure, the resulting program is a hierarchy of functions alongside a collection of unrelated data structures. The programming language does not enforce any connection between the two groups.


Figure 1.1  Traditional program organization.

Is this segregation of code and data a problem? It's usually fine for simpler programs. But a programmer has to remember which data structures are intended to be used by which functions, or errors arise.

Even the computer's own hardware doesn't fit the traditional model of software, as you'll see if you open your computer. The structure of the computer itself doesn't reflect the code/data distinction! Instead, the board consists of a number of independent chips that are wired to send signals back and forth to each other.


Figure 1.2  Life inside the system box.

In the object-oriented approach, the chips become objects. In C++ terminology, the make and model of a chip (for example, Pentium) is a class; an individual chip is an object. A software emulation of a chip is neither pure code nor pure data. Like a brain cell, a chip has both code (behavior) and data (state information).

If you were to set out to write a program to emulate the internals of computer hardware, you’d find that design and programming would be greatly aided by representing each chip as an object. This is where the object-oriented side of C++ most clearly differs from C: the fundamental unit of program organization becomes not code or data, but the object's class—which is a type containing both code and data.


Figure 1.3  Object organization.

Encapsulation: Cure for a Programming Headache

One of the most important ideas in Figure 1.3 is the black box concept. Chips communicate with each other only through specific pins; except for these interfaces, each chip is a mystery to all the others. There is no way for one chip to reach in and interfere with the internal workings of another chip. Moreover, people who design computers need know nothing about the internal circuitry of a chip as long as the chip's input and output do precisely what the manufacturer says they do.

The benefits of this black box feature are substantial. The system works because each chip has a functional spec that states exactly how the chip behaves. As long as you can find another chip that adheres to the same specification, you can pop out an old chip and replace it with a new one, and the whole system will continue to work just as it did before. It may be that the internal circuitry of the new chip is completely different—but you don't care. As long as they interact in the same way with the rest of the system, any two chips are interchangeable.

Nowhere is there a greater need for interchangeable parts than in software development. In a typical project, a software team constantly rewrites the internals of every part of the program. To make an analogy, the programmers are continually popping out old chips and replacing them with new ones to fix bugs, improve efficiency, or add new features. With the traditional approach, there is no clean division between different parts of the program. The result is often disaster. A change to any part of a program can potentially affect every other part. The work that programmer A is doing, for example, can reach in and refer to the internals of the part that B is writing. But as soon as B makes any changes, all of A’s assumptions become invalid, and errors happen.

To some extent, the features of the C language can be used to mitigate this problem. The connecting links between different modules—the interfaces—can be managed at the file level. Specifically, you have programmers A and B stick to working on different files and then be very careful about which data is shared. Certain facilities of C (such as the extern and static keywords) can be used to control which data and functions in one file are visible to other files.

C++ gives you much finer control over the sharing of data and functions between parts of the program. The unit of data protection is no longer limited to the file level but can be as large or as small a unit as you want: a class. With C++, you can make any functions or data (members) private at the class level. These private members become invisible to the rest of the program; they cannot be accessed. The public members of an object make up its interface. These public members constitute the “pins” that are visible outside an object. The rest of the program can refer to these members, and the assumption is that they will not change. Meanwhile, you can rewrite the internals of a class to your heart's content.

The clean division between interface and internals is called encapsulation, which is a fancy name for “protecting the insides of something.” C++'s approach to encapsulation not only gives you more control over scope but also makes the public/private distinction explicit in the source code. The language is a great aid to documenting which part of a class is its interface and which part is internal.

When you’re first learning about C++ and object orientation, you may be tempted to use the terms object and class interchangeably. But the distinction is important: a class is a type, and an object is an instance of that type. I’ll have a lot more to say about that in Chapter 5, “A Touch of Class.”


Previous Table of Contents Next