Previous Table of Contents Next


Even though many changes were made, all the code that uses the CStr class still works. For example, the following code, used with the previous version of CStr, continues to work correctly without any change:

CStr string1, string2;

string1.cpy(“My name is ”); // Copy string to string1
string2.cpy(“Bill.”);       // Copy string to string2
string1.cat(string2.get()); // Concat string2 onto
                            // string1
puts(stringl.get());        // Get string1 data and
                            // print

But even though the externals of CStr haven’t changed, the change to the class has been profound. CStr behaves the same way as before and looks the same to the outside world, but the way it goes about doing things is a thousand times more efficient. The new CStr implementation uses the malloc library function to allocate exactly the amount of bytes needed for the data at any given moment. The next section walks through how these functions work.

We’ve had a taste of object-oriented magic: you can alter large parts of the class-definition code all you want without worrying about introducing unforeseen errors into the rest of the program.

This magic is called encapsulation. It means that certain parts of the object are encapsulated, or protected, from the outside. There must be some interaction with the outside world, of course, or the class is useless. This external part is usually called the interface.


Figure 5.6  Encapsulation in the CStr class.

In standard C, any statements can reach into the internals of a data structure and access any part of it. That’s fine until you rewrite the data structure in some way. At that point, your debugging nightmares begin, because all the code anywhere that refers to the structure must be rewritten—and in a big program, you’ve probably forgotten all the places that refer to your data structure! Encapsulation prevents these nightmares.

The Dynamic-Memory Implementation: A Walk-Through

In the last section, I claimed that the new implementation for CStr is much more efficient. This is true even though there is more code to execute.

Specifically, the new class is far more efficient in its use of string memory. Every string occupies exactly the memory it needs at every moment. This memory management is not free: more statements are needed to manage the memory. Yet on the whole, the result is a much better string class. Probably the single greatest benefit is that there is no arbitrary limit of 256 bytes of storage. Beyond that, efficient storage is usually more important than execution speed when you make intensive use of strings.

The cpy and cat functions make use of the malloc library function. This function requests a number of bytes from the operating system; if successful, the function reserves a block of memory for use by the program and then returns a pointer to the beginning of the block. Calls to malloc should use this general form:

pointer = (base_type *) malloc(total_size_requested);

After the memory is no longer needed, you can free this same block by calling the free function:

free(pointer);

Under our new implementation, the cpy member function copies a new string into the current string data area. But it has an important side effect. If the new string data is longer or shorter than the current string, the current string must grow or expand in size. This is particularly crucial if the new string is longer, because there are no unused bytes in this implementation. Strings cannot grow without corrupting other data.

The solution is to allocate a new block of memory and copy the new data there. The steps are as follows:

1.  Get the length of new string data.
2.  If the new length is different from the current length, then free the current memory block and allocate a new memory block of the same size as the new string.
3.  Copy the new string data into this memory block.

First, the function gets the length of the new string data and stores it in the local variable n:

n = strlen(s);

Next, this length is compared to the current length, and, if they are not equal, the current string data must grow or shrink to the new size—plus one for the null terminator. The easiest way to do that is to simply throw away (free) the current block and then allocate a new one of the correct size. Finally, the class variable nLength is updated to the new length.

if (nLength <> n) {
    if (pData)
          free(pData);
    pData = (char*) malloc(n + 1);
    nLength = n;
}

The rest of the function copies the new string data to the object’s current string data (pointed to by class variable pData).

    strcpy(pData, s);

The cat (concatenate) member function is a little more complicated. It must create a new memory block but also copy data from the current string data, all before finally concatenating the new string. Without this preliminary work, there would be no room for the object’s string data to grow without corrupting other data. The steps are as follows:

1.  Get the length of the new string data. Return immediately if this string is zero length, because there’s nothing more to do.
2.  Allocate a memory block large enough to hold the combined strings.
3.  If there is current string data, copy this data to the new memory block. Then free the old memory block.
4.  Finally, concatenate the new string and update pData to point to the new memory block. Update nLength as well.

The first three lines get the length of the new string data and return immediately if the length is zero:

n = strlen(s);
if (n == 0)
     return;

The function then allocates a memory block large enough for the combined strings. Note that one extra byte is allocated to hold the null terminator.

   pTemp = (char*) malloc(n + nLength + 1);

We now have a new memory block, pointed to by pTemp, big enough to hold the combined strings. The next thing to do is to copy the current string data into this block. Then the memory block that held that data is freed:

if (pData) {
    strcpy(pTemp, pData);
    free(pData);
}

At this point, there is a memory block containing the current string, and it has the extra room needed to concatenate the new string data onto the end. After performing the concatenation, the function ends by updating pData and nLength:

   strcat(pTemp, s);
   pData = pTemp;
   nLength += n;

Life and Death of an Object: Constructors et al.

The CStr class has a couple of problems. First, until string data is assigned, the pData member doesn’t point to anything. This is risky, because the user may expect always to get a meaningful address even if it is only the address of an empty string. But the get function simply returns pData, which might be a null pointer.

char *CStr::get(void) { // Return ptr to string data.
   return pData;
}

The best solution is to initialize pData when an object is created. I’ve been relying on the fact that by default, member variables are initialized to zero or (in the case of pointers) to NULL. But this is not good programming practice.

There’s a worse problem. Nothing frees the current memory block when the string is destroyed. The result is a memory leak: every time a string is created, initialized, and destroyed, it leaves behind a hole in memory. If you don’t have infinite memory, this could be a problem.


Previous Table of Contents Next