C++ From Scratch

Contents


6

Using Linked Lists


The problem with using arrays is that you must declare their size at compile time rather than at runtime.


compile time--When you compile the program

runtime--When you run the program


This means that you must guess, in advance, how much memory you need to allocate. If you guess wrong and you allocate too little, you run out of room and your program breaks. If, on the other hand, you allocate more than you need, you waste memory.

In Decryptix! this isn't a very big problem because we create only two arrays: one to hold the solution and one to hold the guess. We can just create a pair of arrays large enough to hold the biggest legal solutions and the largest possible guess, and let it go at that.

In other programs, however, fixed size arrays are so wasteful of memory as to be unusable. Software designers are often asked to consider how their program will scale: How will they perform as they become larger and handle more data? Programs that use fixed size arrays rarely scale well.


Scaling a program refers to the capability to do more: to handle larger and more complex data sets, more users, or more frequent access. When a program scales, it becomes bigger and typically more complex, and all the weaknesses in performance and maintainability surface.


To solve the problem of fixed size arrays, we need the capability to store data in some form of data structure or collection that grows dynamically, which means that it grows as it needs to while the program runs.

Dynamic Data Structures

Over the years, computer scientists have struggled with this issue. In the past, procedural programmers created complex data structures to hold data efficiently. Object-oriented programmers talk about collection classes, classes that are designed to hold collections of other objects.


collection class--A class designed to hold a collection of other objects


Collection classes are built on top of traditional data structures as higher-level abstractions, but the problem they are solving is the same: How do we efficiently deal with large sets of data or objects?

We need a variety of collection classes because our needs and priorities differ from program to program. Sometimes we care about adding objects to the collection quickly. Other times, we don't mind if there is a slight delay adding objects, but we want the capability to find objects quickly. In other programs, the emphasis is on using little memory or little disk space.

The Standard Template Library

The C++ Standard Library now offers a suite of collection classes called the Standard Template Library (STL), which is described in coming chapters. The STL classes are designed to hold collections of objects, including built-in objects such as characters and more complex (and dramatically larger) user-defined objects. Most importantly, the STL code has been optimized, debugged, and tested so that you don't have to do this work yourself.

Before considering the STL in detail, however, it is helpful to create our own rather simple collection class, at least once, to see what is involved.

We'll rewrite Decryptix! to use a linked list rather than an array. A linked list is a very simple data structure that consists of small containers that can be linked together as needed, and each of which is designed to hold one object.


linked list--A simple data structure in which each element in the list points to data and to the next element in the list


Each individual container is called a node. The first node in the list is called the head, and the last node in the list is called the tail.


node--An element in a data structure

head--The first node in a linked list

tail--The last node in a linked list


Lists come in three fundamental forms. From simplest to most complex, they are

In a singly linked list, each node points forward to the next one, but not backward. To find a particular node, start at the top and go from node to node, as in a treasure hunt ("The next node is under the sofa"). A doubly linked list enables you to move backward and forward in the chain. A tree is a complex structure built from nodes, each of which can point in two or three directions. Figure 6.1 shows these three fundamental structures.

Figure 6.1 Singly linked, doubly linked, and tree structures.

Linked Lists

We'll build the simplest form of linked list, a singly linked list that is not sorted. Characters are added in the order in which they are received (just as they are in an array).

We'll actually create the linked list three times. The first time we'll take a rather simplistic, traditional approach just to get a good sense of how a linked list works. The second time we'll design a somewhat more object-oriented linked list and see whether we can add some solid object-oriented design heuristics to our solution. Finally, we'll use the linked list to illustrate the concept of abstract data types.

Understanding Linked Lists

Our simplest linked list consists of nothing but nodes. A node is a tiny object with two members. The first member is a pointer to the thing we're storing, in our case a single character. The second member is a pointer to another node. By stringing nodes together, we create a linked list.

When there are no more nodes in the list, the last node points to NULL. Figure 6.2 shows what our linked list looks like. The first node in the list (the head node) points to its data (A) and also to the second node in the list. This second node in turn points to its data and also to the third node. The third node is the tail node, and it points to its data and to null, signifying that there are no more nodes in the list.

Figure 6.2 Simple linked list.

Let's implement this linked list and then see how we might use it, instead of an array, to hold our solution. To get started, however, we need only create the Node class and fill it with a list of characters. Listing 6.1 has the declaration for our Node class.


NOTE: During the development of a program, I'm often confronted with a new technology, in this case the linked list. Rather than trying to figure out how to use it in my program while also figuring out how to implement it, I usually first implement the technology with a simple driver program. That is, I'll take it out of context and create a very simple program that does nothing but exercise the new technology. After it is working, I'll go back and integrate it into the real program.


Listing 6.1 The Node Class Declaration

0:  class Node
1:  {
2:  public:
3:      Node(char c);
4:      ~Node();
5:      void      Display        ()         const;
6:      int        HowMany        (char c)    const;
7:      void    Insert        (char c);
8:  
9:  private:
10:      char    GetChar        ();
11:      Node *    GetNext        ();
12:      char myChar;
13:      Node *    nextNode;
14:  };

Let's start by looking at the constructor on line 3. A node is created by passing in a character by value. Rather than keeping a pointer to the character, our Node class keeps a copy of the character on line 12. With a tiny one-byte object, this is sensible. With larger objects, you'll want to keep a pointer to avoid making a copy.


NOTE: In C++, pointers are typically 4 bytes. With a 1-byte object, it is cheaper to keep a copy (one byte of memory used) than to keep a pointer (4 bytes of memory used). With large user-defined types, it can be far more expensive to make the copy, in which case a pointer or reference is used.


Node provides three methods in addition to its constructor and destructor. On line 5 we see Display(), whose job it is to print the characters that are stored in the list. The method HowMany() also takes a character and returns the number of times that character appears in the list. Finally, Insert() takes a character and inserts it into the list. Listing 6.2 shows the implementation of these simple methods.

Listing 6.2 Implementing Node

0:  #include <iostream>
1:  using namespace std;
2:  
3:  #include "Node.h"
4:  
5:  Node::Node(char c):
6:   myChar,nextNode(0)
7:  {
8:  }
9:  
10:  Node::~Node()
11:  {
12:      if ( nextNode )
13:          delete nextNode;
14:  }
15:  
16:  
17:  void Node::Display() const
18:  {
19:      cout << myChar;
20:      if ( nextNode )
21:          nextNode->Display();
22:  }
23:  
24:  
25:  int Node::HowMany(char theChar) const
26:  {
27:      int myCount = 0;
28:      if ( myChar == theChar )
29:          myCount++;
30:      if ( nextNode )
31:          return myCount + nextNode->HowMany(theChar);
32:      else
33:          return myCount;
34:  }
35:  
36:  void Node::Insert(char theChar)
37:  {
38:      if ( ! nextNode )
39:          nextNode = new Node(theChar);
40:      else
41:          nextNode->Insert(theChar);
42:  }

The constructor on line 5 receives a character and initializes its myChar member variable on line 6. The constructor also initializes its nextNode pointer to zero--that is, to null. When the Node is created, it points to nothing further along in the list.

The destructor on line 10 tests the pointer on line 12, and if the pointer is not NULL, the destructor deletes it.


NOTE: The destructor takes advantage of the C++ idiom that any nonzero value is considered true. Thus, if nextNode is null, its value is 0 and, therefore, false, and the if statement does not execute. If nextNode does point to another node, its value is nonzero and thus true, and that object is deleted.


Display(), on line 17, prints the character that is held by the current node on line 19, and then calls Display() on the nextNode in the list, if any (on line 20). In this way, by telling the first node to display itself, you cause every node in the list to display itself.

A Simple Driver Program

On line 25, HowMany() takes a character and returns the number of times that character exists in the list. The implementation of this is tricky and instructive because this type of implementation is common in C++. Explaining how this works in words is much less effective than tracing it in the debugger. To do that, we need a driver program, shown in Listing 6.3.

Listing 6.3 Driver Program

0:  #include "DefinedValues.h"
1:  #include "List0601_Node.h"
2:  
3:  int main()
4:  {
5:      Node head('a');
6:      head.Insert('b');
7:      int count = head.HowMany('a');
8:      cout << "There are " << count << " instances of a\n";
9:      count = head.HowMany('b');
10:      cout << "There are " << count << " instances of b\n";
11:      cout << "\n\nHere's the entire list: ";
12:      head.Display();
13:      cout << endl;
14:  
15:      return 0;
16:  }
There are 1 instances of a
There are 1 instances of b
Here's the entire list: ab

Before we examine HowMany, let's look at the driver. Its job is to generate two letters and add them to the list. To do this, it creates a first node, called the head node, on line 5, and initializes it with the value 'a'. It then tells the head node to insert one more letter ('b'), starting on line 6.

Our linked list now looks like Figure 6.3.

Figure 6.3 With two nodes.

On line 0 we #include DefinedValues.h, shown in Listing 6.4.

Listing 6.4 DefinedValues.h

1:    #ifndef DEFINED
2:    #define DEFINED
3:    
4:    #include <iostream>
5:    #include <vector>
6:    #include <iterator>
7:    #include <algorithm>
8:    #include <time.h>
9:    #include <utility>
10:    
11:    using namespace std;
12:    
13:    const char alpha[27] = "abcdefghijklmnopqrstuvwxyz";
14:    
15:    const int minPos = 2;
16:    const int maxPos = 10;
17:    const int minLetters = 2;
18:    const int maxLetters = 26;
19:    const int SecondsInMinute = 60;
20:    const int SecondsInHour = SecondsInMinute * 60;
21:    const int SecondsInDay = SecondsInHour * 24;
22:    const int GUESSES_PER_SECOND = 10000;
23:    
24:    const int szInt =  sizeof(int);
25:    const int szChar = sizeof(char);
26:    const int szBool = sizeof(bool);
27:    
28:    #endif

This file serves to include the necessary STL header files, and also to define constants we'll need in this program. We will use this same defineValues file throughout all the sample code for the rest of the book.

Let's not examine how Insert() works just yet, but rather assume that the letter b is in fact inserted into the list. We'll return to how this works in just a moment, but let's first focus on HowMany() works.

The HowMany() Method

On line 7 we ask how many instances of 'a' there are, and on line 9 we ask the same about how many instances of 'b' there are. Let's walk through the call to howMany on line 9. Put a break point on this line, and run the program to the break point.

The program runs as expected and stops at the following line:

     count = head.HowMany('b');

Stepping into this line of code brings you to the top of HowMany():

int Node::HowMany(char theChar) const
{

Let's step line by line. The first step initializes myCount to 0, which you can probably see in the local variables window of your debugger.

Which node are we looking at? We'll be entering the HowMany() method once for each node. How can we tell where we are? Remember that every nonstatic member method has a pointer called the this pointer, which holds the address of the object itself.

You can examine the value of the this pointer in your debugger. Take note of the address of the this pointer while you are here in HowMany(). On my machine, it is 0x0012ff6c, but yours might be different. The particular value doesn't matter--just write down whatever you have. This is the address of the node we're examining.

Step to the next line, where myChar is compared with theChar. Examine the myChar member variable ('a') and the local variable theChar ('b'), again in your local variables window.


NOTE: You might need to expand your this pointer to see the member variables, or you might need to click on a different debugger window to find them.


Clearly, these values are not the same, so the if statement fails. myCount remains at 0.

Step again to the next if statement. The nextNode pointer should be nonzero. On my machine, it is 0x004800a0. Your value will differ; again, although the absolute value doesn't matter, write down whatever you get.

Because nextNode is nonzero, the if statement evaluates true, and you step to the following line:

          return myCount + nextNode->HowMany(theChar);

What do you expect to happen if you step into this line? The first thing to be evaluated is

nextNode->HowMany(theChar);

This calls the howMany() method through the nextNode pointer. This, in fact, calls howMany() on the object to which nextNode points. Remember that nextNode had a value, the address of the next node in the list. Let's step in.

The debugger appears to go to the top of the same method. Where are we? Examine the this pointer in your debugging window (you might first have to step to the first line of the method). On my machine, the this pointer has changed to 0x004800a0, which was exactly the value previously held in the nextNode pointer. Aha! We're now in the second node in the list. We can imagine that our list looks like Figure 6.4.

Figure 6.4 Nodes with addresses.

We are running HowMany in the second node. Once again, HowMany() begins by initializing the local variable myCount, on line 27, to 0. Be careful here, the myCount we're looking at now is local to this iteration of HowMany(). The myCount back in the first node has its own local value.

HowMany() then tests the character that is passed in against the character it is storing on line 28; if they match, it increments the counter. In this case, they do match, so we compare myChar ('b') with theChar (also 'b'). They match, so myCount is incremented.

Stepping again brings us to the next if statement:

30:     if ( nextNode )

This time nextNode is NULL (you should see all zeros in its value in your local variables window). As expected, the second node's nextNode points to NULL. The if statement fails and the else statement executes, returning myCount, which has the value 1.

We step into this line and appear to be right back at the return statement. Examine the this pointer, however, and you'll find that we're back in the first node. The value that is returned (1) is added to the value in myCounter (now 0), and it is this combined value that is returned to the calling function, main().

As an exercise, try revising main() to insert the values a, b, c, b, and b. This produces the linked list that is shown in Figure 6.5. Make sure you understand why HowMany() returns the value 3 when passed in 'b'.

Figure 6.5 Linked list with abcbb.

Insert() in Detail

Now is the time to examine the implementation of Insert(), as shown on line 36 in Listing 6.2 and reproduced here for your convenience:

36:  void Node::Insert(char theChar)
37:  {
38:      if ( ! nextNode )
39:          nextNode = new Node(theChar);
40:      else
41:          nextNode->Insert(theChar);
42:  }

The goal of this method is to insert a new character (theChar) into the list.

Note that on line 39 we use the keyword new to create a new Node object. This is explained in full in just a few pages; for now, all you need to know is that this creates a new object of type Node.

Let's start over, creating the linked list from Figure 6.5, using the code shown in Listing 6.5.

Listing 6.5 Decryptix Driver Program

0:  #include "DefinedValues.h"
1:  #include "List0601_Node.h"
2:  
3:  int main()
4:  {
5:      Node head('a');
6:      head.Insert('b');
7:      head.Insert('c');
8:      head.Insert('b');
9:      head.Insert('b');
10:      int count = head.HowMany('a');
11:      cout << "There are " << count << " instances of a\n";
12:      count = head.HowMany('b');
13:      cout << "There are " << count << " instances of b\n";
14:      cout << "\n\nHere's the entire list: ";
15:      head.Display();
16:      cout << endl;
17:      
18:      return 0;
19:  }
There are 1 instances of a
There are 3 instances of b
Here's the entire list: abcbb

On line 5 we create the first Node object, which we call head. Set a break point on that line and run to the break point. Stepping in takes you to the constructor of the Node object:

 Node::Node(char c):
myChar,nextNode(0)
{
}

This does nothing but initialize the member variables. We now have a node whose myChar character variable contains 'a' and whose nextNode pointer is NULL.

Returning to main(), we step into the call to

     head.Insert('b');

Step into this code from Listing 6.2, which is once again reproduced for your convenience:

36:  void Node::Insert(char theChar)
37:  {
38:      if ( ! nextNode )
39:          nextNode = new Node(theChar);
40:      else
41:          nextNode->Insert(theChar);
42:  }

On line 38 we test to see whether nextNode is NULL. In this case it is, so we must create a new node. The last time we created a node, we simply declared it and passed in the value to store ('a'). This time we do something different, calling the new operator. Why?

Until now, all the objects you've created were created on the stack. The stack, you'll remember, is where all local variables are stored, along with the parameters to function calls. To understand why creating your new node object on the stack won't work, we need to talk a bit more about what the stack is and how it works.

Excursion: Understanding the Stack

The stack is a special area of memory that is allocated for your program to hold the data required by each of the functions in your program. It is called a stack because it is a last-in, first-out (LIFO) queue, much like a stack of dishes at a cafeteria (see Figure 6.6).

Figure 6.6 A LIFO stack.

LIFO means that whatever is added to the stack last will be the first thing that is taken off. Other queues are more like a line at a theater, which is called first in, first out (FIFO): The first one on line is the first one off.


LIFO--Last in, first out, like plates on a stack

FIFO--First in, first out, like people on line to buy tickets at a theater

Interestingly, most airplanes board and unboard coach like a FIFO stack. The people at the rear of the plane are the first to board and the last to get off. Of course, first class is a FIFO structure--first class passengers are the first ones in and the first ones out.


When data is pushed onto the stack, the stack grows; as data is popped off the stack, the stack shrinks. It isn't possible to pop a dish off the stack without first popping off all the dishes placed on after that dish, and it isn't possible to pop data off a stack without first popping all the data added above your data.

A stack of dishes is a fine analogy as far as it goes, but it is fundamentally wrong. A more accurate mental picture is of a series of cubbyholes, aligned top to bottom. The top of the stack is whatever cubby the stack pointer happens to be pointing to. The stack pointer is just a pointer whose job is to keep track of the top of the stack.


stack pointer--A pointer that keeps track of the top of the stack


Each of the cubbies has a sequential address, and one of those addresses is kept in the stack pointer register. Everything below that magic address, known as the top of the stack, is considered to be on the stack. Everything above the top of the stack is considered to be off the stack, and therefore invalid. Figure 6.7 illustrates this idea.

Figure 6.7 The instruction pointer.

When data is put on the stack, it is placed into a cubby above the stack pointer, and then the stack pointer is moved up to indicate that the new data is now on the stack.

When data is popped off the stack, all that really happens is that the address of the stack pointer is changed because it moves down the stack. Figure 6.8 makes this rule clear.

Figure 6.8 Moving the stack pointer.

The Stack and Functions

Here's what happens when a program that is running on a PC under DOS branches to a function:

1. The address in the instruction pointer is incremented to the next instruction past the function call. That address is then placed on the stack, and it will be the return address when the function returns.

2. Room is made on the stack for the return type you've declared. On a system with two-byte integers, if the return type is declared to be int, another two bytes are added to the stack, but no value is placed in these bytes.

3. The address of the called function, which is kept in a special area of memory that is set aside for that purpose, is loaded into the instruction pointer, so the next instruction executed will be in the called function.

4. The current top of the stack is noted and is held in a special pointer called the stack frame. Everything that is added to the stack from now until the function returns is considered local to the function.

5. All the arguments to the function are placed on the stack.

6. The instruction that is now in the instruction pointer executes, thus executing the first instruction in the function.

7. Local variables are pushed onto the stack as they are defined.

8. When the function is ready to return, the return value is placed in the area of the stack that is reserved at step 2.

9. The stack is then popped all the way up to the stack frame pointer, which effectively throws away all the local variables and the arguments to the function.

10. The return value is popped off the stack and assigned as the value of the function call itself.

11. The address that is stashed away in step 1 is retrieved and put into the instruction pointer.

12. The program resumes immediately after the function call, with the value of the function retrieved.

Some of the details of this process change from compiler to compiler, or between computers, but the essential ideas are consistent across environments. In general, when you call a function, the return address and parameters are put on the stack. During the life of the function, local variables are added to the stack. When the function returns, these are all removed by popping the stack.

For our purposes, the most important thing to note about this process is that when a function returns, all the local variables are popped off the stack and destroyed.

As was described previously, if we create the new node in InsertNode on the stack, when the function returns, that node is destroyed. Let's try it. We'll just change Insert to create a local node, and we'll stash away the address of that local Node in nextNode. Listing 6.6 illustrates the change.


WARNING: These changes compile and link, but will crash when you run the program.


Listing 6.6 Local Nodes

0:  void Node::Insert(char theChar)
1:  {
2:      if ( ! nextNode )
3:      {
4:          Node localNode(theChar);
5:          nextNode = &localNode;
6:      }
7:      else
8:          nextNode->Insert(theChar);
9:

When I run this program, it quickly crashes. Here's what happens: When we create the head Node, its nextNode pointer is null. When we call Insert() on the head node, the if statement returns true, and we enter the body of the if statement on line 4. We create a localNode object and assign its address to nextNode. We then return from Insert. At that moment, the stack unwinds, and the local node we created is destroyed.

Now, all that happens when that local node is destroyed is that its destructor runs and the memory is marked as reusable. Sometime later, we might assign that memory to a different object. Still later, we might use the nextNode pointer and bang! the program crashes.

Using new

This is a classic example of when you need to create an object on the heap. Objects that are created on the heap are not destroyed when the function returns. They live on until you delete them explicitly, which is just what we need.

Unlike objects on the stack, objects on the heap are unnamed. You create an object on the heap using the new operator, and what you get back from new is an address, which you must assign to a pointer so that you can manipulate (and later delete) that object.

Let's look again at Insert() from Listing 6.2:

36:  void Node::Insert(char theChar)
37:  {
38:      if ( ! nextNode )
39:          nextNode = new Node(theChar);
40:      else
41:          nextNode->Insert(theChar);
42:  }

The logic of this code is that we test to see whether the nextNode pointer is pointing to an object on line 38; if it is not, we create a new object on the heap and assign its address to nextNode.

When we create the new object, we call new, followed by the type of the object we are creating (Node) and any parameters we need to send to the constructor (in this case, theChar).

If this object does point to another node, we invoke Insert on that object, passing along the character we're trying to solve. Eventually we reach the end of the list--a node that does not point to any other node--and we can create a new node and tag it to the end of the list.

new and delete

Many details are involved in using new effectively, which we'll discuss as we come to them. There is one, however, that I want to discuss immediately. When you create an object on the heap using new, you own that object, and you must clean it up when you are done with it. If you create an object on the heap and then lose the pointer, that object continues to use up memory until your program ends, but you can't access that memory.

When you have an object that you can't access anymore but that continues to consume memory, we say it has leaked out of the program. Memory leaks are of significant concern to C++ programmers. You solve memory leaks by the judicious application of delete(). We see this in the destructor in Listing 6.2, copied here:

10:  Node::~Node()
11:  {
12:      if ( nextNode )
13:          delete nextNode;
14:  }

When we are ready to destroy the linked list, we call delete on the head node (implicitly by returning from the function in which the head node was created on the stack, or explicitly if the head node was created on the heap). The destructor examines its own nextNode, and if the nextNode pointer is not null, the destructor deletes the node to which it points. This mechanism knocks down all the dominoes, each object deleting the next object as part of its own sequence of destruction.

Let's modify main() to create the head node on the heap, and we'll delete it explicitly when we're done with the list. To make all this clear, we'll add printouts to the constructors and destructors to see our progress. Listing 6.7 is the entire program, which we'll walk through in some detail.

Listing 6.7a Node.h

1:    class Node
2:    {
3:    public:
4:        Node(char c);
5:        ~Node();
6:    
7:        void    Display        ()            const;
8:        int        HowMany        (char c)    const;
9:        void    Insert        (char c);
10:    
11:    private:
12:        char    myChar;
13:        Node *    nextNode;
14:    };

Listing 6.7b[em]Node.cpp

15:    #include <iostream.h>
16:    #include "List0603a_Node.h"
17:    
18:    Node::Node(char theChar):
19:    myChar(theChar),nextNode(0)
20:    {
21:        cout << "In constructor of Node(" << this << ")\n";
22:    }
23:    
24:    Node::~Node()
25:    {
26:        cout << "In destructor of Node(" << this << ")\n";;
27:        if ( nextNode )
28:            delete nextNode;
29:    }
30:    
31:    
32:    void Node::Display() const
33:    {
34:        cout << this << ": " << myChar << endl;
35:        if ( nextNode )
36:            nextNode->Display();
37:    }
38:    
39:    
40:    int Node::HowMany(char theChar) const
41:    {
42:        int myCount = 0;
43:        if ( myChar == theChar)
44:            myCount++;
45:        if ( nextNode )
46:            return myCount + nextNode->HowMany(theChar);
47:        else
48:            return myCount;
49:    }
50:    
51:    void Node::Insert(char theChar)
52:    {
53:        if ( ! nextNode )
54:            nextNode = new Node(theChar);
55:        else
56:            nextNode->Insert(theChar);
57:    }

Listing 6.7c Driver.cpp

58:    #include <iostream.h>
59:    #include "List0603a_Node.h"
60:    
61:    int main()
62:    {
63:        Node * pHead = new Node('a');
64:        pHead->Insert('b');
65:        pHead->Insert('c');
66:        pHead->Insert('b');
67:        pHead->Insert('b');
68:        int count = pHead->HowMany('a');
69:        cout << "There are " << count << " instances of a\n";
70:        count = pHead->HowMany('b');
71:        cout << "There are " << count << " instances of b\n";
72:        count = pHead->HowMany('c');
73:        cout << "There are " << count << " instances of c\n";
74:        cout << "\n\nHere's the entire list:\n";
75:        pHead->Display();
76:        cout << "Deleting pHead..." << endl;
77:        delete pHead;
78:        cout << "Exiting main()..." << endl;
79:        return 0;
80:    }
81:    In constructor of Node(0x00430060)
82:    In constructor of Node(0x00431DB0)
83:    In constructor of Node(0x00431D70)
84:    In constructor of Node(0x00431D30)
85:    In constructor of Node(0x00431CF0)
86:    There are 1 instances of a
87:    There are 3 instances of b
88:    There are 1 instances of c
89:    
90:    
91:    Here's the entire list:
92:    0x00430060: a
93:    0x00431DB0: b
94:    0x00431D70: c
95:    0x00431D30: b
96:    0x00431CF0: b
97:    Deleting pHead...
98:    In destructor of Node(0x00430060)
99:    In destructor of Node(0x00431DB0)
100:    In destructor of Node(0x00431D70)
101:    In destructor of Node(0x00431D30)
102:    In destructor of Node(0x00431CF0)
103:    Exiting main()...

The best way to follow the progress of this code is to put the code in the debugger and set a break point in main() at line 63. Run the program to the break point and note that we are going to create a new node and initialize it to hold the letter 'a'. The address of this new node is stashed away in the Node pointer pHead. Now, step in at line 63. You might step into the new operator. If so, just step back out and step in again, which brings you to the constructor for Node on line 18.

Sure enough, the character 'a' was passed in (now designated theChar), and this character is used to initialize the member variable myChar. The node's second member variable, nextNode, which is a pointer to a Node object, is also initialized with the value 0 or NULL. Finally, as you step through the constructor, a message is printed on line 21, the effect of which is shown on line 81.

Notice that the this pointer is not dereferenced, so its actual value is printed: That is, the address of the Node object on the heap whose constructor we are now discussing.

If you continue stepping, you'll return from the constructor back to main() on line 64, where we intend to call the Insert() method on that Node. We do so indirectly, using the pointer that holds its address, and we pass 'b' to the Insert method in the hope that a new Node will be created and appended to the list to hold this new value.

Step in at on line 30 and you enter the Insert() method of Node on line 51, where the parameter theChar holds the value 'b'. On line 53 you test the Node's nextNode pointer, which is NULL (or 0), the value to which you initialized it just a moment ago. The if statement returns true. Take a moment and reflect on why.

If a pointer has a nonzero value, it evaluates true. With a 0 value, it evaluates false. Thus, asking if a pointer is false is the same as asking if it is null.

The not operator turns false to true. Thus, (! nextNode) will evaluate true if nextNode is zero (false). Thus

if ( ! nextNode )

will evaluate true and the if statement will execute as long as nextNode points only to NULL (zero).

To most programmers, this is so idiomatic that

if ( ! nextNode )

really means "if nextNode doesn't yet point to anything..." and we don't much think through all the convoluted logic that makes that work.

In any case, the if statement executes by calling newNode, passing in theChar, and assigning the address that results to nextNode. Calling new immediately invokes the constructor, so stepping into this line brings us to the Node constructor on line 18. Once again, a message is printed on line 82), and we return from the constructor, assigning the address that is returned from new to the nextNode pointer of the first node.

Before leaving Insert, let's examine the this and the nextNode pointers. You should find that this has an address that is equal to the first address printed because we are now back in that first node. You should find that the nextNode pointer has the address of the object we just created. Sure enough, we now have a linked list with two objects in it.

Continuing causes us to return from Insert() back to main(), where the same logic is repeated to insert 'c', 'b', and once again 'b'.

If you don't want to work your way through the logic repeatedly, continue to step over these lines until you reach line 70. We are now ready to determine how many instances of 'b' exist in the program.

Step into this line of code. This takes you into Node::HowMany() on line 40, in which the parameter theChar has the value 'b'. On line 42 we'll initialize myCount to 0. On line 43 we test myChar, which has the value 'a', against theChar, which has the value 'b'. This test fails, so we fall through to line 45, where we test to see whether nextNode is nonzero; it is. This causes us to execute the body of the if statement:

          return myCount + nextNode->HowMany(theChar);

Step into this line. You find yourself in HowMany() for the second node. Continue stepping. myChar is 'b' this time, and it thus matches theChar. We increment myCount and test nextNode. Again it is nonzero, so again we step in, this time to the third node in the list.

In the third node, myChar is 'c', so it does not match myChar; but nextNode is nonzero, so we step into the fourth node.

In the fourth node, mychar is 'b', and we increment myCount to 1. Why is it set to 1 and not to 2, given that this is the second node with 'b'? The answer is that myCount is local to this invocation of HowMany() and therefore can't know about the previous values. Again, nextNode is nonzero, so we now step into the fifth node.

Take a look at the this pointer and expand it in your local variables window. You are now looking at the local member variables for the fifth node object. myChar is 'b', and nextPointer is 0. Thus, we increment myCount; then the test for nextPointer fails, so we return myCount.

We thus return the value 1 to the call from the fourth node. This value is added to the myCount variable (also 1), summing to 2, and this value is now returned to the third node. The third node's myCount is 0, so the value 2 is now returned to the second node. Its myCount variable is 1, so 3 is returned to the first node. The first node's myCount is 0, so 3 is returned to main().

It is very important that you understand how this was accomplished. You might find that using pencil and paper and drawing a picture (see Figure 6.9) makes this easier to understand.

Figure 6.9 Walking the list to get the answer.

As you continue to step out of the function calls, you'll find yourself popping up through the calls to HowMany(), unwinding your way from Node 5 to 4, 3, 2, and back to Node 1. Step into and out of this set of calls repeatedly until you can match what is happening in your debugger to the diagram in Figure 6.9.

When this is comfortable, continue stepping over code lines until you get to the call to Display on line 74. Here you are calling Display() on pHead. Step into this method call and you'll be in Display() for your first node on line 32. Step into the method and note the address of the this pointer, which confirms that you are in the first Node in the list. myChar is printed on line 34 (printing 'a'), and the nextNode pointer is checked on line 45. It returns true, so Display() is called on the second node in the list.

Step into this call, and you are back at line 32. Step in and notice that the this pointer now reflects the address of the second node, as you'd expect. On line 34, the member variable myChar is printed ('b'), and once again we call Display(), this time on the third node.

This continues until the fifth node prints its value. Because the fifth node does not point to another node, the if statement fails, and we return through the various Display() method invocations, back to main().

On line 77 we call delete on pHead. To see what this does, place a break point on line 24 and go until the break point. You find yourself in the destructor of the head node. On line 26 we print the address of the first (head) node, and then on line 27 we test nextNode, which points to the second node. We delete that object, causing us to come to the destructor for the second node, where the logic is repeated. The second node deletes the third node, the third Node deletes the fourth Node, and the fourth node deletes the fifth.

The client of the linked list, in this case main(), never had to call HowMany() or Display() on any node except the head node, and it doesn't have to delete any node except the head node. The maintenance of the list is managed by the list itself. Commands such as Display() or delete are passed along the list as required, each node falling like a domino into the next in the list.

Using Our Simple Linked List in Decryptix!

We are just about ready to change the type of the member variable solution in the Game class. Until now, it has been an array; we want it to be a linked list.

Let's examine all the places we use Solution to see what our linked list must be able to accomplish, and whether our list of Nodes is up to the task.

Following are the lines in Game in which we refer to the solution:

                  theGame.Display(theGame.GetSolution());
                  int howManyInAnswer = howMany (solution, alpha[i]);
                  if ( thisGuess[i] == solution[i] )

That is, we must have the capability to retrieve the solution and display it, count the instances of a particular character in the solution, and retrieve the character at a given offset.

Rather than expose the workings of the Node object to the Game, I've chosen to create a small class that will serve as an interface to the nodes, which I'll call LinkedList. Listing 6.8 shows the declaration of the LinkedList class.

Listing 6.8 LinkedList Declared

1:  
2:  class LinkedList
3:  {
4:  public:
5:      LinkedList();
6:      ~LinkedList();
7:      bool    Add            (char c, bool dupes = true);
8:      void    Display        ()        const;
9:      int        HowMany        (char c)    const;
10:      char    operator[]    (int offset);
11: private:
12:      Node *    headNode;
13:  };

To the client (in this case, Game), LinkedList is the linked list structure. The client is oblivious to the existence of nodes (note that headNode is private).

The best way to understand the implementation of these methods is to see them in action. Let's change Game to use a LinkedList as its solution, as shown in Listing 6.9.

Listing 6.9 The Game Class Declaration

0:  #include "List0606_LL.h"
1:  
2:  class Game
3:  {
4:  public:
5:      Game            ();
6:      ~Game            ()        {}
7:      void    Display    (const LinkedList * pList) const
7a:      { pList->Display(); }
8:      const LinkedList &  GetSolution    () const { return solution; }
9:      void    Play    ();
10:      void    Score    (const char * thisGuess, int &
10a:            correct, int & position);
11:  
12:  private:
13:      int        HowMany    (const char * theString, char theChar);
14:  
15:      bool        duplicates;
16:      int        howManyLetters;
17:      int        howManyPositions;
18:      int        round;
19:      LinkedList        solution;
20:  };

Game is unchanged except for the last line, where the solution member variable is changed to type LinkedList.


NOTE: When one class contains another, as Game contains LinkedList, it can do so by value or by reference. LinkedList contains Node by reference (see Figure 6.10).


Figure 6.10 Containing the node by reference.


NOTE: Game, on the other hand, contains LinkedList by value and is diagrammed in the UML as shown in Figure 6.11. The filled in diamond indicates by value.


Figure 6.11 Containing linked list by value.

Run it!

Let's run through one play of Decryptix! and see how the LinkedList is used. Our driver program is unchanged, as shown in Listing 6.10. We begin by instantiating a Game object, which brings us into Game's constructor as shown in Listing 6.11.


When you make an instance of an object, you are said to instantiate it.



NOTE: To save room, I've left out the beginning of Game's constructor, in which the member variables howManyLetters, howManyPositions, and duplicates are set because this logic is unchanged.


Listing 6.10 Decryptix!.cpp

{
while ( choice != 'y' && choice != 'n' )
0:  #include "DefinedValues.h"
1:  #include "List0607_Game.h"
2:  
3:  int main()
4:  {
5:      cout << "Decryptix. Copyright 1999 Liberty";
5a:        cout << Associates, Inc. Version 0.4\n\n" << endl;
6:      bool playAgain = true;
7:  
8:      while ( playAgain )
9:      {
10:          char choice = ' ';
11:          Game theGame;
12:          theGame.Play();
13:  
14:          cout << "\nThe answer: ";
15:          theGame.GetSolution().Display();
16:          cout << "\n\n" << endl;
17:  
18:          while ( choice != 'y' && choice != 'n' )
19:          {
20:              cout << "\nPlay again (y/n): ";
21:              cin >> choice;
22:          }
23:  
24:          playAgain = choice == 'y' ? true : false;
25:      }
26:          
27:      return 0;
28:  }

Listing 6.11 Implementing Game

1:    Game::Game():
2:        round(1),
3:        howManyPositions(0),
4:        howManyLetters(0),
5:        duplicates(false)
6:    {
7:    
8:    //...
9:    
10:        srand( (unsigned)time( NULL ) );
11:    
12:        for ( int i = 0; i < howManyPositions; )
13:        {
14:            int nextValue = rand() % (howManyLetters);
15:            char c = alpha[nextValue];
16:            if ( solution.Add(c, duplicates) )
17:                i++;
18:        }
19:    
20:        cout << "Exiting constructor. List: ";
21:        solution.Display();
22:        
23:    }

We pick up the logic on line 14, within the for loop in which we create our random numbers, turn them into characters on line 20, and then add them to solution on line 21. It is here, when we call solution.add(), that the logic of the LinkedList comes into play. This invokes LinkedList::Add(), as shown in Listing 6.12.

Listing 6.12 Implementing LinkedList

1:  
2:  bool LinkedList::Add(char theChar, bool dupes)
3:  {
4:      bool inserted = false;
5:  
6:      if ( ! headNode )
7:      {
8:          headNode = new Node(theChar);
9:          inserted = true;
10:      }
11:      else if ( dupes || HowMany(theChar) == 0 )
12:      {
13:          headNode->Insert(theChar);
14:          inserted = true;
15:      }
16:  
17:      return inserted;
18:  }
19:  
20:  int LinkedList::HowMany(char theChar) const
21:  {
22:      return headNode->HowMany(theChar);
23:  }
24:  
25:  char LinkedList::offset(int offSetValue) 
26:  {
27:      Node * pNode = headNode;
28:      for ( int i = 0; i < offSetValue && pNode; i++ )
29:          pNode = pNode->GetNext();
30:  
31:      ASSERT ( pNode );
32:      char c =  pNode->GetChar();
33:      return c;
34:  }
35:  
36:  char LinkedList::operator[](int offSetValue)
37:  {
38:      Node * pNode = headNode;
39:      for ( int i = 0; i < offSetValue && pNode; i++ )
40:          pNode = pNode->GetNext();
41:  
42:      ASSERT ( pNode );
43:      char c =  pNode->GetChar();
44:      return c;
45:  }

On line 6, LinkedList checks to see whether it already has a headNode. To do this, it checks its headNode pointer, which was initialized to NULL in LinkedList's constructor. If that is pointer is still NULL, no headNode exists, so one is now created, passing in the character to be stored.

The constructor to Node was considered earlier (refer to Listing 6.3). Remember that the Node's member variable myChar is initialized with the character passed in (theChar), and its member variable nextNode is initialized with NULL as shown in Listing 6.13.

Listing 6.13 Node Constructor

0:  Node::Node(char theChar):
1:  myChar(theChar),nextNode(0)
2:  {
3:  }

This effectively adds the character to the linked list, storing it in the head node.

If there already is a HeadNode, (that is, the pointer is not NULL), we have a list already and must decide whether we want to add the character. If we are accepting duplicates or if the character does not appear in the list already (line 11), we add it by calling Insert on the headNode (line 13). I already described HeadNode()::Insert in Listing 6.3.

We determine whether the character is in the list on line 11 by calling LinkedList::HowMany(), as shown in Listing 6.14.

Listing 6.14 LinkedList's HowMany Method

0:  int LinkedList::HowMany(char theChar) const
1:  {
2:      return headNode->HowMany(theChar);
3:  }

As you can see, LinkedList does nothing but pass along the character to the headNode, calling the logic that was considered earlier (in Listing 6.3). The LinkedList method HowMany() is considered a wrapper method: It wraps around the Node::HowMany() method, encapsulating its interface, but delegating all the work to the encapsulated method.


wrapper--A class is a wrapper for another when it provides a public interface but delegates all the work to the contained class.

method--A method is a wrapper for another method when it provides an encapsulating interface to that method but delegates all the work to the wrapped method.


Playing the Game

After the solution member linked list is populated, the Game object is fully constructed and the next line in main() is a call to the Game's Play() method. This was considered earlier, and you probably remember that Play() solicits a guess from the player and then calls Game::Score().

Game::Score() was also considered earlier, but because solution is now a linked list, this is worth another look. I've reproduced Game::Score() in Listing 6.15 for your convenience.

Listing 6.15 The Score Method

0:void Game::Score(const char * thisGuess, int & correct, int & position)
1:  {
2:      correct = 0;
3:      position = 0;
4:  
5:  
6:      for ( int i = 0; i < howManyLetters; i++)
7:      {
8:          int howManyInGuess = HowMany(thisGuess, alpha[i]);
9:          int howManyInAnswer = solution.HowMany(alpha[i]);
10:    correct += howManyInGuess < howManyInAnswer ?
10a:            howManyInGuess : howManyInAnswer;
11:      }
12:  
13:      for (  i = 0; i < howManyPositions; i++)
14:      {
15:          if ( thisGuess[i] == solution[i] )
16:              position++;
17:      }
18:  
19:      ASSERT ( position <= correct )
20:  
21:  }

On line 9, we must determine how many times each character appears in solution. We do this by calling HowMany() on solution, passing in each character in turn. This calls LinkedList::HowMany(), which, as we just saw, calls Node::HowMany().

On line 15, we compare the letter at each offset into thisGuess with the letter at the same offset in solution. The Node class does not provide an offset operator, but the LinkedList class must if we are to make this comparison.

Solving the Problem with a Member Method

You can solve this problem by implementing an offset method and calling that method by changing line 15 to read

15:        if ( thisGuess[i] == solution.offset(i) )

The implementation of the offset method is shown in Listing 6.16.

Listing 6.16[em]The offset Operator

0:  char LinkedList::offset(int offSetValue)
1:  {
2:      Node * pNode = headNode;
3:      for ( int i = 0; i < offSetValue && pNode; i++ )
4:          pNode = pNode->GetNext();
5:  
6:      ASSERT ( pNode );
7:      char c =  pNode->GetChar();
8:      return c;
9:  }

The offset is passed into this method as a parameter. We make a new, local pointer, pNode, and we assign to that pointer the address that is currently held in headNode. Thus, both pointers now point to the same object on the heap, as shown in Figure 6.12.

The goal of the for loop on line 3 in Listing 6.16 is to tick through the linked list, starting at the head node, and find the node that corresponds to offSetValue.

As we tick through each node, we must also check that pNode continues to point to a valid node (that is, we have not reached the end of the list). This protects us from trying to get an offset that is too large for the list.

We put the test for whether the offset continues to point to a valid Node right into the for loop by adding it to the test condition:

i < offSetValue && pNode;

This tests that the counter i is not greater than the offset value that was passed in (that is, we're still ticking through the list) and that pNode has a nonzero (that is, non-NULL) value.

On line 4 we assign to pNode the address returned by GetNext(). The call to Node::GetNext() simply returns the address that is stored in that node's nextNode pointer. The net effect of this is that pNode now points to the next node in the list.

This for loop continues until i is no longer less than offSetValue. Thus, if offSetValue is 5, this for loop causes pNode to point to the sixth node in the list, just as you want it to.

On line 6, we assert that we are still pointing to a valid object (belt and suspenders!), on line 7 we extract from that node the character it holds, and on line 8 we return that character.

The net effect of all this is that if you call offset(5), you get back the character at the fifth offset: that is, the sixth character in the list.

This works great, but it isn't how arrays work. You never write

myArray.offset(5);

You write

myArray[5];

It would be nice if our linked list supported the same syntax.

Operator Overloading

C++ provides the capability for the designer of a class to give that class the same operators that are available in the built-in class, such as +, -, ==, <, >, and so forth. This is called operator overloading. The goal of operator overloading is to allow your class to behave as if it were a built-in type.


operator overloading--The ability to program operators such as plus (+) or assignment (=) for classes


How You Accomplish Operator Overloading

C++ has a specific syntax dedicated to creating operators for your classes. We'll examine how the offset operator ([]) is overloaded because that is what we need right now in the code; we'll return to operator overloading throughout the book, however, because it is a powerful technique with many subtleties.

To create the offset operator, you use the following syntax:

char operator[](int offSetValue);

When you write solution[5], the compiler automatically converts this to solution.operator[](5).

The implementation for this overloaded operator is identical to the offset method shown in Listing 6.14, as illustrated in Listing 6.17. The only difference is in the signature of the method.

Listing 6.17 LinkedList's offset Operator

0:  char LinkedList::operator[](int offSetValue)
1:  {
2:      Node * pNode = headNode;
3:      for ( int i = 0; i < offSetValue && pNode; i++ )
4:          pNode = pNode->GetNext();
5:  
6:      ASSERT ( pNode );
7:      char c =  pNode->GetChar();
8:      return c;
9:  }

As you can see, the body is identical; it is just the signature on line 0 that is different:

0:  char LinkedList::operator[](int offSetValue)

Once again, the return value is a char, but this time we see the keyword operator, followed by the operator we're overloading, and then the parameter that is needed by that operator.

You invoke this method by writing

solution[5];

which the compiler translates into

solution.operator[](5);

setting the parameter offset to the value passed in (5).

With this implementation in place, the Play() method can test the value of solution[i], and thus the Play() method remains unchanged from when we were using arrays.

Passing Objects by Value

After the Game is fully constructed, we return to main(), where the Play() method is invoked. When we return from Play(), we see this line:

          theGame.GetSolution().Display();

This causes the Game's GetSolution() method to be called, returning a reference to a LinkedList; then the method Display() is called on that reference to a LinkedList object.

It is clearer to write

          const LinkedList & linkedListConstantReference = 
               theGame.GetSolution();
          linkedListConstantReference.Display();

This compiles equally well and makes a bit more explicit what we're up to.

We declare linkedListConstantReference to be a reference to a constant LinkedList object because that is what GetSolution() is declared to return in Game.h:

class Game
{
//...
     const LinkedList &  GetSolution () const { return solution; }
//...
}

Let's pick this apart:

Because the return value is declared to be a constant reference to a LinkedList, we actually don't return solution itself. Instead, we return a constant reference to solution.


NOTE: The fact that we return a reference to a constant object limits what you can do with that object. For example, you can only call constant member methods using that reference. If the reference is to a constant object, you can't change that object, and calling a method that is not constant risks changing the object. The compiler enforces this constraint. We're fine here because the only method we call with this reference is Display(), which is a constant member method of LinkedList.


Why Is it a Reference?

Why bother returning the LinkedList object by reference at all? Why not just return it by value?

     LinkedList GetSolution () const { return solution; }

This has the advantage of not being constant: You can do anything you want with this object. Because it is a copy of the original, you won't affect solution at all. The net effect in this case is the same: You can still call Display, only this time you'll call it on the copy.

The answer is that it is more expensive to make a copy, but to understand why, we must examine what happens when you pass an object by value, this is the subject of the next chapter, "Creating Object-Oriented Linked Lists."


Contents

© Copyright 1999, Macmillan Computer Publishing. All rights reserved.