Chapter 3 Classes, Strings, and Arrays
The last chapter explored Javas primitive data types. This chapter explores Javas reference data types. A primitive data type is one whose value is stored directly in memory. A reference data type stores only a reference to the place where the actual data can be found. There are two reference data types: objects and arrays. Objects and arrays are normally explained very abstractly and at a very high level. Its my goal in this chapter to explain them very concretely and at a very low level. By understanding the low-level structure you can make sure youre that working with Java rather than against it and substantially speed up your programs.
The Heap
The heap is a large block of memory that Java uses to store objects and arrays. Memory in the heap can be allocated discontiguously. When a new object or array is created, the space comes from somewhere in the heap. Exactly where isnt important, or even defined. When an object or array is garbage-collected, the memory that it occupied in the heap is freed. That is, the memory is marked as unused and made available for reuse by other objects.
An object has two parts: its fields and its methods. Each field requires memory to hold a value appropriate to its type. Each method requires memory to hold its arguments and return values and code. However, the memory for the method is needed only when the method is invoked. Furthermore, methods are the same for each instance of the class. Methods therefore are allocated on an as-needed basis in an area of memory called the stack.
Consider the following 3DPoint class:
public class 3DPoint {
double x;
double y;
double z;
// various methods...
}
This class has three double fields. Each double occupies eight bytes. Therefore, each instance of this class needs 24 bytes of memory in the heap. If there is one 3DPoint object in existence, then exactly 24 bytes of heap memory are needed. If there are two 3DPoint objects in existence, then 48 bytes of heap memory are needed. If there are three 3DPoint objects, then 72 bytes of heap memory are needed, and so on.
Arrays are similar. To determine how much heap memory that an array requires, multiply the length of the array by the width of the data type stored in the array. A float array of length 10 thus needs 40 bytes of heap memory; a char array of length 10 needs 20 bytes of heap memory; and a byte array of length 10 needs 10 bytes of heap memory.
When a new object or array is created, the necessary amount of space is set aside for it in the heap. The new operator returns a reference to the block of memory in the heap where the object or array is stored. The virtual machine is responsible for managing the heap and making sure that the same block of memory is not used for two different objects or arrays at the same time.
The exact size of the heap is system-dependent. However, the heap is finite on all systems. In some Java implementations, the heap can grow if more space is needed. On others the size of the heap is fixed when the virtual machine starts up. Nonetheless, the heap is definitely smaller than the memory (physical or virtual) available on the host computer. If the heap fills up, the runtime system throws an OutOfMemoryError.
Garbage collection attempts to prevent this from happening by purging objects and arrays from the heap when theyre no longer necessary. Exactly how the garbage collector decides what can and cannot be purged from the heap is one of the topics in Chapter 6. For now, all you need to know is that the garbage collector is quite reliable and wont purge anything that you might actually need to use.
Objects of different types require different amounts of memory. The more fields that an object has, the more memory that it needs in the heap. Objects can contain other objects as fields. For example, consider this class:
public class GridPoint {
Integer i1;
Integer i2
// various methods...
}
The GridPoint class contains two Integer objects. A GridPoint object does not store the Integer objects themselves in its own block of memory; it stores only references to the Integer objects. References take up four bytes. Therefore, the GridPoint object needs eight bytes of heap memory, regardless of how much heap memory an Integer object requires. Of course, the total memory used by a program will include the memory used by all of the GridPoints, all of the Integers, and all of the other objects stored in the heap.
Pointers, Handles, and References
Theres a lot of confusion about whether Java does or does not have pointers. If youve never programmed in a pointer-based language like C or Pascal, then you will probably never need to understand pointers. You can rest assured that Java lets you do everything that you normally use pointers to do, especially with respect to data structures. However, if youre accustomed to a pointer-based language like C, then you probably need to be convinced of this statement.
What is a pointer?
A pointer is the address of a particular byte of a computers memory. For example, a computer with eight megabytes of memory has 8 * 1024 * 1024 = 8,388,608 bytes of memory. Therefore, the valid pointers on this system begin at zero and count up to 8,388,607. The first byte of memory has the address zero. The last byte of memory has the address 8,388,607. With a pointer, you can inspect the contents of any byte or group of bytes. Similarly, you can write any value you like at any point in memory. For example, in C, to write the int 768 in the four bytes starting with byte 4,324,682, you would write
int n = 4324682;
int* m = (int*) n;
*m = 768;
No check is performed to make sure that it makes sense to put the value 768 at memory location 4324682. If you put the wrong value in the wrong place, it can crash your program, your machine, or worse.
These sorts of bugs are common in C programs. Java has eliminated pointers in order to prevent them. Furthermore, pointers open up many security holes, because they allow any program more or less unrestricted access to all parts of the system.
|