Brought to you by EarthWeb
ITKnowledge Logo Login Graphic Click Here!
Click Here!
ITKnowledge
Find:
 
EXPERT SEARCH ----- nav

EarthWeb Direct

EarthWeb sites: other sites

Previous Table of Contents Next


What is a handle?

A handle is a pointer to a pointer. That is, a handle points to a location in memory where the address of the actual data can be found. The advantage of handles over raw pointers is that an object can be moved in memory and the pointer to it updated while the handle for the program remains valid. This has significant advantages for keeping memory clear and defragmented.

For example, after a program has run for some time, many objects will have been constructed and garbage collected. This can make a heap very fragmented, as shown in Figure 3-1. Each block is a word. The gray blocks are words in use. The white blocks are free words. Of course, real heaps have many more words than this, but this is sufficient for a demonstration.


Figure 3-1  A fragmented heap.

Suppose, with the heap in this state, that you need four words for an object. There is plenty of space in the heap, but it’s fragmented. There is no one place where you can get four words of contiguous memory. To make space for the new object, you have to move some of the allocated blocks around in memory. However, this can cause problems if the running program has pointers straight into the heap. For example, consider Figure 3-2. This is the same heap, with object variables shown as ovals. The arrows are pointers into the heap. Each object has at least one pointer (to its own data), and some have multiple pointers if they themselves contain references to other objects. Furthermore, one object may be pointed to from several different places. This interconnected web of pointers makes it very difficult to move objects in the heap, because you have to update all of the different pointers that can exist in hundreds of different objects, methods, and threads.


Figure 3-2  A heap with pointers.

Now look at what happens if you just willy-nilly compact the heap by moving all the data down to the bottom. Figure 3-3 shows the result. Now there is space for a four-word object. However, many — perhaps most — of the pointers are broken. Some now point to the wrong object. Others point to nowhere in particular. The VM can try to identify every reference to each moved object in the running program and update it with the new address of its data, but there can be thousands of these, and the operation can be extremely time-consuming in a large program.


Figure 3-3  A compacted heap.

How can the references be arranged in such a way that they don’t break when the heap is defragmented? One way to look at the problem is that references point to areas of different sizes in the heap. If you could somehow arrange it so that every object needed exactly the same amount of space in the heap, then fragmentation would not be a problem. As long as there was any free space at all, it could be used.

Of course, different objects do take different amounts of space, but references always take four bytes (one word). The solution is to insert an extra block of references between the references in your source code and the heap. When an object is moved in the heap, only one link needs to be updated: the one between the offset table and the data in the heap. The many more pointers to the offset table do not need to be updated. Furthermore, it’s relatively easy to find the pointers in the offset table that need to be updated. The VM does not need to search the entire memory space of the running program looking for anything that might be a pointer.

Figure 3-4 shows this scheme. To find an object’s data, you follow the first arrow into the offset table. Then you follow the second arrow out to the actual data in the heap.


Figure 3-4  A fragmented heap with handles.

At first glance this appears more complicated than the method in Figure 3-2. However, consider what happens when the heap is compacted. Figure 3-5 shows the result. The object pointers don’t need to be changed. Only one pointer needs to be adjusted for each object, not one pointer for each reference, as in the previous case. Because there’s a one-to-one relationship between filled entries in the offset table and objects in the heap, once you’ve adjusted the pointer from the offset table to the object, you’ll never have to adjust another pointer to the same object later. If you’re moving only one object in the heap, you can stop looking as soon as you find the pointer to it in the offset table.


Figure 3-5  A compacted heap with handles.

There are many optimizations that can be made to this scheme. For example, each object in the heap can contain the index of its pointer in the offset table, so when the memory manger needs to move it, the memory manager can adjust the pointer in constant time.


Secret:  This all happens behind the scenes, so you normally don’t need to worry about it. Sun’s virtual machines use handles, but this isn’t absolutely necessary. Microsoft’s VM implements references as pointers, not doubly indirected handles.

Of course, double indirection is useful not only in virtual machines. This scheme, or variants of it, can be used in situations where moving objects in the heap is very expensive but moving objects in the offset table is cheap. For example, if the heap is actually a file on disk but the offset table is in memory, then you can reorganize the structure of a file by changing the offset table. Variations on this scheme are used in most relational databases.

What is a reference?

Reference is strictly a Java term. There are no references in C or Pascal. A reference is an abstract identifier for a block of memory in the heap. Furthermore, a reference has a type like string or double[]. At the level of the non-virtual host machine, references may be implemented as handles, pointers, or something else entirely. However, references are not pointers; they are not handles; they are merely a means of identifying a particular block of memory in the heap.

How exactly the virtual machine implements references at the level of machine code is VM-dependent and completely hidden from the programmer in any case. Most VMs — including Sun’s — use handles, not pointers. Microsoft’s VM uses pointers rather than handles. Other schemes are possible.

Ninety percent of the time, you can ignore the difference between a reference to an object and the object itself. However, there is always that annoying 10 percent of the time when the difference becomes important. This 10 percent occurs mostly when passing arguments to methods.


Previous Table of Contents Next
HomeAbout UsSearchSubscribeAdvertising InfoContact UsFAQs
Use of this site is subject to certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb Inc. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.