
|
 |
|
Michael Abrash's Graphics Programming Black Book, Special Edition
by Michael Abrash
The Coriolis Group
ISBN: 1576101746 Pub Date: 07/01/97
|
32-Bit Addressing Modes
The 386 and 486 both support 32-bit addressing modes, in which any register may serve as the base memory addressing register, and almost any register may serve as the potentially scaled index register. For example,
mov al,BaseTable[ecx+edx*4]
uses a perfectly valid 32-bit address, with the byte accessed being the one at the offset in DS pointed to by the sum of EDX times 4 plus the offset of BaseTable plus ECX. This is a very powerful memory addressing scheme, far superior to 8088-style 16-bit addressing, but its not without its quirks and costs, so lets take a quick look at 32-bit addressing. (By the way, 32-bit addressing is not limited to protected mode; 32-bit instructions may be used in real mode, although each instruction that uses 32-bit addressing must have an address-size prefix byte, and the presence of a prefix byte costs a cycle on a 486.)
Any register may serve as the base register component of an address. Any register except ESP may also serve as the index register, which can be scaled by 1, 2, 4, or 8. (Scaling is very handy for performing lookups in arrays and tables.) The same register may serve as both base and index register, except for ESP, which can only be the base. Incidentally, it makes sense that ESP cant be scaled; ESP presumably always points to a valid stack, and I cant think of any reason youd want to use the stack pointer times 2, 4, or 8 in an address. ESP is, by its nature, a base rather than index pointer.
Thats all there is to the functionality of 32-bit addressing; its very simple, much simpler than 16-bit addressing, with its sharply limited memory addressing register combinations. The costs of 32-bit addressing are a bit more subtle. The only performance cost (apart from the aforementioned 1-cycle penalty for using 32-bit addressing in real mode) is a 1-cycle penalty imposed for using an index register. In this context, you use an index register when you use a register thats scaled, or when you use the sum of two registers to point to memory. MOV BL,[EBX*2] uses an index register and takes an extra cycle, as does MOV CL,[EAX+EDX]; MOV CL,[EAX+100H] is not indexed, however.
The other cost of 32-bit addressing is in instruction size. Old-style 16-bit addressing usually (except in a few special cases) uses one extra byte, which Intel calls the Mod-R/M byte, which is placed immediately after each instructions opcode to describe the memory addressing mode, plus 1 or 2 optional bytes of addressing displacementthat is, a constant value to add into the address. In many cases, 32-bit addressing continues to use the Mod-R/M byte, albeit with a different interpretation; in these cases, 32-bit addressing is no larger than 16-bit addressing, except when a 32-bit displacement is involved. For example, MOV AL, [EBX] is a 2-byte instruction; MOV AL, [EBX+10H] is a 3-byte instruction; and MOV AL, [EBX+10000H] is a 6-byte instruction.
 | Note that 1 and 4-byte displacements, but not 2-byte displacements, are supported for 32-bit addressing. Code size can be greatly improved by keeping stack frame variables within 128 bytes of EBP, and variables in pointed-to structures within 127 bytes of the start of the structure, so that displacements can be 1 rather than 4 bytes.
|
However, because 32-bit addressing supports many more addressing combinations than 16-bit addressing, the Mod-R/M byte cant describe all the combinations. Therefore, whenever an index register (as described above) is involved, a second byte, the SIB byte, follows the Mod-R/M byte to provide additional address information. Consequently, whenever you use a scaled memory addressing register or use the sum of two registers to point to memory, you automatically add 1 cycle and 1 byte to that instruction. This is not to say that you shouldnt use index registers when theyre needed, but if you find yourself using them inside key loops, you should see if its possible to move the index calculation outside the loop as, for example, in a loop like this:
LoopTop:
add ax,DataTable[ebx*2]
inc ebx
dec cx
jnz LoopTop
You could change this to the following for greater performance:
add ebx,ebx ;ebx*2
LoopTop:
add ax,DataTable[ebx]
add ebxX,2
dec cx
jnz LoopTop
shr ebx,1 ;ebx*2/2
Ill end this chapter with two more quirks of 32-bit addressing. First, as with 16-bit addressing, addressing that uses EBP as a base register both accesses the SS segment by default and always has a displacement of at least 1 byte. This reflects the common use of EBP to address a stack frame, but is worth keeping in mind if you should happen to use EBP to address non-stack memory.
Lastly, as I mentioned, ESP cannot be scaled. In fact, ESP cannot be an index register; it must be a base register. Ironically, however, ESP is the one register that cannot be used to address memory without the presence of an SIB byte, even if its used without an index register. This is an outcome of the way in which the SIB byte extends the capabilities of the Mod-R/M byte, and theres nothing to be done about it, but its at least worth noting that ESP-based, non-indexed addressing makes for instructions that are a byte larger than other non-indexed addressing (but not any slower; theres no 1-cycle penalty for using ESP as a base register) on the 486.
|