Brought to you by EarthWeb
ITKnowledge Logo Login Graphic Click Here!
Click Here!
ITKnowledge
Find:
 
EXPERT SEARCH ----- nav

EarthWeb Direct

EarthWeb sites: other sites

Previous Table of Contents Next


Chapter 2
Primitive Data Types

The Java virtual machine defines eight primitive data types: five integer types, two floating-point types, and one boolean type. The types are byte, short, int, long, float, double, char, and boolean. This chapter explores how these different primitive types are stored in memory and used in calculations. You’ll learn how one can be converted to another and what can go wrong in this conversion. You’ll also learn how to use the bit-level operators to reach down to the lowest level of the virtual machine and to change what you find there.

Bytes in Memory

All data in Java (or any digital computer) must be represented as a particular sequence of bits in the computer’s memory. A bit is an abstract quantity that can have exactly two values. These two values are commonly called 0 and 1. However, as you’ll see shortly, these are not the same as the numbers zero and one.

At the very low level of electronic circuits, a transistor that is charged to a particular value — generally 5.0 or 3.3 volts relative to ground — is said to be on and to have the value “one.” A transistor that is uncharged — at the value of 0.0 volts relative to ground — is said to be off and have the value “zero.” However, when you consider matters at this low a level, the real world is analog, not digital. It is possible for transistors to have voltages of 2.5 volts, 1.2 volts, -3.4 volts, or just about any other value you can imagine. Most digital electronic circuits have some tolerance so that a transistor that’s on at 3.3 volts will still be on at 3.2 volts. Past that tolerance, however, the transistor is said to be three-stating. This is a problem for the electrical engineers that design integrated circuits, but it shouldn’t be a problem for a software engineer. If your computer starts three-stating when it isn’t supposed to, send it back to the shop to be replaced.

Modern computers, including the Java virtual machine, organize bits into groups of eight called bytes. A group of eight bits is also sometimes referred to as an octet. The single byte is normally the lowest level at which you can interact with a computer’s memory. You always work with at least eight bits at a time. Bits are like hot dog buns. You can’t go to a grocery store and buy one hot dog bun or 13 hot dog buns. Because hot dog buns come in packs of 8, you can get 8, 16, 24, or any other multiple of 8, but not any number of buns that isn’t a multiple of 8. There is no keyword or operator in Java that enables you to read from or write to one bit of memory at a time. You have to work with at least seven more bits adjacent to the bit you’re interested in at the same time, even if you aren’t doing anything to those bits.


Note:  This wasn’t always the case. Some early computers used 12-bit words. However, these computers have long since become extinct.

Although you can buy as few as eight hot dog buns at a time, it’s sometimes cheaper to buy them by the case. The case size often depends on where you buy them. At the corner convenience mart, 32 hot dog buns probably cost you four times as much as eight hot dog buns. However, at Benny’s Super Discount Warehouse Store, buns may be cheaper by the gross. Similarly, different computers pack different numbers of bytes into a word. Computers based on the Intel 8088 chip use 8-bit, 1-byte words. Computers based on the 286 architecture, however, use 16-bit words and can therefore move data around at (very roughly) twice the speed of an 8088 computer at the same clock rate. Most modern CPUs use 32-bit words. The 32-bit processors include the 80386, 80486, Pentium, Pentium Pro, Sparc, PowerPC 601, PowerPC 603, and PowerPC 604 CPUs. Some 64-bit processors are just starting to appear, including Digital’s Alpha line, Sun’s UltraSparc chip, and the forthcoming HP/Intel Merced. All of these chips can still run old 8-bit or 16-bit software, but they run faster and more efficiently with software that moves data around in words that match the native size of the processor.

So which is Java? 8-bit? 16-bit? 32-bit? In fact, it’s really none of the above. Because Java uses only a virtual machine, it needs to be able to run on any and all of the mentioned architectures without being tied to a particular word size. In one sense, you can argue that the Java virtual machine is an 8-bit machine because each instruction is exactly one byte long. However, the native integer data type for Java is 32-bit, so in that respect, Java is a 32-bit computer. The interpreter or JIT will likely convert the Java instructions and data into whichever format is appropriate for the machine on which it’s running.

Variables, Values, and Identifiers

Variables, values, and identifiers are closely related to each other. In common use, the three words are used interchangeably. However, each word does have a slightly different meaning, and when you discuss computers at the CPU or virtual machine level, these differences become important.

Consider this Java statement:

     int j = 2;

The letter “j” is an identifier. It identifies a variable in Java source code. The identifier, however, does not appear in the compiled byte code. It is a mnemonic device to make programmers’ lives easier. The number 2 is the value of the variable. To be more precise, the bit pattern 00000000000-000000000000000000010 is the value of the variable. The four bytes of memory where this pattern is stored are the variable.

A variable is a particular group of bytes in the computer’s memory. The value of a variable is the bit pattern stored in those bytes that make up the variable. How the bit pattern is interpreted depends on the type of the variable. The rest of this chapter discusses the interpretation of the bit patterns that make up different primitive data types.

You can change the value of a variable by adjusting the bits that live in those bytes. This does not make it a new variable. Conversely, two different variables can have the same value.

An identifier is a name for a particular group of bytes in memory. Some programming languages allow a single variable to have more than one name. However, Java does not. In a Java program, an identifier always points to a particular area of memory. Once an identifier has been created, there is no way to change where it points.


Note:  This may sound a little strange to experienced Java programmers. In particular, you may think that this is true for primitive data types like int but not for object types like String. In fact, this is true for all Java data types. You’ll have to wait till the next chapter to see why.


Previous Table of Contents Next
HomeAbout UsSearchSubscribeAdvertising InfoContact UsFAQs
Use of this site is subject to certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb Inc. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.