![]() |
|||
![]() ![]() |
![]() |
|
![]() |
Java was first designed by Big-Endian engineers at Sun Microsystems. It was also designed for the Internet, where almost all protocols specify Big-Endian byte orders. Therefore, it should come as no surprise that Javas virtual machine uses Big-Endian format for all data types. Little-Endian systems, like the X86, have to translate the Big-Endian data in Java byte code into their native Little-Endian format before executing it.
Unsigned integersMany traditional programming languages, notably C, allow the use of unsigned quantities. An unsigned number uses its high-order bit for data so it can count twice as high as a number that has to reserve one bit for the sign. However, it can only count positive numbers, not negative numbers. Recall that the largest signed byte is 01111111, which is 127 in decimal. 11111111 is not 255 but rather -128. However, by reading 11111111 as an unsigned quantity, the first 1 bit is interpreted as 128, not the - sign. Thus, as unsigned quantity, 11111111 is indeed 255. On the other hand, theres no way to express negative numbers as unsigned numbers. All Java numeric data types except char use signed integers exclusively. However its not unlikely that youll run across data from programs written in other languages that do have unsigned integers. java.io.DataInputStream has two methods that read unsigned quantities. readUnsignedByte() reads a single byte off the stream and returns an int between 0 and 255. An int is returned instead of a byte or a short because a byte can go only as high as 127, whereas an unsigned byte can go as high as 255. Similarly readUnsignedShort() reads two bytes from the input stream and returns an int between 0 and 65,535. There is no similar readUnsignedInt() method. If you want to, its easy enough to write one yourself. Youll need to read four bytes and return a long between 0 and 4,294,967,295. Again, the most efficient way to do this uses bit-level operators, so well defer the details until the end of this chapter. An unsigned long that is, an 8-byte unsigned integer is relatively uncommon in practice. No primitive Java data type is large enough to handle unsigned longs. You can, however, use the java.math.BigInteger class instead. Integer widthsYouve probably heard a lot of hype about 32-bit computing and 32-bit clean code. Youll be hearing more about 64-bit platforms in the near future, if you havent already. Whats being referred to is, very roughly, the preferred size of an integer on a given computer architecture and the number of bits that can be transferred from main memory to the CPU in one clock cycle. Generally, the higher the number of bits, the faster the computer will run. However, you need to rewrite (or at least recompile) the software to accommodate the proper bit width before you can see the performance gain. Much legacy code is written in languages like C that do not guarantee the width of an integer. The same C program may use 32-bit ints on a Sparc, 16-bit ints on a Mac, and 64-bit ints on a DEC Alpha. Although these all have Java equivalents, you have to know which one youre dealing with before you write the code to handle it! Trying to read 16-bit ints with Javas readInt() method is a sure path to failure. Theres no guaranteed way to look at a file in the absence of outside information and tell solely from the contents of the file whether it was written using 16-bit integers or 32-bit integers. Similarly, you cant tell whether or not it uses Big-Endian or Little-Endian data. In an ideal world, youd have access to a specification that describes the data format used. If you dont, perhaps you have access to the source code that was used to write the file. If not, youll have to do some testing. Try to read the file as 16-bit ints. Do the results make sense? What if you read it as 32-bit ints? Do those results make sense? If you seem to have an excessive number of zeroes appearing in your data, especially if they tend to alternate with non-zero values, that may indicate that you are reading the data using too short an integer. For example, if the data file is full of numbers mostly between 10 and 1000, then if its written with 32-bit ints, the high two bytes of each int will be zero.
|
![]() |
|