Java Technology Home Page
A-Z Index

Java Developer Connection(SM)
Technical Tips

Downloads, APIs, Documentation
Java Developer Connection
Tutorials, Tech Articles, Training
Online Support
Community Discussion
News & Events from Everywhere
Products from Everywhere
How Java Technology is Used Worldwide
Print Button
 
Tech Tips index

Tech Tips
July 23, 1998

This issue presents tips, techniques, and sample code for the following topics:

Stream Tokenizing
In Tech Tips: June 23, 1998, an example of string tokenization was presented, using the class java.util.StringTokenizer.

There's also another way to do tokenization, using java.io.StreamTokenizer. StreamTokenizer operates on input streams rather than strings, and each byte in the input stream is regarded as a character in the range '\u0000' through '\u00FF'.

StreamTokenizer is lower level than StringTokenizer, but offers more control over the tokenization process. The class uses an internal table to control how tokens are parsed, and this syntax table can be modified to change the parsing rules. Here's an example of how StreamTokenizer works:

import java.io.*;
import java.util.*;
   
public class streamtoken {
  public static void main(String args[])
  {
    if (args.length == 0) {
      System.err.println("missing input filename");
      System.exit(1);
    }
   
    Hashtable wordlist = new Hashtable();
   
    try {
      FileReader fr = new FileReader(args[0]);
      BufferedReader br = new BufferedReader(fr);
   
      StreamTokenizer st = new StreamTokenizer(br);
      //StreamTokenizer st =
      //    new StreamTokenizer(new StringReader(
      //    "this is a test"));
      st.resetSyntax();
      st.wordChars('A', 'Z');
      st.wordChars('a', 'z');
      int type;
      Object dummy = new Object();
      while ((type = st.nextToken()) !=
        StreamTokenizer.TT_EOF) {
          if (type == StreamTokenizer.TT_WORD)
            wordlist.put(st.sval, dummy);
        }
        br.close();
      }
      catch (IOException e) {
        System.err.println(e);
      }
   
      Enumeration enum = wordlist.keys();
      while (enum.hasMoreElements())
        System.out.println(enum.nextElement());
   }
} 

In this example, a StreamTokenizer is created on top of a FileReader / BufferedReader pair that represents a text file. Note that a StreamTokenizer can also be made to read from a String by using StringReader as illustrated in the commented-out code shown above (StringBufferInputStream also works, although this class has been deprecated).

The method resetSyntax is used to clear the internal syntax table, so that StreamTokenizer forgets any rules that it knows about parsing tokens. Then wordChars is used to declare that only upper and lower case letters should be considered to form words. That is, the only tokens that StreamTokenizer recognizes are sequences of upper and lower case letters.

nextToken is called repeatedly to retrieve words, and each resulting word is found in the public instance variable "st.sval". The words are inserted into a Hashtable, and at the end of processing the contents of the table are displayed, using an Enumeration as illustrated in Tech Tips: June 23, 1998. So the action of this program is to find all the unique words in a text file and display them.

StreamTokenizer also has special facilities for parsing numbers, quoted strings, and comments. It's a useful alternative to StringTokenizer, and is especially applicable if you are tokenizing input streams, or wish to exercise finer control over the tokenization process.

Division By Zero
Suppose you have a Java language program in which the following expression appears:

   1.0 / 0.0

What happens? Does this usage cause an exception to be thrown? Is the result undefined?

In the Java programming language, integral division by zero results in an ArithmeticException. But for floating-point, no exception is thrown (in C++ the result of division by zero is undefined). The result of 1.0 / 0.0 is positive infinity, indicated by the constant Double.POSITIVE_INFINITY (which, in fact, is defined by performing this division).

This example illustrates an important point, which is that Java language floating-point arithmetic operates according to a well-defined standard, known as IEEE 754.

Another related idea is that of NaN (not a number), used to represent the results of certain arithmetic operations such as the following:

   0.0 / 0.0

There is also a "Double.NaN" constant defined, which is analogous to the constant Double.POSITIVE_INFINITY. NaN is interesting in that it has the following property:

   NaN != NaN

In other words, NaN is unequal to itself, and this fact is used to implement methods such as Double.isNaN.

To tie down these ideas a little better, here is an example that uses negative infinity, positive infinity, and NaN:

public class number {
  public static void main(String args[])
  {
    long neg_inf_bits =
      Double.doubleToLongBits(-1.0 / 0.0);
    long pos_inf_bits =
      Double.doubleToLongBits(1.0 / 0.0);
    long nan_bits =
      Double.doubleToLongBits(0.0 / 0.0);
   
    System.out.println(Long.toHexString(neg_inf_bits));
    System.out.println(Long.toHexString(pos_inf_bits));
    System.out.println(Long.toHexString(nan_bits));
  }
}

The output of the program is the following:

   fff0000000000000
   7ff0000000000000
   7ff8000000000000

These numbers are the hexadecimal 64-bit values that represent negative infinity, positive infinity, and NaN respectively. In other words, particular bit patterns for a double value indicate specific special values such as NaN.

Finally, knowing how floating-point arithmetic behaves is quite important in particular applications, and the Java language specification goes to some lengths to tie down behavior in this area.


Print Button
[ This page was updated: 21-Sep-2000 ]
Products & APIs | Developer Connection | Docs & Training | Online Support
Community Discussion | Industry News | Solutions Marketplace | Case Studies
Glossary | Feedback | A-Z Index
For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.
Sun Microsystems, Inc.
Copyright © 1995-2000 Sun Microsystems, Inc.
All Rights Reserved. Terms of Use. Privacy Policy.