Java Technology Home Page
A-Z Index

Java Developer Connection(SM)
Technical Tips

Downloads, APIs, Documentation
Java Developer Connection
Tutorials, Tech Articles, Training
Online Support
Community Discussion
News & Events from Everywhere
Products from Everywhere
How Java Technology is Used Worldwide
Print Button
 
Tech Tips archive

Tech Tips
March 17, 1999

WELCOME to the Java Developer ConnectionSM Tech Tips, Vol. 2 No. 8. This issue covers:

Collators

If you've used Java much at all, you've probably had occasion to compare strings using the String.compareTo method. This method does lexicographical comparison, which is a fancy way of saying that the numerical values of corresponding Unicode characters in the strings are compared. For example, the letter "a" has a numeric value of 0x61, "b" 0x62, and so on.

Such comparisons are obviously useful, but not necessarily completely adequate, for example in an internationalization context. Suppose, for example, that you'd like for lower and upper case characters to compare identical, or you want accents on letters to be ignored. The collator classes in java.text can be used for this purpose, that is, to build locale-sensitive string comparison methods.

To see how collators work, consider an example:

  
import java.text.*;
import java.util.*;
  
public class collate {
  public static void main(String args[])
  {
    Collator coll = Collator.getInstance(Locale.US);
  
    coll.setStrength(Collator.TERTIARY);
    System.out.println(coll.compare("a","A"));//false
    coll.setStrength(Collator.SECONDARY);
    System.out.println(coll.compare("a","A"));//true
  
    coll.setStrength(Collator.SECONDARY);
    System.out.println(coll.compare("a","\u00e0"));//false
    coll.setStrength(Collator.PRIMARY);
    System.out.println(coll.compare("a","\u00e0"));//true
  
    coll.setStrength(Collator.IDENTICAL);
    System.out.println(coll.compare("a","b"));//false

    CollationKey key1 = coll.getCollationKey("abc");
    CollationKey key2 = coll.getCollationKey("def");
    System.out.println(key1.compareTo(key2));//false
  }
}
The first line:
Collator defcoll = Collator.getInstance(Locale.US);
retrieves a new collator, according to the locale settings applicable to the United States.

Then a series of string comparisons is done, in each case setting a strength before performing the comparison. A strength specifies what level of difference is considered important in the comparison. Four different strengths can be defined: IDENTICAL, PRIMARY, SECONDARY, and TERTIARY. The meaning of each strength depends on the specific locale. For example, in the US locale, upper versus lower case is considered a TERTIARY difference, less important than a SECONDARY difference. If the strength is set to TERTIARY, then case is significant. An example of a SECONDARY difference is accents on letters. The Unicode letter "\u00e0", defined to be:

00E0;LATIN SMALL LETTER A WITH GRAVE
is considered different than "a" when comparing using a SECONDARY strength setting, but identical when using a PRIMARY one. These rules may of course be different for some other locale.

A final point about this example concerns efficiency. If you are performing repeated string comparisons using collators, it may be more efficient to use CollationKey objects instead of Collation.compare. CollationKey objects are precompiled, which aids performance.

If you are developing applications that operate in an international context, then this whole area is one that needs to be considered.

BigDecimal

The java.math package contains two classes, BigInteger and BigDecimal. BigInteger represents arbitrary-precision integers, with arithmetic operations such as addition and division supported, along with comparison and hashing methods. A BigDecimal consists of an arbitrary-precision integer along with a scale, where the scale is the number of digits to the right of the decimal point.

These classes can be used in applications requiring high-precision numbers. Financial applications sometimes require such precision, as do some kinds of numerical programming problems. An example of one of these is computing numerical constants to a high degree of precision.

The mathematical constant "e" can be defined as the sum of the infinite series:

1/0! + 1/1! + 1/2! + 1/3! + ...

A program that uses BigDecimal to compute this constant to 40 places is:

  
import java.math.*;
  
public class bige {
  public static void main(String args[])
  {
    BigDecimal one = new BigDecimal("1");
    BigDecimal curfact = new BigDecimal("1");
    BigDecimal factmul = new BigDecimal("1");
    BigDecimal curval = new BigDecimal("0");

    String curout = "";

    // number of desired decimal places
    final int NP = 40;

    for (;;) {
      // divide 1 by the current factorial
      BigDecimal x = one.divide(curfact, NP + 1,
          BigDecimal.ROUND_HALF_EVEN);

      // add the result to the accumulated value
      curval = curval.add(x);

      // move to the next factorial value
      curfact = curfact.multiply(factmul);
      factmul = factmul.add(one);

      // check convergence of the current value
      String s =
          curval.toString().substring(0, NP + 2);
      if (s.equals(curout)) {
        System.out.println(s);
        break;
      }
      curout = s;
    }
  }
}

During the calculation, an extra digit is carried to help the rounding behavior. The rounding behavior itself is ROUND_HALF_EVEN, which means "round up/down toward nearest digit", or if the digits are equidistant, round toward the even digit. For example, division using this rounding mode, assuming one decimal place, comes out like this:

  
755 / 100 = 7.6 (7.5 and 7.6 are equidistant, 
                 round toward 6)

745 / 100 = 7.4 (7.4 and 7.5 are equidistant, 
                 round toward 4)
This particular rounding method minimizes cumulative error.

The output of the program is:

2.7182818284590452353602874713526624977572
which is a correct value for "e" to 40 places.

The JDC Tech Tips are written by Glen McCluskey.


Print Button
[ This page was updated: 21-Sep-2000 ]
Products & APIs | Developer Connection | Docs & Training | Online Support
Community Discussion | Industry News | Solutions Marketplace | Case Studies
Glossary | Feedback | A-Z Index
For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.
Sun Microsystems, Inc.
Copyright © 1995-2000 Sun Microsystems, Inc.
All Rights Reserved. Terms of Use. Privacy Policy.