Java Technology Home Page
A-Z Index

Java Developer Connection(SM)
Technical Tips

Downloads, APIs, Documentation
Java Developer Connection
Tutorials, Tech Articles, Training
Online Support
Community Discussion
News & Events from Everywhere
Products from Everywhere
How Java Technology is Used Worldwide
Print Button
 
Tech Tips index

Tech Tips
January 20, 1998

This issue presents tips, techniques, and sample code for the following topics:

String versus StringBuffer.
For this tip, suppose you have an application in which you're detabbing text for later display--using a method like drawString in java.awt.Graphics. In such an application, it's necessary to go through the text character by character, and expand a tab to the appropriate number of spaces required to reach the next tab stop (typically set at intervals of 8 characters)

What's the fastest way to do this? One approach is to set up an empty string and append characters to it in turn, using the += operator for appending, as shown in the following example:


   String result = "";

   ...

   result += 'x';

This approach works, but overlooks the fact that in the Java language, strings are immutable (that is, they never change). Therefore, the operation shown above consists of copying the current value of "result" to a temporary buffer, appending the character, and creating a new String object that result then references. Another way of saying this is that in:


   String string1 = "abc";
   String string2 = string1;

   string1 = "xyz";
   System.out.println(string2);

the printed result will be "abc," because reassigning to string1 doesn't change the reference from string2 to the string "abc."

A faster approach: A much faster approach to detabbing uses StringBuffer, a class that supports mutable strings. Operations such as appending are directly supported with StringBuffer, and the result can be converted to a String at any time.

To show how this approach works in practice, here is an example of detabbing text using String (detab1) and StringBuffer (detab2):


public class perf {
  public static String detab1(String s)
  {
    if (s.indexOf('\t') == -1)
      return s;
    String res = "";
    int len = s.length();
    int pos = 0;
    int i = 0;
    for (; i < len && s.charAt(i) == '\t'; i++) 
    {
      res += "        ";
      pos += 8;
    }
    for (; i < len; i++) 
    {
      char c = s.charAt(i);
      if (c == '\t') {
        do {
          res += " ";
          pos++;
        } while (pos % 8 != 0);
      }
      else {
        res += c;
        pos++;
      }
    }
    return res;
  }
      
  public static String detab2(String s)
  {
    if (s.indexOf('\t') == -1)
      return s;
    StringBuffer sb = new StringBuffer();
    int len = s.length();
    int pos = 0;
    int i = 0;
    for (; i < len && s.charAt(i) == '\t'; i++) 
    {
      sb.append("        ");
      pos += 8;
    }
    for (; i < len; i++) {
      char c = s.charAt(i);
      if (c == '\t') {
        do {
          sb.append(' ');
          pos++;
        } while (pos % 8 != 0);
      }
      else {
        sb.append(c);
        pos++;
      }
    }
    return sb.toString();
  }
        
  public static String testlist[] = {
    "",
    "\t",
    "\t\t\tabc",
    "abc\tdef",
    "1234567\t8",
    "12345678\t9",
    "123456789\t"
  };
        
  public static void main(String args[])
  {
    for (int i = 0; i < testlist.length; i++) {
      String tc = testlist[i];
      if (!detab1(tc).equals(detab2(tc)))
        System.err.println(tc);
     }
        
     String test_string =
       "\t\tthis is a test\tof detabbing performance";
     int N = 5000;
     int i = 0;
        
     long ct = System.currentTimeMillis();
     for (i = 1; i <= N; i++)
       detab1(test_string);
     long elapsed = System.currentTimeMillis() - ct;
     System.out.println("String time = " + elapsed);
         
     ct = System.currentTimeMillis();
     for (i = 1; i <= N; i++)
       detab2(test_string);
     elapsed = System.currentTimeMillis() - ct;
     System.out.println("StringBuffer time = " 
       + elapsed);
   }
}

This example first runs some test data through both methods to ensure they are equivalent, and then does a timing to compare the two. The second method, using StringBuffer, runs approximately six times as fast as the first method using String.

When you're tuning programs that do a lot of character processing, it's always a good idea to keep an eye on how much work the program is doing for each character, and the example given above illustrates how character data in the Java language can be efficiently manipulated.

chiclets

Using javap
javap is a tool supplied with JDKTM 1.0.x, and is used to dump out various types of information about .class files. By default, javap displays declarations for all the non-private members of a class. For example, if you run javap on the performance example above, by saying:


   javap perf

the result is:


   Compiled from perf.java
   public synchronized class perf extends java.lang.Object
   /* ACC_SUPER bit set */
   {
      public static java.lang.String testlist[];
      public static java.lang.String detab1(java.lang.String);
      public static java.lang.String detab2(java.lang.String);
      public static void main(java.lang.String[]);
      public perf();
      static static {};
   }

Other options include


        javap -c perf

to print Java1 Virtual Machine bytecodes,


        javap -l perf

to display line number and local variable tables (you need to compile perf.java with -g for this to work), and


        javap -p perf

to print private methods and fields in addition to the public ones.

javap is a handy tool for digging beneath the surface to find out what's really going on in a .class file.

chiclets

Followup on Tech Tips: December 16, 1997
Thanks to everyone who provided feedback on the performance tip in Tech Tips: December 16, 1997. The tip caused some confusion because the problem to be solved was not stated clearly enough.

The application is one where Java source code is being parsed. The parse tree contains non-terminal and terminal nodes, and there are several hundred different node types, representing grammar productions in Java (such as "Primary Expression").

These different types must be represented in some way. Two approaches are to use a "type" field, as was done in the example, or to define a subclass for each type. The latter approach implies several hundred classes, which is not manageable.

The other issue is whether to factor out into distinct classes the three different cases, one for a terminal node, one where a node has a single child node, and one where a node has multiple children. This can be handled as shown in the example, but can also be represented via an abstract superclass, with three subclasses:


   public abstract class Node {...}
   
   public class StringNode extends Node {...}
   
   public class UnaryNode extends Node {...}
   
   public class ArrayNode extends Node {...}

This approach will work, and is in many ways more elegant than the approach presented in Tech Tips: December 16, 1997. But there are a couple of tradeoffs to be made. One is that you have four classes instead of one, which makes the application somewhat harder to manage.

The other issue is one of performance; the Node superclass cannot be declared "final." A method such as getType or getChild, invoked through a Node superclass reference, will be slower than an equivalent method invoked on a single final class of type Node.

_______
1 As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.


Print Button
[ This page was updated: 21-Sep-2000 ]
Products & APIs | Developer Connection | Docs & Training | Online Support
Community Discussion | Industry News | Solutions Marketplace | Case Studies
Glossary | Feedback | A-Z Index
For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.
Sun Microsystems, Inc.
Copyright © 1995-2000 Sun Microsystems, Inc.
All Rights Reserved. Terms of Use. Privacy Policy.