Java Technology Home Page
A-Z Index

Java Developer Connection(SM)
Technical Tips

Downloads, APIs, Documentation
Java Developer Connection
Tutorials, Tech Articles, Training
Online Support
Community Discussion
News & Events from Everywhere
Products from Everywhere
How Java Technology is Used Worldwide
Print Button
 
Tech Tips index

Tech Tips
December 16, 1997

This issue presents tips, techniques, and sample code for the following topics:

Performance -- using Object to represent disparate types.
This tip is a little tricky, but it recently came up in an actual application, and illustrates how Java language features are used to efficiently represent a large data structure.

The application is one where a very large tree structure, consuming millions of bytes, is built up. Some of the nodes in the tree reference child nodes (non-terminals), while others are leaf nodes (terminals) and have no children, but contain String information. The application involves parsing a large Java program and representing it internally via a tree.

One simple approach to this problem is to define a Node class such as the following:


        public class Node {
                private int type;
                private Node child[];
                private String info;
        }

If the node is a leaf node, then info is used. Otherwise, child refers to the children of the node, and child.length to the number of children.

This approach works pretty well, but uses a lot of memory. Only one of child and info are used at any one time, meaning that the other field is wasted. child is an array, with attendant overhead, for example, in storing the dimensions of the array for subscript checking. For certain large inputs, the parser program runs out of memory.

The first refinement of this approach is to collapse child and info:


        public class Node {
                private int type;
                private Object info;
        }

In this scheme, info can refer to either a String, for a leaf node, or to a child node array. Object is the root of the Java class hierarchy, so that for example, the following:


        class A {}

implicitly means:


        class A extends Object {}

An instance of a subclass of Object, such as String, can be assigned to an Object reference. An array of Nodes can likewise be assigned to an Object. The instanceof operator can be used to determine the actual type of an Object reference.

In the parser application, using Object to represent both data types is not good enough because it still takes up too much memory. So a further change has been implemented. After doing some research, it was found that the child array consisted of a single Node element about 95 percent of the time. So it's possible to represent one-child cases directly using an Object reference to the child node, rather than a reference to a one-long array of child nodes.

This representation is complicated, and it's useful to define a method for encapsulating the abstraction as in the following example:


public class Node {
  private int type;
  private Object info;

// constructors, other methods here ...

// gets the i-th child reference

  public Node getChild(int i)
  {
    if (info instanceof String)
      return null;
    else if (info instanceof Node && i == 0)
      return (Node)info;
    else
      return ((Node[])info)[i];
   }
}

getChild returns the i-th child, or null for leaf nodes. If there is exactly one child, then info is of type Node, referencing that child. If there is more than one child, info is of type Node[], and a cast to Node[] is done, followed by a retrieval and return of the child reference.

In the parser application, this change is enough to tip the scales, so that the application would not run out of memory. The internal representation in this example is tricky, but it can be hidden via methods such as getChild. In general, it's wise to avoid tricky coding, but useful to know how to do it when the need arises.

The example also illustrates the utility of using one Object reference to represent several different data types. In C/C++ similar techniques would use void* pointers or unions.

Please see Tech Tips: January 20, 1998 for a followup on this topic.

Reflection.
An important Java language feature is reflection, a feature also known as "introspection." Reflection is the ability to query a Java class about its properties, and to operate on methods and fields by name for a given object instance.

You can use reflection to set object fields or invoke particular object methods by name. For example, given an object instance "obj," and a method name "f5" specified at program execution time, the method can be invoked on the instance.

To see how this works, look at this simple example:


import java.lang.reflect.*;

public class DumpMethods {
  public static void main(String args[])
  {
    try {
      Class c = Class.forName(args[0]);
      Method m[] = c.getDeclaredMethods();
      for (int i = 0; i < m.length; i++)
        System.out.println(m[i].toString());             
    }
              
    catch (Throwable e) {
      System.err.println(e);
    }
  }
}

For an invocation of:


        java DumpMethods java.util.Stack

the output is:


public java.lang.Object 
         java.util.Stack.push(java.lang.Object)
public synchronized 
         java.lang.Object java.util.Stack.pop()
public synchronized 
         java.lang.Object java.util.Stack.peek()
public boolean java.util.Stack.empty()
public synchronized int 
         java.util.Stack.search(java.lang.Object)

That is, the method names of class java.util.Stack are listed, along with their fully qualified return types and parameter types.

This program loads the specified class using class.forName, and then calls getDeclaredMethods to retrieve the list of methods defined in the class. The class java.lang.reflect.Method represents a single class method.

The reflection feature may not seem like much at first, but it's impossible to do in other languages such as C, C++, or Fortran. The names and properties of functions in these other languages are gone by the time the program is executed. In technical terms, these other languages have "earlybinding," whereas the Java language has "late binding."


Print Button
[ This page was updated: 21-Sep-2000 ]
Products & APIs | Developer Connection | Docs & Training | Online Support
Community Discussion | Industry News | Solutions Marketplace | Case Studies
Glossary | Feedback | A-Z Index
For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.
Sun Microsystems, Inc.
Copyright © 1995-2000 Sun Microsystems, Inc.
All Rights Reserved. Terms of Use. Privacy Policy.