WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips,
February 29, 2000. This issue focuses on serialization.
The tip has four parts:
This issue of the JDC Tech Tips is written by Stuart Halloway,
a Java specialist at DevelopMentor.
These tips were developed using JavaTM 2 SDK, Standard Edition,
v 1.2.2, and are not guaranteed to work with other versions.
SERIALIZATION IN THE REAL WORLD
The JavaTM serialization mechanism illustrates two of the best
characteristics of the JavaTM programming language: simplicity
and flexibility. Serialization allows you to create persistent
objects, that is, objects that can be stored and then
reconstituted for later use. You might want to do this, for
example, if you want to use an object with a program and then
use the object again with a later invocation of the same program.
The basic mechanism of serialization is simple. And it's flexible
enough for you to customize default serialization as needed. This
tip shows you how to serialize objects. It then shows you three
situations where you can take advantage of the mechanism's
flexibility: introducing a new version of a class, securing
protected data, and completely rewriting a class.
First, take a look at this basic example:
import java.io.*;
public class Person implements Serializable {
public String firstName;
public String lastName;
private String password;
transient Thread worker;
public Person(String firstName,
String lastName, String password) {
this.firstName = firstName;
this.lastName = lastName;
this.password = password;
}
public String toString() {
return new String(
lastName + ", " + firstName);
}
}
class WritePerson {
public static void main(String [] args) {
Person p = new Person("Fred",
"Wesley", "cantguessthis");
ObjectOutputStream oos = null;
try {
oos = new ObjectOutputStream(
new FileOutputStream(
"Person.ser"));
oos.writeObject(p);
}
catch (Exception e) {
e.printStackTrace();
}
finally {
if (oos != null) {
try {oos.flush();}
catch (IOException ioe) {}
try {oos.close();}
catch (IOException ioe) {}
}
}
}
}
class ReadPerson {
public static void main(String [] args) {
ObjectInputStream ois = null;
try {
ois = new ObjectInputStream(
new FileInputStream(
"Person.ser"));
Object o = ois.readObject();
System.out.println(
"Read object " + o);
}
catch (Exception e) {
e.printStackTrace();
}
finally {
if (ois != null) {
try {ois.close();}
catch (IOException ioe) {}
}
}
}
}
Person
is a class that represents data you'd like to make
persistent. You might want to archive it to disk and reload in
a later session. Java technology makes this easy. All you need
to do is declare that the Person
class implements the
java.io.Serializable interface
. The Serializable
interface does
not have any methods. It's simply a "signal" interface that
indicates to the JavaTM virtual machine1 that you want to use the
default serialization mechanism.
Compile Person and then test the code by first running WritePerson
.
WritePerson
creates an ObjectOutputStream
for the Person
object
and writes it to a FileOutputStream
named Person.ser
. This means it
formats the object as a stream of bytes and saves it in the
Person.ser
file. Then, execute ReadPerson
. This creates an
ObjectInputStream
from the FileInputStream
, Person.ser
. In other
words, it reads the byte stream from Person.ser
and reconstitutes
the Person
object from it. ReadPerson
then prints the object.
You should see:
Read object Wesley, Fred
The serialization mechanism you just used is capable of handling
a wide variety of situations. When you serialize an object you save
the complete state of the object, including all of its fields. This
even includes fields marked private, such as the password field in
the Person
example. However there are times when you don't want
a field to be persistent. In the Person
example, the worker thread
is tied to resources that are specific to this session of the
virtual machine. It does not make any sense to serialize the thread
for later use. Fortunately, the JavaTM programming language
includes the declaration transient. A field marked transient means
that the the field is not saved when an object is serialized.
Notice that the worker thread is declared transient so it is not
saved when the Person
object is serialized.
SERIALIZATION AND CLASS VERSIONING
A place that default serialization usually runs into trouble is when
you make a simple enhancement to a class. Imagine after shipping your
Person
class, you decide to track a Person
's age. The modification
to the person class is straightforward:
public class Person implements Serializable {
public String firstName;
public String lastName;
int age;
private String password;
transient Thread worker;
public Person(String firstName,
String lastName,
String password,
int age) {
this.firstName = firstName;
this.lastName = lastName;
this.password = password;
this.age = age;
}
public String toString() {
return new String(lastName + ", " +
firstName + " age " + age);
}
}
class WritePerson {
public static void main(String [] args) {
Person p = new Person("
Fred", "Wesley",
"cantguessthis", 31);
//everything past this point is
//the same as the original...
What happens if somebody tries to use this new version of Person
to
stream in an old Person.ser file? Try it by executing ReadPerson
again. (Don't run WritePerson
first, or you will overwrite the old
Person.ser
file.) Notice that you can no longer read the file,
instead you get a java.io.InvalidClassException
. This is because the
Java serialization mechanism is very cautious with modified classes.
When a class is serialized, a 64-bit "fingerprint" for the class is
calculated. This fingerprint, which is called the serialVersionUID
,
is based on several pieces of class data, including all the
serializable fields. Because you added a new field (age)
to the
class, the serialVersionUID
no longer matches, and you cannot read
your old Person.ser
file.
The cautious approach is nice, because it prevents nasty bugs that
might appear if two versions of a class were truly incompatible
in some way. However, you might reasonably argue that the new Person
class is compatible with the old one. Also your code is aware that
the age value might not be set correctly when loading Person
in its
original format. In this situation, you need a way to tell Java that
two classes are compatible. You can do this by explicitly setting
the serialVersionUID
for the Person
class. If you add a line of the
form:
static final long serialVersionUID = /* some long integer */;
to a class, Java serialization will use that ID, instead of
calculating one for you. Of course, this piece of information is
coming a little late, since you already saved the original Person
using some Java-generated ID. Despair not. The serialver
command-line tool in JDKTM 1.2 lets you extract a
serialVersionUID
from an existing class. Recompile the original
Person
class, and issue the command "serialver Person." In response,
you should see:
static final long serialVersionUID = 4070409649129120458L;
Add this entire line to the new version of Person
, and recompile.
Now you can successfully load the original Person
by running
ReadPerson
. The age is not correct (it's set to a default value, 0
),
because the original format didn't have an age field. At the very
least, you have access to all of the data you serialized with the
first version of the Person
class.
SERIALIZATION AND SECURE DATA
Earlier, you saw that Java serialization works even with private
data. This is necessary because private data is usually an essential
part of an object's state. Without the private fields serialization
would be meaningless. However this presents a problem. In the Person
example above, Fred Wesley trusts that nobody can see his password,
since the password
field is private
. With serialization, you can
bypass this protection by dumping Fred's Person
instance to a file.
Open the Person.ser
file in a hex editor, and Fred's password
("cantguessthis") is visible to all the world.
To fix this security exposure, you need to control the way that
data is written to the stream. The Java programming language allows
you to do this with the following two methods:
private void writeObject(
ObjectOutputStream stream)
throws IOException;
private void readObject(
ObjectInputStream stream)
throws IOException,
ClassNotFoundException;
For serializable objects, the writeObject
method allows a class to
control the serialization of its own fields. The readObject
method
allows a class to control the deserialization of its own fields.
What this means is that if you implement these methods in a
Serializable
class, they will replace the normal serialization
behavior. Using writeObject
and readObject
allows you to do most
anything with the stream. But in the Person
case all you really need
to do is encrypt the password. Once the password is encrypted,
you can let the normal serialization mechanism take over. You can
defer to the normal mechanism by calling the methods
defaultReadObject
and defaultWriteObject
.
Here's a Person
class that puts this all together:
import java.io.*;
public class Person
implements Serializable {
public String firstName;
public String lastName;
int age;
private String password;
transient Thread worker;
static final long serialVersionUID =
4070409649129120458L;
//This is not a serious encryption
//algorithm! It works
//but you should substitute
//something better.
static String crypt(String input,
int offset) {
StringBuffer sb =
new StringBuffer();
for (int n=0;
n<input.length(); n++) {
sb.append((char)(
offset+input.charAt(n)));
}
return sb.toString();
}
//In a real application, you should
//synchronize access to
//password and you should not print
//the password to System.out!
private void writeObject(
ObjectOutputStream stream)
throws IOException {
password = crypt(password, 3);
System.out.println("
Password encyrpted as " +
password);
stream.defaultWriteObject();
password = crypt(password, -3);
}
private void readObject(
ObjectInputStream stream)
throws IOException,
ClassNotFoundException {
stream.defaultReadObject();
password = crypt(password, -3);
System.out.println("
Password decrypted to " +
password);
}
public Person(String firstName,
String lastName,
String password,
int age) {
this.firstName = firstName;
this.lastName = lastName;
this.password = password;
this.age = age;
}
public String toString() {
return new String(lastName + ", " +
firstName + " age " + age);
}
}
Notice that writeObject
encrypts the password before it invokes
the default serialization mechanism with stream.defaulWriteObject
.
The readObject
method reverses the process. Use the WritePerson
and ReadPerson
classes to test this new version. You'll see that
thepassword
is no longer visible as plain text. You can also try
viewing the Person.ser
file.
SERIALIZATION AND THE COMPLETE CLASS REWRITE
What happens if you need to change some existing fields in the
Person
class? Assume that you decide to eliminate the lastName
and firstName
fields in favor of a single fullName
field. The
Person
class now looks like this:
import java.io.*;
public class Person implements Serializable {
public String fullName;
int age;
private String password;
transient Thread worker;
static final long serialVersionUID =
4070409649129120458L;
//This is not a serious encryption
//algorithm! It works
//but you should substitute
//something better.
static String crypt(String input,
int offset) {
StringBuffer sb =
new StringBuffer();
for (int n=0;
n<input.length(); n++) {
sb.append((char)(
offset+input.charAt(n)));
}
return sb.toString();
}
//In a real application, you should
//synchronize access to
//password and you should not print
//the password to System.out!
private void writeObject(
ObjectOutputStream stream)
throws IOException {
password = crypt(password, 3);
System.out.println("
Password encyrpted
as " + password);
stream.defaultWriteObject();
password = crypt(password, -3);
}
private void readObject(
ObjectInputStream stream)
throws IOException,
ClassNotFoundException {
stream.defaultReadObject();
password = crypt(password, -3);
System.out.println("Password
decrypted to " +
password);
}
public Person(String firstName,
String lastName,
String password,
int age) {
this.fullName = lastName + ", " +
firstName;
this.password = password;
this.age = age;
}
public String toString() {
return new String(fullName + "
age " + age);
}
}
Now try reloading an existing Person.ser
file, by executing
ReadPerson
. Because you have set the serialVersionUID
, the code
doesn't crash. But it doesn't do anything useful. The new class
has no field names that match the lastName
and firstName
fields
in the Person.ser
file, so these fields are ignored. Conversely,
the fullName
field does not existin the Person.ser
file, so
a correct fullName
value isn't materialized (instead it's set to
the default, a null value).
The problem is with defaultReadObject
, which tries to match stream
fields by name to fields in the class. Normally this saves you
a lot of trouble, but in this case the field names no longer match.
So you need to manage how fields are read from storage. You can
explicitly name the fields you expect to find in the stream by
using the nested class ObjectInputStream.GetField
. Here's how you
can use ObjectInputStream.GetField
in the Person class:
//Replace readObject with
//this new version:
private void readObject(
ObjectInputStream ois)
throws IOException,
ClassNotFoundException {
ObjectInputStream.GetField gf =
ois.readFields();
//Hope that we have the new version...
fullName = (String) gf.get(
"fullName", null);
if (fullName == null) {
//Uh-oh.
//Old version.
//Calculate fullName:
String lastName = (String) gf.get(
"lastName", null);
String firstName =
(String) gf.get(
"firstName", null);
fullName = lastName + ",
" + firstName;
}
age = gf.get("age", 0);
password = (String) gf.get("password", null);
password = crypt(password, -3);
System.out.println("Password
decrypted to " + password);
}
First, the GetField
object is accessed by calling readFields
on
the ObjectInputStream
. Then the code attempts to read the stream
field "fullName" into the class field fullName
. The second
parameter to the get
method indicates a default value (null) to use
if no fullName
field exists. If the default value is returned,
the method assumes that the stream is in the old format. It uses
the get method to read the stream fields lastName
and firstName
into a local variable, and then calculates the fullName
value.
Try this class with both an old and new version of Person.ser
.
It will now work with either one.
CONCLUSION
The final Person
class does a lot more work than the original, which
simply implemented the Serializable
interface. The final version
specifies a serialVersionUID
, manages the state of the password field,
and names the fields you want to read. This additional work is the
price you pay for a major benefit: the ability to evolve your code
over time. With these techniques, your persistent classes become
backwards compatible, that is, they add new capabilities without
losing capabilites they already had.
Click to view Source code for
this tip, or right-click to download.
To learn more about Java serialization, check out the Java
serialization specification.
Note
The names on the JDCSM
mailing list
are used for internal Sun MicrosystemsTM
purposes only. To remove your name from the list, see
Subscribe/Unsubscribe
below.
Feedback
Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster
Subscribe/Unsubscribe
The JDC Tech Tips are sent to you because you elected to
subscribe when you
registered as a JDC member. To unsubscribe from JDC email, go
to the following
address and enter the email address you wish to remove from
the mailing list:
http://developer.java.sun.com/unsubscribe.html
To become a JDC member and subscribe to this newsletter go to:
http://java.sun.com/jdc/
_______
1 As used on this web site, the
terms "Java
virtual machine" or "JVM" mean a virtual
machine for the Java
platform.