The following chapter is excerpted from
JavaSpaces
TM Principles, Patterns, and Practice, recently
published by Addison-Wesley as part of the Jini Technology Series from Sun
Microsystems, Inc.
Note from the authors:
JavaSpacesTM is a powerful
JiniTM service that provides a high-level tool for
creating collaborative and distributed applications in Java. You might be
familiar with other approaches to building distributed programs, such as message
passing and remote method invocation. The JavaSpaces model is quite
different from these techniques: It provides persistent object exchange areas
through which remote Java processes coordinate their actions and exchange data.
This alternative approach can significantly ease the design and coding involved
in building distributed applications, and is a valuable tool to have in your
distributed programming repertoire.
The book, JavaSpaces Patterns, and Practices, is a comprehensive guide to the
technology, providing details of the JavaSpaces API and numerous examples that
teach you how to develop advanced distributed computing applications with the
technology. Chapter 1, excerpted here, provides an introduction to the
JavaSpaces model and gets you started with space-based programming by showing
you how to build a basic "Hello World" application. -- Eric Freeman and
Susanne Hupfer
Order
this book from Amazon.com
A language that doesn't affect the way you think about programming, is not
worth knowing.
-- Alan Perlis, Epigrams in Programming
The Java
TM programming language is arguably the most popular
programming language in computing history--never before has a language so
quickly achieved widespread use among computing professionals, students, and
hobbyists worldwide. The Java programming language owes its success in part to
its clean syntax, object-oriented nature, and platform independence, but also to
its unique position as the first general programming language expressly designed
to work over networks, in particular the Internet. As a result, many programmers
are now being exposed to network programming for the first time. Indeed, with
the computer industry moving toward increasingly network-centric systems, no
programmer's toolbox will be complete without the means and know-how to design
and construct distributed systems.
The JavaSpacesTM technology is a new tool
for building distributed systems. By providing a high-level coordination
mechanism for Java, it significantly eases the burden of creating such systems.
JavaSpaces technology is first and foremost designed to be simple:
space-based programming requires learning only a handful of operations. At the
same time, it is expressive: throughout this book we will see that a
large class of distributed problems can be approached using this simple
framework. The benefit for you, the developer, is that the combination of these
two features can significantly reduce the design effort and code needed to
create collaborative and distributed applications.
Before getting into the details of the JavaSpaces technology, let's take a
look at why you might want to build your next application as a distributed one,
as well as some of the trouble spots you might encounter along the way.
1.1 Benefits of Distributed Computing
The 1980s slogan of Sun Microsystems, Inc., "The Network is the
ComputerTM," seems truly prophetic in light of
the changes in the Internet and intranets over the last several years.
By early in the new millennium, a large class of computational devices--from
desktop machines to small appliances and portable devices--will be
network-enabled.
This trend not only impacts the way we use computers, but also changes
the way we create applications for them: distributed applications are becoming
the natural way to build software. "Distributed computing" is all about
designing and building applications as a set of processes that are distributed
across a network of machines and work together as an ensemble to solve
a common problem. There are many compelling reasons for building applications
this way.
Performance: There is a limit to how many cycles you can squeeze out
of one CPU. When you've optimized your application and still need better
performance, there is only one thing left to do: add another computer.
Fortunately, many problems can be decomposed into a number of smaller ones. Once
decomposed, we can distribute them over one or more computers to be computed in
parallel. In principle, the more computers we add, the faster the job gets done.
In reality, adding processors rarely results in perfect speedup (often the
overhead of communication gets in our way). Nevertheless, for a large class of
problems, adding more machines to the computation can significantly reduce its
running time.
This class of problems is limited to those problems in which the time spent
on communicating tasks and results is small compared to the time spent
on computing tasks (in other words the computation/communication ratio
is high). We will return to this topic in Chapter 6.
Scalability: When we write a standalone application, our
computational ability is limited to the power and resources of a single machine.
If instead we design the application to work over any number of processors, we
not only improve performance, but we also create an application that scales: If
the problem is too much work for the team of computers to handle, we simply add
another machine to the mix, without having to redesign our application. Our
"distributed computing engine" can grow (or shrink) to match the size of the
problem.
Resource sharing: Data and resources are distributed, just as people
are. Some computational resources are expensive (such as supercomputers or
sophisticated telescopes) or difficult to redistribute (such as large or
proprietary data sets); it isn't feasible for each end user to have local
access. With a distributed system, however, we can support and coordinate remote
access to such data and services. We could, for instance, build a distributed
application that continually collects data from a telescope in California, pipes
it to a supercomputer in New York for number crunching, adds the processed data
to a large astronomical data set in New Mexico, and at the same time graphs the
data on our workstation monitor in Connecticut.
Fault tolerance and availability: Nondistributed systems typically
have little tolerance for failure; if a standalone application fails, it
terminates and remains unavailable until it is restarted. Distributed systems,
on the other hand, can tolerate a limited amount of failure, since they are
built from multiple, independent processes--if some fail, others can continue.
By designing a distributed application carefully, we can reduce "down time" and
maximize its availability.
Elegance: For many problems, software solutions are most naturally
and easily expressed as distributed systems. Solutions often resemble the
dynamics of an organization (many processes working asynchronously and
coordinating) more than the following of a recipe (one process following
step-by-step instructions). This shouldn't be surprising, since the world at
large, along with most of its organizations, is a distributed system.
Instructing a single worker to sequentially assemble a car or run a government
is the wrong approach; the worker would be overly complex and hard to maintain.
These activities are better carried out by specialists that can handle specific
parts of the larger job. In general, it is often simpler and more elegant to
specify a design as a set of relatively independent services that individual
processes can provide--in other words, as a distributed system.
1.2 Challenges of Distributed Computing
Despite their benefits, distributed applications can be notoriously difficult to
design, build, and debug. The distributed environment introduces many
complexities that aren't concerns when writing standalone applications. Perhaps
the most obvious complexity is the variety of machine architectures and software
platforms over which a distributed application must commonly execute. In the
past, this heterogeneity problem has thwarted the development and proliferation
of distributed applications: developing an application entailed porting it to
every platform it would run on, as well as managing the distribution of
platform-specific code to each machine. More recently, the Java virtual machine
has eased this burden by providing automatic loading of class files across a
network, along with a common virtual machineTM
that runs on most platforms and allows applications to achieve "Write once, Run
anywhereTM" status.
The realities of a networked environment present many challenges beyond
heterogeneity. By their very nature, distributed applications are built from
multiple (potentially faulty) components that communicate over (potentially slow
and unreliable) network links. These characteristics force us to address issues
such as latency, synchronization, and partial failure
that simply don't occur in standalone applications. These issues have an
significant impact on distributed application design and development. Let's take
a closer look at each one:
Latency: In order to collaborate, processes in a distributed
application need to communicate. Unfortunately, over networks, communication can
take a long time relative to the speed of processors. This time lag, called
latency, is typically several orders of magnitude greater than communication
time between local processes on the same machine. As much as we'd like to sweep
this disparity under the rug, ignoring it is likely to lead to poor application
performance. As a designer, you must account for latency in order to write
efficient applications.
Synchronization: To cooperate with each other, processes in a
distributed application need not only to communicate, but also to synchronize
their actions. For example, a distribute algorithm might require processes to
work in lock step--all need to complete one phase of an algorithm before
proceeding to the next phase. Processes also need to synchronize (essentially,
wait their turn) in accessing and updating shared data. Synchronizing
distributed processes is challenging, since the processes are truly
asynchronous--running independently at their own pace and communicating, without
any centralized controller. Synchronization is an important consideration in
distributed application design.
Partial failure: Perhaps the greatest challenge you will face when
developing distributed systems is partial failure: the longer an application
runs and the more processes it includes, the more likely it is that one or more
components will fail or become disconnected from the execution (due to machine
crashes or network problems). From the perspective of other participants in a
distributed computation, a failed process is simply "missing in action," and the
reasons for failure can't be determined. Of course, in the case of a standalone
application, partial failure is not an issue--if a single component fails,
then the entire computation fails, and we either restart the application
or reboot the machine. A distributed system, on the other hand, must be
able to adapt gracefully in the face of partial failure, and it is your
job as the designer to ensure that an application maintains a consistent
global state (a tricky business).
These challenges are often difficult to overcome and can consume a
significant amount of time in any distributed programming project. These
difficulties extend beyond design and initial development; they can plague a
project with bugs that are difficult to diagnose. We'll spend a fair amount of
time in this book discussing features and techniques the JavaSpaces technology
gives us for approaching these
challenges, but first we need to lay a bit of groundwork.
1.3 What Is JavaSpaces Technology?
JavaSpaces technology is a high-level coordination tool for gluing processes
together into a distributed application. It is a departure from conventional
distributed tools, which rely on passing messages between processes or invoking
methods on remote objects. JavaSpaces technology provides a fundamentally
different programming model that views an application as a collection of
processes cooperating via the flow of objects into and out of one or more
spaces. This space-based model of distributed computing has its roots
in the Linda coordination language developed by Dr. David Gelernter at Yale
University. We provide several references to this work in Chapter 12.
A space is a shared, network-accessible repository for objects.
Processes use the repository as a persistent object storage and exchange
mechanism; instead of communicating directly, they coordinate by exchanging
objects through spaces. As shown in Figure 1.1, processes perform simple
operations to write new objects into a space, take objects
from a space, or read (make a copy of) objects in a space.
When taking or reading objects, processes use a simple value-matching lookup
to find the objects that matter to them. If a matching object isn't found
immediately, then a process can wait until one arrives. Unlike conventional
object stores, processes don't modify objects in the space or invoke their
methods directly--while there, objects are just passive data. To modify
an object, a process must explicitly remove it, update it, and reinsert
it into the space.
Figure 1.1. Processes use spaces and simple operations to
coordinate.
To build space-based applications,
we design distributed data structures and distributed
protocols that operate
over them. A distributed data structure is made up of multiple objects
that are stored in one or more spaces. For example, an ordered list of
items might be represented by a set of objects, each of which holds the
value and position of a single list item. Representing data as a collection
of objects in a shared space allows multiple processes to concurrently
access and modify the data structure.
Distributed protocols define the way participants in an application share and
modify these data structures in a coordinated way. For example, if our ordered
list represents a queue of printing tasks for multiple printers, then our
protocol must specify the way printers coordinate with each other to avoid
duplicating efforts. Our protocol must also handle errors: otherwise a jammed
printer, for example, could cause many users to wait unnecessarily for jobs to
complete, even though other printers may be available. While this is a simple
example, it is representative of many of the issues that crop up in more
advanced distributed protocols.
Distributed protocols written using spaces have the advantage of being
loosely coupled: because processes interact indirectly through a space
(and not directly with other processes), data senders and receivers aren't
required to know each other's identities or even to be active at the same time.
Conventional network tools require that all messages be sent to a particular
process (who), on a particular machine (where), at a particular time (when).
Instead, using a JavaSpaces system, we can write an object into a space with the
expectation that someone, somewhere, at some time, will take the object and make
use of it according to the distributed protocol. Uncoupling senders and
receivers leads to protocols that are simple, flexible, and reliable. For
instance, in our printing example, we can drop printing requests into the space
without specifying a particular printer or worrying about which printers are up
and running, since any free printer can pick up a task.
The JavaSpaces technology's shared,
persistent object store encourages the use of distributed data structures,
and its loosely coupled nature simplifies the development of distributed
protocols. These topics form the major theme of this book--before diving
in and building our first space-based application, let's get a better idea
of the key features of the technology and how spaces can be used for a
variety of distributed and collaborative applications.
1.3.1 Key Features
The JavaSpaces programming interface is simple, to the point of being minimal:
applications interact with a space through a handful of operations. On the one
hand, this is good--it minimizes the number of operations you need to learn
before writing real applications. On the other hand, it begs the question: how
can we do such powerful things with only a few operations? The answer lies in
the space itself, which provides a unique set of key features:
Spaces are shared: Spaces are network-accessible "shared memories"
that many remote processes can interact with concurrently. A space itself
handles the details of concurrent access, leaving you to focus on the design of
your clients and the protocols between them. The "shared memory" also allows
multiple processes to simultaneously build and access distributed data
structures, using objects as building blocks. Distributed data structures will
be a major theme of Chapter 3.
Spaces are persistent: Spaces
provide reliable storage for objects. Once stored in the space, an object
will remain there until a process explicitly removes it. Processes can
also specify a "lease" time for an object, after which it will be automatically
destroyed and removed from the space (we will cover leases in detail in Chapter
7).
Because objects are persistent,
they may outlive the processes that created them, remaining in the space
even after the processes have terminated. This property is significant
and necessary for supporting uncoupled protocols between processes. Persistence
allows processes to communicate even if they run at non-overlapping times.
For example, we can build a distributed "chat" application that stores
messages as persistent objects in the space and allows processes to carry
on a conversation even if they are never around at the same time (similar
to email or voice mail). Object persistence can also be used to store preference
information for an application between invocations--even if the application
is run from a different location on the network each time.
Spaces are associative: Objects in a space are located via
associative lookup, rather than by memory location or by identifier.
Associative lookup provides a simple means of finding the objects you're
interested in according to their content, without having to know what the object
is called, who has it, who created it, or where it is stored. To look up an
object, we create a template (an object with some or all of its fields set to
specific values, and the others left as null
to act as wildcards). An
object in the space matches a template if it matches the template's specified
fields exactly. We'll see that with associative lookup, we can easily express
queries for objects such as: "Are there any tasks to compute?" or "Are there any
answers to the prime factor I asked for?" We will cover the details of matching
in the next chapter.
Spaces are transactionally secure: The JavaSpaces technology
provides a transaction model that ensures that an operation on a space is atomic
(either the operation is applied, or it isn't). Transactions are supported for
single operations on a single space, as well as multiple operations over one or
more spaces (either all the operations are applied, or none
are). As we will see in Chapter 9, transactions are an important way to deal
with partial failure.
Spaces allow us to exchange executable content: While in the space,
objects are just passive data--we can't modify them or invoke their methods.
However, when we read or take an object from a space, a local copy of the object
is created. Like any other local object we can modify its public fields as well
as invoke its methods, even if we've never seen an object like it before. This
capability gives us a powerful mechanism for extending the behavior of our
applications through a space.
1.3.2 JavaSpaces Technology in Context
To give you a sense of how distributed applications can be modeled as objects
flowing into and out of spaces, let's look at a few simple use scenarios.
Consider a space that has been set up to act as an "auction room" through which
buyers and sellers interact. Sellers deposit for-sale items with descriptions
and asking prices (in the form of objects) into the space. Buyers monitor the
space for items that interest them, and whenever they find some, they write bid
objects into the space. In turn, sellers monitor the space for bids on their
offerings and keep track of the highest bidders; when an item's sale period
expires, the seller marks the object as "sold" and writes it back into the space
(or perhaps into the winning buyer's space) to close the sale.
Now consider a computer animation production house. To produce an animation
sequence, computer artists create a model that must then be rendered for every
frame of a scene (a compute-intensive job). The rendering is often performed by
a network of expensive graphics workstations. Using the JavaSpaces technology, a
series of tasks--for instance, one task per frame that needs to be rendered--are
written into the space. Each participating graphics workstation searches the
space for a rendering task, removes it, executes it, drops the result back into
the space and continues looking for more tasks. This approach scales
transparently: it works the same way whether there are ten graphics workstations
available or a thousand. Furthermore, the approach "load balances" dynamically:
each worker picks up exactly as much work as it can handle, and if new tasks
get added to the space (say another animator deposits tasks), workers will
begin to compute tasks from both animation sequences.
Last, consider a simple multiuser chat system. A space can serve as a "chat
area" that holds all the messages making up a discussion. To "talk," a
participant deposits message objects
into the space. All chat members wait for new message objects to appear,
read them, and display their contents. The list of attendees can also be
kept in the space and gets updated whenever someone joins or leaves the
conversation. Late arrivals can examine the existing message objects in
the space to review previous discussion. In fact, since the space is persistent,
a new participant can view the discussion long after everyone else has
gone away, and participants can even come back much later to pick up the
conversation where they left off.
These examples illustrate some of the possible uses of spaces, from workflow
systems, to parallel compute servers, to collaborative systems. While they leave
lots of details to the imagination (such as how we achieve ordering on chat
messages) we'll fill them in later in the book.
1.4 JavaSpaces Technology Overview
Now we are going to dive into our first example by building the obligatory
"Hello World" application. Our aim here is to introduce you to the JavaSpaces
programming interface, but we will save the nitty-gritty details for the next
chapter. We are going to step through the construction of the application piece
by piece, and then, once it is all together, make it a little more interesting.
1.4.1 Entries and Operations
A space stores entries. An entry is a collection of typed objects that
implements the Entry
interface. Here is an example "message" entry,
which contains one field--thecontent of the message:
public class Message implements Entry {
public String content;
public Message() {
}
}
We can instantiate a Message
entry and set its content to "Hello World"
like this:
Message msg = new Message();
msg.content = "Hello World";
With an entry in hand,
we can interact with a space using a few basic operations: write
,
read
and take
(and a couple others that we will get to
in the next chapter). The write
method places one copy of an entry
into a space. If we call write
multiple times with the same entry,
then multiple copies of the entry are placed into the space. Let's obtain
a space object and then invoke its write
method to place one copy
of the entry into the space:
JavaSpace space =
SpaceAccessor.getSpace();
space.write(msg, null, Lease.FOREVER);
Here we call the getSpace
method of the SpaceAccessor
class, which returns an instance of an object that implements the
JavaSpace
interface (which we will refer to as a "space object" or
"space" throughout this book). We then call write
on the space object,
which places one copy of the entry into the space. We will define a
SpaceAccessor
class and cover the details of the write
method
in the next chapter; for now, it's enough to know that getSpace
returns
a JavaSpace
object and that write
places the entry into the
space.
Now that our entry exists in the space, any process with access to the
space can read it. To read an entry we use a template, which is an
entry that may have one or more of its fields set to null
. An entry
matches a template if the entry
has the same type as the template (or is a subtype), and if, for every
specified (non-null
) field in the template, their fields match
exactly. The null
fields act as wildcards and match any value. We
will get back to the details of matching in the next chapter, but for now
let's create a template:
Message template = new Message();
That was easy. It is
important to point out that the content
field of the template is
by default set to null
(as Java does with all noninitialized object
fields upon creation). Now let's use our template to perform a read
on the space:
Message result = (Message)space.read(
template,
null, Long.MAX_VALUE);
Because the template's content
field is a wildcard (null
),
the template will match
any Message
entry in the space (regardless of its contents). We're
assuming here that this space is our own private space and that other processes
are not writing or reading Message
entries. So, when we execute
our read
operation on the space, it matches and returns a copy of
the entry we wrote there previously and assigns it to result
. Now
that we have our entry back, let's print its content:
System.out.println(result.content);
Sure enough, we get:
Hello World
For the sake of completeness,the take
operation is just like
read
, except that it withdraws the matching entry from the space. In
our code example, suppose we issuea take
instead of a read
:
Message result = (Message)space.take(
template,
null, Long.MAX_VALUE);
We would see the same output
as before. However, in this case, the entry would have been removed from
the space.
So, in just a few steps we've
written a basic space-based "Hello World" program. Let's pull all these
code fragments together into a complete application:
public class HelloWorld {
public static void main(String[] args) {
try {
Message msg = new Message();
msg.content = "Hello World";
JavaSpace space =
SpaceAccessor.getSpace();
space.write(
msg, null, Lease.FOREVER);
Message template = new Message();
Message result =
Message)space.read(template,
null, Long.MAX_VALUE);
System.out.println(result.content);
} catch (Exception e) {
e.printStackTrace();
}
}
}
In this code we've kept things simple by wrapping the code in a
try
/catch
statement that catches all exceptions. We also left
out the implementation of the SpaceAccessor
. We will return to both of
these topics in the next chapter. However, note that you can find the complete
source for each example at the book's web site
http://java.sun.com/docs/books/
jini/javaspaces.
Before moving on, let's step back a bit--with a small bit of simple code,
we've managed to send a message using spaces. Our HelloWorld
places a
message into a space, in effect broadcasting a "Hello World" message to anyone
who will listen (everyone with access to the space who is looking for
Message
entries). Rightnow, HelloWorld
is by design the only
listener; it reads the entry and prints out its contents. In the next section we
will change that.
1.4.2 Going Further
Let's take our example and make it a little more interesting. In doing so,
you'll get a glimpse of the key features that make the JavaSpaces technology an
ideal tool for building distributed applications.
We'll begin by modifying the Message
class to hold not only a
message but also a count of how many times it has been read. So far, our
HelloWorld
application is the only process reading the message entry,
so we'll also create a HelloWorldClient
to read the entry. We will also
enhance our HelloWorld
application so that it can monitor the entry's
popularity by keeping track of how many times it has been read. Let's start with
the new Message
entry:
public class Message implements Entry {
public String content;
public Integer counter;
public Message() {
}
public Message(
String content, int initVal) {
this.content = content;
counter = new Integer(initVal);
}
public String toString() {
return content + " read " + counter +
" times.";
}
public void increment() {
counter = new Integer(counter.intValue(
) + 1);
}
}
We've added an Integer
field called counter
, and a new
constructor that sets the content
and counter
fields to values
passed in as parameters. We've also added a toString
method, which
prints the values of the fields, and a method called increment
, which
increments the counter by one.
Note that in all our examples, we've been violating a common practice of
object-oriented programming by declaring our entry fields to be public. In fact,
fields of an entry must be public in order to be useful; if they are instead
declared private or protected, then processes that take or read the entry from a
space won't be able to access their values. We'll return to this subject in the
next chapter and explain it more thoroughly.
Now let's modify the HelloWorld
class to keep track of the number of
times the Message
entry has been read by other processes:
public class HelloWorld {
public static void main(String[] args) {
try {
Message msg = new Message(
"Hello World", 0);
JavaSpace space =
SpaceAccessor.getSpace();
space.write(
msg, null, Lease.FOREVER);
Message template = new Message();
for (;;) {
Message result = (Message)
space.read(template,
null, Long.MAX_VALUE);
System.out.println(result);
Thread.sleep(1000);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Following along in our main
method, we first make use of our
Message
entry's new constructor, which takes a String
parameter and an initial counter value and assigns them to the content
field and the counter
field respectively. Next we obtain a space object
and write the Message
entry into the space. As in the previous version
of HelloWorld
, we then create a template (with null
fields)
to match the entry.
Now things become more interesting: we enter a for
loop that
continually reads the message entry from the space using template
. Each
time we read a Message
entry, we print out the value of its counter by
calling println
, which implicitly calls the message's
toString
method. The loop then takes a short breather by sleeping for
one second before continuing. If we now run this version, it will continually
print a counter value of zero because we are waiting for other processes to read
the entry and, so far, there are none.
So, let's write a HelloWorldClient
that will take Message
entries, increment their counters and place them back in the space:
public class HelloWorldClient {
public static void main(String[] args) {
try {
JavaSpace space =
SpaceAccessor.getSpace();
Message template =
new Message();
for (;;) {
Message result =
(Message)
space.take(
template,
null, Long.MAX_VALUE);
result.increment();
space.write(
result, null, Lease.FOREVER);
Thread.sleep(1000);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Just as in the HelloWorld
application, HelloWorldClient
first creates a template using the default constructor (both fields are set to
null
and act as wildcards). Rather than reading (as in
HelloWorld
) a message from the space, we take it out of the space and
assign it to result
. We then call result's
increment
method, which increments the counter
by one, and write the result back
into the space. Like HelloWorld
, we then sleep for one second and
repeat the entire process.
So let's now run HelloWorld
and then start up a few HelloWorldClients
. The output for a typical
run might look something like:
Hello World read 0 times.
Hello World read 1 times.
Hello World read 5 times.
Hello World read 10 times.
.
.
.
Let's trace through the
whole scenario to understand exactly what has happened. First, we started
up HelloWorld
, which deposits its Message
entry into the
space and enters a loop that reads the entry and prints the value of the
counter. The first time through the loop, the counter
field's value
is still zero (no other processes have yet updated the counter). We also
start up several HelloWorldClient
applications, which each begin
searching the space using a template of type Message
(with both
its fields set to null
to act as wildcards). Since the system is
asynchronous, the HelloWorldClients
access the Message
entry
in an unpredictable order. If a client tries to take the entry but it is
has already been removed, then the client simply blocks and waits until
the entry shows up in the space. Once a client manages to take an entry,
it calls the entry's increment
method to update the counter, and
then it returns the modified entry to the space.
Our output indicates that, by the second time HelloWorld
reads the
counter, one client process has accessed the entry and incremented its counter.
By the third time, four more clients have managed to access the entry. Finally,
by the fourth time, five more clients have accessed it. In general, the more
clients we add, the faster counter
gets incremented (although only so
many processes can take and write the entry in the one-second interval).
1.5 Putting It All Together
Even though our "Hello World" example is simple, it demonstrates the key
features of space-based programming and ties together many of the topics we've
covered in this chapter. JavaSpaces technology is simple and expressive: with
very little code (and only four lines that contain JavaSpace operations) we've
implemented a simple distributed application that provides concurrent access to
a shared resource (in this case a shared object). Because spaces provide a
high-level coordination mechanism, we didn't need to worry about multithreaded
server implementation, low-level synchronization issues, or network
communication protocols--usual requirements of distributed application design.
Instead, our example concretely illustrates what we said earlier in this
chapter--that we build space-based applications by designing distributed data
structures along with distributed protocols that operate over them.
HelloWorld
uses a very simple distributed data structure: a shared
object that acts as a counter. It also uses a simple protocol: clients
take
the Message
entry to gain exclusive access to it,
increment the counter, and then write
the entry back to the space to
share it once again. This protocol deserves a closer look.
Note that the protocol is loosely coupled--HelloWorld
writes an
entry into the space without worrying about the specifics of which clients will
access it, how many, from where, or when. Likewise, the
HelloWorldClients
don't care who generated the entry, where, or when;
they simply use associative lookup to find it. They don't even care if the entry
exists in the space or not, but are content to wait until it shows up. Because
the entry is persistent, clients can even show up at much later times (possibly
even after HelloWorld
has terminated) to retrieve it.
In our example, processes use entries to exchange not only data (a counter)
but also behavior. Processes that create entries also supply proper methods of
dealing with them, removing that burden from the processes that look up the
entries. When a HelloWorldClient
retrieves a Message
entry, it
simply calls the object's increment
method (without needing to know how
it works).
Our distributed protocol also relies on synchronization. Without coordinated
access to a shared resource--in this case, the counter--there would be no way to
ensure that only one process at a time has access to it, and processes could
inadvertently corrupt it by overwriting each other's changes. Here, to alter an
entry, a process must remove it, modify it, and then return it to the space.
While the process holds the entry locally, no other processes can access or
update it. Transactional security of spaces also plays a key part in
guaranteeing this exclusive access: If a process succeeds at a take
operation, the entry is removed and returned atomically, and the process is
guaranteed to have the only copy of the entry.
This isn't to say our simple example covers everything. Although we can trust
a single space operation to be transactionally secure (either it completes or it
doesn't), there is nothing in our current example to prevent the
Message
entry from being irretrievably lost if a client crashes or gets
cut off from the network after taking the message entry from the space (as often
happens in the presence of partial failure). In cases like this, to ensure the
integrity of our applications, we will need to group multiple space operations
into a transaction to ensure that either all operations complete (in our
example, the entry gets removed and returned to the space) or none occur (the
entry still exists in the space). We'll revisit the topic of transactions in
greater detail in Chapter 9.
1.6 Advantages of JavaSpaces Technologies
We hope that in this introduction you've gained a sense for why you might want
to build your next distributed application using spaces. If your application can
be modeled as a flow of objects into and out of spaces (as many can), then the
JavaSpaces technology offers a number of compelling advantages over other
network-based software tools and libraries:
It is simple.
The technology doesn't require learning a complex programming interface;
it consists of a handful of simple operations.
It is expressive.
Using a small set of operations, we can build a large class of distributed
applications without writing a lot of code.
It supports loosely coupled protocols.
By uncoupling senders and receivers, spaces support protocols that are simple,
flexible, and reliable. Uncoupling facilitates the composition of large
applications (we can easily add components without redesigning the entire
application), supports global analysis (we can examine local computation and
remote coordination separately), and enhances software reuse (we can replace any
component with another, as long as they abide by the same protocol).
It eases the burden of writing client/server systems.
When writing a server, features
such as concurrent access by multiple clients, persistent storage, and
transactions are reinvented time and time again. JavaSpaces technology
provides these functionalities for free; in most cases, we only need to
write client code, and the rest is handled by the space itself.
The beauty of JavaSpaces technology is that it can be grasped easily and used
in powerful ways. In comparison to other distributed programming tools,
space-based programming will, in many cases, ease design, reduce coding and
debugging time, and result in applications that are more robust, easier to
maintain, and easier to integrate
with other applications.
1.7 Chapter Preview
This book is about building distributed and collaborative applications with the
JavaSpaces technology. As with any programming methodology, a number of general
principles and patterns have emerged from the use of spaces, and we will spend
the bulk of this book covering them. Our aim is to help you explore new ways of
thinking about, designing, and building distributed applications with spaces
(and in short order so that you can quickly begin to create your own distributed
applications). The following is a roadmap to what you'll find as you make
your way through this book:
Chapters
Chapter 2--JavaSpaces Application Basics--lays the foundation you
will need to understand and experiment with the examples in the rest of the
book. In a tutorial style, we cover the mechanics of creating a space-based
application and introduce the syntax and semantics of the JavaSpaces API
and class library.
Chapter 3--Building Blocks--presents basic "distributed data
structures" that recur in space-based applications, and describes common
paradigms for using them. Code segments are given to illustrate the examples,
which include shared variables, bags, and indexed structures. This chapter lays
the foundation for the next two: Synchronization and Communication.
Chapter 4--Synchronization--builds upon Chapter 3 and describes
techniques for synchronizing the actions of multiple processes. We start with
the simple idea of a space-based semaphore and incrementally present more
complex examples of synchronization, from sharing resources fairly, to
controlling a group in lockstep, to managing multiple readers and writers.
Chapter 5--Communication--also builds upon Chapter 3 and describes
common
communication patterns that can be created using distributed data structures.
We first introduce space-based message passing and then explore the principles
behind space-based communication (which provides a number of advantages
over conventional communication libraries). We then present a "channel"
as a basic distributed data structure that can be used for many common
communication patterns.
Chapter 6--Application Patterns--introduces several common
application
patterns that are used in space-based programming, including the
replicated-worker pattern, the command pattern, and the marketplace pattern. In
each case, we develop a simple example application that makes use of the
pattern. We also provide a general discussion of more ad hoc patterns.
Chapter 7--Leases--begins the book's coverage of more advanced
topics.
Spaces use leases as a means of allocating resources for a fixed period
of time. This chapter explores how to manipulate and manage the leases
created from writing entries into a space. The techniques covered for managing
leases are also applicable to distributed events and transactions, which
are covered in the next two chapters.
Chapter 8--Distributed Events--introduces the Jini distributed event
model and shows how applications can make use of remote events in conjunction
with spaces.
Chapter 9--Transactions--introduces the idea of a transaction as a
tool for counteracting the effects of partial failure in distributed
applications. This chapter covers the mechanics as well as the semantics of
using transactions.
Chapter 10--A Collaborative Application--explores the creation of a
distributed interactive messenger service using spaces. This collaborative
application makes use of the full JavaSpaces API, and also some of the advanced
topics encountered in previous chapters, namely leases, events, and
transactions.
Chapter 11--A Parallel Application--explores parallel computing with
spaces. We first building a simple compute server and then a parallel
application
that runs on top of it. Both are used to explore issues that arise when
developing space-based parallel applications. Like the collaborative
application,
in this chapter we make full use of the JavaSpaces API and its advanced
features.
Chapter 12--Further Exploration--provides a set of references
(historical and current) that you can use as a basis for further exploration.
Appendices A, B, and C--contain the official Jini Entry
Specification, Jini Entry Utilities Specification,
and JavaSpaces Specification written by the Jini product team at Sun
Microsystems, Inc.
Online Supplement
The online supplement to this book can be accessed at the World Wide Web site
http://java.sun.com/docs/books/
jini/javaspaces.
The supplement includes the following:
- Full source code to examples
- Links to sites with related information, including
links to specifications, whitepapers, and demonstration programs
- Corrections, supplements, and commentary generated
after this book went to press
1.8 Exercises
- List all the applications you can think of that
can't be represented as the flow of objects into and out of a space. Review
your list. Are you sure?
- Rerun the second version of
HelloWorld
; this time run several
HelloWorldClients
along with two HelloWorld
applications. What happens? What is the cause of this output?
-
Change
HelloWorld
and HelloWorldClient
so that multiple
instances of the application can be run independently by different users at the
same time, within the same space. Was it difficult?
-
Write a program using the standard Java API networking classes (from the
java.net
package) that has the same functionality as our first version
of HelloWorld
. Compare the two.
1As used on this web site, the terms "Java
virtual machine" or "JVM" mean a virtual machine for the Java platform.
About the Authors
Eric Freeman is co-founder and CTO of Mirror Worlds
Technologies, a Java and Jini-based software company. Dr. Freeman previously
worked at Yale University on space-based systems, and is a Fellow at Yale's
Center for Internet Studies. Susanne Hupfer is Director of
Product Development for Mirror Worlds Technologies and a fellow of the Yale
University Center for Internet Studies. Dr. Hupfer previously taught Java
network programming as an Assistant Professor of Computer Science at Trinity
College. Ken Arnold is the lead engineer of the JavaSpaces
product at Sun. He is one of the original architects of the Jini platform and
is co-author of The JavaTM Programming
Language, Second Edition.
To Top
Reader Feedback
Tell us what you think of this book excerpt.