Overview
| User Interfaces
| Main Features
| Download and Installation
| Related (Java) Projects
| Acknowledgments
PAL: Phylogenetic Analysis Library
A Java library for molecular evolution and phylogenetics
Version 1.4 (January 20, 2002)
The PAL project (http://www.pal-project.org) is
a collaborative effort dedicated to provide a
high quality Java library
for use in molecular evolution and phylogenetics.
This package may be distributed under the terms of the
GNU Lesser General Public License
Copyright (c) 1999-2002 PAL Development Core Team.
Core Team (with CVS rights):
Alexei Drummond
(2000-)
email: [email protected]
School of Biological Sciences, University of Auckland,
3A Symonds Street, Auckland, New Zealand.
Ed Buckler
(2001-)
email: [email protected]
Department of Genetics,
North Carolina State University,
2523 Gardner Hall, Box 7614,
Raleigh, NC 27695-7614, USA.
Korbinian Strimmer
(1999-)
email: [email protected]
Department of Statistics, University of Munich, Ludwigstrasse 33, 80539 Munich, Germany.
Other Contributions:
Matthew G. Goode, University of
Auckland (GUI, codon and alignment classes, utilities)
Sean Luke, University of Maryland (Random number generator)
Oliver G. Pybus, University of Oxford
(Demographic models)
Andrew Rambaut, University of Oxford
(Yang codon model, demographic models)
Jesse Stone, ?
(Optimization code)
Overview:
The PAL project is a collaborative effort to provide a
high quality Java library
for use in molecular evolution and phylogenetics.
Updates of PAL are released in regular intervals.
At present (version 1.4) PAL consists of approximately 200 public
classes/interfaces in 16 packages
with a total of more than
35,000 lines of Java code. Please refer to the
API documentation for a detailed description
of all classes and methods available, and to the
release history for an overview
of the development history of PAL.
Please feel free to use PAL or parts of it in your own programs,
the terms and conditions for copying, distribution and
modification of the GNU Lesser General Public License
apply. (Note that previous versions of PAL were licensed under the
GNU GPL rather the GNU LGPL.)
Contributing code to PAL is greatly encouraged, please read the following
guidelines.
If you wish to cite PAL please use
-
Drummond, A., and K. Strimmer.
2001. PAL: An object-oriented programming library for molecular
evolution and phylogenetics. Bioinformatics 17: 662-663.
This paper provides an overview over PAL version 1.1.
back to top
User Interfaces:
PAL is designed as an object-oriented programming library, and as such
it has no direct user interface. To access the methods and objects available
in PAL you need to program in Java.
However, there are user-friendly programs available that
rely (in part) on PAL to do their job. Be aware, however, that they will usually
restrict functionality to some core features of PAL:
-
Vanilla
aims at providing a simple bare bones text interface to some selected features in
PAL. It basically consists of a set of command line applications, most of them
in PHYLIP CUI style. Best run in a UNIX (incl. MacOS X) or MSDOS shell.
This package is also useful to see how you could use
PAL in your own program.
-
Pebble
(codenamed vCEBL) is much more broadly targeted. It not only offers a GUI interface
to PAL but also comes with its own functional command language. PEBBLE has
an easy-to-use installation program.
Three other ways of accessing the functionality in PAL without requiring
Java programming experience are planned:
-
It should be rather straightforward to use features available in PAL from
within the
Mesquite
program, a modular system for evolutionary analysis written in Java by Wayne and Dave
Maddison.
-
In collaboration with Andrew Rambaut (University of Oxford) it is planned to
develop a general XML interface definition for evolutionary analysis, which will be
implemented for PAL and for Andrew Rambaut 's
Nautilus C++
libraries.
-
Java provides the opportunity to write web-based applications (applets)
that can be executed in a web browser. In collaboration with
Andrew Rambaut it is planned to write simple
applets, e.g. for skyline plot analysis, tree viewing etc.
back to top
Main Features:
PAL is entirely written in the Java language.
This allows for a clean object oriented design while avoiding the complexities of C++.
Moreover, Java class code runs without needing recompilation on a wide range on
platforms. Additionally, PAL also compiles into native code on Unix systems
(just like C++) using the
GNU compiler for Java (gcj),
part of recent releases of the
GNU compiler collection (gcc).
Corresponding makefiles are included with this distribution of PAL.
PAL consists of a rich variety of objects to facilitate
the construction of special-purpose tools for phylogenetic analysis.
PAL contains, e.g., ready-to-use objects for:
-
reading and writing sequence alignments, distance matrices, and trees
-
a large variety of substitution models for nucleotides and amino acids
(REV, TN, HKY, F84, F81, JC; Dayhoff, JTT, MTREV24, BLOSUM, VT, WAG, CPREV)
as well as for codons (Yang codon model)
-
Various models for rate variation over sites (invariable sites, Gamma)
-
efficient maximum-likelihood estimation of pairwise distances and
of branch lengths in a tree
(for unconstrained, clock, and dated-tips clock trees)
-
simulating coalescence intervals and estimation of demographic
parameters
-
likelihood ratio and chi-square tests and for
comparison of phylogenetic hypotheses
(e.g., Kishino-Hasegawa and Shimodaira-Hasegawa tests)
-
manipulating alignments (e.g., bootstrapping)
and trees and simulating data
- optimizing uni- and multivariate functions by various methods,
computing numerical derivatives, random numbers (simulation quality),
sorting etc.
-
creating formatted input and output from/to files,
standard io streams, and strings, through convenience classes that extend the
standard Java IO library
-
construct neighbor-joining, UPGMA and SUPGMA trees, and estimating least-squares
branch lengths on trees (weighted and unweighted LS)
- translate nucleotide to AA sequences
- access mathematical special functions (gamma, error, binomial)
and pdf, cdf, and quantile functions of statistical distributions
(gamma, exponential, chi-square, normal, Pareto)
-
create split systems from trees and compute partition distances
between trees (Robinson-Foulds distance)
-
XML interface for PAL objects (this uses the org.w3c.dom library which
is included with PAL - please see the
copyright info of this library)
- ...
All classes are members of one of the 16 PAL packages
(alignment, coalescent, datatype, distance, eval, gui, io, math, mep, misc,
popgen, statistics,
substmodel, tree, util, xml). A detailed list of these packages
along with a description of the public and protected
interfaces and functions and the purpose of each single class in the library
is available in the
API Documentation.
back to top
Download and Installation:
By design, PAL can be installed and run on any platform where Java 1.1 (or better)
is available.
PAL is distributed in two variants, one for Unix/MSDOS (includes makefiles
for jikes/gcj) and one for Macintosh (includes project file for CW 6):
PAL is also available from Don Gilbert's IUBio archive (University of Indiana):
After uncompressing using appropriate tools
the following simple directory structure will be created:
pal-1.4
______|______
| |
doc src
The "doc" folder contains this page, the API manual
and some other documentation and the "src" contains the complete source code.
Note that there are no precompiled class files.
To actually compile the sources you need to set the class path properly to
the folder "pal-1.4/src". Please consult the manual for your Java
development kit (or your system administrator) for details.
Compile of PAL into native code on Unix has been successfully tested
using the following software (or any later version):
Makefiles for compilation are provided with the PAL sources.
Note that gcc version 3.0 (June 2001) includes native Java compilation
by default (no need to download libgcj separately).
For compilation on Macintosh a Metrowerks Codewarrior project
file (suitable for CW 6) is included.
Older versions of PAL are also available, please refer to the
release history.
back to top
Related (Java) Projects:
To our knowledge, PAL is the only project that aims at
providing a collaborative Java library for molecular evolution and phylogenetics.
Similar "private" projects are, however,
currently undertaken, e.g.,
by Andrew Rambaut and Mike Charleston
(their Nautilus project)
or by the Felsenstein lab (both in C++).
Other Java projects in molecular evolution and bioinformatics include:
-
Mesquite,
a modular system for evolutionary analysis is developed by Wayne Maddison
and David Maddison. Mesquite is designed to allow third-party components
and plug-in modules and provides a nice graphical user interface.
-
The BioJava project is collaborative
project that aims at creating a Java library for general bioinformatics
applications (BLAST, CORBA etc.).
-
Christian Zmasek's program FORESTER
(BSD license) is a Java project aimed at phylogenomics.
ATV is part of
FORESTER and is a useful tree viewer that can also displays trees written
in the NHX format (an extended NH format).
-
John Brzustowski distributes
qclust,
a set of Java classes implementing a variety of clustering methods, including
UPGMA and neighbor-joining.
-
Ed Buckler has developed a number of Java applications
for phylogenetic and linkage analysis
(e.g., Phylogeographer and TASSEL
). Ed is now a co-developer of PAL.
-
Don Gilbert distributes a number of his Java programs in his IUBio Archive,
for example the sequence editor
SeqPup
and the tree drawing program
Phylodendron.
- David Posada and collaborators have written
GEODIS,
a Java program for cladistic nested analysis of the geographical distribution
of genetic haplotypes,
and
TCS,
a Java program to estimate gene genealogies
using statistical parsimony.
Finally, the general issue of scientific programming in Java is discussed, e.g., on the
Java Numerics
web page.
back to top
Acknowledgments:
We thank Oliver Pybus, Andrew Rambaut, Rick Ree, Allen Rodrigo,
and Wayne Maddison for discussion and valuable suggestions.
We also thank Allen Rodrigo for providing hardware through his NIH grant #59174
and Don Gilbert for distributing PAL in his software archive.
This work is also supported by
a Bright Future´s Scholarship of FRST to A.D.
and an Emmy-Noether-Fellowship of the DFG to K.S.
back to top
Last modified: January, 2002