Speakers James Ducan Davidson (ducan) and Rajiv Mordani (rajiv)
Moderator Edward Ort (MDR-edO)
This is a moderated forum.
MDR-edO: Welcome to Java Live! Our topic today is JAVATM API for XML Parsing (JAXP). Our guests are James
Davidson and Rajiv Mordani, both JAXP and XML gurus. So let's begin. Who has the
first question?
JAT: Is there any way to get a nicely indented display of a
DocumentFragmentor
, more generally, a Node?
rajiv: There are internal APIs available in the com.sun.xml.tree. If you
look at the documentation available at XML
you will see the documentation for doing that.
mike: Does the JAXP API support DTD parsing?
duncan: Mike: DTD Parsing is something that's supported by the underlying
parser directly. There aren't hooks directly in the JAXP API for associating
DTDs with a doctype at this time.
Mike Nowak: What's the best way to deploy the JAXP software in
an applet?
duncan: Mike Now: Applet Parsing... Unfortunately, you've hit a bug in
the JAXP 1.0 RI. We've got a 1.0.1 on tap that will let you use JAXP in applets.
krishna: When will JAXP be with J2EE, if I want to use JAXP
with J2EE 1.2? What are the kludges I need to do?
duncan: Krishna: As far as using JAXP in J2EE, it's simple enough. Just
put the JARs on the classpath and use the classes. Easy enough. As far as
inclusion into the J2EE core, we're looking at doing that in a future release of
the J2EE. But, you don't have to wait. Just use the jar files.
john: We have a DTD with 99 elements, of which only 10 of them
are REQUIRED. The other elements are optional. We are using a DOM parser, and
parsing XML documents, and validating them against this DTD. Whenever the XML
documents contain most of the optional elements, we have no problems. However,
when the XML documents only contain the required elements, and only a few of the
optional elements, we find that our DOM parser slows tremendously, and
eventually hangs the system. As more documents containing fewer of the optional
elements are parsed, we find the DOM parser takes longer (and even hangs). What
is causing this? Is there any better way to do parsing/validating against a DTD
where there are a lot of optional elements?
rajiv: If your documents can get big you may want to use SAX as an
alternate. However, DOM shouldn't be hanging in any case. If you can send the
details of the file to xml-feedback@java.sun.com we will
get back to you with any suggestions.
Erik D.: How would you define JAXP in respect to DOM, SAX (and
DOM2, SAX2) as well as xml.apache.org efforts (Xerces/Xalan)
Mike Nowak: Aren't the com.sun.xml.tree
classes
private for the most part?
duncan: ErikD: So, JAXP sits on top of SAX and DOM providing a bit of
glue support. Not much more. DOM2/SAX2 are coming along with JAXP 1.1. As far as
Xerces, we're working with the Apache implementation to make sure that JAXP 1.0
support is in there.
rajiv: Mike, all the classes that need to be used in
com.sun.xml.tree
are exposed . The others that are package private
are designed to be so. You should be fine in using the public APIs.
jon2: What's the best way to copy an Xml document?
Ed: I noticed that some JAXP classes access system properties
that are restricted from an applet. Thus, I cannot build an applet that uses
the JAXP without modifying the client policy file. Was this intentional? If
so, why? If not, can we make it so that we can use the JAXP from an applet?
duncan: So, that's that bug in JAXP 1.0 that we're going fix in JAXP
1.0.1. It wasn't intentional that it was restricted out. Sorry about that. (For
now, yes, editing the client policy works. And we're working to get that bug fix
out ASAP)
rajiv: jon2, you can use the DOM APIs clone method. The clone method
takes a boolean deep which if set to true will clone the whole Xml document for
you.
Russell: What is the status/schedule for schema support?
duncan: Russell, schema support, like DTDs, these are supported by the
parser directly. So if the XML defines use of a schema, you'll have to use a
parser that groks them. Right now the JAXP 1.0 RI doesn't support them. There's
preliminary support in the Xerces parser, which means that when Xerces supports
JAXP, you'll be able to have transparent support using that combo.
JP: Why was ElementNode
in project X defined so
you cannot use appendNode but have to cast the ElementNode to either an
ElementEx or an Element? When should you use Element, ElementEx and ElementNode
to describe an XML element?
aah: compare to JAXP to IBM-XML4J, where one has more maturity
API?
rajiv: JP: I am not quite sure what your question here is?
ElementNode
implements ElementEx
which is just an
interface to provide more support like Namespaces in DOM Level 1. You should be
able to typecast to ElementNode
and appendNode
.
duncan: IMBXML4J vs. JAXP... JAXP is a API like DOM or SAX that XML4J
(built on the Xerces source base) can support. So there's not a direct
comparison. They are orthagonal. We're working to have JAXP support in Xerces
which hopefully means that XML4J will have support in the future.
Christophe Zeila: A question from Paris to our XML guru: do you
have any news about XML's extensible linking language anD its fantastic
capabilities? Many thanks, Christophe.
Mike Nowak: Thanks, and thanks for putting this software out,
it's really helpful. Any idea when 1.0.1 will be available.
rajiv: Mike Now: 1.0.1 is ready from engineering point of view. It is
currently going through QA cycle and will be available soon.
duncan: Bon jour Christophe. It's late in Paris isn't it? XLink is
working its way through the W3C. We're interested in it and looking into how we
can support it best. So no news yet on our front, but we're looking at it.
raygao:What about DataBind
? When will that be in
beta and available for download?
Shikha Arora: I have read about XML and have tried to use it with
an ASP database and Access. I was getting the data from the database and
creating and XML document, rendering it through a style sheet and creating my
output. How can I start on something that basic using Java? What is the best
resource? Is there an example available I can download or look at to get
started?
duncan: Raygo: Project Adelard (DataBindings) will be out in Beta spec
and RI this summer--and you'll be able to grab it from our site at that time. XML
aah: is JAXP API for individual use or free for commercial use?
rajiv: Shikha A: If you download the JAXP RI there are a few basic
examples available. There is also a tutorial
available. These two places should be a good start.
duncan: JAXP is free for commercial use. Anybody can implement it. And
anybody can take the RI and bundle it however you want as long as you don't
change the javax.xml.* classes, of course.
donald: Are non-Sun suppliers of XML Parsers adapting their
software to conform to the JAXP interface? Specifically Oracle, IBM/Apache?
Ted Beers: The Project X implementation allows a node's parent
to be changed with XmlDocument.changeNodeOwner()
, which is critical
for moving nodes from one Document to another. This feature appears to be
missing from the document standard, though. Is there some other means to take
nodes from one document and put them into a different (but compatible) document?
duncan: Yep. non-sun people are in the fray. Lots of partners are
implementing right now. Apache Xerces, we're working directly in there to do it.
Both Rajiv and I are on the Xerces project. And, there's a prelim implementation
of JAXP in the Xerces source base checked in by Pier F. If you keep up with
Xerces, you'll know who he is.
Nancy2: Is there any performance improvement over DOM or SAX
with JAXP?
rajiv: Ted Beer: DOM Level 2 has that support. The JSR for JAXP 1.1 (JSR
063) is already filed for supporting SAX 2.0, DOM Level 2 and XSLT. Till then
you can use the internal APIs provided through com.sun.xml.tree.*
duncan: JAXP sits on top of and uses DOM/SAX as underlying APIs, so any
use of JAXP will by necessity be tied to a DOM or SAX implementation with the
same performance characteristics.
JP: you cannot though because appendNode
is in
ParentNode
, which is not exposed so you get an exception unless you
use the cast. Maybe it is a bug?
duncan: Performance is a top issue of ours and one of the things we're
doing with the Xerces folks is working to find out how all the different types
of documents that people use affect performance and how we can look at helping
parsers get faster and faster.
rajiv: As long as you don't try to typecast to ParentNode
you are fine. If you typecast to ElementNode
and call
appendNode
it should work fine. There is however a bug in the older
JDKs where you will get IllegalAccessError
. See the release notes
for JAXP 1.0 for details about that.
duncan: Somebody asked about JAXP into J2EE. I just want to add that
we're looking at getting JAXP into J2SE so that its there in all Java runtimes.
No concrete news yet, but it's something that were working on.
zankl: Is the parsers performance an issue if I try to digest
huge amounts of XML? Does a factory delivering a native parser implementation
make sense?
rajiv: Performance on the server side is critical. However, using a
native parser may not be necessary. Java parsers can perform equally or maybe
even better. If you look at xerces-j
and xerces-c
, the
Java parser performs better than the C parser.
Christophe Zeila: Another question: is there anyone you know in
the US working on a new graphical software GUI able to dynamically manipulate
objects like text, image sound video and generate the XML DDT and XSL documents?
I know somebody working on that software GUI in France. Are you interested by
that kind of project?
duncan: Christophe, I've seen several people working on software that
generates XML. For example, the latest thing that I've been playing with is a
plug in to Illustrator that dumps out SVG. I'm sure that people are doing the
same with SMIL et all.
jon2: Please summarize the additions/changes since JAXP was
Java Project X technology release 2.
srp: Do you know of any API that can be used to parse a DTD
(not the XML) and hand me back something I can traverse (kind of like DOM)?
duncan: Jon2: Basically the differences between JAXP 1.0 and Project X
TR2 was the JAXP classes, the private support classes implementing the JAXP
interfaces, and a lot of bug fixes and performance tweaks.
rajiv: xerces-j has some kind of DTD parsing support . You should take a
look at that.
Christian: Greetings! Just jumped on... right place at the right
time, I guess! But forgive me if this has been asked and answered... I am new
to JAXP, and am a bit frustrated with the JAXP tutorial. I haven't been able to
discover an answer to this question: I have an XML file, and I need to turn this
data into serialized objects. Which parser is most appropriate for this task,
SAX or DOM? Is there a paper or tutorial that could start me down the road to
this approach? Thanks much!
dinusha: I am currently working on MPEG7 work, which involves XML
schema. Can the JAXP support this
rajiv: Christia: Serialization of DOM is in the requirements list for
Level 3. Take a look at www.w3.org/DOM. You
can also take a look at Project Adelard (Java
Data binding project). Adelard beta should be available this summer.
Lassie Jorge: Is there going to be infrastructure built-in to
the JAXP to easily switch to different parser implementation, like in JDBCTM?
duncan: Sure, JAXP will work fine with parsers that support Schema. Once
Xerces ships a build that has the JAXP classes in it, you'll be probably be able
to validate this sort of work with it. Point is that JAXP supports the concept
of configuring validation or not but leaves it to the parser to support
different validation types. Of course, DTD validation is in the XML 1.0 spec and
Schema is in early days still. But nothing keeps a parser from doing JAXP and
Schema support together.
gbellows: You mentioned preliminary support for schema with
Xerces. But what about generating Java classes automatically that can parse and
generate structures to access the data?
duncan: What you are really talking about is Data Bindings. Project
Adelard, Out this summer in beta version. It takes an XML DTD and creates code
that will transparently parse in XML into objects and back again. There probably
is more info on our website.
rajiv: SAX is an event driven API and DOM can be built as an application
of SAX. SAX is serial whereas you can traverse the resulting object tree in any
fashion using the DOM APIs.
rajiv: JAXP is designed to make it easy to switch between different
parser implementation. If you read the documentation for JAXP you just need to
set a System property to do that.
nsherman: Could you please point to a resource that explains the
differences and interplay between SAX and DOM? It had seemed in looking at the
APIs that SAX was a parsing API, and DOM the resulting object model, but from
context here it seems that the distinction is not so clear-cut. Thanks!
Christian: Sirs, what are your most valued/treasured XML
resources? Books primarily, or any web pages that are worth a darn.
duncan: XML Resources. I keep the following book marked:
www.w3.org
www.xml.org/
www.xmlhack.com
www.xml.com
java.sun.com/xml :)
Erik D.: How do you mean. JAXP 1.0 support is in there ?
Provide a unified parsing API for whatever parser is used in the back ? A.k.a
making xerces a JAXP plugin?
aah: what is a Xerces? how is that related to Jaxp? Thank you!
duncan: Erik... Sort of.. More like implementing the 4 JAXP classes in
Xerces. It's providing glue logic so that when you create a
DocumentBuilderFactory, an underlying XercesDocumentBuilderFactory is actually
created which knows how to perform functionality with Xerces. These classes are
checked into the source base, but haven't yet made into a build.
theo_MTA: Does JAXP parser do a content validation? That is, if an
attribute according to DTD can have an enumeration of values, does the parser
throw an exception if XML document has that attribute with a different value. Is
there anyway of doing content validation when creating a DOM document?
duncan: Right now in JAXp there's not a convenience API to validate a
DOMThis is part of the work of DOM level 3 that the W3C is undertaking.
rajiv: Xerces is the Apache parser implementation. The source code for
the JAXP implementation has been checked into the repository but not made
available through the builds.
duncan: theo... Yes, you can set the setValidating
option on
the ParserFactories
so that the parser will do content validation.
If the content isn't valid, the parser calls methods on the given
ErrorHandler
allowing your code to react sanely.
gonzo: What is needed to support XLink and XPointer using JAXP?
Do I get them for free? Thanks.
duncan: Depends on the parser implementation Yes, I know that I keep
saying that, but the point of JAXP is to be a pluggability layer to the parser.
:) Anyway, if your parser does XLink/Pointer, then you'll get it. Now, there's a
bigger XPointer item that we'll have to look at with XSLT support in
JAXP.next
, but we're still figuring that out.
MDR-edO: Well, our hour has quickly drawn to a close. I want to thank all
of you who have participated, and I want to especially thank our guests James
and Rajiv.
rajiv: Bye everybody. Thank you for the questions. If you have any
further questions please send them to
xml-feedback@java.sun.com and
visit java.sun.com/xml for further updates
to the various XML technologies.
duncan: By the way, we've had lots of questions that we haven't gotten
to. Our chat system actually got pretty loaded down for a while there. So,
apologies to those that we didn't get to. We'll be around in various places
though. Upcoming places where we'll be talking: WWW9 in Amsterdam, XML Europe,
and of course JavaOne. In addition, we're out on the mailing lists.
MDR-edO: Last moderator (me) signing off. The forum is now unmoderated
Reader Feedback
Tell us what you think of this transcript.