By now, you'll probably have come across my notes on
processing XML using SAX.
If not, do take a look.
This page describes my initial explorations of the second
low-level
XML
API known as
DOM (Document Object Model).
This takes the approach of converting an XML document to a tree
structure. This is clearly useful for allowing random access into
modestly sized XML documents, but will be less useful than SAX for
large documents, as it reads the entire XML document into memory to
create the tree.
Using the methods defined in the Java org.w3c.dom package,
I developed a Java program to read an XML file into a DOM structure,
and print the DOM in a simple tree format.
This involved a recursive function to print the members of a Node, and
its children (indented one tab stop). For an Element type node, the tag
was extracted using getTagName(), and the list of attributes using
getAttributes(). A single call to printtree(document) in main() then
printed the entire XML document specified on the command line.
Details of the process:
-
I looked first at the
DOM tutorial from Sun Microsystems
which is actually part of their JAXP tutorial set.
-
I then downloaded the
JAXP software
from Sun Microsystems and unpacked the file jaxp-1_1.zip using the unzip command.
-
I made sure PATH is set to
include the Java utilities I downloaded in Sun's JDK:
PATH=/usr/local/jdk1.2.2/bin:$PATH export PATH
This overrides old versions of Java utilities supplied with my Linux
installation.
-
I worked through the JAXP DOM example, to the point where an
XML document can be turned into a DOM structure.
In order to compile the sample code, its necessary to set
specify the classpath which was done as follows:
javac -classpath /mnt/DOS_hda1/Linux/jaxp1.1/jaxp-1.1/jaxp.jar: \
/mnt/DOS_hda1/Linux/jaxp1.1/jaxp-1.1/crimson.jar \
DomEcho.java
This should create the Java class file DomEcho.class.
-
I then remembered that the
Xerces parser
implements the classes necessary to get this far,
including the required parts of JAXP API.
In other words, with no code changes required, I could build the
same DomEcho.class file using the command:
javac -classpath /mnt/DOS_hda1/Linux/xerces1.3.1/xerces-1_3_1/xerces.jar \
DomEcho.java
-
I developed the remainder of the Java program using the
Xerces API documentation
for reference.
|