University of Oxford |
|
Computing Services |
|
Processing XML using Java |
|
Author: Barry Cornelius Date: 1st May 2001 |
<table border=1> <tr> <td><b>inkjet cartridges</b></td> <td><em>HP Deskjet print cartridge 51645A</em></td> <td bgcolor="yellow">19.50</td> </tr> ... </table>
<consumables> <product> <category>inkjet cartridges</category> <item>HP Deskjet print cartridge 51645A</item> <price>19.50</price> </product> ... </consumables>
<price>19.50</price>is called an element. It is introduced by a start tag. In this example, the start tag is:
<price>And it is terminated by an end tag:
</price>A start tag may have one or more attributes as in:
<price units="pounds" country="UK">19.50</price>An element may be empty as in:
<price amount="19.50"></price>and this can be abbreviated to:
<price amount="19.50"/>Another example is:
<unknown_price></unknown_price>and this can be abbreviated to:
<unknown_price/>
As I see it, there are three main goals of XML:
<!ELEMENT consumables (product*)> <!ELEMENT product (category, item, price)> <!ELEMENT category (#PCDATA)> <!ELEMENT item (#PCDATA)> <!ELEMENT price (#PCDATA)>
More recently, XML schemas have been introduced. They can be used to define the structure of some XML (instead of using a DTD). Two of the main differences are:
<xsl:template match="consumables"> <html> <body> <table> <xsl:apply-templates/> </table> </body> </html> </xsl:template>
Three approaches for processing an XML document will now be considered.
We could write a program to process an XML document. The program could be used to produce HTML which we could then make available on the WWW. More details about this are given in Section 5.
The problem with using a tool is that we have to remember to run it everytime the XML document is updated. The remaining two approaches enable the user of a browser to work directly on the XML document.
If a webserver supports the running of server-side programs, then a browser visiting a page could trigger a webserver to run a program that reads an XML document and produces HTML. Although this could be done using a CGI program or by running a PHP script, Section 6 looks at how a Java program could be run by a webserver.
XMLReader tXMLReader = new org.apache.xerces.parsers.SAXParser();
Handler tHandler = new Handler();
tXMLReader.setContentHandler(tHandler);
FileReader tFileReader = new FileReader("consumables.xml"); InputSource tInputSource = new InputSource(tFileReader); tXMLReader.parse(tInputSource);
public class Handler extends DefaultHandler { public void startElement(String pURI, String pLocalName, String pQualifiedName, Attributes pAttributes) { System.out.println(pLocalName); } }
The full code of a program that produces HTML from the consumables.xml file is given at http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/ConvertConsumables.java and http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/Handler.java.
java org.apache.xalan.xslt.Process -in consumables.xml -xsl consumables.xsl -out consumables.html
XML parsers (including the SAX and DOM APIs) and XSLT processors are available from:
Some articles and books about SAX, DOM and XSLT include:
<HTML> <BODY> <FORM METHOD="POST" ACTION= "http://altair.dur.ac.uk:8080/barry.cornelius/servlet/choice"> Please enter your choice<BR> <INPUT TYPE="text" NAME="choice"> <BR> <INPUT TYPE="submit" VALUE="Submit choice"> <INPUT TYPE="reset"> </FORM> </BODY> </HTML>
import javax.servlet.http. HttpServlet; import javax.servlet.http. HttpServletRequest; import javax.servlet.http. HttpServletResponse; import java.io. IOException; import java.io. PrintWriter; import javax.servlet. ServletException; public class choice extends HttpServlet { public void doPost(final HttpServletRequest request, final HttpServletResponse response) throws ServletException, IOException { final String tChoiceString = request.getParameter("choice"); response.setContentType("text/html"); final PrintWriter tResponsePrintWriter = response.getWriter(); StringBuffer tStringBuffer = new StringBuffer(); tStringBuffer.append("<html>\n" ); tStringBuffer.append("<title>Reply</title>\n" ); tStringBuffer.append("Your choice was: " + tChoiceString + "\n" ); tStringBuffer.append("</html>\n" ); tResponsePrintWriter.println(tStringBuffer); tResponsePrintWriter.close(); } }
<HTML> <BODY> <FORM METHOD="POST" ACTION= "http://altair.dur.ac.uk:8080/barry.cornelius/prices/prices.jsp"> What is your choice of price?<BR> <INPUT TYPE="text" NAME="testprice"> <BR> <INPUT TYPE="submit" VALUE="Submit price"> <INPUT TYPE="reset"> </FORM> </BODY> </HTML>
<%@page language="java" import="java.sql.*" %> <% String tMaxPriceString = "1000000000000.0"; if (request.getParameter("testprice") != null) { tMaxPriceString = (String)request.getParameter("testprice"); } final Driver tDriver = (Driver)Class.forName("org.gjt.mm.mysql.Driver").newInstance(); final Connection tConnection = DriverManager.getConnection( "jdbc:mysql://www.dur.ac.uk/Pdcl0bjc_prices", "", ""); final PreparedStatement tPreparedStatement = tConnection.prepareStatement( "SELECT * FROM consum WHERE price < " + tMaxPriceString + " ORDER by price"); tPreparedStatement.setQueryTimeout(0); final ResultSet tResultSet = tPreparedStatement.executeQuery(); %> <html> <head> <title>Access to the prices database</title> </head> <body bgcolor="#FFFFFF"> <table> <% while (tResultSet.next()) { %> <tr> <td> <%= ((tResultSet.getObject("price")!=null)?tResultSet.getObject("price"):"") %> </td> <td> <%= ((tResultSet.getObject("goods")!=null)?tResultSet.getObject("goods"):"") %> </tr> <% } %> </table> </body> </html> <% tResultSet.close(); tConnection.close(); %>
<%and
%>brackets.
<html><body> <p>start</p> <jsp:useBean id="tAdder" scope="page" class="addition.Adder" /> <p>effectively this applies setFirst(27) to tAdder<p> <jsp:setProperty name="tAdder" property="first" value="27" /> <p>effectively this applies setSecond(42) to tAdder<p> <jsp:setProperty name="tAdder" property="second" value="42" /> <p> now apply getSum to tAdder </p> <p>sum is <jsp:getProperty name="tAdder" property="sum" /> </p> <p>effectively this applies setFirst(5) to tAdder<p> <jsp:setProperty name="tAdder" property="first" value="5" /> <p> now apply getSum to tAdder again </p> <p>sum is <jsp:getProperty name="tAdder" property="sum" /> </p> <p>finish</p> </body></html>
package addition; public class Adder { private String iX = "0"; private String iY = "0"; public void setFirst(String pFirst) { iX = pFirst; } public void setSecond(String pSecond) { iY = pSecond; } public String getSecond() { return iY; } public String getSum() { int tResult = Integer.parseInt(iX) + Integer.parseInt(iY); return "" + tResult; } }
<%@page language="java" %> <%@taglib uri="http://jakarta.apache.org/taglibs/xsl-1.0" prefix="xsltlib" %> <html> <head> <title>Consumables example</title> </head> <body> <xsltlib:apply xml="/xml.processing/consumables.xml" xsl="/xml.processing/consumables.xsl" /> </body> </html>
As time moves on, WWW browsers play catch-up with new developments. We now look at the support for XML (and XSL) that is provided in recent versions of browsers: we look at version 5.5 of Internet Explorer (IE5.5) and Milestone 18 of Mozilla augmented by the code of Mozilla's XSLT project (M18XSLT). Milestone 18 was released in October 2000. Since the release of Mozilla 0.9.1 on 6th June 2001, the code of the XSLT project has formed part of Mozilla. The latest release of Mozilla can be downloaded from http://www.mozilla.org/releases/.
Both IE5.5 and M18XSLT provide support for XML with CSS1 (version 1 of Cascading Style Sheet). To use CSS, you need to augment an XML document with a xml-stylesheet processing instruction. This results in the text that is in the file http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/css1.xml.
<?xml version="1.0" standalone="no"?> <?xml-stylesheet type="text/css" href="css1.css"?> <!DOCTYPE consumables SYSTEM "consumables.dtd"> <consumables> ... </consumables>The type attribute of this xml-stylesheet processing instruction says that this XML document is to be transformed using a CSS stylesheet and the href attribute gives the URL of the file containing the actual CSS instructions.
The file css1.css might contain:
consumables { display: block } product { display: block } category { font-size: x-large; color: black } item { font-size: large; color: red } price { background-color: yellow; color: blue }
CSS2 introduces some new aspects like the handling of tables. So instead we could use the file http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/css2.xml.
<?xml version="1.0" standalone="no"?> <?xml-stylesheet type="text/css" href="css2.css"?> <!DOCTYPE consumables SYSTEM "consumables.dtd"> <consumables> ... </consumables>where the file css2.css contains:
consumables { display: table } product { display: table-row } category { display: table-cell; font-size: x-large; color: black } item { display: table-cell; font-size: large; color: red; text-indent: 0.1in } price { display: table-cell; background-color: yellow; color: blue; text-align: right; text-indent: 0.1in }
Although these instructions are obeyed correctly by M18XSLT, they are not understood by IE5.5.
Both IE5.5 and M18XSLT can use XSL to process an XML document. You need to insert the following xml-stylesheet processing instruction in your XML document:
<?xml-stylesheet type="text/xsl" href="consumables.xsl"?>
Inconveniently, there are two important differences in the XSL that is understood by IE5.5 and M18XSLT:
<xsl:template match="/"> <xsl:apply-templates/> </xsl:template>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">whereas an M18XSLT XSL document requires:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html"/>
This is because IE5.5 has an old version of Microsoft's XML Parser (MSXML). The WWW page http://www.netcrucible.com/xslt/msxml-faq.htm gives details of how you can obtain Version 3.0 of MSXML (November 2000) and get IE5.5 to use it.
The files http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/xsl.xml and http://www.dur.ac.uk/barry.cornelius/java/xml.processing/code/consumables.xsl contain the texts that can be used by both M18XSLT and the modified version of IE5.5.