In the world of enterprise computing there’s always been one big problem: interoperability. How do you get all your business systems to communicate with one another, when they’re using different data types, not to mention different operating systems? With organisations becoming larger and larger, and more and more distributed, this problem is getting bigger – not smaller. And now, another element has entered the equation: your customers and suppliers. How can you work directly with their systems, without costing them and you large sums of money?
One option is to use the World Wide Web. The combination of browser and web server is a powerful and easy to use tool for delivering information to users on many different platforms over local and wide area networks. Unfortunately, the very flexibility of the HTML tags used to display web pages means that they’re not suitable for delivering structured information. HTML tags only describe how data is to be displayed, not what the data is. As business data is normally structured information, this means that HTML is a poor tool for handling business-to-business communications and application integration.
If web technologies are to become the key method of building business applications, then we need to use structured data, otherwise the tools required to deal with information shared over the web will need to be extremely complex – as they will be working with the equivalent of raw text. Whilst it is possible to develop parsing technologies based on keywords, a new generation of document tags is required, which will allow them to describe their own contents. This concept is known as “metadata”: information about information. Using metadata it is possible to take a structured text document, parse it, and deliver the content to a database or to an application.
The solution to this problem lies in the history of the World Wide Web. The initial versions of HTML were based on the SGML international standard document formatting mark-up language (in fact, HTML is itself defined as an SGML DTD or Document Type Data). The structure of SGML is designed to allow users to create structured documents that contain metadata. By using SGML, documents are able to define their own grammar, providing embedded rules for describing document layout, so that SGML documents can be designed for specific purposes. This is achieved by providing definitions for the tags used in the documents.
As web technologies link more and more businesses and provide front ends to business systems, metadata will become increasingly important. Organisations have different ways of displaying and encoding information, and the use of metadata allows information transfer systems to be implemented. The World Wide Web Consortium (W3C) has understood the importance of metadata to the future development of the Web, and has been working to produce a specification for allowing structured information to be distributed over the Internet. This has been recently released as the specification for XML 1.0 - the eXtensible Mark-up Language.
By allowing developers to create data schema for XML documents, it’s possible to create general-purpose document and message structures that can then be used by businesses to transfer information. As the rules for parsing XML documents have been defined, and implemented as parsers for most major platforms, XML can be used to link incompatible computer systems by formatting data as an XML message, transmitting it via any possible means, and then extracting the information and using it. By implementing such a system, the only development required will be the export and import tools.
XML alone is a powerful tool, but when combined with the XSL, eXtensible Style sheet Language, filtering and formatting system it becomes even more useful. By allowing organisations to create their own XSL style sheets they can not only format and display the contents of an XML document in a standard web browser, they can also use XSL as part of any XML processing sub-system. XSL contains a concise document filtering and sorting language, which allows it to be used as a significant part of the business logic of any XML parsing component – reducing the complexity of any XML processing software, and allowing rapid change without requiring rewriting of any applications.
XML was originally designed to enable document sharing over the web, but it’s more powerful than that, as it also provides the framework for a new generation of electronic commerce applications. Initially XML is most likely to be used in messaging applications, and in integrating messaging frameworks into business systems.
The most obvious area where XML will be used is in developing web-based EDI systems. Instead of delivering carefully formatted messages to specific suppliers or customers, a business will be able to define <ORDER> or <INVOICE> documents with their associated tags. Simple applications written in Visual Basic or Java, linked to widely available XML parsers, could then be used to interpret the XML pages and transfer the information directly into business systems, whilst the messages themselves could be displayed in a web browser or an off-the-shelf office suite. By building XML into an inter-organisation message framework, a company based around one specific OS and application platform can integrate its operations with one based around a different set of tools. This type of application can also be moved inside the organisation, where internal procurement systems and ERP systems can be enhanced with the use of messaging over distributed asynchronous networks.
XML can also form the basis of more complex web applications, especially in the development of brokerage systems. A good example of this type of application is an quotation system for an insurance brokers. A traditional web application can be used to capture user details and requirements, and then collate them into an XML document. This can then be delivered to the quotation systems of various underwriters. The information held in the XML document can then be parsed and translated for use. A similar process can be used to deliver the resulting quotations to the brokerage application for display. By using XML, the brokerage can give its users the same single point of contact as a direct insurance service.
Another area where XML is likely to become important is in the development of web site content control systems. Large corporate sites are complicated systems, where information needs to be categorised before publication. By using XML tags to describe a page as <NEWS> or as <SUPPORT> a content management tool can place the item in the correct place in the site. It’s also possible to tag an item as <ENGLISH> or <GERMAN>, and then use server-side script driven content negotiation techniques to determine the localisation of a browser, and then deliver appropriate content without any user intervention.
XML isn’t limited to the web, and is also moving into the world of the desktop application, where up to now proprietary file formats have held sway. Microsoft is using XML as a key feature of Office 2000, and XML features strongly in the next release of Corel’s WordPerfect suite. Documents can be saved as HTML, with XML being used to store the standard document properties information, and can then be shared using an Intranet – without requiring every user to have a copy of Excel or Word to read, or edit, the documents. This will also allow tools like Netscape’s Compass search engine and Microsoft’s Index server to effectively catalogue and categorise Intranet information resources, as part of a knowledge management implementation.
Creating your XML document and schema is only the first part of implementing an XML solution. However, it’s a very important part indeed! There are a rapidly growing number of XML document development tools on the market. These range from simple modified text editors to complex graphical development environments.
Initially you’ll probably want to put together your first XML documents in a text editor, so you can get to grips with the structure of an XML file. There are plenty of books and web sites devoted to XML, so you shouldn’t have any problems getting started.
It’s a good idea to create a DTD for your XML, as this helps you define your document’s schema. The following XML document is designed to transfer data I the form of a payment request from one server to another, as part of a business-to-business e-commerce application. The XML document is easy enough to read, and parse. However it’s only when associated with its DTD that the full meaning of the document structure is apparent, and the message can be validated and used by an application.
Listing 1: XML document
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE PAYMENT_REQUEST SYSTEM "http://www.myserver/dtds/invoice.dtd" ><PAYMENT_REQUEST> <!--This is a sample file XML file--><INVOICE APPROVED="Simon Bisson" DATE="19/07/1999" LEVEL="URGENT" SIGNED="Simon Bisson"> <CUSTOMER>XML Publishing Ltd.</CUSTOMER><PURCHASE_ORDER>OX123</PURCHASE_ORDER><AMOUNT>£36.30</AMOUNT></INVOICE> </PAYMENT_REQUEST>Listing 2: XML DTD
<!DOCTYPE PAYMENT_REQUEST SYSTEM "http://www.myserver/dtds/invoice.dtd"><!ELEMENT PAYMENT_REQUEST(INVOICE)+><!ELEMENT INVOICE (CUSTOMER,PURCHASE_ORDER,AMOUNT)> <!ATTLIST ARTICLE APPROVED CDATA #REQUIRED><!ATTLIST ARTICLE SIGNED CDATA #REQUIRED><!ATTLIST ARTICLE DATE CDATA #IMPLIED> <!ATTLIST ARTICLE LEVEL CDATA #IMPLIED><!ELEMENT CUSTOMER (#CDATA)><!ELEMENT PURCHASE_ORDER (#CDATA)><!ELEMENT AMOUNT (#CDATA)>Whilst you can develop your XML documents in a text editor, an alternate option is to use a dedicated XML editor. These can vary from simple text editor variants such as Microsoft’s XML Notepad to complex tools from SGML vendors, including SoftQuad’s XMetaL. An interesting tool is Vervet Logic’s XML Pro, which is a Java application and uses the IBM XML parser.
An alternative method of displaying the document is to use an XSL style sheet to format and display your XML data in a web browser. By defining an XSL style sheet, the payment request XML message can be displayed as a pro forma invoice.
XML is an important part of any modern object oriented application development environment. Whether you’re using CORBA or COM, Java or Visual Basic, you’ll be able to use an XML parser in your applications.
Using an XML parser in an application can be very easy. The following code, from a VBScript ASP application, will instantiate Microsoft’s XML parser twice. The first instance holds an XML document “xmlfile”, whilst the second holds an XSL style sheet “xslfile”. By calling the transformNode method of the document, the XSL style sheet is used to filter and format the XML document, ready for display.
Listing 3: XML object usage
set doc = Server.CreateObject(“Microsoft.XMLDOM”)doc.load(Server.MapPath(xmlfile)set style = Server.CreateObject(“Microsoft.XMLDOM”)style.load(Server.MapPath(xslfile) result = doc.transformNode(style.documentElement)response.write(result)Whilst this code section is designed to display its results as a dynamically generated HTML document, the basic techniques can be used to import data into databases or applications, or even to create a new XML document. Similar techniques can be used to create valid XML documents as an output from an application.
By using these simple XML components in your applications, an XML gateway can be created quickly and easily, allowing you to translate the output from your applications into cross-platform XML documents.
One of the most interesting roles for XML is likely to change the face of distributed application development. Instead of implementing an n-tier application framework, where objects communicate via dedicated protocols, XML allows applications to be decoupled, and implemented in messaging frameworks.
By using XML messaging, applications can be distributed over wider areas than previously practical, and also allowed to operate over uncertain links. Using these technologies a salesman on the road can use a simple handheld PC as if it was part of the core business systems of his organisation. When he takes an order, he can fill it in a database application on his handheld device, and hit a single button: “Send Order”. As far as he is concerned, he has taken an order, and it is now on its way to the organisation’s main systems, even though he’s disconnected from them. The order has in fact been encoded as an XML message, and is now sat in an assured message queuing system – such as Microsoft’s MSMQ or IBM’s MQseries. When the salesman connects his handheld by modem to the office to check his email, the message queue delivers the order to the company’s business systems, allowing them to start work on processing the XML message, and fulfilling the order. An XML-driven messaging framework has allowed his handheld, whether it’s a fully-fledged laptop PC or a PC companion-style device, to act as part of his company’s distributed computing environment, even when completely disconnected from it.
One of the most interesting messaging frameworks currently under development is Microsoft’s Biztalk business-to-business e-commerce environment. This is an application framework designed around agreed and defined XML document schema. The Biztalk web site, www.biztalk.org, is a clearing-house for groupings of organisations with common interests. A high level server is currently under development that will allow organisations to define trading relationships, before using XML as a tool for integrating their business systems. The Biztalk server is intended to also act as a tool for configuring message transport systems, so that messages can be transmitted by the most appropriate means – using technologies like email and HTTP as well as assured message delivery queues. Unusually Microsoft has publicly committed itself to a cross-platform Biztalk implementation, one that will run as happily on systems built using Linux, Apache and Perl as on Windows NT.
James Utzschneider, Microsoft’s Director of Business Frameworks, has said that Biztalk is currently being used to define schema for amongst others: legal filings, flight data recordings, accounting and reporting, value chain systems, document sharing and the Human Genome Project.
