Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use rules apply.
This document describes use cases for evaluating the potential benefits of an efficient serialization format for XML. The use cases are documented here to understand the constraints involved in environments for which XML employment may be problematic because of one or more characteristics of XML. Desirable properties of XML and alternative formats to address the use cases are derived and discussed in a separate publication of the XML Binary Characterization Working Group (XBC WG) [XBC Properties].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a Working Group Note, produced by the XML Binary Characterization Working Group as part of the XML Activity.
This document is part of a set of documents produced according to the Working Group's charter, in which the Working Group has been determining Use Cases, characterizing the Properties that are required by those Use Cases, and establishing objective, shared Measurements to help judge whether XML 1.x and alternate binary encodings provide the required properties.
The XML Binary Characterization Working Group has ended its work. This document is not expected to become a Recommendation later. It will be maintained as a WG Note.
Discussion of this document takes place on the public public-xml-binary@w3.org mailing list (public archives).
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
1 Introduction
2 Use Case Structure
3 Documented Use Cases
3.1 Metadata in Broadcast Systems
3.1.1 Description
3.1.2 Domain & Stakeholders
3.1.3 Justification
3.1.4 Analysis
3.1.5 Properties
3.1.5.1 Must Have
3.1.5.2 Should Have
3.1.5.3 Nice To Have
3.1.6 Alternatives
3.1.7 References
3.2 Floating Point Arrays in the Energy Industry
3.2.1 Description
3.2.2 Domain
3.2.3 Justification
3.2.4 Analysis
3.2.5 Properties
3.2.5.1 Must Have
3.2.5.2 Should Have
3.2.5.3 Nice To Have
3.2.6 Alternatives
3.2.7 References
3.3 X3D Graphics Model Compression, Serialization and Transmission
3.3.1 Description
3.3.2 Domain & Stakeholders
3.3.3 Justification
3.3.4 Analysis
3.3.5 Properties
3.3.5.1 Must Have
3.3.5.2 Should Have
3.3.5.3 Nice To Have
3.3.6 Alternatives
3.3.7 References
3.4 Web Services for Small Devices
3.4.1 Description
3.4.2 Domain
3.4.3 Justification
3.4.4 Analysis
3.4.5 Properties
3.4.5.1 Must Have
3.4.5.2 Should Have
3.4.5.3 Nice To Have
3.4.6 Alternatives
3.4.7 References
3.5 Web Services within the Enterprise
3.5.1 Description
3.5.2 Domain
3.5.3 Justification
3.5.4 Analysis
3.5.5 Properties
3.5.5.1 Must Have
3.5.5.2 Should Have
3.5.5.3 Nice To Have
3.5.6 Alternatives
3.5.7 References
3.6 Electronic Documents
3.6.1 Description
3.6.2 Domain & Stakeholders
3.6.3 Justification
3.6.4 Analysis
3.6.5 Properties
3.6.5.1 Must Have
3.6.5.2 Should Have
3.6.5.3 Nice To Have
3.6.6 Alternatives
3.6.7 References
3.7 FIXML in the Securities Industry
3.7.1 Description
3.7.2 Domain
3.7.3 Justification
3.7.4 Analysis
3.7.5 Properties
3.7.5.1 Must Have
3.7.5.2 Should Have
3.7.5.3 Nice To Have
3.7.6 Alternatives
3.7.7 References
3.8 Multimedia XML Documents for Mobile Handsets
3.8.1 Description
3.8.2 Domain
3.8.3 Justification
3.8.4 Analysis
3.8.5 Properties
3.8.5.1 Must Have
3.8.5.2 Should Have
3.8.5.3 Nice To Have
3.8.6 Alternatives
3.8.7 References
3.9 Intra/Inter Business Communication
3.9.1 Description
3.9.2 Domain & Stakeholders
3.9.3 Justification
3.9.4 Analysis
3.9.5 Properties
3.9.5.1 Must Have
3.9.5.2 Should Have
3.9.5.3 Nice To Have
3.9.6 Alternatives
3.9.7 References
3.10 XMPP Instant Messaging Compression
3.10.1 Description
3.10.2 Domain & Stakeholders
3.10.3 Analysis
3.10.4 Justification
3.10.5 Properties
3.10.5.1 Must Have
3.10.5.2 Should Have
3.10.5.3 Nice To Have
3.10.6 Alternatives
3.10.7 References
3.11 XML Documents in Persistent Store
3.11.1 Description
3.11.2 Domain & Stakeholders
3.11.3 Justification
3.11.4 Analysis
3.11.5 Properties
3.11.5.1 Must Have
3.11.5.2 Should Have
3.11.5.3 Nice To Have
3.11.6 Alternatives
3.11.7 References
3.12 Business and Knowledge Processing
3.12.1 Description
3.12.2 Domain & Stakeholders
3.12.3 Justification
3.12.4 Analysis
3.12.5 Properties
3.12.5.1 Must Have
3.12.5.2 Should Have
3.12.5.3 Nice To Have
3.12.6 Alternatives
3.12.7 References
3.13 XML Content-based Routing and Publish Subscribe
3.13.1 Description
3.13.2 Domain & Stakeholders
3.13.3 Justification
3.13.4 Analysis
3.13.5 Properties
3.13.5.1 Must Have
3.13.5.2 Should Have
3.13.5.3 Nice To Have
3.13.6 Alternatives
3.13.7 References
3.14 Web Services Routing
3.14.1 Description
3.14.2 Domain & Stakeholders
3.14.3 Justification
3.14.4 Analysis
3.14.5 Properties
3.14.5.1 Must Have
3.14.5.2 Should Have
3.14.5.3 Nice To Have
3.14.6 Alternatives
3.14.7 References
3.15 Military Information Interoperability
3.15.1 Description
3.15.2 Domain & Stakeholders
3.15.3 Justification
3.15.4 Analysis
3.15.5 Properties
3.15.5.1 Must Have
3.15.5.2 Should Have
3.15.5.3 Nice To Have
3.15.6 Alternatives
3.15.7 References
3.16 Sensor Processing and Communication
3.16.1 Description
3.16.2 Domain & Stakeholders
3.16.3 Justification
3.16.4 Analysis
3.16.5 Properties
3.16.5.1 Must Have
3.16.5.2 Should Have
3.16.5.3 Nice To Have
3.16.6 Alternatives
3.16.7 References
3.17 SyncML for Data Synchronization
3.17.1 Description
3.17.2 Domain & Stakeholders
3.17.3 Justification
3.17.4 Analysis
3.17.5 Properties
3.17.5.1 Must Have
3.17.5.2 Should Have
3.17.5.3 Nice To Have
3.17.6 Alternatives
3.17.7 References
3.18 Supercomputing and Grid Processing
3.18.1 Description
3.18.2 Domain & Stakeholders
3.18.3 Justification
3.18.4 Analysis
3.18.5 Properties
3.18.5.1 Must Have
3.18.5.2 Should Have
3.18.5.3 Nice To Have
3.18.6 Alternatives
3.18.7 References
4 References
While XML has been enormously successful as a markup language for documents and data, the overhead associated with generating, parsing, transmitting, storing, or accessing XML-based data has hindered its employment in some environments. The question has been raised as to whether some optimized serialization of XML is appropriate to satisfy the constraints present in such environments. In order to address this question, a compatible means of classifying the requirements posed by specific use cases and the applicable characteristics of XML must be devised. This allows a characterization of the potential gap between what XML currently supports and use case requirements. In addition, it also provides a way to compare use case requirements to determine the degree to which an alternate serialization could be beneficial.
Use cases describing situations where some characteristics of XML may prevent its effective use are presented in this document. The XBC WG has made efforts through internal discussion and dialog with the XML community to define a comprehensive set of use cases to examine and compare potential solutions, including alternate serializations of XML as well as other available means. Comments are invited, especially if important use cases or solutions to address them have been omitted.
In this section we elaborate on the template used to present the use cases. All the use cases collected by this WG are listed in 3 Documented Use Cases.
Description: Provides an overview of the use case.
Domain & Stakeholders: Identifies the functional and/or business area(s) and users to which the use case pertains.
Justification: This is a discussion of why (or why not) a standard solution for efficient XML encoding should be pursued.
Analysis: Examines why XML is appropriate for addressing the use case, and the limitations of XML which hinder its use. Requirements for the use case are discussed in this section.
Properties: References/discusses the properties required to support the use case. For each use case the properties are partitioned into three categories:
Must Have: This is the set of properties which must be supported for a format to be adopted in the use case domain. This is intended to be a high bar in that an unsupported "must have" property would not simply make a format undesirable but actually unusable.
Should Have: This is the set of properties which are important, but not critical, to the use case. A format which did not support "should have" properties would be significantly less desirable than one that did. However, formats not supporting "should have" properties would still be usable for that use case.
Nice To Have: This is the set of properties which are not important, but supporting them brings some benefit to the use case. However, the benefit is generally minor and would be traded off to support "should have" or "must have" properties for that use case.
Alternatives: Presents alternatives to XML for addressing the use case.
References: Lists references to industries, standards, etc. that are related to the use case.
The use cases identified by or submitted to the working group are documented below, in accordance with the meta data defined in 2 Use Case Structure.
The constant progress of digital TV, the multiplication of channels, the competition and convergence with the Web, and the widespread deployment of a variety of set-top boxes (notably Personal Video Recorders, or PVRs for short) call for services on TV sets that extend beyond simply broadcasting audio/video content.
For instance, above a certain number of channels, broadcasters find themselves having to provide EPG (Electronic Program Guide) services to their users, without which they would be overwhelmed with the sheer amount of available content. These EPGs also allow PVRs to automatically pick up recording schedules for given programs, based on user-defined criteria that match against metadata broadcasted alongside the data.
Similarly, efforts are under way to define a world-wide standard for timed text systems under the leadership of the W3C Timed Text Working Group. Timed text is an essential service for a large variety of video use cases, such as broadcasting subtitles for foreign movies, providing accessible video information, or implementing karaoke systems.
However, there are constraints that cause problems when trying to deploy such services, all of which rely on XML, to television sets and other video devices:
Bandwidth: TV bandwidth is extremely expensive, and how much data you use for services directly constrains the number of channels that you are able to broadcast. In addition to the potential technical issues, there are strong economic motivations to reduce bandwidth usage as much as possible. One week of electronic program guide data easily amounts to roughly 30Mb of XML data.
Processing Power: Most set-top boxes are inexpensive, and the low-end ones suffer from low processing power. Contrary to mobile devices, there are few limitations as to the processing power that can be embarked in a box the size of the average set-top box (STB), notably the problems relating to heat and battery life are of little or no concern. However, the large-scale deployment of STBs and similar devices into households requires them to be extremely inexpensive and therefore as limited as possible (an average mid-end STB may have up to 16Mb of memory and processors up to 80Mhz, but frequently less, and systems anticipated for deployment in 2008 may have up to 120Mhz processors and up to 64Mb of memory). Also convergence with mobile devices remains a prime motivator for the television industry and constraints applicable to mobile devices apply equally to broadcasted XML metadata notably because of efforts such as DVB-H (Digital Video Broadcast for Handsets).
Unidirectional Network: This being broadcast, there is not typically a way for TVs to request data. Instead, it is being continuously streamed and re-streamed to them, a process which is called carouselling (the data itself being 'on a carousel'). Some set-top boxes do in fact have a return channel but the vast majority don't and it is not expected that most would support one in the near future.
Change Resilience. Upgrading several million STBs is often very impractical. Therefore, it must be possible to evolve the broadcast format without breaking older hardware. While XML is perfectly suited to this, the above issues make it unusable. It is thus required that the binary XML format replacing it be resilient to changes in the schema.
These constraints translate into a set of requirements: a decoder should be able to begin decoding at any point in the stream without waiting for an entire document to have been received and be able to reconstruct the document progressively; if a decoder fails to receive a fragment, then, within a limited time duration, it should be able to receive a repeated copy of that same fragment, knowing that some fragments may be more important than others and therefore repeated more frequently; and the transmission size of the data as well as the amount of processing power required to process it should be minimized.
As a result, MPEG-7 BiM (a binary encoding of XML originally created to carry video metadata), has been integrated into a number of broadcasting standards, notably ARIB, DVB, and TV Anytime.
This use case is relevant to the entirety of the television distribution industry, comprising content providers, broadcast infrastructure deployers, television and set-top box manufacturers, and of course the broadcasting companies themselves.
It also covers similar requirements that can be found in digital radio broadcasting, where one equally needs to broadcast EPG metadata to very limited devices, to integrate with mobile devices (for instance by sending SVG ads as part of the radio stream).
And finally, convergence with TV is considered to be a major next step in mobile services, and all participants on both sides of the fence are presently being extremely active in making television available anywhere, at any time, and on any device. Quite naturally, this leads to the need for common technology between the TV industry and the rest of the Web. As such, the major stakeholders in the domain have expressed interest in reusing a solution commonly agreed upon across several industries.
Television is a very large market that has a strong need for program metadata, and is increasingly converging with the Web at large (with a strong emphasis on mobile devices at first), notably using technologies such as XHTML, SVG, XForms, SMIL, and Time Text.
Deployed systems already use binary XML, currently standardized as part of ISO MPEG and industry fora such as ARIB, DVB, or TV-Anytime, but have expressed interest in using more broadly adopted technologies.
XML is appropriate for these situations because:
existing specifications based on XML are being reused wholesale;
most major TV standards in the area are already XML-based, and the industry has no wish to go through another standards cycle;
XML is well-suited to describing structured information such as metadata;
XML has proven to be a good format to specify user interfaces in, using notably XHTML, SVG, or XForms. These are needed for TV applications, and they need to be broadcasted;
the industry wishes to publish its data, especially the Electronic Program Guides, to as many media as possible. XML enables it to publish directly to desktops, mobile devices, and TVs using off-the-shelf or Open Source software.
DVB EIT schedules provide a small subset of the required functionality and do not integrate well with a more generic information ecosystem.
Without a W3C format of binary XML, application domains will likely adopt different formats--a phenomenon that already started in the Broadcast domain. This would further cluster the market. A convergence (which can be recognized right now) between the broadcast and the mobile communication services would be hindered by this clustering. In this concrete case, mobile devices would be required to implement codecs of both domains to enable value added services like interactive and location aware TV broadcast. This drives up the initial investment, which translates into a great obstacle for a converged service in the marketplace.
The upstream segment of the energy industry is concerned with exploration for and production of oil and gas. XML-based techniques have made very little penetration into the upstream technology part of the energy industry. The most basic reason for this is the nature of the data, which does not at this time lend itself to being represented usefully in XML.
There are basically two core types of data in this industry: well logs and seismic data. Well logs are moderately large datasets while seismic datasets are very large, typically in the order of gigabytes. Although the Petrotechnical Open Standards Consortium [POSC] has produced an XML schema for well logs [WellLogML], it has not been widely adopted by the industry. At the time of writing, we are not aware of anyone even considering a schema definition for seismic data.
One example of the magnitude of data and processing needs is data from marine seismic surveys. A typical data collection arrangement includes a ship, traveling at a fairly slow speed, trailing a cable behind it that is about two miles long and which contains about a hundred listening devices (geophones) spaced evenly along the cable. An air gun array on the boat fires every thirty seconds or so and the echos from the subsurface received by each geophone are recorded for about six seconds at a two millisecond sampling rate. That's 36 million words of floating point data per hour. The ship travels back and forth covering in a systematic way an area several miles on a side, resulting in a highly redundant 3D sampling of echos from the subsurface. The redundancy, loosely speaking, results from the fact that the cable is moving so that, for example, a given subsurface point might be sampled by geophone 10 on shot 100, geophone 14 on shot 101, geophone 18 on shot 102 and so on. The high degree of redundancy later results in a large number of processing operations involving reordering and sub-setting the data. The communication of this data and the necessity of these operations influences the formats and structures that are appropriate.
Both seismic and well log data include control data, easily represented in XML, as well as large arrays of floating point numbers, not easily represented efficiently in XML. Although in practice an XML representation is not used, such data may be represented as shown in the following fragment (with a whole document consisting of a large number of these fragments):
<seisdata> <lineheader>...<lineheader> <header> <linename>westcam 2811</linename> <ntrace>1207</ntrace> <nsamp>3001</nsamp> <tstart>0.0</tstart> <tinc>4.0</tinc> <shot>120</shot> <geophone>7</geophone> </header> <trace>0.0 0.0 468.34 3.245672E04 6.9762345E05 ... (3001 floats)</trace> <header> ... </header> <trace> ... </trace> ... </seisdata>
The scope within the Energy Industry as discussed above is very broad, encompassing a very large number of technical issues and usage scenarios involving, for example, integration of drilling information, processing of seismic and well data, integration of seismic and well data into interpretation systems, and so on.
There are a number of dominant technology vendors in this sector as well as a number of small companies that "work around the edges". The dominant technology vendors in this field provide proprietary solutions that do not interoperate easily with each other. Providing communications between these products within a company, or between companies, is a constant problem: this is the main motivator to develop Web service interfaces for these products. A second motivator for a standard is that it will open the door for smaller companies to provide useful add-on products. Large budgets in this sector are allocated to the purchase of software packages and display devices, but these budgets are small compared to the leverage of mass-market devices, so a longer term objective is to encourage a situation where more technologies with mass-market cost leverage can be used.
Given that this scenario involves interoperability between companies using disparate systems, XML is a natural choice due to its ubiquity and tool availability.
The main shortcoming of XML for this application is the processing expense incurred while converting floating point data to and from a character representation, as well as the extra size of some representations. Thus, the main requirement for this use case is the ability to represent sequences of scalars like floating point numbers in a native binary format in order to facilitate efficient use in application processing. In the example shown above, the header information would still have a textual value representation (useful for any infoset-based processing), but the data composed of floating point numbers would appear as a binary stream that is as directly usable as possible.
A candidate format must support floating point numbers in multiple native formats representative of common architectures with appropriate type indication. This allows reader-makes-right conversion only when needed and direct memory load otherwise. In practice, most operations involve moving data between machines with the same floating point formats so the solution should not impose undue overhead on the most common situation in order to handle the less common ones. It is generally considered too expensive to incur processing overhead of conversion between floating point and character representation for this data. The ecosystem of data communication and processing involving multiple independently developed applications indicates the need for exchange formats that are very processing efficient and space efficient in ways that are not processing intensive. This implies that a directly random access, random update, and efficient update capable format would be very useful and that transmitting low-level deltas might make sense to support update or repetitive communication.
Developers working in this industry find it most desirable to use a coherent library interface to get efficient access to the usually-native scalar data in the most direct way after receiving a block of data in the format. Similar efficiency is needed in the creation or update of an instance of the format. This need is not fulfilled by something that operates like a traditional parser. A parser-driven architecture usually involves considering each byte of the entire object first, generating many parse events, and then building a memory representation involving memory allocation and data copies, often through interfaces that must be invoked repeatedly. The goal of infrastructure for applications in this industry is to have the overall minimum net overhead in processing. To support adoption as a common interchange format, the industry needs a standard format that can support this lightweight low-overhead processing model.
Notes: Specialized Codecs makes sense if it is the property that allows direct representation of binary scalars. Platform Neutrality is needed in the sense that native scalars from all needed platforms must be supported by any implementation if the format has a Single Conformance Class.
Data Compression: One expert in this area has said, "For us, binary compression is probably not that important because transmission speeds are constantly improving. The additional time needed to compress and decompress seismic data would probably slow things down. We also place a greater value in the message structures than the transmission mechanics". Or, in more picturesque words, again from an expert in the field when asked about compressing seismic data, "Been there, done that, doesn't work, not interested". Bear in mind that this epigram encapsulates decades of experience and highly sophisticated R&D.
CORBA: There is, in fact, a CORBA-based integration platform currently deployed (although perhaps not widely) in this space. Without diving into technical details, it is clear that some companies would prefer an approach based on Web services.
XML Protocol Attachments: It is possible to represent seismic data control information in XML and to put the floating point arrays in a binary attachment using XOP. This data architecture is certainly viable, assuming that the issues involving floating point numbers are addressed, as evidenced by the fact that many of the proprietary vendor data formats work this way. It is, however, less flexible than the header-trace architecture described above, which is probably one reason why the latter is used in industry-wide seismic data standards (e.g. SEGY). Nonetheless, Web services that return data using XOP are an attractive alternative for dealing with seismic data.
Extensible 3D (X3D) Graphics [Extensible 3D (X3D) Graphics] is an XML-enabled ISO Standard 3D file format to enable real-time communication of 3D data across all applications and networks. It is used for commercial applications, engineering and scientific visualization, medical imaging, training, modeling and simulation (M&S), multimedia, entertainment, education, and more. [X3D Markets] Computer-Aided Design (CAD) and architecture scenes are also supported, though because they have larger file sizes and are more complex they are not typically streamed for web-delivered viewing.
Web-delivered file sizes in this use case typically range from 1-1000 KB of data while CAD files may run to several hundred megabytes apiece. An optimized serialization of the X3D data may be performed in concert with application-specific compression (e.g. combining coplanar polygons, quantizing colors, etc.) Lossy geometric compression is sometimes acceptable. Due to interaction requirements, the latency time associated with deserialization, decompression and parsing must be minimal. Digital signature and encryption compatibilities are also important for protecting digital content assets.
Support of Web-based interchange, rendering and interactivity for 3D graphics scenes. Stakeholders include tool builders, content authors, application developers and end users of 3D graphics models.
The X3D Compressed Binary Encoding Request For Proposals (RFP) from the X3D Consortium [X3D RFP] lists and justifies ten separate technical requirements. Many of these have parallels to a general optimized serialization format. The Web3D Consortium and X3D designers see great value in aligning with W3C standardization efforts in this area. This serves as significant evidence of the need for an industry standard.
Taken together, the following technical requirements for the X3D Compressed Binary Encoding RFP (indicated by emphasized type) include many requirements for an optimized serialization format. Strictly speaking, such an efficient serialization is not necessarily "compressed", though it is very likely to be more compact. Other factors, such as speed of parsing or databinding performance, may override the desire for compact representation for some applications, such as CAD. The X3D Consortium's RFP has chosen to include the ability to perform application-specific size optimizations, both lossless and lossy, as part of the process of converting the document from XML to the optimized serialization. However, the optimized format can still be represented as XML; in other words, a lossy geometric compression to an efficient format can be translated back to a lossy XML representation.
X3D Compatibility: The compressed binary encoding shall be able to encode all of the abstract functionality described in X3D Abstract Specification. Since X3D is expressed in XML, any optimized serialization that encodes all XML features will also be capable of encoding an X3D document.
Interoperability: The compressed binary encoding shall contain identical information to the other X3D encodings (XML and Classic VRML). It shall support an identical round-trip conversion between the X3D encodings. This corresponds to the Roundtrip Support property.
Multiple, separable data types: The compressed binary encoding shall support multiple, separable media data types, including all node (element) and field (attribute) types in X3D. The RFC allows the possibility of performing domain-specific compressions or encodings of data in the XML document, for example of polygons and textures. The ability to make use of Specialized Codecs is essential to meeting this requirement.
Processing Performance: The compressed binary encoding shall be easy and efficient to process in a runtime environment. Because the data for interactive applications is delivered across the web low latency is important. The ability to quickly process documents is important. In the case of CAD files, which may be several hundred megabytes in size, the ability to quickly process the file is very important. Often 3D files have long arrays of numeric data. Using XML format requires that a reader extract the string information and convert it to a binary representation. This is a computationally expensive process. Experimental data has shown that an optimized format can be 20 times as fast as XML. The ability to rapidly process arrays of floating point data is also important. Thus the Accelerated Sequential Access property is also relevant to this use case.
Ease of Implementation: Binary compression algorithms shall be easy to implement. This corresponds to Implementation Cost.
Streaming: Compressed binary encoding will operate in a variety of network-streaming environments. X3D documents are often streamed in web environments, and portions of the 3D scene are rendered as they arrive on the client computer. This corresponds to the Streamable property.
Compression: Compressed binary encoding algorithms will together enable effective compression of diverse datatypes. 3D data often consists of large arrays of floating point data that can be compressed in various ways. The ability to employ Specialized Codecs, either lossless or lossy, would meet this requirement.
Security: Compressed binary encoding will optionally enable security, content protection, privacy preferences and metadata such as encryption, conditional access, and watermarking. This corresponds to the Signable property.
Bundling: Mechanisms for bundling multiple files (e.g. X3D scene, inlined sub-scenes, image files, audio file, etc.) into a single archive file will be considered. This corresponds to Embedding Support.
Intellectual Property Rights (IPR): Technology submissions must meet the Web3D Consortium IPR policy. The W3C Patent Policy [W3C PP] is compatible with this.
GZIP is the specified compression scheme for Virtual Reality Modeling Language (VRML 97) specification, the second-generation ISO predecessor to X3D. GZIP is not type-aware and does not compress large sets of floating-point numbers as well. GZIP allows staged decompression of 64KB blocks, which might be used to support streaming capabilities. GZIP outputs are strings and require a second pass for any parsing, thus degrading parsing and loading performance. A GZIPed file would not gain the parsing speed and databinding advantages of an optimized format.
Numerous piecemeal, incompatible proprietary solutions exist in the 3D graphics industry for Web-page plug-ins. None address the breadth of technical capabilities that might be enabled by a general purpose optimized serialization format.
An X3D-specific compression and serialization algorithm for XML is certainly feasible and demonstrated. Compatibility with a general recommendation for an optimized format is desirable in order to maximize interoperability with other XML technologies, and reduce implementation cost. Many of these issues are common to other use-case domains; broad mutual benefits become possible via a common recommendation.
As Web services become more and more ubiquitous, there is a greater demand to use this technology as a way to deliver content to small devices such PDAs, pagers and mobile phones. All these devices often share the following characteristics:
They have limited memory and limited processing power.
Battery life is at a premium.
They are connected to low-bandwidth, high-latency networks which in some cases are regulated by "pay-per-byte" policies.
XML-based messaging is at the heart of the current Web services technology. XML's self-describing nature has significant advantages, but they come at the price of bandwidth and performance. XML-based messages are larger and require more processing than other protocols, and are therefore not well suited for a domain having the characteristics outlined above. Increased bandwidth usage affects wireless networks due to bandwidth restrictions allotted for communication by each device. In addition, the larger the message the higher the probability of a retransmission as a result of an on-the-air collision.
The target platforms for this use case include a broad range of PDAs, handhelds and mobile handsets, including mass market devices that limit code size to 64K and heap size to 230K. The transport packet size may vary from network to network, but it is typically measured in bytes (e.g. 128 bytes).
Small devices connected to low-bandwidth, high-latency networks. Two examples in this domain are cellular phone networks and PDA networks employed by the military.
XML is the fundamental technology underlying a Web services infrastructure, and one of the main reasons why Web services are not being deployed on the mobile space. A number of alternative serializations have already been developed to deliver XML content to small devices, however, many of these are not interoperable. This lack of interoperability results in fragmentation and the need for specialized gateways to transcode proprietary formats.
In order to satisfy the requirements of this use case, an alternative serialization must be faster to process and must produce smaller packets. Faster processing will result in lesser battery consumption while smaller packets will result in reduced latency as well as, assuming a pay-per-byte model, a more cost-effective service. In addition to small and fast, an alternative serialization should also be streamable, i.e. it should be possible for the client application to operate on any prefix of the serialized data.
Assuming that the same amount of information is encoded in an alternative serialization, a way to quantify efficiency is to consider the instruction to data ratio. In other words, the amount of effort that is needed to produce or consume a unit of data. Even though this is an implementation requirement, an alternative serialization must enable the creation of "thin" stacks with a low instruction to data ratio.
The reduction in latency that results by improving parsing speed may or may not be noticeable to the consumer depending on the transport latency of the network --transport latency is the dominant factor in many existing networks. Nevertheless, a more efficient parsing method will improve battery life on the device as well as throughput on the server.
Proprietary solutions result in the so-called gatewayed networks, where communication is always routed through a single point that translates to and from XML. This architecture not only creates a single point of failure within a network but also fragments the entire network by creating non-interoperable, domain-specific solutions.
Message size reductions are attainable via the use of standard data compression techniques. Even though in general decompression is less expensive than compression, it is still too costly for most small devices. Additionally, the extra burden of compressing packets has a negative impact on the overall system throughput.
In addition to the added cost, redundancy-based compression algorithms tend to perform very poorly on small messages, in many cases resulting in larger messages. Mobile clients often carry on dialogs with servers which consist of a large number of small messages. Examples of this include: data synchronization, stateful web services, multi-player games, querying and browsing data. In all of these use cases, the cumulative stream of messages that make up the dialog can grow very large even though all of the individual messages are rather small. Thus, there is still a need to reduce the amount of data exchanged, but doing this by compressing each message individually is not a viable solution.
A large number of existing enterprise systems are built using distributed technologies such as RMI, DCOM and CORBA. As the industry moves from distributed object systems to Service Oriented Architectures (SOAs), the use of Web services technologies becomes more significant even within the confines of a single enterprise. Many of the concepts behind SOAs are applicable to divisions within an corporation, so it is only natural to extend the applicability of Web services to intranet systems.
A stumbling block that several re-architected systems are facing is that XML-based messages are larger and require more processing than those from existing protocols: data is represented inefficiently and binding requires more computation. It has been shown that an RMI service can perform up to an order of magnitude faster than an equivalent Web Service due to the processing required to parse and bind XML data into programmatic objects.
The domain is that of distributed systems, typically based on binary protocols, which for technical reasons (e.g., interoperability) or for economic reasons (e.g., reduction of software licenses) need to be re-architected as Web services. An important constraint for these type of re-deployments is to maintain (or improve) the system's performance, a task that has been found challenging given the additional processing requirements of XML-based protocols.
There are some important economic reasons that support the use of Web services as an alternative to existing technologies for building distributed systems. First, preliminary results show more powerful hardware is needed to re-deploy existing systems using Web services technologies given the additional processing requirements of an XML messaging system. Second, assuming the company in question already develops (or is planning to develop) Web services to communicate outside their firewall, there is the extra incentive in using the same set of tools and the same development team to build intranet applications. This reduces both software fees (e.g., by reusing application servers and development tools) as well as training costs associated with having separate development teams for each technology. Third, some companies that have successfully deployed CORBA-based systems, but are not planning on deploying Web services, may find an additional incentive to do so if a more efficient serialization is standardized.
Intranet Web services differ from Internet Web services especially in the areas of deployment and security: deployments are easier to manage and security is typically defined by a single domain. The requirements for intranet systems are somewhat different from those for Internet systems, permitting the use of certain optimizations in the former which would be difficult or simply impossible to implement in the latter. Consequently, in many cases the degree of coupling of the systems can be adjusted if this helps in achieving the desired performance goals.
The main requirement for this use case is reducing XML processing time in order to achieve a level of performance comparable to the existing systems. Due to the availability of high-speed networks in these scenarios, reducing message sizes is of a lesser priority. It is worth pointing out that not all systems re-deployed using Web services will be unable to achieve their performance requirements. Therefore, this use case applies only to a subset of the aforementioned re-deployments.
In some cases, it may be possible to re-design the system's interfaces to make them more coarse grained in order to reduce the number of messages exchanged. Although this is technically feasible in most cases, the costs associated with this effort can be prohibitive.
Documents are the most basic form of recorded human communication, dating back thousands of years. Electronic documents are the transition of this invention to the online, computerized world. Books, forms, contracts, e-mails, spreadsheets, and Web pages are only some of the forms in which electronic documents are used. Unlike paper-based documents, electronic documents are not limited to static text and images. Electronic documents regularly contain both static content, dynamic content (e.g., animations, video), and interactive content (e.g., form fields). This wide range of content has a great affect on selecting an appropriate representation format and must be considered in evaluating this use case.
Documents are first created in some authoring environment. During the creation process the author may elect to include text, fonts, image, videos, or other resources which are to be rendered more than once when the document is displayed. For example, a company logo may appear in the header of each page of a document, but this should not require adding the logo to the document more than once.
In a special case of document creation, new documents are created by assembling a set of existing documents into a single aggregate document. For example, this may done to combine a basic product manual with additional documentation for optional product accessories into a customized manual for an individual purchaser. When documents are bound together in this way it may be important that the data in the original documents is not modified, so as to preserve signatures or other properties of the file, or it may be desirable to identify and eliminate duplicated resources, such as fonts.
After a document has been created it is usually read, in whole or in part. Documents are not necessarily read front to back; a particular reader may select a different order or read only part of a document. A reader may, for example, obtain the document by traversing a hyperlink which points to a specific location within the document. It is important that rendering a document for reading be fast, even when starting at an arbitrary location in this way, and even when documents are large (millions of pages). This implies that it must be possible to navigate to specific sections within a document quickly, as well as follow links to shared resources within the document, as mentioned above under document creation. Finally, if a document is being retrieved over a slow link, it may be useful to fetch portions of the document in the order in which they are being rendered and read (e.g., starting at page 700), as opposed to document order (i.e., starting at page 1).
Documents often contain information of a sensitive or proprietary nature and so can be secured using encryption technologies. Encrypting the document can serve either to keep the contents confidential, to--in conjunction with the rendering application--allow only certain operations ("rights") on the document, or both. Typically a description of any rights granted is embedded within the document itself when it is encrypted. It is often desirable that only portions of a document be encrypted so that intermediaries can access some portion of the data in the file.
Documents, and especially those used in business transactions, are often signed to indicate authenticity of, consent to, or agreement with the document. In electronic documents, this is implemented by digitally signing the document. The digital signature must itself be stored in the document. Multiple signatures may be applied to a document, each one signing those which came before it. Additional information is sometimes added to a document after it has been signed but without invalidating a signature--in the same way one can initial a correction to a paper document--but so that it is clear that any subsequent changes were not present when the pre-existing signatures were applied. In some cases signatures should apply to only part of a document, leaving other parts for later modification. Finally, it must be possible for a recipient to validate all of these signatures.
Documents are often long-lived and, during the course of their lives, used in different environments with varying constraints. For example, when a document is being published for general consumption, it might be most desirable to select an encoding such as XML which is widely understood. If, however, the same document is being transmitted between partners with known expectations a more compact format such as [XOP] might be preferred. Thus, a single document may sometimes be transformed between different encodings at different times and for different purposes. Such transformations should preserve the information in the document, but these operations cannot be expected to be compatible with encryption mechanisms used to secure documents.
Even when various encodings are available documents tend to push the available storage and bandwidth of the devices on which they are created, stored, transmitted, and read. In other words, as device capabilities increase, users respond by creating larger documents. Note that these documents rarely contain only text; they generally contain larger elements such as fonts and images and, increasingly, video and 3D models which these same enhanced devices make possible.
Electronic documents, like their paper counterparts, can be modified or re-purposed. In electronic documents, this typically occurs when pages, images, videos, and so forth are either copied out of a document to be used elsewhere or removed from a document to produce an altered version of that document. Again, these operations should be efficient: removing any one page from a one million page document should not take significantly longer than doing the same to a ten page document. Documents may also be modified by their recipients to include comments of various types--editors' marks, sticky notes, etc.--usually intended to communicate responses back to the author. These comments may be stored within the document itself; both adding them to and extracting them from the document should be efficient.
Finally, some documents are designed to be interactive beyond the limited interactions of rendering, signing, and annotating. These documents may contain form fields, GUI widgets such as buttons and listboxes, or other active elements, data islands bound to these widgets, and code, scripts, or declarative logic to validate input to these elements, enable or disable the elements, transmit the document, modify the document, interact with the rendering application, and so forth. It must be possible to describe and access all of these elements within the document itself.
Electronic documents are used extensively throughout government, business, and personal domains as well as in the interchange between these entities.
XML is in its roots a syntax for marking documents, and so the electronic document use cases seem highly relevant. Interestingly, XML has a number of shortcomings (discussed below) with respect to many of the requirements derived from this use case. Arguably, these occur because XML (and SGML) were focused largely on textual documents, but such documents represent a decreasing fraction of all electronic documents. Thus, Binary XML as a natural extension of XML to handle new document types, and documents containing new content, seems particularly relevant.
Documents are almost always exchanged between two or more people, and often between larger entities such as corporations or governments. It is, therefore, extremely desirable that an electronic document format should be easily consumable by all parties involved. XML, as a widely accepted, implemented, and used format, fits this need quite well.
Unfortunately there are a number of requirements imposed by electronic documents which XML fails to address:
Documents frequently contain embedded resources such as fonts, images, and video which are themselves encoded in binary formats. It must be possible to efficiently embed these resources in documents.
It must be possible to navigate to and render a specified location in better than linear time with respect to the size of the document.
The document encoding must be efficient with respect to space, that is, it must have low redundancy.
The document encoding must be efficient with respect to space, that is, it must have low entropy.
In order to make updates efficient, it must be possible to update a document in time proportional to the size of the update rather than the size of the document.
There are a number of requirements which XML does address, but which are enumerated here as well because they would also be requirements on any Binary XML encoding:
The format should be widely accepted, available, and implemented.
Re-usable resources may appear, or be referenced from, multiple locations within the document. In order to maintain reasonable document sizes, it must be possible for these resources to be used by reference, rather than by duplication.
It must be possible to efficiently assemble even large documents.
It must be possible to assemble signed documents in such a way that their signatures are preserved.
It must be possible for a document to contain multiple signatures, full or selective, from one or more signers.
It must be possible to read a secured (encrypted) document without suffering an unreasonable delay when first viewing the document, without unreasonably exposing the decrypted contents of the document, and while obeying rights associated with the document.
It must be possible to efficiently extract data from the document (i.e., a document fragment) and without modification to the extracted data.
Finally, the introduction of multiple formats (i.e., XML and Binary XML), implies the following desirable requirement:
The conversion of a document between different encodings must preserve all information in the document, including digital signatures.
The current de facto standard for interchange of electronic documents is Adobe's Portable Document Format, or PDF. PDF meets all of the requirements stated here except that it is not based on a widely accepted, implemented, and available format, and so while widely deployed for document viewing is not sufficiently easy to use for general-purpose interchange.
Earlier formats for electronic documents, such as TIFF, DVI, RTF, AFP, and Postscript do not, among other shortcomings, support the full range of required document features, such as dynamic and interactive content.
HTML/XHTML, SVG, XSL-FO, and other XML-based formats can, in combination, provide coverage for most of the requirement document features. However, they fail to meet certain file format requirements as described under Analysis, above.
There are other proprietary formats, such as Microsoft Word and SWF, which meet many of the functional requirements described here but, due to their proprietary nature, also lack sufficiently broad-spread acceptance, implementation, and availability.
The Securities industry has cooperated to define a standard protocol and a common messaging language called FIX which allows real-time, vendor/platform neutral electronic exchange of securities transactions between financial institutions.
The original definition of FIX was as a tag-value pair format. Due to increased competition by the year 1999, and to better accommodate business models of emerging initiatives, an XML-based message format for application-layer messages called FIXML was devised. Even though FIXML was designed to have minimum impact on existing systems, in order to protect investments in traditional FIX systems and processes, it soon became evident that the new message size was as much as 6 times larger than its tag-value predecessor, a condition that precluded key participants in the industry to integrate FIXML into their systems. This problem, together with some positive findings made through experiments, spurred the discussion for size reduction of FIXML messages, which culminated in a new format called Transport Optimized FIXML (TO-FIXML) in FIXML version 4.4. TO-FIXML is essentially a collection of XML Schema definitions that uses name abbreviations as well as attributes instead of elements wherever possible to collectively reduce FIXML messages up to 4 times.
Securities industry engaging in capital markets such as derivatives, equity and fixed-income markets, where the FIX protocol is applicable and is moving towards SOA architectures based on FIXML. Major roles played in the industry include brokers, exchanges and clearing houses.
Even though TO-FIXML has been designed to minimize message sizes, some industry participants still consider it to be a sub-optimal solution and envisage the possibility of further optimization by studying binary-compatible XML formats.
XML was the natural choice for the securities industry in light of its expandability and flexibility, which was required for the continuous and rapid evolution of the FIX protocol. There was also a demand for cross-industry interoperability given the broad adoption of XML by other financial industries.
XML Schema is the point of agreement for multiple parties to share a common transport format. However, the bloated size of the XML instances resulted in artificial changes to the schemas, with the sole purpose of reducing the number of bytes on the wire. The methods used for this purpose include the use of name abbreviations and the use of attributes in favor of elements wherever possible. Clearly, XML Schema is not the right place to tackle this problem given that the syntax verbosity is a property exclusive to the XML serialization. Stated differently, XML Schema is the point of agreement in terms of vocabulary and structure, not in terms of syntax.
Shown below are two sample FIXML order messages. The two messages carry the same information, yet their appearance is quite different. The first one is in FIXML version 4.3 format, while the second is in TO-FIXML format (FIXML version 4.4). As shown below, the one in TO-FIXML is much more compact than its equivalent FIXML 4.3 message. Some items such as 'Sender' and 'TransactTime' that were elements in FIXML 4.3 became attributes in TO-FIXML with abbreviated names 'SID' and 'TxnTm', respectively.
<FIXML DTDVersion="2.0.0" FIXVersion="4.3"> <FIXMLMessage> <Header> <Sender><CompID>CAT</CompID></Sender> <Target><CompID>DOG</CompID></Target> <SendingTime>2004-10-13T12:00:00</SendingTime> </Header> <Order> <ClOrdID>123456</ClOrdID> <Instrument> <Symbol>XYZ</Symbol> </Instrument> <Side Value="1"/> <TransactTime>2004-10-13T12:00:00</TransactTime> <OrderQtyData> <OrderQty>100</OrderQty> </OrderQtyData> <OrdType Value="2"/> <Price>85.00</Price> </Order> </FIXMLMessage> </FIXML> <FIXML v="4.4" r="20030618" s="20031218"> <Order ID="123456" Side="1" TxnTm="2004-10-13T12:00:00" Typ="2" Px="85.00"> <Hdr SID="CAT" TID="DOG" Snt="2004-10-13T12:00:00"/> <Instrmt Sym="XYZ"/> <OrdQty Qty="100"/> </Order> </FIXML>
Message size alone can be substantially reduced by standard compression methods. However, there is a study that shows compression of FIXML instances increases round trip time over 10 Mbps networks. Compression may be useful for considerably slower networks, which is not the typical case in FIXML. The same study also suggests that marshalling/unmarshalling costs do not seem to make tangible performance differences in those data sets typically seen in FIX scenarios.
The Service Enabler standard for mobile handsets benefits from extensive use of XML-based technologies for interoperability. For example, SMIL, SVG and XHTML are used as document formats for mobile content services such as:
Multimedia Messaging Services (MMS): MMS in 3G consists of multiple XML documents, such as SMIL, SVG and XHTML. The handset is required to parse and render multi-namespaced XML documents.
Map Services: Map data delivered to a handset is split into multiple chunks based on region and level of detail; handsets retrieve additional chunks in response to user zooms and scrolls. Additional data, such as restaurant information supplied by other content providers, can also be overlayed on top.
XML documents in these services are considerably large. For instance, the map data represented in SVG could be 100KB or more. Rich content MMS could also be very large. Even on today's high-end handsets with 120 MHz 32-bit RISC processors, parsing a raw 100 KB XML document takes approximately 10 seconds.
XML is required for maximum interoperability. In fact, XML technology is already widely adopted in the mobile services space. As this area requires a solution for narrow band and limited footprint devices, the importance of this use case should be considered high.
This use case requires the following capabilities of XML to be preserved:
Interoperability
Multiple namespace support
In-memory, random access using a DOM
Interoperability is mandatory as the same documents must be shared among different handsets. Moreover, for map services, the layering of multiple source map data requires interoperability among the providers. Support for multiple namespaces is a must in order to deliver multi-format messages (e.g. HTML + SVG) to the devices. DOM access is required to support ECMA scripting as well as for efficient rendering of formats such as SVG.
The requirements not satisfied by current XML solutions that must be addressed are:
Efficient transmission of XML documents by reducing their sizes
Efficient access to a DOM, i.e. efficient DOM parsing
The WAP Forum defined a WAP Binary XML format as an alternative serialization for XML. However, this format has a number of shortcomings, the biggest of which is the lack of support for multi-namespace documents due to the use of a "single dimension" system of 6-bit tags.
A large business communicates via XML with a number of remote businesses, some of which can be small business partners. These remote or small businesses often have access only to slow transmission lines and have limited hardware and technical expertise. The large business cannot expect the smaller partners to upgrade often or to use expensive technology. The primary illustrations of this use case come from the energy, banking, and retail industries.
In the energy industry, the major upstream (exploration and production) operations of oil companies are largely in developing countries and it is a common problem to have very slow and perhaps unreliable communications between the main office and remote sites. It's not that the oil companies don't know how to set up a satellite feed, it's that they are often required by the local governments to use the communication facilities provided by that government, and these communications can be technically low-end and expensive. So the common problem is one where there is plenty of processing power and bandwidth at both central and remote sites, but the communication between the two is slow.
Although many scenarios illustrating this problem have to do with upstream operations, this specific example will be from downstream (refining and marketing). It involves transmission of Point of Sale (POS) information back and forth between back office systems and remote sites. The data flowing to the remote sites includes "incremental price book" for dry goods and wet stock, currency exchange rates, promotion codes/rates/groups and so on. The data coming back includes raw sales transactions data, tank data, etc. One might have 1000 transactions per day per site with an average file size of 3 KB, for a total size of 3 Megs typically broken up into 12 documents (transmittal every 2 hours, referred to as "trickle feed"). Each document would then average 250 KB.
Currently the scope is for many thousands of sites connected to several regional back-office hubs. Connectivity ranges from VSAT to 32 kbps analog connections. The 32 kbps connections would only communicate once a day. This downstream situation includes a factor which is not common in upstream operations. Not only are there communication limitations, but in this case some of the remote sites also have limited processing capabilities because they are small businesses with limited resources.
In the banking industry, there is typically a main data center(s) and several connected branch offices, ATM machines, and business partners. The main data center may have the latest in technology, however, the connected branch offices, ATM machines and partners are often without access to high speed connections and powerful hardware. Communicating between the various entities can be accomplished with XML Web Services, however, the size and speed issues of XML are troublesome for those without access to high speed lines and/or powerful machines.
These same issues affect the retail industry as stores often are connected to the main data center over less than optimal links. In addition, the retail store needs to perform real-time purchase/return transactions that require round-trip communications with the main center.
Retailing operations of large companies, particularly those where the actual retail outlets are SME's (Small to Medium Size Enterprises) and large companies with various small business partners and/or branch offices. The belief is that the experience gained in this situation is likely to be directly applicable to a number of other scenarios in the industry.
Note that the players in this use case have rather different situations and needs. The large company has significant sunk investment in complex back-office systems, lots of hardware and a team of IT professionals. The objective here is to integrate the solution into a complex, high-tech environment. The SME, partner, or branch office, may have very limited hardware and technical resources and is probably highly motivated toward a simple, low-cost solution, preferably one that plugs-and-plays off the shelf without extensive configuration or integration. This creates a tension between flexible and capable on one hand and simple and cheap on the other.
The need for compression is clear, but it is not so clear that a binary XML standard is required. One solution is to use a standard compression technique, like GZIP, on the XML and transmit it that way. This may not be an effective solution for all cases. One problem is that the SME, branch office, business partner, and/or ATM machine location might not have enough computing power to make the GZIP solution attractive. In addition, the data center, which may have a lot of computing power, may not want to absorb the increased CPU load that a GZIP solution requires. A binary solution that compressed the message and did not increase the CPU load could be beneficial in these cases.
XML’s verboseness and size detract from its use in these scenarios. An alternate encoding would be required to be more compact than the original XML encoding with no loss of data. This would enable the branch office and other entities that operate over less than optimal transmission lines to take better advantage of XML and Web Services.
This is a case where there is a need to compress entire data files, which are composed of a bunch of tags with relatively short data fields. That is, no huge arrays of floating point numbers causing special problems. Also note that the energy industry expects no particular problem with processing on either end, just the transmission, so the overhead of using compression algorithms is not a problem.
The energy industry currently plans to use native VSAT compression and probably one of many standard compression algorithms like ZIP for other transmission mechanisms. Initial tests with freeware ZIP compression software yielded a compression of 39:1, which is plenty. One problem that did occur, however, on small machines such as might be used by a small business, is that the compression algorithm may need to read the entire document into memory and work on it globally. On small machines this can cause paging and the resulting performance difficulties can be painful. Some sort of compression algorithm that works in a streaming mode or on "chunks" of data would obviously be preferable in these cases.
There are cases, however, where the SME is unwilling to use the CPU required by the compression. This has been encountered in cases where the small business has a computer, perhaps a mainframe, that is overburdened by other routine tasks. In this case binary serialization may be an attractive alternative assuming it can be done without extra CPU cycles.
The idea is that if one is going to have to parse the XML anyway (which may or may not be the case, depending on the business process), the CPU required to do that parsing is a "sunk cost". Once parsed, a binary serialization of the XML will probably be smaller than the usual text serialization because the tags are not repeated in text and some of the data fields (e.g. some numbers) may be smaller in their binary representation. For typical business documents one might expect a reduction on the order of a factor of two from binary serialization. This moderate reduction in file size may hit the "sweet spot" in cases where CPU is a big problem and file size a moderate concern. It seems likely, however, that this scenario will be less common than the case where reasonable computational capability is available in the small business and the slow transmission lines are the big problem. In these cases, as documented above, compression via standard techniques of the conventional text serialization of XML is probably the preferred solution.
Note that in both cases the needs of both small and large businesses can potentially be met. The large business gets the XML document it needs in order to integrate with its complex systems. The small business either uses inexpensive compression software or the parser outputs the binary serialization directly, so the complexity of the solution from their viewpoint is minimized.
The Extensible Messaging and Presence Protocol (XMPP) [RFC 3920] [RFC 3921] formalizes the core Jabber instant messaging and presence protocols [Jabber] for the IETF. In addition, the Jabber Software Foundation develops extensions to XMPP, known as Jabber Enhancement Proposals [JEPs]. These protocols use XML as an underlying mechanism for instant messaging, group chat, presence, and other near-real-time functionality such as publish-subscribe and alternative bindings for SOAP and XML-RPC. XMPP is a streaming XML protocol and does not deal directly with XML documents. However, an XMPP session is effectively an XML document that is built up over time as the participants exchange XML "stanzas" (XML document fragments) during an open-ended conversation. The architecture is client-server; clients authenticate to a server, and servers authenticate with each other. Messages sent to a user's ID on a server are forwarded to the client. There is a large and active installed base of XMPP/Jabber clients and servers.
XML is used in XMPP because it offers flexibility, ease of debugging, and extensibility. New XML semantics can be added to the protocol, and it is simple to examine the contents of messages and determine their meaning. In a very general sense, it can be said that XMPP acts as a sort of router for XML messages.
Instant messaging and chat applications, presence servers, publish-subscribe systems (e.g., content syndication), SOAP and XML-RPC exchange.
XML may require more computational power to parse than equivalent binary protocols used in other instant messaging and chat protocols. When an XML stanza is received at an XMPP server it is typically routed to an intended recipient other than the server itself (e.g., another server or a client connected to the server). Routing requires parsing enough of the XML stanza to make a routing decision. Typically only part of the XML stanza at a well-known location needs to be examined by the server, and the message payload remains opaque to the server. Therefore, being able to quickly retrieve only the information needed is a significant benefit.
The traffic profile is that of a great number of small XML messages that must be parsed quickly. Because a single server may support up to hundreds of thousands of concurrent users in high-traffic environments, parsing speed may become a limiting factor in the number of users a server can support.
In order to satisfy this use case, parsing the data required to make a routing decision must be faster using a binary format than using a text XML format.
When XMPP data is sent over bandwidth-limited links, it may be helpful for the XML stanzas being exchanged to be more compact than is possible using the text XML format. An XMPP server being accessed across a bandwidth-limited link using uncompressed XML will run out of bandwidth faster than a server using a compact binary format.
An XMPP server often sends out slightly modified messages to many people, for example when a client's presence status changes (notifications must be sent to all entities on that person's contact list). In this situation nearly identical messages that differ only in a few respects, such as address, should be sent. The ability to efficiently update an XML message with new data would be useful in this situation.
Any server or client that uses an XML binary standard must maintain the original XML format in order to preserve compatibility with the XMPP RFC standards. Thus, any binary format must be transcodable back to an equivalent textual XML format.
When the server receives an XML stanza it must parse the document and decide what clients or servers should receive the message. Also, across bandwidth-limited links, XML is more verbose than binary formats. Binary formats have the potential to improve these aspects of XMPP.
The Jabber Software Foundation has addressed compact representations of XML streams through an experimental add-on extension to XMPP that allows the streams to negotiate compression using zlib [Jabber Stream Compression]. The XMPP protocol may also use Transport Layer Security (TLS), which can optionally include zlib compression of encrypted traffic. Using zlib places a somewhat larger computational load on the server above and beyond the parsing duties it must perform. Again, with large numbers of subscribed users or heavy traffic the computational load may become a limiting factor. A compact binary format may be smaller than a zlib compressed stanza, while retaining high processing speed. The Jabber Stream Compression protocol defined in JEP-0138 can easily be modified or extended to cover alternative formats, such as a binary XML format or a binary format that is also compressed.
Documents are stored in the persistent store. They are searched and updated. For its support, beside compression, it requires multiple techniques such as selective indexing, data access using groups of pre-compiled XPath expressions, etc. The size of documents varies depending on application domains and particular customer needs within those domains. The persistent store management system has to deal with many scenarios, from small documents to very, very large ones. The ratio of the space taken by actual data and structural information (tags, attribute names, and namespace definitions) also varies very significantly.
When documents are stored in the persistent store or retrieved from it different kinds of transformations and processing can be requested by different applications. Thus, the persistent store management system has to deal with various tools interfacing to it. The ways documents will be manipulated are never known in advance.
Since documents can be large or very large they have to be processed incrementally in many cases. The incremental processing can be of two types: stream like or lazy DOM like. The stream like processing requires documents to be transferred by pieces where each piece can be fully processed without information in the following pieces. The lazy DOM processing allows client applications to request only a part, subtree (full or partial), of the document.
Schema aware documents present multiple opportunities for various optimizations. These optimizations affect different aspects of the alternative XML serialization: compactness, efficiency of random access, efficiency of sequential processing, etc. The alternative XML serialization format must include means to apply these optimizations to the representation of XML documents. At the same time it should allow schema aware documents to be represented, on request, as non-schema aware documents.
Handling schemas in the persistent store has certain properties that are not necessarily present in other use cases. The most important one is the need to support schema evolutions. Schema evolutions include additions of new elements and attributes, modifications by extension of simple data types, etc. In this respect, the forward compatibility of the alternative XML serialization format is very important for persistently stored documents.
There are a couple of important and interesting aspects that need more discussion. One of them is versioning. There are two different kinds of versioning: versioning of schemas for schema aware documents and versioning of documents (in the course of different modifications made to the document) for all XML documents. Both kinds of versioning present technical and usability challenges.
The XQuery Data Model provides an abstract representation of one or more XML documents or document fragments. The data model is based on the notion of a sequence. A sequence is an ordered collection of zero or more items. An item may be a node (document, element, attribute, namespace, text, processing-instruction or comment) or an atomic value. The input and output of XQuery are defined in terms of the XQuery Data Model. Because of this, the efficient handling of XQuery Data Model instances, when storing, retrieving, or searching, is very important. While an XML level serialization of XQuery Data Model instances is defined, it is not sufficient for applying an alternative XML serialization to XQuery Data Model instances. Thus it is important that an alternative XML serialization is rich enough to handle not only the XML serialization but also the XQuery Data Model serialization.
Stakeholders are all providers of persistent store management systems and all providers of applications using and relying on persistent store management systems.
Persistent store technologies are widely and globally used. XML documents have been stored (and queried, and manipulated) in persistent stores for some number of years. The use of XML capable persistent stores is rapidly growing. And nobody expects any decline in this process.
The efficiency of XML handling in the persistent store is becoming critically important. The experience of persistent store management system providers shows that the use of alternative XML serializations of documents significantly improves performance, both size wise and processing efficiency wise.
A standard for alternative XML serialization is important for several reasons. Persistent stores interoperate with multiple software components (application servers, various client applications, mobile devices, etc.) produced by multiple vendors. Experience shows that conversions between multiple proprietary formats can be quite expensive. In addition, proprietary formats tend to change rapidly. This increases development costs. Also, handling of patent related issues is not usually helpful in delivering products on time.
While persistent store vendors usually handle several data representations, the necessity to handle XML documents is a given. It is not a question whether applications using persistent stores can use other data representations or not. It is the reality of persistent store customers using XML in more and more applications. The use of XML in the persistent store, as described above, leads to following requirements:
Efficiently support the alternative XML serialization format for schema aware documents and be friendly to schema evolutions that maintain the validity of documents
Efficiently support certain operations in the persistent store and in the memory. These operations should include querying, updating, indexing, access to fragments, and fragment extraction.
Efficiently support XML schema datatypes. Transferring data in native binary format is typically more compact. For example, xsd:hexBinary data represented as binary bytes is half the size of the corresponding textual hexadecimal representation.
Efficiently support incremental processing where both partial loading and partial transmission are included.
The use of an alternative XML serialization should require minimal changes to existing application layers (only the underlying layers need to be changed). The impact to developers should be minimal.
Efficiently support multiple ordering. It should be possible to process documents both sequentially and based on subtrees.
Large XML documents flow through a business process. During the flow of a document, various business processes perform different, disjoint tasks. In addition, each distinct business process may only require portions of the entire XML document to complete their task. For example, a purchase order document may contain various customer information, shipping information, payment and billing information, etc. A business process is then defined where this document is passed to various entities, some being serviced by outside vendors, to approve and fill the purchase order. Another example is a business workflow scenario where XML documents are shared between business partners that contain the relevant workflow information.
These business processes vary greatly in complexity. One successful strategy to implementation and maintenance of complex applications is the concentration of business rules and business logic in an engine or subsystem of some kind. This subsystem is referenced in one or more ways by other parts of an application while supporting the architectural principle of non-redundant implementation of changing and evolving business aspects. These applications include web-based transactional applications implemented as one or more tiers or component layers which can be distributed in more than one thread, process, or server system. The thread of execution of an application for a specific user is usually represented by a "context" which is maintained for the entire user "session". Distributed application clusters frequently employ methods for distribution of session context data for load balancing and failure recovery. This means that the data that represents a session needs to be serialized and copied to another server or to a database on every transaction.
Business rules may be simple validation descriptions and global variable settings or they may consist of complex business logic or declarative logic programming. Logic and other intelligent programming can consist of expert system production rules, constraint languages, ontology processing, and the integration of multiple dissimilar processing engines. The key considerations are the efficient representation, exchange, maintenance, and evolution of data and knowledge. Often, a knowledge base references or is created from a large template and activity in the application produces many distributed incremental changes to represent state and acquired knowledge. Many standards and research efforts are layered on the representation of entities, relationships, and attributes in RDF and related methods. Optimization for this body of work would be very beneficial to this use case.
The representation of knowledge is a difficult problem to solve generally. It is however important to strive in this direction as a major technique in complex applications is the combination of engines that use different methods but cooperatively work with common data. Representation of knowledge can differ qualitatively from simple data because of the maintenance of meta-data such as "unknown" status, probability, history, and representation of what rules have changed, could change, or are dependent on a particular 'fact'. The alternation of activation of different knowledge or other application modules requires that all or part of the current, updated, session state be made available to those modules. This means there would be frequent interchange of large amounts of changing data, optionally represented as many small changes to a large template, between separately developed modules that need to access the same data in a standard fashion.
A specific example of this use case is of a web insurance or loan application. These kinds of applications are typically hosted on distributed web application clusters. Each processing step, usually a web form submission, involves reception by a web server, processing by an application agent, transmission via queues or web services to an application engine, and processing through many tiers of applications, servers, and communication links.
Application data in these kinds of applications will involve complex relationships and involve complex updates. The organization of data is very likely to be different from the multiple relationships between data elements which leads to a need to efficiently express arbitrary data structures. This need, along with the need to be able to efficiently update data objects while also supporting delta layers, results in a requirement to support explicit pointers that are automatically managed in the format. These pointers, which can be considered sticky virtual pointers, need to be able to be created and maintained within a format instance and have a fixed reference convention. Pointers need to be dereferenced about as quickly as referencing data items via direct paths. The pointers also need to be able to be represented as element or attribute values both internal and external to a format instance.
This Use Case pertains to small, medium, and large businesses that utilize XML to support intra and inter business process workflow. An appropriate domain for this use case is users of complex applications, especially web, n-tier, or component based architectures with distributed processing. This use case is particularly suited for knowledge processing systems that include multiple processing subsystems. Important emerging work related to this use case include the Service Oriented Architecture (SOA), Web Services (WS), and Business Process Execution Language (BPEL) related activities.
Business processes often utilize a workflow where each step in the process only needs and processes certain subsets of the entire document. This results in the different steps in the business process performing disjoint tasks on random parts of the overall document. These disjoint tasks, since they are only processing a subset of the document, do not require the entire schema to perform their task.
Even though the entire document is not required at each step in the business process, the entire document is passed each time and, with current methods, fully parsed and processed. In addition, each business process requires a distinct and disjoint subset of the entire document to perform its task.
Existing methods, including data binding methods of rule engines, methods of distributing and replicating context, and communication models fall far short of possible efficiency levels. Lack of rich data format standardization with any efficiency is holding back intelligent component integration in production systems.
The document passed to each entity can be large, meaning that large amounts of potentially unused data are passed to each endpoint. GZIP may be an option, however you would pay for the zip and unzip at each endpoint, potentially negating its benefits. In addition, each endpoint may require random access into the document. If the document was compressed with GZIP, it would make this type of access impossible without first uncompressing it.
To avoid the potential zip and unzip problem and the bandwidth problem, a binary encoding that represented the data in a more compact fashion could be used. The encoding would also need to allow each endpoint to randomly access and randomly update a subset of the entire original document.
In addition, the document can be modified along the way. This requires that each endpoint have a way to quickly modify a part of the document, then send it to the next step in the workflow process. Modifying the document in place is desirable because creating a DOM, making the change, and writing it back out would be too costly. To support a much wider variety of processing and data structure needs, an ability to support direct, maintained, non-computed pointers is needed.
In some cases, the requirements include that the alternate form of the data be more compact than the original XML. In other cases, processing speed improvement is paramount and compactness is a nice-to-have feature. The more compact form must be lossless. This means the alternate form can be converted back into the original XML with no differences. In addition, the creation of the alternate form and conversion back to XML must be efficient such that the entire business process does not take more time than it did with XML. Furthermore, the alternate encoding must allow for efficient random access and random update into the document such that the entire document does not have to be processed only to access a small subset contained at some specified location.
Passing the XML document in its uncompressed or gzipped compressed form are two options. Both have drawbacks. Passing the uncompressed document can clog up the network whereas passing the gzipped document can alleviate the bandwidth concerns, but be too costly to compress and decompress at each intermediary. An efficient binary encoding could mitigate the bandwidth issues while at the same time avoiding the compress/decompress steps at each intermediary.
The ability to perform full XML message content-inspection and decision-making based on rules applied to the content is a requirement for a wide array of message processing systems. Routing, firewall (which is a filtering router), auditing gateway (a router that records), a trusted gateway (a firewall that strongly implements security policy), and the rich, high-level communications model that underlies IM/Presence (a router with sessions, virtualized endpoints, identity management, login, high level point to point and pubsub communication all at once) are all related by a number of requirements and similar operations. One strong part of the essence of presence/IM is that it relies on pub/sub of presence information.
Some of these routers and routers+ simply inspect messages, possibly just a header or possibly deeper, before sending them to their new destination. More simple types of routing include channel-based where a named channel is established by participants and topic-based where certain keywords are used to make decisions. These types of routing situations may require less raw processing performance than those requiring deep content inspection but their requirements still drive network infrastructure beyond current capacity and cost. Even very simple pure routing situations like NAT/PAT, modify, repackage, or otherwise update messages, requiring extremely efficient XML processing. Header-based routing as described in [WS-Routing] is another form of lighter routing. This form of routing is described in 3.14 Web Services Routing. We note that the concept of header may be problematic. With XML, it often simply means earlier XML ranges or subtrees rather than going deeper. If a reasonably opaque payload is used, the concept of header becomes more concrete, but that subset is not a relevant distinction for most XML messaging systems.
Layer 7 application-oriented routing devices and content publish subscribe systems are currently in use in the financial sector for performance-critical applications such as trade routing, where transactions may be routed from the front office to the back office, to brokers, and to exchanges. Routing or subscription criteria may be based on information in XML messages such as trade, client, task or ticket type, for example. Another application example is the dissemination of data services by mobile carriers based on granular subscriptions by a mobile user. In general, content-based routing and publish subscribe are extremely useful paradigms for all kinds of XML message dissemination and are relevant for Web Services, distributed computing, instant messaging, tactical military operations, government services, homeland security, and many types of online publishing. XML-based routing and publish subscribe as filtering technologies can be critical to reducing bandwidth utilization on LANs as well as WANs.
This use case applies to many domains requiring deep content inspection of XML messages in a message processing environment including firewall, routing, publish/subscribe, security gateways, and instant messaging. These applications are relevant to almost all commercial and public entities concerned with enterprise-scale information and transactional infrastructures.
There exists no approach in software today which can meet all usage scenarios of this use case using XML. An alternate XML format could make such an implementation in software feasible by increasing the potential performance of the operations needed in a content-based routing or publish subscribe system. Reduction or elimination of parsing time and the ability to very rapidly find and test XML infoset items is necessary. For example, an alternative XML format with support for random access could enable extraction of data from the message with only a minimal lookup time and could potentially support the low latency needed by routing and the execution of the large numbers of rules needed for content-based publish subscribe. The Random Access property happens to map very strongly to this use case since XPath queries can be used for expressing the logical address of indexed content and the routing and subscriber rules. Another property of an alternate XML format which could assist in reducing parsing time is Accelerated Sequential Access. Compatibility with data binding could accelerate the matching of routing and subscriber rules. For example, such rules often have integer, floating point, and data type comparisons which would be more rapidly evaluated if the cost of conversion to these types in the programming language could be minimized.
XML content-based routing and publish subscribe systems may require additional processing such as message transformation. For example, it may be necessary to delete, insert, or change information used to route the message to ensure that the message does not loop back to the same processing node. WS-Eventing specifies the possibility of inserting a subscriber-specified SOAP header in a message before it is passed to the subscriber. Efficient Update is therefore a critical property of the XML format in such scenarios. In some cases, the Fragmentable property is required to support extraction and processing of a sub-segment of the message. One example might be filtering text or attachments through virus filters. A situation in which both transformation and handling of fragments is critical is in compound event detection. Here a router monitors a sequence of messages for interesting subsequences. If such a subsequence is identified a single compound message composed from them is transmitted.This requires extraction of fragments and transformation on these fragments to construct the compound message. Message security may also be a concern in routing and publish subscribe systems. An alternate XML format which supports Signable and Encryptable properties in an efficient manner is indispensable in these types of scenarios. In many situations any part of the content may be used to make routing decisions, so those parts need to be decryptable by the routers making those decisions. In this case it is necessary to be able to decrypt only fragments of the message, and the needed level of fragmentation may not be known beforehand. It is important to note that mere compatibility with existing XML Signature and XML Encryption Recommendations without considerable advances in the security processing speed of text XML implementations will not be sufficient. Finally, having an efficient way to ensure validity of the message according to a schema is highly desirable in many content-based routing and publish subscribe systems since this may make routing rules simpler to specify and less costly to evaluate.
Routers at the edge of the LAN responsible for distributing traffic coming from the WAN may operate with a relatively limited set of content inspection rules and small routing tables, but must deal with considerable quantities of message traffic and require very short message processing latency. Publish subscribe applications within the LAN will typically deal with less volume and may have more relaxed latency requirements compared to routers, but may need to deal with a large number of subscribers generating a very large set of rules to be evaluated against every published XML message.
Both XML routers and XML publish subscribe systems have in common that the greatest part of processing overhead comes from simply parsing each XML message. Typically, each message is converted into an in-memory representation (such as DOM) prior to evaluating routing or subscriber rules.
Evaluation of the rules may also require a great amount of processing power; most existing systems cannot, in fact, support complex rules because of the processing cost. XML Routers may be confined to processing routing information in message headers. XML publish subscribe applications may be unable to achieve dynamic subscription matching with full decoupling of publishers and subscribers. Adequate performance can only be achieved when subscribers are limited to matching a fixed set of keywords recognized by the publisher. The inability to perform full content inspection because of its cost in performance has limited architectures to static, inflexible, and narrowly-targeted systems which do not meet current needs.
XPath is sometimes used for expressing routing and subscriber rules. Its ability to locate any content item in an XML message and to apply additional node filtering criteria corresponds closely to the requirements for rule-based classification of XML content. As there are many tools for evaluating XPaths over DOM, we can conceptualize the problem of XML content-based routing and publish subscribe as a process of making decisions from the results of XPath evaluation.
Streaming XPath is a more efficient XPath implementation usually built on a SAX parser and limited to some subset of XPath expressions which can be evaluated in a single pass over an XML document. Streaming XPath may be an alternative for improving the performance of routing or publish-subscribe rules described with XPath expressions. The performance gain may be sufficient for some usage scenarios but still will not approach the network speed required for inspection and direction of XML messages entering into a LAN.
Purpose-built hardware devices can perform XML content inspection and make firewall, routing, or publish decisions at throughput rates which may be sufficient for this use case. This alternative is proprietary and costly and is therefore of limited applicability. In addition, the problem is sufficiently challenging even for hardware-based solutions. An alternate XML format could have the effect of increasing the performance of hardware-based solutions, further extending the capacity of such solutions and also lowering their cost.
When SOAP messages flow in a network, rarely do they flow from the source system to the destination system directly. More often than not, they flow through various intermediaries between the two endpoints. The intermediaries route the message between systems for various reasons. These include routing the SOAP message to the appropriate endpoint in order to fulfill the request, routing the message based on system availability, and so on.
This use case applies to medium to large business that utilize XML and/or Web Services in SOAP messaging applications.
In a system that is dynamically routing messages to support a real-time messaging application, the size of each message as well as the time it takes to determine the next intermediary is important for the overall system performance and response time. The smaller the message, the less time it takes to transmit it between intermediaries, thus freeing up the remaining bandwidth. In addition, an alternate encoding has the potential to make the process of determining the next intermediary more efficient.
In some messaging systems, the endpoint for a message is not determined by the client, but rather in real time by intermediaries. An intermediary will route the message based on various factors. These factors include, but are not limited to, message type, message content, system availability, etc.
An intermediary looks only at the SOAP header of the message to determine the route the message will take. This is a more efficient process than reading the entire message, creating an in memory representation (DOM) and digging into the message to determine its route. The more efficient mechanism is to have the intermediary only look at some header information at the beginning of a message in order to make an intelligent decision about how to route it.
It is important to note that the message, when received by an intermediary for routing, could be digitally signed and/or encrypted. These issues need to be considered for the alternate encoding in the same ways they are considered for XML today.
An alternate encoding needs to support the ability for this type of message routing to take place in, at a minimum, without loss of performance. The intermediary needs to examine the alternate encoding’s header to determine the routing of the message. This, ideally, requires random access into the message header to determine the routing information.
Various other technologies can be used to achieve routing. However, they would be more brittle and often more proprietary compared to XML. Being able to express routing information in the header of a message in XML ensures a level of interoperability and non-brittleness that XML is known for.
The United States Department of Defense and its allies are very large enterprises and, as such, operate in one of the most challenging information sharing environments. To be effective in today's military environment, thousands of dissimilar information systems, developed independently by different organizations in different countries must share timely, accurate information in a form that is mutually understandable. They must do so within the competing force's decision cycle despite differences in cooperating force cultures, command structures, operational procedures and languages. Information must flow reliably and seamlessly between command centers, mobile devices and embedded systems, including aircraft, ships, submarines, satellites, handheld devices, remote sensors, unmanned vehicles, etc.
To address this challenge, the United States and its allies have invested a great deal of energy devising technologies and formalizing data standards to automate information exchange between command and control (C2) systems and reduce the ambiguity of natural language. In the late 1960s and early 1970s, the U.S. and its allies developed some of the first hierarchically structured data encoding standards, similar to XML, but more compact, for exchanging both text and binary representations of critical defense information. These military standards have been deployed worldwide on a diverse range of platforms and are implemented by almost all U.S. and allies' C2 systems. However, as military proprietary standards, they are expensive to implement, are supported by a limited number of tools and do not evolve at the same pace as technologies driven by the dynamics of a highly competitive commercial marketplace. In addition, training materials, on-line communities and software development tools are sparse and add to the associated expense.
With the advent of XML, the U.S. and its allies quickly recognized the potential of the commercial marketplace to produce low cost, high quality, rapidly evolving technologies for sharing information between C2 systems (e.g., see [NATO XML]). Key national and international military data standards have adopted XML and it is being widely deployed in military C2 systems. Based on the success of these initiatives, the DoD has issued high level policies, directives and programs specifying the adoption of XML across the DoD enterprise.
Unfortunately, the overhead associated with XML’s text syntax makes it impractical or impossible to deploy on systems where efficiency is critical or bandwidth is limited. This creates an interoperability rift between the systems that can effectively process XML and those that cannot, making it difficult to share information between these systems without an expensive series of proprietary gateways and the associated risk that information fidelity or accuracy will be lost.
The military information environment includes a very diverse set of usage scenarios, data types, systems and networks. Important characteristics of the military operating environments, data and systems are described below.
Bandwidth capacity and cost: The tactical networking environment is challenging and involves, for example, high speed aircraft, dramatic terrains, varying electromagnetic conditions, intentional jamming of frequency bands, communications interception risks, etc. In addition, it is not possible to rely on an in-place communications infrastructure. Each network participant brings a portion of the communications infrastructure with them [Airborne Networking]. Due to these characteristics, bandwidth is both limited and expensive. To maintain current performance characteristics, information representations used in this environment must be competitive in size with existing hand optimized binary formats, which is to say their size must be close to the information theoretic minimum. In addition, it must be possible to transmit small subsets of larger information items, e.g. specified by queries or changes to the underlying data.
Diverse data and documents: As a massive enterprise, the military uses a very diverse range of documents and data. This includes highly structured, semi-structured and loosely structured data. It includes very large documents, such as aircraft maintenance manuals, and high volume streams of very small messages, such as position reports. It includes documents designed for both human and machine consumption. It also includes documents that integrate and fuse several different types of data from different sources.
Information Flexibility: Information sources are often used by a variety of information consumers for different purposes. Not all consumers of an information source are known at the time the information source is designed. In addition, information sources and information consumers must be able to evolve independently without being tightly coupled. Therefore, it is not generally desirable or even possible to build many separate schemas to define the interfaces between individual information producers and consumers. Rather, information producers must be able to make information available in a consumer independent form and allow consumers to query that information according to their individual needs.
Schemas and validity: In many cases, the information exchange requirements for C2 systems are well understood and relatively stable. However, real world data exchanges do not always fit precisely within the bounds of their associated schemas, which are often designed well before systems are deployed that implement them. Therefore, it is important for systems to be able to deal flexibly with schema deviations. In addition, there are times when a schema does not exist or cannot be relied upon and scenarios where typed information and untyped information are interspersed in a single document.
Platform constraints: There are a diverse range of systems connected to defense networks, including high powered workstations, low powered remote sensors, small mobile devices and hardened systems embedded in military air, land, sea and space craft. Many of these systems have very limited processing power, memory capacity and/or battery life. To maintain current performance characteristics, the code footprint and time / space complexity of parsing and serialization algorithms must be competitive with that of current hand optimized encoders and decoders.
Upgrade time and cost: Upgrading hardware and software on tactical systems is time-consuming and costly. In many cases, any configuration change triggers expensive testing (e.g., flight tests). In addition, the number of systems that can be taken out of service simultaneously is restricted because operational readiness must be maintained. Thus, system upgrades are tied to maintenance cycles, which are not synchronized, and may even be delayed one or more cycles due to competing priorities and budget limitations. Consequently, deployed systems often implement different versions of the same schema.
The DoD would like a single binary XML standard that works well for the diverse range of data, systems and scenarios specified above as opposed to an incompatible set of binary XML standards optimized for vertical industries. We would also like a binary XML standard that leverages schema information when it is available to improve performance, but does not depend on its existence or accuracy to work. We have sponsored commercial research and development that demonstrates that what we want is both possible and practical [Extending XML]. Additional information about our objectives and requirements can be found in the references section below.
The United States military, government agencies, U.S. allies and others with which the U.S. military exchanges information.
The US DoD maintains a large number of military facilities, mobile devices, sensors and vehicle systems that require timely, accurate information to achieve their objectives. These are being modernized to leverage web technologies [NCES], [GIG-ES]. Where possible, the DoD is buying commercial off-the-shelf hardware and software products, including web servers, mobile devices, databases, messaging solutions and security infrastructure, from a wide range of industries to reduce development costs. Having a common data representation that is widely supported across these industries and military systems drastically reduces the cost and complexity of military systems, while increasing their interoperability and tempo of evolution. For this reason, many of these military components have already adopted XML based solutions.
A large number of systems, however, do not have sufficient bandwidth or computing resources to support XML and accomplish this transition. A major objective is to extend the benefits of XML to the edge of military tactical environments where its use is currently impossible or impractical due to bandwidth and platform constraints.
The value of XML to the DoD and its allies is a function of its wide adoption. The dynamics of the competitive commercial marketplace have produced a wide range of high quality, inexpensive, rapidly evolving XML products that can be easily integrated into a wide variety of military systems. In addition, it has created a vast pool of XML developers, communities, books and training materials. Consequently, using XML drastically reduces the time and cost required to develop and maintain military systems. These benefits create a natural incentive for independently developed military systems and data standards to gravitate toward a common information representation, increasing interoperability across the military enterprise.
Unfortunately, it is not possible or practical to deploy XML on a wide variety of military systems due to the characteristics of military operating environments, data and systems described above. The systems that cannot currently use XML are most often located on the edge of the network where the most time critical data is both digitized and consumed. It is critical for the military network to efficiently connect the observers that detect real world events, the decision makers that allocate and direct resources and the actors that can change the course of those events. A widely adopted efficient XML encoding would enable timely XML information to flow to and from the edge of the network in an efficient, interoperable and cost effective way.
Below are the properties a binary XML standard must, should and may have to support this use case. We have included properties we believe are in scope for a binary XML standard and have omitted properties they reinvent or otherwise duplicate functionality at other layers of the technology stack. For example, security features were omitted from the list of binary XML properties because they already exist in the XML technology stack (i.e., XML signatures and XML encryption). To the extent possible, we would like to reuse existing XML technologies with binary XML instead of developing duplicate capabilities specific to binary XML. This will result in a clean separation of concerns, increasing interoperability and reducing the cost and complexity of binary XML.
Increased use of sensors in military and commercial environments is being driven by the need for better intelligence data and by technology which provides smaller, less costly and more capable sensors. Sensing architectures based on computer networks lead to increased communication to combine (fuse) information. Fixed sensors may have stable communication networks, while mobile sensors, especially those on the ground, may have communication networks which form and reform on the fly. Network links can be publish/subscribe or point-to-point. Sensor packaging and missions can vary widely:
Some can be deployed as large, complex sensor assemblies which have considerable on-board capabilities.
Unattended sensors with limited battery power, processing and communication capabilities. The software stack which supports receiving commands and sending reports must be the smallest and most efficient possible.
Sensors with hard real-time requirements, such as those protecting or navigating a platform.
Sensor communication is typically two-way. Sensor reports and status are sent to collectors or specific consumers. Commands and control messages are sent to sensors. Both kinds of communications may need to be persisted. Persisted data may be used for investigations, analyses, training and simulation inputs. Synthetic sensor communication, for training, testing and simulation, may need to be generated. Tools should be able to transform sensor communications into human-readable forms and human-readable forms into sensor communications. In some cases sensors may be combined with safety-critical devices. Sensing systems may need to collaborate with other external, heterogeneous systems. In this case communication may be a unidirectional flow of sensor reports to the external system
Sensors report data and are controlled over communication networks. High latency and low bandwidth may not be constant features of the communication, but could emerge at any time. Any organization which uses sensors is a stakeholder. Military organizations have a strong interest. Other stakeholders include homeland defense, commercial organizations which explore for natural resources and government agencies which monitor or investigate the environment.
XML has advantages in this domain: it is standards-based, easier for humans to review and interpret and has the ability to bridge heterogeneous networks and protocols. The disadvantages are higher bandwidth requirements and possible increases in required processing power and time.
For XML to be a viable syntax for creating a sensor language, it would need to be able to produce small packets which are inexpensive to encode and decode. It would need to be at least as efficient as existing binary sensor message formats. It would need to be seen as providing other benefits such as standardization and commercial tool support.
Small packet size is important not only for communication, but also for persistence. The sensor traffic for a large-scale exercise, recorded for later play- back, could be quite large. Persisted data should be susceptible to rapid search mechanisms, although this capability is mostly a function of storage and retrieval tools. Because sensor reports and commands are comparatively small, rapid access to data within them is not critical.
Encoding, decoding and packetizing of sensor reports, status messages and commands needs to be inexpensive. A worst case is unattended sensors which have limited power, limited battery life and limited processing resources. In all cases sensor communication should support high levels of information assurance. This is especially important in safety-critical scenarios. The communication format should not add any additional cost to encrypting/decrypting and signing/verifying sensor reports and commands.
Existing binary sensor message formats and protocols will continue to be used and new single-solution formats developed.
With the emergence of mobile computing and communications devices, users want ubiquitous access to their information and applications from any of their devices. The ability to use applications and information on one device, then to synchronize any updates with the applications and information back at the office, or on the network, is key to the utility and popularity of this pervasive, disconnected way of computing. Before 2000, there was a proliferation of different, proprietary data synchronization protocols for mobile devices. Each of these protocols was (and still is) available only for selected transports, implemented on a selected subset of devices, and able to access a small set of networked data.
OMA (Open Mobile Alliance) Data Sync (also known as SyncML) protocol is a universal and powerful XML-based synchronization protocol that operates on any device (phones, handhelds, PCs, etc.) over wireless and conventional networks. The SyncML initiative has been sponsored by a wide range of mobile industry leaders, and a large number of small devices are now SyncML-compliant. Small devices, such as mobile phones, PDAs or pagers, have restricted memory, processing power and battery life, and are connected to low-bandwidth, high-latency networks.
Until now, SyncML was commonly used to synchronize contacts, to-do lists, and schedule information. With the growing demand for data synchronization services, a new release of the protocol offers better support for email, file and folder synchronization. This will considerably increase the average size of SyncML messages exchanged over the network.
This use case applies to data synchronization services, and is particularly significant in the context of small mobile handsets. Stakeholders include users of synchronizable data sources (e.g., PDA and mobile phone users), and providers of platforms and devices allowing such synchronization.
One of the main criticisms of SyncML protocol, made by manufacturers and service providers who choose to adopt another proprietary data synchronization protocol, concerns its lack of performance. XML-based messages are larger and typically require more processing than other protocols. Whatever the type of device used (small or not), the synchronization process has to be efficient with regard to time and resources. This promotes user convenience and also minimizes user costs (as very often, he has to pay for the synchronization communication over the network).
The popularity of SyncML synchronization and the size and variety of data commonly synchronized is growing. Combining this with the limitations of small devices mentioned above makes the need for a more efficient XML serialization crucial.
SyncML is based on XML in order to maximize interoperability. The speed and efficiency of synchronization is key to the adoption of the SyncML synchronization protocol, particularly in the context of band-width limited wireless networks and devices with limited processing power and other resources.
The main requirements this use case imposes on an alternate XML serialization would be faster processing and reduced message size. Considering that the data conveyed via SyncML messages is often the user's personal data, it is also important that data privacy be preserved. So an encryption capability is also important.
Supercomputing and grid processing involve multiple computing nodes and often involve massive amounts of data and processing. Supercomputer and supercomputer clusters usually have very low latency and high bandwidth network interconnections. Grid systems may have any type of network connectivity depending on the grid's goal and users. Supercomputing applications process large amounts of data that consist of floating point arrays, genetic sequence strings, physical particle or molecular systems, images, 3D models, knowledge representation and queries, or simulation events. Existing systems make use of XML, custom binary formats, and evolving shared access file formats such as HDF5 (Hierarchical Data Format 5) from NCSA (National Center for Supercomputing Applications ). HDF5, and the predecessors including CD (Common Data Form from NASA) and net (Network Common Data Form from the Undated Program Center), are file formats that allow efficient production, use, and sharing of data. These are container formats that are self-describing, architecture independent, directly accessible, appendable, and sharable (one writer, many readers simultaneously).
Supercomputing applications include weather forecasting, physical simulation of many things, Monte Carlo simulation of complex systems, 3D rendering, data discovery, and knowledge processing. Applications that are heading toward supercomputing include massive multiplayer online role-playing games (MMORPG), expanded search engine services, and expanded retail services. Distributed grid applications include searching for protein folding solutions, testing cryptographic strength, and searching for extra-terrestrial signals. Weather forecasting, as an example of data interchange needed, receives real-time telemetry from thousands of stations all over the world. Each node communicates results frequently with other nodes. Continuous results are output, processed, and communicated in realtime to many organizations. Communication between cluster or grid nodes takes a number of forms and includes MPI or other interfaces for message queuing and communication, shared filesystems, and non-uniform shared memory (cache coherent (ccNUMA) and non-cache coherent memory).
Users of supercomputers, supercomputing clusters, and computational and storage grids. This includes weather forecasting, physics research, virtual weapons testing and simulation, manufacturers, 3D rendering, and companies and developers that communicate with services using computational grid technology.
Supercomputing developers have created file formats with many of the characteristics needed for a more complete binary XML standard format. While this has interesting properties, it isn't usable by the rest of the IT market as a substitute for XML. Supercomputing applications nearly always need to consume large amounts of data from other systems and produce results that must be used by external systems. Much of this data is now XML at points in the lifecycle. A binary XML standard that addressed the needs of supercomputing applications and is usable for efficient data input and output could greatly improve efficiency, integratability, and standardization.
A binary XML format that supports efficient lifecycle data production, use, and modification would be a very important mechanism for intra-grid communication and for data input and output. Formats like HDF5 have advanced container characteristics but have limited management of actual data payloads beyond typing. It is probable that the binary XML and HDF5 work will influence each other in important ways.
There are many types of grid and supercomputing applications. While some are classic hard physics particle simulations, others focus on business-processing simulations or analysis. The data involved varies from very large floating point arrays to complex, tree or graph structured data. A solution to the binary XML requirements may provide a very attractive tool for these applications.
Container file formats such as HDF5 provide a partial solution, often used with raw arrays of floating point data that must be managed by applications. In addition, most applications use either a custom binary data format or XML. When the advantages of XML are needed, the performance problems involved are quite apparent to users of these performance sensitive applications.
The Use Cases have been gathered by the XBC Working Group contributors: Robin Berjon (Expway), Carine Bournez (W3C), Don Brutzman (Web3D), Mike Cokus (MITRE), Roger Cutler (ChevronTexaco), Ed Day (Objective Systems), Fabrice Desré (France Telecom), Seamus Donohue (Cape Clear), Olivier Dubuisson (France Telecom), Oliver Goldman (Adobe), Peter Haggar (IBM), Takanari Hayama (KDDI), Jörg Heuer (Siemens), Misko Hevery (Adobe), Alan Hudson (Web3D), Takuki Kamiya (Fujitsu), Jaakko Kangasharju (University of Helsinki), Arei Kobayashi (KDDI), Eugene Kuznetsov (DataPower), Terence Lammers (Boeing), Kelvin Lawrence (IBM), Eric Lemoine (Tarari), Dmitry Lenkov (Oracle), Michael Leventhal (Tarari), Don McGregor (Web3D), Ravi Murthy (Oracle), Mark Nottingham (BEA), Santiago Pericas-Geertsen (Sun), Liam Quin (W3C), Kimmo Raatikainen (Nokia), Rich Salz (DataPower), Paul Sandoz (Sun), John Schneider (AgileDelta), Claude Seyrat (Expway), Paul Thorpe (OSS Nokalva), Alessandro Triglia (OSS Nokalva), Stephen D. Williams (Invited Expert).