Next: 2.2 Deliverables and Work
Up: 2. Programme and Methodology
Previous: 2. Programme and Methodology
Although the overall shape of the desired architecture for data
publishing and interchange via XML is clear, and many more or less
ad hoc efforts are already under way to instantiate it for
particular application/programming language pairs (see e.g. Reinhold
1999, Box 1999), what is really wanted is support for a declarative
specification of the relation between an application data model and an
XML Schema, each independently defined. In concrete terms this
support should yield implementations of language-independent
marshalling and unmarshalling, that is, bi-directional conversion
between XML instance and application data. Some aspects of a solution
are already clear in outline - others will require exploration of
possibilities for application of research results from other related
disciplines.
We see the proposed research as necessary preparation for
standardisation work in this area: member companies of the W3C
have recently requested that it undertake work on standardising
XML protocols (Larry Masinter, personal communication), while at
the same time clarifying that XML-encoded RPC is not
what is required: such a move would leave the XML-structure
to/from application structure correspondence issue to be solved.
The following questions each need to be addressed to arrive at the
desired architecture:
- Is the mapping to be specified by annotations within an
individual XML Schema, e.g. by adding mapping information
to each element and attribute declaration? Alternatively, should
the mapping be specified externally, possibly exploiting XSLT?
- The directionality aspect of the mechanism deserves special
consideration: Both XML Schema and XSLT are by
design good matches for the XML
1#1application
(unmarshalling) direction. What kind of conditions must
be imposed on either type of solution to guarantee reversibility,
that is, the application
1#1
XML (marshalling) direction?
Again, XML Schema and XSLT both imply a control structure from
which an implementation of unmarshalling naturally emerges. What
control structure would be required for marshalling?
- What are the tradeoffs between specifying the application side
of the mapping in implementation-level terms (e.g. Java class
instance/variable or relational table/row) versus specifying it in
more abstract terms (e.g. Entity-Relation, EXPRESS, or UML
(see http://www.uml-zone.com/umlfaq.asp)?
- Would specifying an abstract mapping in the schema, and concrete
language-specific bindings from abstract model to implementation
independently of the schema, give the right modularity?
- When only one or the other model is specified in advance (i.e.
XML Schema or application data model), can we automatically derive
the other? If so, what conventions should be used in doing so?
- Does the intermediate position between XML and traditional
databases occupied by semi-structured data offer any leverage for
the solution to this problem?
- What constraints, if any, are required to allow an
implementation of unmarshalling to work in a streaming fashion, i.e.
to build application structures as an XML document is processed,
without building a complete internal representation of the document
before application structure construction can begin?
The work proposed here aims at answering these questions, structuring the
effort in terms of three broad goals:
- Design a declarative approach based as far as possible on
existing public standards which supports automatic generation of
implementation-language-appropriate unmarshalling of XML documents
into application data structures;
- Extend the above to support marshalling in a congruent way, i.e.
the construction of XML documents from application data structures;
- Explore approaches to (semi)- automatically generating
appropriately annotated XML schemas from schemas expressed in one or
more data modelling languages.
Next: 2.2 Deliverables and Work
Up: 2. Programme and Methodology
Previous: 2. Programme and Methodology
Henry Thompson
2000-09-13