W3CNOTE-schema-req-19990107

XML Schema Requirements

DRAFT Version 0.1

NOTE-schema-req-19980107

World Wide Web Consortium Note

Current Edition: 1998/12/21

This version:
http://www.w3.org/XML/Group/1998/12/NOTE-schema-req-19981221.html
Previous version:
None.
Latest version:
http://www.w3.org/XML/Group/1998/12/NOTE-schema-req.html
Editors:
Ashok Malhotra -- (petsa@us.ibm.com) for IBM
Murray Maloney -- (murray@muzmo.com) for Veo Systems Inc.

Status of this document

This is a draft of what is scheduled to become a W3C Note in January 1998, produced as a deliverable of the XML Schema WG according to its charter. The current version of this document is an internal WG draft for comment by XML Schema WG members. It is inappropriate to use W3C draft Notes as reference material or to cite them as other than work in progress. A list of current W3C working drafts and notes can be found at http://www.w3.org/TR .

According to the XML Schema WG Charter, the document you are reading was scheduled to be published as a W3C Note in December 1998. This document offers an edited selection of requirements already discussed by the group, upon which we hope everyone can agree. It has not yet been approved by the XML Schema WG or the XML Plenary group, and has not yet been made public.

In preparing this list of requirements, the editors have attempted to include all the essential requirements suggested in the WG discussions so far. However, this requirements documents has been intentionally designed to leave most design questions undecided. So, if WG members agree to this document, the WG will have an opportunity argue all the remaining issues during the design phase. See the referenced list of the XML Schema WG issues at "Candidate Requirements for XML Schema and Datatyping from Nov. 14 and 15 meeting".

Please consider the following document as a best effort at a baseline set of requirements. As the XML Schema work continues, the concrete implications of these requirements for the design will be worked out and documented.

Comments about this document should be addressed to the XML Schema WG.

Abstract

This document specifies the basic usage scenarios, design principles, and base requirements for an XML schema language.

Table of Contents

  1. Overview
  2. Purpose
  3. Scenarios
  4. Principles
  5. Requirements

1. Overview

This document lists a base set of agreed requirements for an XML schema language.

The XML 1.0 specification defines the concepts of well-formedness and validity; it is very simple to check a document for well-formedness, while validation requires more work but allows the user to define more powerful constraints on document structure. XML validity requires that a document follow the constraints expressed in its document type definition, which provides the rough equivalent of a context-free grammar for a document type.

In some contexts, applications may need definitions of markup constructs more informative, or constraints on document structure tighter than, looser than, or simply different from those which can be expressed using document type definitions as defined in XML 1.0. There is also a widespread desire to allow markup constructs and constraints to be specified in an XML-based syntax, in order to allow tools for XML documents to be used on the specifications.

By charter, the XML Schema WG is assigned to address the following issues:

primitive data typing
integers, dates, and the like, based on experience with SQL, Java primitives, etc.; byte sequences ("binary data") also need to be considered
structural schemas
a mechanism somewhat analogous to DTDs for constraining document structure (order, occurrence of elements, attributes). Specific goals beyond DTD functionality are
  • integration with namespaces
  • definition of incomplete constraints on the content of an element type
  • integration of structural schemas with primitive data types
  • inheritance: Existing mechanisms use content models to specify part-of relations. But they only specify kind-of relations implicitly or informally. Making kind-of relations explicit would make both understanding and maintenance easier
conformance
The relation of schemata to XML document instances, and obligations on schema-aware processors, must be defined. The WG will define a process for checking to see that the constraints expressed in a schema are obeyed in a document (schema-validation); the relationship between schema-validity and validity as defined in XML 1.0 will be defined.

The XML Schema work is interdependent with several other areas of W3C activity. These are listed below under Design Principles.

2. Purpose

The purpose of the XML schema language is to provide an inventory of XML markup constructs with which to write schemas.

The purpose of a schema is to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: elements and their content, attributes and their values, entities and their contents and notations. The definitions of XML markup constructs document and restrict the meaning, usage, and function of elements, attributes and datatypes, their contents, and how they may be combined. Schema constructs may also provide for the specification of implicit information such as default values. Schemas document their own meaning, usage, and function.

Thus, the XML schema language can be used to define, describe and catalogue XML vocabularies for classes of XML documents.

Any application of XML can use the Schema formalism to express syntactic, structural and value constraints applicable to its document instances. The Schema formalism will allow a useful level of constraint checking to be described and validated for a wide spectrum of XML applications. For applications which require other, arbitrary or complicated constraints, the application must perform its own additional validations.

3. Usage Scenarios

The following usage scenarios describe XML applications that should benefit from XML schemas. They represent a wide range of activities and needs that are representative of the problem space to be addressed. They are intended to be used during the development of XML schemas as design cases that should be reviewed when critical decisions are made. These usage scenarios should also prove useful in help non-members of the XML Schema WG understand the intent and goals of the project.

  1. Publishing and syndication

    Distribution of information through publishing and syndication services. Involves collections of XML documents with complex relations among them. Structural schemas describe the properties of headlines, news stories, thumbnail images, cross-references, etc. Document views under control of different versions of a schema.

  2. Electronic commerce transaction processing.

    Libraries of schemas define business transactions within markets and between parties. A schema-aware processor is used to validate a business document, and to provide access to its information set.

  3. Supervisory control and data acquisition.

    The management and use of network devices involves the exchange of data and control messages. Schemas can be used by a server to ensure outgoing message validity, or by the client to allow it to determine what part of a message it understands. In multivendor environment, discriminates data governed by different schemas (industry-standard, vendor-specific) and know when it is safe to ignore information not understood and when an error should be raised instead; provide transparency control. Applications include media devices, security systems, plant automation, process control.

  4. Traditional document authoring/editing governed by schema constraints.

    One important class of application uses a schema definition to guide an author in the development of documents. A simple example might be a memo, whereas a more sophisticated example is the technical service manuals for a wide-body intercontinental aircraft. The application can ensure that the author always knows whether to enter a date or a part-number, and might even ensure that the data entered is valid.

  5. Use schema to help query formulation and optimization.

    A query interface inspect XML schemas to guide a user in the formulation of queries. Any given DB can emit a schema of itself to inform other systems what counts as legitimate and useful queries.

4. Design Principles

In the design of any language, trade-offs in the solution space are necessary. To aid in making these trade-offs the following design principles are used. They are subject to comment and revision.

The XML schema language shall be:

  1. more expressive than XML DTDs
  2. self-describing and expressed in XML;
  3. usable by a wide variety of applications that employ XML;
  4. straightforwardly usable on the Internet;
  5. optimized for interoperability
  6. simple enough to implement with modest resources;
  7. compatible/coordinated with relevant W3C specs;

The XML schema language specification shall:

  1. be prepared quickly;
  2. be precise, concise, human-readable, and illustrated with examples.

5. Requirements

Structural requirements

The XML schema language must define:

  1. mechanisms for constraining document structure (namespaces, elements, attributes) and content (datatypes, entities, notations);
  2. mechanisms to enable inheritance for element, attribute, and datatype definitions;
  3. mechanism for URI reference to standard semantic understanding of a construct;
  4. mechanism for embedded documentation;
  5. mechanism for application specific constraints and descriptions;
  6. mechanism for managing evolving document and information models.

Datatype requirements

The XML schema language must:

  1. provide for primitive data typing, including integers, dates, SQL, Java primitives, byte sequence, etc.;
  2. define a type system adequate for import/export from traditional (e.g. relational) databases;
  3. distinguish requirements relating to lexical (serialized) data representation, vs. requirements (if any) governing the underlying information set;
  4. allow user-defined datatypes as specializations of existing datatypes by constraining properties such as range, precision, length or mask.
  5. allow creation of user-defined datatypes, eg. datatypes created by constraining existing datatypes with properties such as range, precision, length or mask.

Conformance

The XML schema language must:

  1. describe the responsibilities of conforming processors;
  2. define the relationship between schemas and XML documents;
  3. define the relationship between schema validity and XML validity;
  4. define the relationship between schemas and XML DTDs, and their information sets;
  5. define the relationship among schemas, namespaces, and validity.