We published the first freely available research corpus annotated using the TEI Guidelines for SGML markup of such material (Burnard and Sperberg-McQueen 1994, Bard et al. 1992), developed a simplified version of SGML for use in language processing and distributed a free API and toolkit based on this (Thompson et al. 1995). Normalised SGML, as we called this simplification, was an important input into the design process for XML itself, which the proposer participated in as one of the original members of what was initially called the SGML Working Group of the W3C in 1997.
The LTG migrated its normalised SGML tools to XML and started developing a suite of XML-aware tools for use in data-intensive applications (for an overview, see McKelvie et al. 1998). The core of these tools is LT XML (Thompson et al. 1998), an integrated set of XML tools and a developers' tool-kit, including a C-based API, running on UNIX and WIN32 platforms. LT XML has now been licenced to over 4500 individual worldwide in academic and industrial environments. The core validating XML parser at the heart of that system, called RXP (Tobin 1999), is widely acknowledged as one of the two best high-performance validating parsers available.
The proposer has been involved with the core XML technologies of the proposed work, XSLT and XML Schema (see below) from the very start, having collaborated in the initial research and design work (Adler et al. 1997) which led to the formation of the XSL Working Group which wrote the XSLT Recommendation, and similarly published an early exposition of the need for what became XML Schema which included a number of key design ideas now incorporated in the XML Schema draft recommendation (Thompson et al. 2000) of which he is a co-editor.
Support for our work on XML and related markup technologies has come from EPSRC (project NSCOPE, GR/L29125, with the proposer as Principal Investigator and the named RA as major contributor), the European Union Project MATE (on a Multilevel Annotation Tool for corpora; see McKelvie et al. 2000) and the ESRC (the Core Grant to the Human Communication Research Centre, 1989-1999). Our work in this area has also attracted industrial support, with direct grants from Sun Microsystems and Microsoft.