University crest Markup Technology Logo
Language in Edinburgh

Henry S. Thompson's Home Page

HST travel plans
Accelerated Natural Language Processing (ANLP), Winter Term 2018 p
Foundations of Natural Language Processing (FNLP), Spring Term 2019
Contacting me (including PGP key)
Index of all documents on this site of potential general interest


Context

I'm based in the Institute for Language, Cognition and Computation of the School of Informatics at the University of Edinburgh, with the title "Professor of Web Informatics". I'm a Fellow of the Alan Turing Institute. I'm interested in the Architecture of the Web, Markup Languages, the Foundations of Cognitive Science, as well as Computational Linguistics, Data-Intensive Linguistics, Language Corpora and Corpus Management.

Much of my recent work originates in my membership in the W3C Technical Architecture Group between 2005 and 2012 and my work within W3C working groups:

  1. Making sense of how the Web works (and doesn't work), with particular reference to the nature of URIs;
  2. In particular, I recently spent a lot of time developing tools and techniques for studying the Web empirically, based on both static (very large-scale web datasets) and dynamic (proxy logs) evidence;
  3. Developing XML and Web standards and related tools (see below); I was involved in the W3C SGML Working Group, whose work led to the XML recommendation: Extensible Markup Language (XML), and was a member of the XML Core, XML Schema and XML Processing Model working groups;
  4. I also contribute to IETF work in the above areas.

If you're interested in pursuing an MSc or PhD in one of these areas, please see the Informatics postgraduate prospectus and review my entries in the ILCC PhD topics pages, then get in touch with me.

Outside my University time I do consulting and business mentoring via Markup Systems.


XML Tools (LT XML, XED and XSV)

Version 1.2 of LT XML, a fully compliant XML tool kit and API for WIN32 and UN*X platforms, is available.

The beta of XED, my XML document instance editor is still available.

The current version of XSV, an XML Schema validator, is available via a web interface.

An add-on to Python's SAX functionality providing a simple 'pull'-style interface PullFromSAX.py

The beta version of xslj, an old (not-standard) XSL to DSSSL translator is still available.


Address

Postal:
Henry S. Thompson
4.22 Informatics Forum
10 Crichton Street
Edinburgh EH8 9AB
SCOTLAND

Email:
ht@inf.ed.ac.uk
PGP key:
HST's GnuPG Public Key
Tel:
+44 (0)131 650 4440
Mobile:
+44 (0)7866 471 388
Fax:
+44 (0)131 651 1426 Valid XHTML 5!
 
Photo (if you must)
My wife Catharine runs the OPENspace Research Centre.
We intend to develop ShutYourFacebook.com as a website for promoting outdoor activities.
We are fortunate to have inherited access to a holiday home on the coast of Maine, which we make available to rent.
I'm starting to move some material to a personal server

The following sections are of historical interest only at this point -- I haven't worked on this stuff for years.

XML Linking Architectures

I helped launch the use of standoff markup to improve annotation management in complex datasets: the underlying techonology is described in my SGML Europe '97 paper. My presentation to the COCOSDA meeting in Rhodes discusses the application of this technology to spoken language transcripts, available as Powerpoint v.7 version, Powerpoint v.4 version and quick and dirty HTML from Powerpoint outline.


DSSSL Tools (DSC)

DSC version 2.0, an online syntax checker, normaliser and implementation framework for DSSSL, based on embedding a full R4RS Scheme interpreter in James Clark's SP parser, is available for downloading. For more information, see the release announcement, which describes dsc in more detail.

Version 2.0, as demonstrated at SGML/XML '97 in November 1997, provides a much richer implementation framework than previous versions, including the full query language and the transformation language.

DSSSL users might find my index to DSSSL procedures by prototype useful. I've also produced a summary of information about the copyright status of the DSSSL standard and pointers to various electronic versions thereof.

For DSSSL/SGML implementation mavens, heres an illustrated example of an SGML source grove.