The Semantic Web and Knowledge Representation:

An informal comparison

Henry S. Thompson
28 Jan 2006

1.   Introduction

There's a difficult choice confronting anyone embarking on a project in which large-scale knowledge resources are required: the choice of knowledge representation. In addition to first-order logic and the many specialised knowledge representation languages developed over the last thirty years, the apparently radical alternative offered by the Semantic Web must now also be considered. In what follows I'll introduce the KR technologies of the Semantic Web project, and draw out some important contrasts with more traditional approaches.

2.   The Semantic Web

The Semantic Web is the name given to a collection of technologies whose design and development was initiated by Tim Berners-Lee and the World Wide Web Consortium. Its stated goal is to reproduce the success of the World Wide Web as a medium for human beings to exchange information at the level of automatic processing. That is, to enable distributed development of information resources which can be discovered, merged and exploited computationally without human participation. To this end it has reconstructed some basic elements of knowledge representation systems in order to make them 'web-friendly'. To date its efforts have been focussed in two areas:

RDF
a syntactic low-level relational graph data model for recording assertions
OWL
a semantic higher-level collection of primitives for use in describing ontologies

To date there has been less work on inference or reasoning. Connections have been achieved with description-logic-based inference engines, and work has recently begun on a rule language.

3.   What's special about the Semantic Web?

A KR language whose core model is two-place relations, understood as forming a graph, is nothing special. A KR language with an XML serialisation is nothing special. And designing the language first and then playing 'catch-up' with respect to inference isn't new either, alas.

What is new is the focus on KR for the Web. This in turn has brought with it a key aspect of the architecture of the Web: everything has a universal name. The thing which sets the Semantic Web technologies apart from every other KR technology is the use of URIs (Uniform Resource Identifiers) for naming not only individuals, but also relations. Every RDF graph is constructed from subject-relation-object triples, and all three constituents are named with URIs.

In principle this allows for the direct merger of two independently-constructed RDF-encoded knowledge base -- graph nodes named with the same URI (and URI comparison is trivial and well-defined) can simply be merged. And since it is the relations in a factual knowledge base which are typically the nodes (subjects or objects) in an ontology, it is likewise easy to imagine merging multiple ontologies.

In practice there are problems. Independently developed knowledge bases or ontologies are unlikely to have chosen the same URIs for the same individuals or concepts. Even where they have, or where approximate equivalence of the place of one URI in one graph with the place of another URI in another graph can be detected, identity of real-world denotation or ontological functionality cannot be assumed.

What does follow from the use of URIs for everything, however, is the availability of the 'follow your nose' strategy: by definition, a URI identifies a resource, and there should be a description of that resource retrievable from it. So for example when a human or a computer encounter an RDF graph with a relation in it of the form

{ http://purl.org/dc/elements/1.1/ , creator }

dereferencing the URI yields both human- and machine-readable information about the predicate: On the one hand, its official definition in natural language, on the other hand its place in an RDF-encoded ontology (i.e. an OWL schema).

What the creators of the Semantic Web are hoping is that the universal ability to 'follow your nose' and see what's behind any subject or relation you encounter in RDF form will encourage speedy de facto standardisation on a single ontology for any given subject area, and that this in turn can bootstrap the detection and exploitation of co-reference in separately-developed knowledge bases.

4.   Conclusions

The history of work on knowledge representation over the last thirty-five years sends one clear message to the Semantic Web community: good data models are necessary, but not sufficient, for success. The ability to compute with the represented knowledge is the real test. In other words, ignore the inference engine problem at your peril.

It remains to be seen whether the Semantic Web will succeed as a knowledge representation system in its own right, or whether it can bring about the more ambitious dream of the emergence of a virtual world-wide knowledge base. What is clear is that its focus on extending the boundaries of knowledge bases beyond the site at which they are developed is new and important. So the clear message to the KR community is that any knowledge resource creation effort, particularly any one aimed at a large-scale and cooperatively developed knowledge resource, should look very carefully at the ideology and the technology of the Semantic Web.

5.   Acknowledgements

Although I'm fortunate to work with and learn from some of the proprietors of the Semantic Web, most notably Tim Berners-Lee and Dan Connolly, they're not in any way responsible for the opinions presented here. My colleague Michael Sperberg-McQueen first enumerated for me RDF's particular qualities in the way they are presented above, and got me started on thinking about what significance this might have.