I owe much of what I understand about the Web to Tim Berners-Lee, along with Dan Connolly, Harry Halpin and Larry Masinter
Brian Cantwell Smith has hugely influenced my approach to all things computational
Jonathan and Jeni, my co-authors, haven't vetted these slides, and so can't be held responsible for anything contained herein with which you disagree
The URIs this discussion focusses on are limited
http:
schemeGET
request
for the URI will result in an HTTP 200 responseRestricted, but a very substantial part of the usage of URIs in data
And the only HTTP operation we address is GET
By considering the use of URIs in data, we can exemplify two common usage patterns which reveal more-or-less covert allegance to two distinct extensions
Sometimes information in data involving a URI is evidently about what you can retrieve from that URI:
{ "@id": "http://www.w3.org/People/Berners-Lee/",
"last modified": "2012-06-08" }
{ "@id": "http://www.websci13.org/files/2013/04/ht-317x370.jpg",
"resolution": "72x72"}
Sometimes information in data involving a URI is evidently about what is described/depicted by what you can retrieve from that URI:
{ "@id": "http://www.w3.org/People/Berners-Lee/",
"birthday": "1955-06-08" }
{ "@id": "http://www.websci13.org/files/2013/04/ht-317x370.jpg",
"surname": "Thompson"}
The key word here is 'evidently' -- it's evident to us, but not, at least not without help, to computers
The value proposition of the Web:
What happens if data based on two different extensions are aggregated?
{ "@id": "http://www.w3.org/People/Berners-Lee/",
"birthday": "1955-06-08"
"last modified": "2012-06-08" }
At best this is confused, at worst potentially destructive of reliable inference
First, some terminology around the case where data containing URIs is about whatever is described/depicted by what you can retrieve from them
So for example http://www.w3.org/People/Berners-Lee/ is a proxy page for Berners-Lee and http://www.w3.org/2011/05/w3cteam.html is a landing page for http://www.w3.org/2011/05/team-photo.jpg
The image and the descriptions are all retrievables
How do we (even we humans) understand data involving the URIs of landing pages?
For example
{ "@id": "http://www.w3.org/2011/05/w3cteam.html",
"last modified": "2012-06-08" }
Does the information here apply to the landing page, or to the image?
This is not a hypothetical: metadata found on e.g. Flickr landing pages is often, but not always, about the image linked from that page, not about the page itself
Consider what we have seen in terms of relationships:
last_modified
birthday
Machine-readable documentation of properties as to whether they are immediate or shorthand would enable interoperability
This talk has been an attempt at a proof-of-concept
And an introduction to an analysis of usage which provides the basis for an approach to interoperability based on documentation of properties