An introduction to linked data

Lets start with some egg sucking

The internet

...provides a means to connect machines

The internet ≠ the web

World Wide Web

The web

...provides a means to connect documents

The web = the internet

The web = the internet + links

The web = the internet + links + documents

...or....

The web = the internet + http + html

Web standards

http://en.wikipedia.org/wiki/Web_standards#Common_usage

When a web site or web page is described as complying with web standards, it usually means that the site or page has valid or nearly valid HTML, CSS and JavaScript. The HTML should also meet accessibility and semantic guidelines.

http://en.wikipedia.org/wiki/Web_standards#Common_usage

When a web site or web page is described as complying with web standards, it usually means that the site or page has valid or nearly valid HTML, CSS and JavaScript. The HTML should also meet accessibility and semantic guidelines.

We tend to obsess on the documents:

...at the expense of the links:

The trouble is...

...HTML has always been...

Everything that's good about the web comes from links

If you can point at something you can talk about it and share it

The web = the internet + http + html

Magazines are made of pages....

...websites are made of links

One problem with the web

We need to get from this...

a web of documents

..to this

a web of things

The other problem with the web...

...people can parse documents and extract meaning...

meaning

...but machines can't

no meaning

We need to help machines to understand the web...

...so machines can help us to understand things

The semantic web

Mk 1

RDF

The RDF data model

RDF

More examples

Triples combine to form a graph...

A graph of the relationships in the previous slide

...which is fundamentally different to the set based approach of relational databases

A venn diagram of the relationships in the previous slide

Graphs are web like - so easily expandable and scaleable

An expanded graph

One night in Manchester - birth of Factory Records

An expanded graph

So what happened

Semweb mk 1 = the internet + http + rdf

REST

REST

If I ask for a document about Factory Benelux...

The key is

Content negotiation - what I want / what I accept

I'd like this resource about Factory Benelux, I speak English but I can just about get by in French and I'd like it for my mobile, please

Content negotiation - what I'm given

Can't do you English but I've got French and can send as xhtml-mp. Here you go

Not always successful

Content negotiation - what I want / what I accept

I'd like this resource about Factory Benelux, I speak English but I can just about get by in German and I'd like it for my mobile, please

Content negotiation - what I'm not given

I've got that resource but can only do French. 406

Back to linked data

Linked data

Linked data = the internet + http + rdf

Linked data = web standards

Design issues for linked data

Use URIs as names for things (my emphasis)

The map is not the territory

Non-information resources

We want to be able to make different claims about the thing and the document about the thing

So we need URIs for non-information resources - stuff that you can't send down wires

What happens if someone asks for a non-information resource?

I'd like Factory Records, and by the way I speak English but I can just about get by in French

What happens if someone asks for a non-information resource?

Yves will not fit down the wires but (303) I can give you some information about him in English

Designing URIs for non-information resources

Slash URIs

Slash URIs in pictures: slash + 303 + conneg

slash uris

You need to be able to configure your server for 303s and content negotiation.

Hash URIs

Hash URIs in pictures: hash + conneg

hash uris

Cheaper setup - no need to set up for 303s although you still need content negotiation. Fewer round trips to server.

RDFa URIs in pictures: hash only

RDFa uris

Cheapest setup - no need to set up for 303s or content negotiation.

So, what's the point?

Different people know (or claim to know) different things about the same topic

Linked data is a web-scale database

A special mention for owl:sameAs

owl:sameAs

So if we say...

When sameAs goes wrong

An example stolen from Tom Heath

When we declare sameAs we need to be careful

When using sameAs you need to decide

Ceci n'est pas une pipe...

...and this is not Hamlet

Photo of book cover of Hamlet

It is...

Once you've minted a URI for a non-information resource

Linked data can describe anything

There are vocabularies available for

And if an ontology doesn't exist

Time for Turtle and SPARQL