Bart learns SPARQL

Turtle is an RDF serialization

Serialization formats

in addition to the XML serialization of RDF there are several other options

generally better options although RDF/XML is sometimes better supported

all of these options are based on Notation 3 or N3 which actually extends past the expressiveness of RDF

N-Triples ⊂ Turtle ⊂ N3

Serialization formats

in addition to the XML serialization of RDF there are several other options

generally better options although RDF/XML is sometimes better supported

all of these options are based on Notation 3 or N3 which actually extends past the expressiveness of RDF

N-Triples ⊂ Turtle ⊂ N3

what we've seen so far has all been Turtle ( Terse RDF Triple Language) and we'll pretty much stick with that

Back to Bill Evans

Bill Evans

Simple Turtle example

Let's revisit Bill Evans

<http://dbpedia.org/resource/Bill_Evans>

<http://dbpedia.org/ontology/genre>

<http://dbpedia.org/resource/Modal_jazz> .

this is valid Turtle (and N-Triples and N3)

again, this triple states "Bill Evans has genre modal jazz"

simple! but these URIs are annoyingly long :-(

Let's add some prefixes

We can use the @prefix key word to create some shortcuts


@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .
		
dbpedia:Bill_Evans dbpedia-owl:genre dbpedia:Modal_jazz .
	

notice we remove the < and > characters surrounding the URIs

URIs in this form are compact URIs or CURIEs

More on prefixes

Note the name of the prefix is arbitrary and we define it locally in our serialization

we could use any prefix we want


@prefix foo: <http://dbpedia.org/resource/> .
@prefix bar: <http://dbpedia.org/ontology/> .

foo:Bill_Evans bar:genre foo:Modal_jazz .
	

and the meaning is exactly the same

however...

Even more on prefixes

There are some vocabularies and namespaces that are commonly associated with a particular prefix by convention


@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix mo: <http://purl.org/ontology/mo/> .

WARNING: such prefixes must still be specified locally

try using http://prefix.cc to lookup prefixes

Back to Bill


@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix dbpo: <http://dbpedia.org/ontology/> .
@prefix mo: <http://purl.org/ontology/mo/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

dbpedia:Bill_Evans dbpo:genre dbpedia:Modal_jazz .
dbpedia:Bill_Evans dbpo:birthName "William John Evans"@en .
dbpedia:Bill_Evans dbpo:birthdate "1929-08-16"^^xsd:date .
dbpedia:Bill_Evans rdf:type mo:MusicArtist .
dbpedia:Bill_Evans rdf:type foaf:Person .

Notice we provide his birthName and birthdate as a literals

Some more literals

Types of literals are taken from the XML Schema


	"this is a basic string"
	"this is in English"@en
	"und auf Deutsch"@de
	"3.14"^^xsd:float
	"2009-10-26T21:32:52"^^xsd:dateTime
	

Bill Evans final ttl


@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix dbpo: <http://dbpedia.org/ontology/> .
@prefix mo: <http://purl.org/ontology/mo/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

dbpedia:Bill_Evans dbpo:genre dbpedia:Modal_jazz ;
  dbpo:birthName "William John Evans"@en ;
  dbpo:birthdate "1929-08-16"^^xsd:date ;
  a mo:MusicArtist, foaf:Person .

  • rdf:type can be replaced by a
  • we use the ; to chain multiple predicate-object pairs to one subject
  • we use the , to chain multiple objects to one subject-predicate pair

wait a minute,

What is DBpedia?

DBpedia

from dbpedia.org

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.

DBpedia logo

Time to shine with SPARQL

Let's get to SPARQL

SPARQL is really exciting query language for RDF

  • pronounced "sparkle"
  • SPARQL Protocol and RDF Query Language
  • On 15 January 2008, SPARQL became an official W3C Recommendation
  • allows one to perform complex joins of disparate databases in a single, simple query
  • query for triple patterns using conjunction, disjunction, and optional patterns

SPARQL implementations

There are SPARQL implementations for most popular programming languages

  • Redland's Rasqal (C with bindings for Perl, Python, Ruby, PHP and others)
  • Jena's ARQ (Java)
  • RDFLib (Python)
  • and many others

SPARQL store implementations

Even better there are several open-source scalable RDF stores that support SPARQL

A simple SPARQL query example

SPARQL syntax is a relatively intuitive mashup of traditional SQL and Turtle

		
SELECT ?foo ?bar 
  WHERE { <http://dbpedia.org/resource/Bill_Evans> ?foo ?bar . }

this will return all triples in the store with Bill Evans as the subject

terms beginning with ? are universally quantified variables

let's try it out at http://dbpedia.org/snorql

SPARQL - basic syntax

  • Prefix declarations, for abbreviating URIs
  • Dataset definition, stating what RDF graph(s) are being queried
  • A result clause, identifying what information to return from the query
  • The query pattern, specifying what to query for in the underlying dataset
  • Query modifiers, slicing, ordering, and otherwise rearranging query results

SPARQL - basic syntax


# prefix declarations
PREFIX foo: <http://example.com/resources/>
...
# dataset definition
FROM ...
# result clause
SELECT ...
# query pattern

WHERE {
    ...
}
# query modifiers
ORDER BY ...

SPARQL query example

as in Turtle we can use the keyword PREFIX


PREFIX dbpedia: <http://dbpedia.org/resource/Bill_Evans>
SELECT ?p ?o 
  WHERE { dbpedia:Bill_Evans ?p ?o . }

notice we don't need the @ and . around the PREFIX keyword

Challenge

Can you find all the Jazz artists from New Jersey who play piano?

Photo by http://www.flickr.com/photos/mcsimon/

Challenge answer

Note there is more than one way to get this answer


	PREFIX dbpo: <http://dbpedia.org/ontology/>
	PREFIX dbpp: <http://dbpedia.org/property/>
	PREFIX : <http://dbpedia.org/resource/>


	SELECT *
	WHERE { 
	?artist a dbpo:MusicalArtist ;
	    dbpo:genre :Jazz ;
	    dbpp:instrument :Piano ;
	    dbpo:homeTown :New_Jersey.
	}
	

A quick note on Python

We can make SPARQL queries against endpoints in Python with SPARQLWrapper


	$ sudo easy_install SPARQLWrapper
	

a quick example of syntax


from SPARQLWrapper import SPARQLWrapper2
sparql = SPARQLWrapper2('http://dbpedia.org/sparql')
query = 'SELECT * WHERE { ?s ?p ?o } LIMIT 20'
sparql.setQuery(query)
response = sparql.query()
for binding in response.bindings:
	print binding['s'].value
	print binding['p'].value
	print binding['o'].value

Detailed examples

There are detailed examples of how to use SPARQLWrapper in the github repository

Using SPARQLWrapper and RDFLib to build a Jamendo artist recommender:

http://github.com/kurtjx/MatWoD/blob/master/examples/scripts/recommend_by_location/

Also some examples using Ruby

"Find all music artists from Detroit":

http://github.com/kurtjx/MatWoD/blob/master/examples/scripts/artists_based_in_detroit/

see your handout or the website for a listing of more endpoints

Warning!

In your handout - there are two example queries

the second query uses blank nodes [] and a UNION statement to select all the properties and classes contained in an endpoint

while this is a valid query, it will cause most endpoints to choke because it is very expensive - better to break into two separate queries with limits


	SELECT DISTINCT ?type 
	WHERE { [] a ?type . } LIMIT 10	
	

	SELECT DISTINCT ?prop
	WHERE {} [] ?prop [] . } LIMIT 10
	

with SPARQL

all of Linked Data becomes one unified Web API

photo credits

Bart SPARQL photo from Bob DuCharme's blog

New Jersey skyline photo by http://www.flickr.com/photos/mcsimon/

Now Yves will discuss

the Music Ontology