Limit a SPARQL query to one dataset - sparql

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks

First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.

Related

Getting all articles with "Point-in time" property in SPARQL

I would like to query all wikipedia articles that have a property P585 (point in time). Unfortunately, some of these are very obscure and are under sub-sub properties, like on the picture. I would like to be able to filter for these dates, no matter under what property they are.
The query: "?item p:P585/ps:P585 ?date. " only gives back results where point in time is the root category, and no matter what I try, I can't get all articles that have "point in time" somewhere. Or if not possible, the very least I would like to be able to specify "significant event/*/point in time"...
Thank you!
The issue is that you want 'point in time' to be a qualifying property as opposed to the property of a statement as you currently wrote. Let me clarify what I mean by considering the following query about Q76 (Barack Obama) and P26 (spouse):
SELECT *
WHERE {
wd:Q76 p:P26 ?statement .
?statement ?p ?o
}
ps:P26 tells us who the object of the statement is, i.e. Michelle Obama.
Note: Often you will find that for every ?a p:Px/ps:Px ?b there is ?a wdt:Px ?b (this depends on what the rank of the statement is, e.g. there are two statements about Obama's place ob birth, on listing it as Hawaii, the other as Kenya, but only the statement that lists Hawaii is ranked high enough to warrant a direct link with wdt).
However if we want to find the qualifying properties about the statement (e.g. when the Obamas were married, where etc), then we need a different namespace, in this case pq, where 'q' stands for qualifier.
How does this relate to your example then?
In your example, the 'main' property you are looking for is 'significant event', so P793, and the qualifier is P585.
Thus your query should be like this:
SELECT *
WHERE {
?subject p:P793 ?statement .
?statement pq:P585 ?date .
}
Or for short:
SELECT *
WHERE {
?subject p:P793/pq:P585 ?date .
}
If you are further interested in the object of the significant event (in your example this is 'rocket launch' Q797476), then you may specify this like so:
SELECT *
WHERE {
?subject p:P793 ?statement .
?statement pq:P585 ?date ;
ps:P793 wd:Q797476 .
}
Notice the role that the namespaces and the property numbers play.

Retrieving the wider dbpedia vocabulary for tagging pictures

I'm trying to develop a tool in JS for tagging pictures, so I need a set of possible "things" from dbpedia. I already tryed to retrieve this way:
select ?s ?l {
?s a owl:Class .
?s rdf:type ?l
FILTER regex(str(?s), "House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Fs+%3Fl+%7B%0D%0A+++%3Fs+a+owl%3AClass+.%0D%0A+++%3Fs+rdf%3Atype+%3Fl%0D%0A+++FILTER+regex%28str%28%3Fs%29%2C+%22House%22%2C+%22i%22%29.%0D%0A%7D
And also this way:
select ?label
WHERE {
?concept a skos:Concept.
?concept skos:prefLabel ?label.
FILTER regex(str(?label), "^House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Flabel+%0D%0AWHERE+%7B%0D%0A++%3Fconcept+a+skos%3AConcept.%0D%0A++%3Fconcept+skos%3AprefLabel+%3Flabel.%0D%0A++FILTER+regex%28str%28%3Flabel%29%2C+%22%5EHouse%22%2C+%22i%22%29.%0D%0A%7D
In the first case, I just have "instances" of the house "thing", but not the "House" class itself. In the second one, I never retrieve the "house" and the similar thing is "houses". Any alternative for retrieving a better vocabulary based in dbpedia dataset?
If you don't bother to restrict yourself to owl:Thing or to skos:Concept, you can just get things that have a label that contains "house". Rather than using regex, I chose to use contains and lcase, since a string containment could be less expensive than invoking a full regular expression processor.
select ?thing ?label where {
?thing rdfs:label ?label .
filter contains(lcase(?label), "house")
}
SPARQL results (limited to 200)

DBpedia Sparql by page template

I am trying to run a query on dbpedia using SPARQL syntax, to look for all pages of a certain template. Doesn't seem to be work, I am looking for all pages with dbpprop:wikiPageUsesTemplate. Does anyone know how to correct this to properly look for templates?
SELECT ?name ?member_Of ?country ?lat ?lng ?link
WHERE {
?x dbpprop:wikiPageUsesTemplate "dbpedia:Template:Infobox_settlement" .
?x a <http://dbpedia.org/ontology/Settlement> .
?x foaf:name ?name .
?x dbpedia-owl:isPartOf ?member_Of.
?x dbpedia-owl:country ?country.
?x geo:lat ?lat .
?x geo:long ?lng .
?x foaf:isPrimaryTopicOf ?link .
}
LIMIT 2500 OFFSET 0
I've also attempted to run it just by the dbprop to no avail.
SELECT * WHERE { ?page dbpprop:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_settlement> . ?page dbpedia:name ?name .}
If anyone is trying to do a similar thing, it is also possible via the wiki api, where you can pagananate over all results. http://en.wikipedia.org/w/api.php?action=query&list=embeddedin&eititle=Template:Infobox_settlement
There are at least two problems: (i) you need to use IRIs in places, and not strings; and (ii) you need to use properties that DBpedia uses.
Use IRIs
In
?x a <http://dbpedia.org/ontology/Settlement> .
and
?x foaf:isPrimaryTopicOf ?link .
you've demonstrated that you know that URIs need to be written in full with < and >, or abbreviated with a prefix. However,
?x dbpprop:wikiPageUsesTemplate "dbpedia:Template:Infobox_settlement" .
certainly isn't going to work. It's legal SPARQL, because a string can be the object of a triple, but you almost certainly want an IRI.
Use DBpedia's vocabulary
A query with dbpprop:wikiPageUsesTemplate like this returns no results:
select distinct ?template where {
[] dbpprop:wikiPageUsesTemplate ?template
}
SPARQL results
That's easy enough to check, and quickly confirms that there's no data that can possibly match your query. Where did you find this property? Have you seen it used somewhere? I'm not confident that you can query DBpedia based on infobox templates. DBpedia is not the same as Wikipedia, and even if the Wikipedia API supports it, it doesn't mean that DBpedia has a counterpart. There is a note on DBpedia Data Set Properties that says:
http://xx.dbpedia.org/property/wikiPageUsesTemplate (will be changed to http://dbpedia.org/ontology/wikiPageUsesTemplate)
but that latter property doesn't seem to be in use on the endpoints either. See my answer to Syntax for Sparql query for pages with specific infobox for more details.

How to retrieve abstract for a DBpedia resource?

I need to find all DBpedia categories and articles that their abstract include a specific word.
I know how to write a SPARQL query that queries the label like the following:
SELECT ?uri ?txt WHERE {
?uri rdfs:label ?txt .
?txt bif:contains "Machine" .
}
but I have not figured out yet how to search the abstract.
I've tried with the following but it seems not to be correct.
SELECT ?uri ?txt WHERE {
?uri owl:abstract ?txt .
?txt bif:contains "Machine" .
}
How can I retrieve the abstract in order to query its text?
Since you already know how to search a string for text content, this question is really about how to get the abstract. If you retrieve any DBpedia resource in a web browser, e.g., http://dbpedia.org/resource/Mount_Monadnock (which will redirect to http://dbpedia.org/page/Mount_Monadnock), you can see the triples of which it's a subject or predicate. In this case, you'll see that the property is dbpedia-owl:abstract. Thus you can do things like
select * where {
?s dbpedia-owl:abstract ?abstract .
?abstract bif:contains "Monadnock" .
filter langMatches(lang(?abstract),"en")
}
limit 10
SPARQL results
Instead of visiting the page for the resource, which not endpoints will support, you could have simply retrieved all the triples for the subject, and looked at which ones relate it to its abstract. Since you know the abstract is a literal, you could even restrict it to triples where the object is a literal, and perhaps with a language that you want. E.g.,
select ?p ?o where {
dbpedia:Mount_Monadnock ?p ?o .
filter ( isLiteral(?o) && langMatches(lang(?o),'en') )
}
SPARQL results
This also clearly shows that the property you want is http://dbpedia.org/ontology/abstract. When you have a live query interface that you can use to pull down arbitrary data, it's very easy to find out what parts of the data you want. Just pull down more than you want at first, and then refine to get just what you want.

Get nationality of person using DBPedia and SPARQL

I have the following SPARQL query:
SELECT ?nationalityLabel WHERE {
dbpedia:Henrik_Ibsen dbpedia-owl:nationality ?nationality .
?nationality rdfs:label ?nationalityLabel .
}
I have checked that Henrik Ibsen exists and that he has the nationality ontology/property on him:
http://dbpedia.org/page/Henrik_Ibsen
And this is an ontology:
http://dbpedia.org/ontology/nationality
A very similar query to this listed here works:
https://stackoverflow.com/a/10248653/1680130
The problem I have is that the query doesn't return any result.
If I could get help solving this it would be great.
Summarized solution:
Both answers were great so upvote to both but landed on Joshua's in the end because informing about dbpedia-owl being cleaner. Optimal solution in my opinion:
First check with dbpedia-owl for birth-place:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
If found then get the demonym:
select ?label {
dbpedia:Norway dbpedia-owl:demonym ?label
filter langMatches(lang(?label),"en")
}
If above fails then do the "dirty" query:
SELECT
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
filter langMatches(lang(?nationality),"en")
}
Of course is "dirty" means data being correct but not so often present the order might be better other way around because people can be born in a country but from a different.
Kristian's answer is right that the property is dbpprop:nationality that Henrik Ibsen has. You're right that there is a dbpedia-owl:nationality property, too, but Henrik Ibsen doesn't have a value for it, unfortunately. The value of dbpprop:nationality that Henrik Ibsen has, though, is a string, which is a literal, and literals cannot be the subjects of triples in RDF, so ?nationality rdfs:label ?nationalityLabel in your query will never match.
The DBpedia ontology data (dbpedia-owl) tends to be cleaner than the dbpprop data, so you might prefer a solution using dbpedia-owl properties that Henrik Ibsen does have. In this case, you might look to the dbpedia-owl:birthPlace. Then you could get the name the country of the birth places:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
}
SPARQL results
You might want to narrow the permissible languages:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
SPARQL results
Those queries will produce the name of the country, but it wanted the corresponding demonym, you can get the dbpedia-owl:demonym value of the country, if it's available. It's probably best to make the demonym optional, since a cursory investigation suggests that lots of countries in DBpedia don't have a value for it, so the name of the country may be the only option. E.g.,
select ?name ?demonym {
dbpedia:Henrik_Ibsen dbpedia-owl:birthPlace ?country .
?country a dbpedia-owl:Country ; rdfs:label ?name .
optional { ?country dbpedia-owl:demonym ?demonym }
filter langMatches(lang(?name),"en")
filter langMatches(lang(?demonym),"en")
}
SPARQL results
Two things are wrong with the query:
It's dbpprop:nationality
The label doesn't appear to exist, and unless you make that variable optional, it will eliminate the row altogether. EDIT: *Joshua Taylor's answer reminded me that the label doesn't exist because the dbprop:nationality value is a literal, which cannot be used as a subject resource, therefore, there will never be a label for dbpprop:nationality. Instead, where the data exists, you would use dbpedia-owl:nationality, which you did originally. it just so happens that Henrik_Ibsen has no dbpedia-owl:nationality value associated with him*
Updated query (updated).
SELECT
#### ?label #### See Edit
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
#### OPTIONAL { ?nationality rdfs:label ?label . } #### See Edit.
}