Get nationality of person using DBPedia and SPARQL - sparql

I have the following SPARQL query:
SELECT ?nationalityLabel WHERE {
dbpedia:Henrik_Ibsen dbpedia-owl:nationality ?nationality .
?nationality rdfs:label ?nationalityLabel .
}
I have checked that Henrik Ibsen exists and that he has the nationality ontology/property on him:
http://dbpedia.org/page/Henrik_Ibsen
And this is an ontology:
http://dbpedia.org/ontology/nationality
A very similar query to this listed here works:
https://stackoverflow.com/a/10248653/1680130
The problem I have is that the query doesn't return any result.
If I could get help solving this it would be great.
Summarized solution:
Both answers were great so upvote to both but landed on Joshua's in the end because informing about dbpedia-owl being cleaner. Optimal solution in my opinion:
First check with dbpedia-owl for birth-place:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
If found then get the demonym:
select ?label {
dbpedia:Norway dbpedia-owl:demonym ?label
filter langMatches(lang(?label),"en")
}
If above fails then do the "dirty" query:
SELECT
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
filter langMatches(lang(?nationality),"en")
}
Of course is "dirty" means data being correct but not so often present the order might be better other way around because people can be born in a country but from a different.

Kristian's answer is right that the property is dbpprop:nationality that Henrik Ibsen has. You're right that there is a dbpedia-owl:nationality property, too, but Henrik Ibsen doesn't have a value for it, unfortunately. The value of dbpprop:nationality that Henrik Ibsen has, though, is a string, which is a literal, and literals cannot be the subjects of triples in RDF, so ?nationality rdfs:label ?nationalityLabel in your query will never match.
The DBpedia ontology data (dbpedia-owl) tends to be cleaner than the dbpprop data, so you might prefer a solution using dbpedia-owl properties that Henrik Ibsen does have. In this case, you might look to the dbpedia-owl:birthPlace. Then you could get the name the country of the birth places:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
}
SPARQL results
You might want to narrow the permissible languages:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
SPARQL results
Those queries will produce the name of the country, but it wanted the corresponding demonym, you can get the dbpedia-owl:demonym value of the country, if it's available. It's probably best to make the demonym optional, since a cursory investigation suggests that lots of countries in DBpedia don't have a value for it, so the name of the country may be the only option. E.g.,
select ?name ?demonym {
dbpedia:Henrik_Ibsen dbpedia-owl:birthPlace ?country .
?country a dbpedia-owl:Country ; rdfs:label ?name .
optional { ?country dbpedia-owl:demonym ?demonym }
filter langMatches(lang(?name),"en")
filter langMatches(lang(?demonym),"en")
}
SPARQL results

Two things are wrong with the query:
It's dbpprop:nationality
The label doesn't appear to exist, and unless you make that variable optional, it will eliminate the row altogether. EDIT: *Joshua Taylor's answer reminded me that the label doesn't exist because the dbprop:nationality value is a literal, which cannot be used as a subject resource, therefore, there will never be a label for dbpprop:nationality. Instead, where the data exists, you would use dbpedia-owl:nationality, which you did originally. it just so happens that Henrik_Ibsen has no dbpedia-owl:nationality value associated with him*
Updated query (updated).
SELECT
#### ?label #### See Edit
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
#### OPTIONAL { ?nationality rdfs:label ?label . } #### See Edit.
}

Related

dbpedia sparql query for Deathplace someone

I'm trying to get the Deathplace for a unique language. My query is this but I have errors...
SELECT ?place ?placeLabel
WHERE {
<https://dbpedia.org/page/Alfonso_XII> rdfs:label ?person
?place dbo:deathPlace ?person;
rdfs:label ?placeLabel.
FILTER (LANG(?placeLabel) = "en")
}
You have a few issues with this query.
You are using the URL of the resource, not the URI
A place of death links a person to a place, not the other way round.
The right URI prefix for deathPlace is dbp:
Places should be resources indeed but in DBpedia that's a strong assumption
Since you don't ask for the person level, it seems you don't need it in the query
SELECT ?place ?placeLabel
WHERE {
dbr:Alfonso_XII dbp:deathPlace ?place .
OPTIONAL {?place rdfs:label ?placeLabel .
FILTER (LANG(?placeLabel) = "en") }
}

How can I join climate data to cities with dbpedia?

I am trying to get a dataset that gives me all the data available in a city's climate table but I'm having some trouble.
I was able to get this to work and felt pretty good about myself. When I plug this in on dbpedia's virtuoso client this gives me all the cities that dbpedia has, and all of their countries.
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?city_name ?country
WHERE { ?city rdf:type dbpedia-owl:City ;
rdfs:label ?city_name;
dbpedia-owl:country ?country
FILTER (langMatches(lang(?city_name), "EN")) .
}
Update: I have found properties that seem to give what I'm looking for (e.g. dbpedia.org/property/aprHighC) but I'm having trouble adding them to my output.
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?city_name ?country ?aprHighC
WHERE { ?city rdf:type dbpedia-owl:City .
?city rdfs:label ?city_name .
?city dbpedia-owl:country ?country
FILTER (langMatches(lang(?city_name), "EN")) .
}
Gives an error: Variable 'aprHighC' is used in the query result set but not assigned. How do I assign it?
For the query to get results in the second query, a city has to have a three properties: rdfs:label, dbpedia-owl:country and dbpedia-owl:climate. Your query pretty much proves that DBPedia data has cities with label and country properties, but not climate. Try the following to see just what properties are found for members of dbpedia-owl:City:
SELECT DISTINCT ?p
WHERE {
?city rdf:type dbpedia-owl:City ;
?p ?o .
}
Note that not all members of dbpedia-owl:City will have these properties, but it gives you a range of what properties are used.
Looking at it the other way, you can ask what entities use the dbpedia-owl:climate property:
SELECT ?s
WHERE {
?s dbpedia-owl:climate ?climate
}
I didn't find any, so it could be the case that the prefix is different than the one you are using? I'd suggest double-checking the property name.
Regardless, it's a good idea to use SPARQL to find what is actually in the data store. And use LIMIT to look at parts of the data without overwhelming the system.
The following query gives the January average daily high (°C). Adding other climate items is as simple as copying the line beginning "OPTIONAL" and changing the item and variable name from janHighC to whatever you are trying to get.
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT * {
{ ?city rdf:type dbo:City .
?city rdf:type schema:City .
?city rdfs:label ?name
}
OPTIONAL {?city dbp:janHighC ?janHighC .}
}
I will note, however, that most cities don't have this information. I had to give up on getting the data this way.

Retrieving the wider dbpedia vocabulary for tagging pictures

I'm trying to develop a tool in JS for tagging pictures, so I need a set of possible "things" from dbpedia. I already tryed to retrieve this way:
select ?s ?l {
?s a owl:Class .
?s rdf:type ?l
FILTER regex(str(?s), "House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Fs+%3Fl+%7B%0D%0A+++%3Fs+a+owl%3AClass+.%0D%0A+++%3Fs+rdf%3Atype+%3Fl%0D%0A+++FILTER+regex%28str%28%3Fs%29%2C+%22House%22%2C+%22i%22%29.%0D%0A%7D
And also this way:
select ?label
WHERE {
?concept a skos:Concept.
?concept skos:prefLabel ?label.
FILTER regex(str(?label), "^House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Flabel+%0D%0AWHERE+%7B%0D%0A++%3Fconcept+a+skos%3AConcept.%0D%0A++%3Fconcept+skos%3AprefLabel+%3Flabel.%0D%0A++FILTER+regex%28str%28%3Flabel%29%2C+%22%5EHouse%22%2C+%22i%22%29.%0D%0A%7D
In the first case, I just have "instances" of the house "thing", but not the "House" class itself. In the second one, I never retrieve the "house" and the similar thing is "houses". Any alternative for retrieving a better vocabulary based in dbpedia dataset?
If you don't bother to restrict yourself to owl:Thing or to skos:Concept, you can just get things that have a label that contains "house". Rather than using regex, I chose to use contains and lcase, since a string containment could be less expensive than invoking a full regular expression processor.
select ?thing ?label where {
?thing rdfs:label ?label .
filter contains(lcase(?label), "house")
}
SPARQL results (limited to 200)

Limit a SPARQL query to one dataset

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks
First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.

How to match exact string literals in SPARQL?

I have this query. It matches anything which has "South" in its name. But I only want the one whose foaf:name is exactly "South".
SELECT Distinct ?TypeLabel
WHERE
{
?a foaf:name "South" .
?a rdf:type ?Type .
?Type rdfs:label ?TypeLabel .
}
a bit late but anyway... I think this is what your looking for:
SELECT Distinct ?TypeLabel Where {
?a foaf:name ?name .
?a rdf:type ?Type .
?Type rdfs:label ?TypeLabel .
FILTER (?name="South"^^xsd:string)
}
you can use FILTER with the xsd types in order to restrict the result.
hope this helps...
cheers!
(Breaking out of comments for this)
Data issues
The issue is the data, not your query. If use the following query:
SELECT DISTINCT ?a
WHERE {
?a foaf:name "Imran Khan" .
}
You find (as you say) "Imran Khan Niazy".
But looking at the dbpedia entry for Imran Khan, you'll see both:
foaf:name "Imran Khan Niazy"
foaf:name "Imran Khan"
This is because RDF allows repeated use of properties.
Cause
"South" had the same issue (album, artist, and oddly 'South Luton'). These are cases where there are both familiar names ("Imran Khan", "South"), and more precise names ("Imran Khan Niazy", "South (album)") for the purposes of correctness or disambiguation.
Resolution
If you want a more precise match try adding a type (e.g. http://dbpedia.org/ontology/MusicalWork for the album).
Beware
Be aware that DBpedia derives from Wikipedia, and the extraction process isn't perfect. This is an area alive with wonky data, so don't assume your query has gone wrong.
That query should match exactly the literal South and not literals merely containing South as a substring. For partial matches you'd go to FILTER with e.g. REGEX(). Your query engine is broken in this sense - which query engine you are working with?