Convert xsd:date to xsd:gYear in SPARQL - sparql

Hello I'm trying to retrieve from DBpedia persons that are born in Lyon after 1900 ("1900"^^xsd:gYear)
Here is my piece of code :
prefix dbo: <http://dbpedia.org/ontology/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix dbr: <http://dbpedia.org/resource/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select $n $birthDate
{
$p a dbo:Person.
$p dbo:birthPlace dbr:Lyon.
$p foaf:name $n.
$p dbo:birthDate $birthDate.
filter($birthDate > "1980"^^xsd:gYear).
}
And I'm getting the following results :
query results

From my understanding, the problem is in the data: many dates do not have valid xsd:date literals, thus, the comparison fails. It also fails for filter($birthDate > "1980-01-01"^^xsd:date). According to XML Schema, xsd:date must provide the literal in form "YYYY-MM-DD". And that's unfortunately not the case for many dates in DBpedia. For valid dates, the comparison works perfectly.
Workaround:
prefix dbo: <http://dbpedia.org/ontology/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix dbr: <http://dbpedia.org/resource/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select $n $birthDate ?year
{
$p a dbo:Person.
$p dbo:birthPlace dbr:Lyon.
$p foaf:name $n.
$p dbo:birthDate $birthDate.
filter($birthDate > "1980-01-01"^^xsd:date).
bind(replace(str($birthDate),"(\\d+)-\\d*-\\d*", "$1") as ?year)
filter(xsd:integer(?year) > 1980)
}
Note, it might fail for other kinds of ill-formed dates. I didn't check all corner cases.
Somebody should report this to the DBpedia community. It should be fixed.

Related

SPARQL query all persons who lived through the 20th century in DBPedia

I want to query all persons who lived through the entire 20th century in DBPedia.
Im using https://dbpedia.org/sparql/ to process my query. I have limited the output to 20.
The query I've tried is the following:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?personName ?birthDate ?deathDate
WHERE {
FILTER (?birthDate < "1900-01-01"^^xsd:date AND ?deathDate > "1999-12-31"^^xsd:date).
?p rdf:type dbo:Person.
?p dbp:name ?personName.
?p dbp:birthDate ?birthDate.
?p dbp:deathDate ?deathDate.
}
LIMIT 20.
In the output all person died after 1999-12-31 but they weren't born before 1900-01-01.
Why is my query wrong? How can I fix it?
Thanks in advance for you time.
This issue is due to integer values being included in the query results, and can be resolved using the DATATYPE() function in the same or a new FILTER() section.
For example:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?personName ?birthDate ?deathDate
WHERE
{
?person rdf:type dbo:Person;
dbp:name ?personName;
dbp:birthDate ?birthDate;
dbp:deathDate ?deathDate.
#This FILTER ensures that only xsd:date values are returned
FILTER(DATATYPE(?birthDate) = xsd:date && DATATYPE(?deathDate) = xsd:date).
FILTER (?birthDate < "1900-01-01"^^xsd:date && ?deathDate > "1999-12-31"^^xsd:date).
}
LIMIT 20
Live Query Definition
Live Query Result
DATATYPE() Documentation

SPARQL Syntax correction

It's my first time using SPARQL.
Using DBpedia, I want to list the band names of all Rock bands from 1992
The code I've already developed runs correctly but gives 0 results. Am I filtering the date incorrectly?
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?bandname where {
?band dbo:genre dbp:Rock_music .
?band foaf:name ?bandname .
FILTER (dbo:activeYearsStartYear >= xsd:date("1992-01-01") && dbo:activeYearsStartYear <= xsd:date("1992-12-31"))
}
I expect only the band names in a column.
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?bandname ?date where {
?band dbo:genre dbp:Rock_music .
?band foaf:name ?bandname .
?band dbo:activeYearsStartYear ?date .
FILTER (?date >= "1992-01-01"^^xsd:dateTime && ?date <= "1992-12-31"^^xsd:dateTime)
}
This does the trick but note that dbo:activeYearsStartYear produces the year only, so FILTER is not using day/month.

How to gracefully handle dbpedia queries of birthDate in different ontologies

I'm trying to extract the date of birth of a number of people from dbpedia.org. However some queries fail because the data is in a different attribute. For example:
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
select ?birthDate where {
dbr:Alan_Turing dbo:birthDate ?birthDate
}
returns 1912-06-23 as it should but:
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
select ?birthDate where {
dbr:Grace_Hopper dbo:birthDate ?birthDate
}
returns an empty result.
EDIT TO ADD:
The original problem was confused by the disparity between the live.dbpedia.org and dbpedia.org, meaning despite the non-live version having dbo:birthDate in both this wasn't the same on live.dbpedia.org. If you compare Alan Turing with Grace Hopper you see they have their birthDates in two different ontologies. So the problem is now how to handle those gracefully.
The answer is to use the UNION operator to find an answer from whichever ontology has it:
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
select ?birthDate where {
{ dbr:Alan_Turing dbo:birthDate ?birthDate }
UNION
{ dbr:Alan_Turing dbp:birthDate ?birthDate }
}
Gives us two results, both the same. And for Grace Hopper:
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
select ?birthDate where {
{ dbr:Grace_Hopper dbo:birthDate ?birthDate }
UNION
{ dbr:Grace_Hopper dbp:birthDate ?birthDate }
}
We only get one result
As live.dbpedia.org already has a bunch of namespace prefixes defined and as suggested by #AKSW we can simplify the call even further. The distinct keyword means identical results from different taxonomies are merged together:
select distinct ?birthDate {
dbr:Grace_Hopper dbo:birthDate|dbp:birthDate ?birthDate
}
Giving this result.

Different order of results in DBpedia through SNORQL vs. SERVICE query

I am trying to retrieve some data about movies from DBpedia. This is my query:
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX onto: <http://dbpedia.org/ontology/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
{{SELECT *
WHERE {
SERVICE<http://dbpedia.org/sparql>{
?movie dcterms:subject <http://dbpedia.org/resource/Category:American_films> ;
a onto:Film ;
rdfs:label ?title ;
dbpedia2:gross ?revenue .
?movie onto:starring ?actorUri .
?actorUri rdfs:label ?actor .
OPTIONAL {
?movie onto:imdbId ?imdbId .
}
BIND(xsd:integer(?revenue) as ?intRevenue) .
FILTER ((datatype(?revenue) = 'http://dbpedia.org/datatype/usDollar') && (LANGMATCHES(LANG(?title), 'en')) && (LANGMATCHES(LANG(?actor), 'en'))) .
}
}
}}
ORDER BY DESC (?intRevenue)
LIMIT 40000
OFFSET 0
Running this query on http://dbpedia.org/snorql/ (without the SERVICE keyword) returns the correct result. However, doing so from a third party triplestore doesn't yield the same order (ex: Hobbit and Lord of the Rings are missing).
What do I need to change in the query to get identical results?
The best way to overcome this specific limitation is to have your own DBpedia mirror, on which you can set your own limits (including none), and which you can then use as either your primary or remote data store and/or query engine.
(ObDisclaimer: OpenLink Software provides the public DBpedia SPARQL endpoint, produces Virtuoso and the DBpedia Mirror AMI, and employs me.)

DBPedia SPARQL query should return results but is empty

I want to get all pages which have a specified category and a specified key. My query:
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
PREFIX dcterms: <http://dublincore.org/2010/10/11/dcterms.rdf#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX grs: <http://www.georss.org/georss/point>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?page ?cat ?key ?value (lang(?value) as ?language) WHERE {
?page dcterms:subject ?cat .
?page ?key ?value .
FILTER(
regex(?cat, "category:Amusement_parks_in_the_Netherlands") &&
?key = foaf:depiction
)
}
But it return zero results. See here: SNORQL query
It should at least return this page: http://dbpedia.org/page/Duinrell (because it matches the criteria).
Any ideas?
There's a couple of things wrong with your query. First of all, don't use a regex when comparing URIs. Instead of:
regex(?cat, "category:Amusement_parks_in_the_Netherlands")
use:
?cat = <http://dbpedia.org/resource/category:Amusement_parks_in_the_Netherlands>
Second: your namespace prefix for dublin core seems wrong. You are querying the property dcterms:subject, and in your query, the prefix dcterms is mapped to the namespace http://dublincore.org/2010/10/11/dcterms.rdf#. However, the actual property in DBPedia is http://purl.org/dc/terms/subject, so your namespace prefix should map to http://purl.org/dc/terms/ instead.