I am having trouble executing SPARQL queries against dbpedia.org using Jena.
The queries are of the form:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX p: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?album ?name ?dateofrelease
WHERE {
?album p:artist <http://dbpedia.org/resource/SomeArtist> .
?album rdf:type <http://dbpedia.org/ontology/Album> .
?album rdf:type <http://schema.org/MusicAlbum> .
?album p:name ?name .
?album <http://dbpedia.org/ontology/releaseDate> ?dateofrelease .
FILTER(xsd:dateTime(?dateofrelease) >= '2009-01-01T00:00:00Z'^^xsd:dateTime)
} LIMIT 5
where http://www.dbpedia.org/resource/SomeArtist is a valid artist URI, e.g. http://dbpedia.org/resource/Wilco, and URL-encoded properly before sent.
The queries are executed with the following standard code:
Query query = QueryFactory.create(queryString);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
ResultSet results = queryExecution.execSelect();
The program is doing about 30 queries on the same form, but some of them "hangs" for about a minute or two before Jena throws a
com.hp.hpl.jena.sparql.resultset.ResultSetException: Not an ResultSet result
If I catch the exception and go on, some queries hang and some returns a result set quickly with either 0 or more results. Doing this several times, it's random what queries that returns or "hangs". Using the SPARQL DBpedia with a identical query sometimes works, sometimes hangs in the same way in the web browser.
Am I constructing the query wrong, making it in some way time consuming for dbpedia.org so that the query timeout at the sever? I'm quite new to the Semantic Web and Jena, but I thought initially my query not could be very time consuming since I'm using a absolute URI for the object part in the
?album p:artist <http://www.dbpedia.org/resource/SomeArtist>
statement.
Is there some number of requests from-one-source/per-time-unit limit to dbpedia.org that I'm not aware of?
(Using Jena 2.6.4)
I can confirm that the query is slow.
I tried a few variations that might effect the speed, but it didn't help. I can't see any obvious reason why that type of query would be particularly slow, sorry.
One alternative, to get more consistent response times would be to download the DBPedia dump, grep out the predicates that you're interested in, and load them into a local triplestore, e.g. Jena.
Related
I am trying to get alternative names of given names in WikiData with the following simple query:
PREFIX ps: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT {?s rdfs:label ?o}
WHERE { ?s ps:P31 wd:Q202444. ?s rdfs:label ?o}
LIMIT 1000
Initially, the query was much more complex, but I was getting time-outs on the public WikiData SPARQL endpoint. I decided to use Linked Data Fragments to offload some filtering from the server to the client.
$comunica-sparql "https://query.wikidata.org/bigdata/ldf" -f query > given_names.n3
Could not retrieve https://query.wikidata.org/bigdata/ldf?subject=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ21147790&predicate=http%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label&page=3 (500: unknown error)
(where query is a file with the SPARQL query shown above). Unfortunately, the client tries to get output from the 3rd page, I am getting the error. Following the link in fact returns HTTP 500 error with
The link points to the 3rd page. It works if you try to go to the second page.
Is this a bug or a limitation of a service?
This is definitely a bug; the server does not fulfill the Triple Pattern Fragments specification if pagination does not work.
I'm executing the below SPARQL query
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?resource WHERE {
?resource a <http://dbpedia.org/ontology/Place> .
{?resource rdfs:label 'Paris'#en.} UNION { ?resource rdfs:label 'France'#en.}
}
I'm executing this here.
Sometimes I am getting required result and sometimes it returns 502 error ( I get the message that website is under maintenance.....)
Can you please let me know why result is not consistent and how can I avoid this?
Also inconsistent behavior when I execute through java code :
Query query = QueryFactory.create(sb.toString());
QueryExecution qexec =
QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
The experimental DBpedia SPARQL endpoint is made available as a public service, but as such it has no Quality-of-Service (QoS) guarantees.
You can set up your own DBpedia mirror in the Amazon AWS cloud, by spinning up the pre-populated and dynamically updated AMI. You can also set up a local mirror, by manually loading the DBpedia datasets into your own Virtuoso or other triple or quad store instance.
(ObDisclaimer: OpenLink Software produces Virtuoso and the DBpedia AMI, provides and maintains the public DBpdia endpoint, and employs me.)
I tried to make this query on http://sparql.sindice.com/
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT *
WHERE
{
?thing rdfs:label ?name .
?thing rev:hasReview ?review .
filter regex(str(?name), "harlem", "i")
} LIMIT 10
And it returns 504 Gateway Time-out
The server didn't respond in time.
What i'm doing wrong?
Thanks.
You made a query that was too hard for the endpoint to answer in a timely fashion hence why you got a timeout response. Note that there website states the following:
all queries are time and resource limited. notice that this means that
sometime you will get incomplete or even no results. If this is
happening often for you or you really want to run more complex queries
please contact us
Your query essentially selects a vast swathe of data and then makes the engine run a regular expression over ever possible value which is extremely slow.
I believe Sindice use Virtuoso as their SPARQL implementation so you can cheat and use Virtuoso specific full text query extension like so:
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT *
WHERE
{
?thing rdfs:label ?name .
?thing rev:hasReview ?review .
?name bif:contains "harlem" .
}
LIMIT 10
However this query also seems to timeout, if you can add more conditions to constrain your query further you will have more chance of getting results in a timely fashion.
I am trying to retrieve the value of the dbpedia-owl:influenced in this page e.g: Andy_Warhol
The query I write is:
PREFIX rsc : http://dbpedia.org/resource
PREFIX dbpedia-owl :http://dbpedia.org/ontology
SELECT ?o WHERE {
rsc:Andy_Warhol dbpedia-owl:infuenced ?o .
}
but it is EMPTY.
Strange is that when I have the same query for another property from the ontology type like "birthPlace", the sparql engine gives the result back:
SELECT ?o WHERE {
rsc:Andy_Warhol dbpedia-owl:birthplace ?o .
}
which is a link to another resource:
dbpedia.org/resource/Pittsburgh
I am just confused how to write this query?
besides several formal errors addressed in the answer of #Joshua, there is also the semantic problem that the properties you are looking for - in this case - seem to be found on the entities that were influenced.
this query might give you the desired results
PREFIX rsc: <http://dbpedia.org/resource/>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT ?s WHERE {
?s dbpedia-owl:influencedBy rsc:Andy_Warhol .
}
run query
There are a few issues here. One is that the SPARQL, as presented, isn't correct. I edited to make the prefix syntax legal, but the prefixes were still wrong (they didn't end with a final slash). You don't want to be querying for http://dbpedia.org/resourceAndy_Warhol after all; you want to query for http://dbpedia.org/resource/Andy_Warhol. Some standard namespaces for DBpedia are listed on their SPARQL endpoint. Using those namespaces and the SPARQL endpoint, we can ask for all the triples that have http://dbpedia.org/resource/Andy_Warhol as the subject with this query:
SELECT * WHERE {
dbpedia:Andy_Warhol ?p ?o .
}
In the results produced there, you'll see the one using http://dbpedia.org/ontology/birthPlace (note the captial P in birthPlace), but you won't see any triples with the predicate http://dbpedia.org/ontology/infuenced, so it makes sense that your first query has no results. Do you have some reason to suppose that there should be some results?
HI there is a query which was working untill yesterday :
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT(?film_link) ?film_abstract ?film_name ?wikipage
WHERE {
?film_link rdf:type <http://dbpedia.org/ontology/Film> .
?film_link rdfs:comment ?film_abstract
FILTER (langMatches( lang(?film_abstract), "EN")) .
?film_link foaf:name ?film_name .
?film_title foaf:page ?wikipage .
}
but today it is showing: Virtuoso 42000 Error The estimated execution time 99232592 (sec) exceeds the limit of 1500 (sec).
I have seen this error earlier also but when i ran such query again after some time it runs..
Could anyone explain the meaning of the error?
It means that Virtuoso's query planner (Virtuoso is the triplestore that DBPedia runs on) has estimated how long it would take to evaluate your query and it thinks it will take too long so it has refused to run the query.
I suspect the problem is the last triple pattern in your query:
?film_title foaf:page ?wikipage
You never use either of those variables before that point so what you've asked Virtuoso to do is cross product every possible triple with foaf:page in the predicate position with the results in the rest of your query.
If you change this to the following it should work fine:
?film_link foaf:page ?wikipage
I suspect this is what you meant to write anyway and this works for though it is still pretty slow because your query is pretty broad and FILTER clauses are often quite sluggish to evaluate.