German DBpedia endpoint in HTTP Sparql Queries returns no result - sparql

I am using the Jena Java framework for querying DBpedia end point using SPARQL, to get the type for all points of interest in German cities. I am facing no issue for places that have English DBpedia entries. But, when it comes to place names to be queried from the German DBpedia endpoint (http://de.dbpedia.org/resource/Schloß_Nymphenburg), this query returns no result. This problem is also mentioned over here (http://mail-archives.apache.org/mod_mbox/jena-users/201110.mbox/%3C4E877C8A.4050705#apache.org%3E). Even after referring to this, I am unable to solve the problem. I don't know how to work with QueryEngineHTTP. I am adding two code snippets - one that works (first one - query for Allianz Arena : which has an English entry in DBpedia) and one that doesn't work (second one - for Schloß Nymphenburg, that has a German entry).
This might be a very trivial issue, but I am unable to solve it. Any pointers to a solution would be very very helpful.
Thanks a lot!
Code 1 - working :
String service = "http://dbpedia.org/sparql";
final ParameterizedSparqlString query = new ParameterizedSparqlString(
"PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>" +
"PREFIX dbo: <http://dbpedia.org/ontology/>" +
"PREFIX dcterms: <http://purl.org/dc/terms/>" +
"SELECT * WHERE {" +
"?s geo:lat ?lat ." +
"?s geo:long ?long ." +
"?s dcterms:subject ?sub}");
query.setIri("?s", "http://dbpedia.org/resource/Allianz_Arena");
QueryExecution qe = QueryExecutionFactory.sparqlService(service, query.toString());
ResultSet results = qe.execSelect();
ResultSetFormatter.out(System.out, results);
Code 2 - not working :
String service = "http://dbpedia.org/sparql";
final ParameterizedSparqlString query = new ParameterizedSparqlString(
"PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>" +
"PREFIX dbo: <http://dbpedia.org/ontology/>" +
"PREFIX dcterms: <http://purl.org/dc/terms/>" +
"SELECT * WHERE {" +
"?s geo:lat ?lat ." +
"?s geo:long ?long ." +
"?s dcterms:subject ?sub}");
query.setIri("?s", "http://de.dbpedia.org/resource/Schloß_Nymphenburg");
QueryExecution qe = QueryExecutionFactory.sparqlService(service, query.toString());
ResultSet results = qe.execSelect();
ResultSetFormatter.out(System.out, results);

I don't think this is an issue with jena at all. Trying:
SELECT * WHERE {
<http://de.dbpedia.org/resource/Schloß_Nymphenburg> ?p ?o }
at http://dbpedia.org/sparql I get no results: try it yourself.
SELECT * WHERE {
<http://de.dbpedia.org/resource/Schloss_Nymphenburg> ?p ?o }
by contrast returns something, even if it's just a bunch of cross links.

Related

SPARQL execdescribe query in jena only giving prefix as output

Below SPARQL execdescribe query giving me only two prefixes as output while running with Jena query but when I run this query on virtuoso SPARQL endpoint it's giving perfect output.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX db: <http://dbpedia.org/ontology/>
PREFIX prop: <http://dbpedia.org/property/>
DESCRIBE ?movie ?author ?genre
WHERE {
?movie rdf:type db:Film ;
prop:author ?author ;
prop:genre ?genre .
}
LIMIT 2
OFFSET 0
When I run with Jena I am getting only two line output like this,
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
Below is my code which I am using but with some query its working fine,
String queryString = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>" +
"PREFIX db: <http://dbpedia.org/ontology/> " +
"PREFIX prop: <http://dbpedia.org/property/>" +
"DESCRIBE ?movie ?author ?genre" +
"WHERE { " +
"?movie rdf:type db:Film ;" +
"prop:author ?author ;" +
"prop:genre ?genre ." +
"}" +
"LIMIT 2" +
"OFFSET 0";
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://localhost:8890/sparql", queryString);
Model results = qexec.execDescribe();
results.write(System.out,"TTL");
Its perfectly giving me output on virtuoso SPARQL end point.
below is screen shot,
As suggested in comments by #AKSW, There is a space issue in the query. So after edited my query, it will perfectly work fine.

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language

SPARQL query with FILTER returning correct values for its variables BUT only the given value for the variable being filtered

I'm using Jena to make a SPARQL query to search for all a document's properties by its subject. But the documents can have more than one subject, and when I do the search, it doesn't return me all the documents' properties, including all the documents' subjects, but even if it has 3 subjects (for example) it returns me all the documents properties + only the subject I set at FILTER.
I'd like to have as a return all the properties from the found document + all the subjects (that belong to the found document) and not only the one at FILTER.
Query (this.subject is a variable that has its value set in a JSF page):
String queryString = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX dc: <http://purl.org/dc/elements/1.1/> " +
"PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?document ?subject" +
" ?title ?description ?language WHERE { " +
"?document dc:title ?title." +
"?document dc:subject ?subject." +
"?document dc:description ?description." +
"?document dc:language ?language." +
"FILTER ( regex(?subject, replace( \"" + this.subject + "\", ' ', '|' ), 'i' )). }";
Thank you!
You likely want to use a sub-query to restrict to the documents matching the FILTER and then select the rest of the stuff you are actually interesting in e.g.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc : <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?document ?subject ?title ?description ?language
WHERE
{
{
SELECT ?document
WHERE
{
?document dc:subject ?subject .
FILTER(REGEX(?subject, REPLACE("search term", " ", "|"), "i"))
}
}
?document dc:title ?title ;
dc:description ?description ;
dc:subject ?subject ;
dc:language ?language .
}
Note that this is still going to give you a row for each document-subject combination so if you have a document with 3 subjects you'll get three rows for that document. If you want to combine documents into a single row then you can use GROUP BY and then a GROUP_CONCAT aggregate, there are other questions already on Stack Overflow that detail how to do that.
Notes
Also note that using simple string concatenation to inject constants into your query is ill advised, take a look at Jena's ParameterizedSparqlString for a more user friendly and SPARQL injection proof API for building queries.

DBpedia query returns results with Virtuoso but not with Jena

I'm trying to create a SPARQL query with Jena to query DBpedia. The query is working when I use Virtuoso but when I plug it into the following Java code, it returns an empty set.
String sr="Christopher_Nolan";
String sparqlQueryString1 = "PREFIX dbont: <http://dbpedia.org/ontology/> "+
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>"+
"PREFIX foaf: <http://xmlns.com/foaf/0.1/> "+
" SELECT distinct ?s"+
" WHERE { "+
"?s foaf:name ?label ." +
"filter(?label=\"Christpher_Nolan\"#en)." +
" }";
Query query = QueryFactory.create(sparqlQueryString1);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
ResultSet results = qexec.execSelect();
ResultSetFormatter.out(System.out,results, query);
qexec.close() ;
I don't think this is a problem with Jena, but with your particular query. Your query, when run on the DBpedia SPARQL endpoint, produces no results
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT distinct ?s
WHERE {
?s foaf:name ?label .
filter(?label="Christpher_Nolan"#en)
}
SPARQL results (no results)
However, if you add an o to the name Christopher, and change the underscore to a space, you get three results:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT distinct ?s
WHERE {
?s foaf:name ?label .
filter(?label="Christopher Nolan"#en)
}
SPARQL results (3 results)
s
http://dbpedia.org/resource/Christopher_Nolan_(author)
http://dbpedia.org/resource/Christopher_Nolan
http://dbpedia.org/resource/Chris_Nolan_(musician)
I'd also point out that this is a rather unusual use of filter. If you want to select triples that use a certain value, just put that value into the triple pattern:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT distinct ?s
WHERE {
?s foaf:name "Christopher Nolan"#en
}
SPARQL results (same 3 results)

get latitude and longitude of a place dbpedia

I want to get the latitude and longitude of a place whose name I already know by
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?s a dbo:Place .
?s geo:lat ?lat .
?s geo:long ?long .
}
where the name of the place (?s) is something like Graves Park.
How would one go about implementing the same in Jena where the name of the place might vary?
You can use Jena's ARQ to execute queries against remote SPARQL endpoints. The process is described in ARQ — Querying Remote SPARQL Services.
Using ParameterizedSparqlStrings in SELECT queries
To do this for different places that you might not know until it is time to execute the query, you can use a ParameterizedSparqlString to hold the query and then inject the value(s) once you have them. Here's an example. The query is the one you provided. I put it into a ParameterizedSparqlString, and then used setIri to set ?s to http://dbpedia.org/resource/Mount_Monadnock.
import com.hp.hpl.jena.query.ParameterizedSparqlString;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.query.ResultSetFormatter;
public class DBPediaQuery {
public static void main( String[] args ) {
final String dbpedia = "http://dbpedia.org/sparql";
final ParameterizedSparqlString queryString
= new ParameterizedSparqlString(
"PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>"+
"PREFIX dbo: <http://dbpedia.org/ontology/>" +
"SELECT * WHERE {" +
" ?s a dbo:Place ." +
" ?s geo:lat ?lat ." +
" ?s geo:long ?long ." +
"}" );
queryString.setIri( "?s", "http://dbpedia.org/resource/Mount_Monadnock");
QueryExecution exec = QueryExecutionFactory.sparqlService( dbpedia, queryString.toString() );
ResultSet results = exec.execSelect();
ResultSetFormatter.out( System.out, results );
}
}
The results printed by this are:
--------------------------------------------------------------------------------------------------------------
| lat | long |
==============================================================================================================
| "42.8608"^^<http://www.w3.org/2001/XMLSchema#float> | "-72.1081"^^<http://www.w3.org/2001/XMLSchema#float> |
--------------------------------------------------------------------------------------------------------------
Once you have the ResultSet, you can iterate through the rows of the solution and extract the values. The values here are Literals, and from a Literal you can extract the lexical form (the string value), or the value as the corresponding Java type (in the case of numbers, strings, booleans, &c.). You could do the the following to print the latitude and longitude instead of using the ResultSetFormatter:
while ( results.hasNext() ) {
QuerySolution solution = results.next();
Literal latitude = solution.getLiteral( "?lat" );
Literal longitude = solution.getLiteral( "?long" );
String sLat = latitude.getLexicalForm();
String sLon = longitude.getLexicalForm();
float fLat = latitude.getFloat();
float fLon = longitude.getFloat();
System.out.println( "Strings: " + sLat + "," + sLon );
System.out.println( "Floats: " + fLat + "," + fLon );
}
The output after this change is:
Strings: 42.8608,-72.1081
Floats: 42.8608,-72.1081
Using ParameterizedSparqlStrings in CONSTRUCT queries
Based some of the comments, it may also be useful to use CONSTRUCT queries to save the results from each query, and to aggregate them into a larger model. Here's code that uses a construct query to retrieve the latitude and longitude of Mount Monadnock and Mount Lafayette, and stores them in a single model. (Here we're just using CONSTRUCT WHERE {…}, so the model that is returned is exactly the same as the part of the graph that matched. You can get different results by using CONSTRUCT {…} WHERE {…}.)
import com.hp.hpl.jena.query.ParameterizedSparqlString;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
public class DBPediaQuery {
public static void main( String[] args ) {
final String dbpedia = "http://dbpedia.org/sparql";
final ParameterizedSparqlString queryString
= new ParameterizedSparqlString(
"PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>"+
"PREFIX dbo: <http://dbpedia.org/ontology/>" +
"CONSTRUCT WHERE {" +
" ?s a dbo:Place ." +
" ?s geo:lat ?lat ." +
" ?s geo:long ?long ." +
"}" );
Model allResults = ModelFactory.createDefaultModel();
for ( String mountain : new String[] { "Mount_Monadnock", "Mount_Lafayette" } ) {
queryString.setIri( "?s", "http://dbpedia.org/resource/" + mountain );
QueryExecution exec = QueryExecutionFactory.sparqlService( dbpedia, queryString.toString() );
Model results = exec.execConstruct();
allResults.add( results );
}
allResults.setNsPrefix( "geo", "http://www.w3.org/2003/01/geo/wgs84_pos#" );
allResults.setNsPrefix( "dbo", "http://dbpedia.org/ontology/" );
allResults.setNsPrefix( "dbr", "http://dbpedia.org/resource/" );
allResults.write( System.out, "N3" );
}
}
The output shows triples from both queries:
#prefix dbr: <http://dbpedia.org/resource/> .
#prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
#prefix dbo: <http://dbpedia.org/ontology/> .
dbr:Mount_Lafayette
a dbo:Place ;
geo:lat "44.1607"^^<http://www.w3.org/2001/XMLSchema#float> ;
geo:long "-71.6444"^^<http://www.w3.org/2001/XMLSchema#float> .
dbr:Mount_Monadnock
a dbo:Place ;
geo:lat "42.8608"^^<http://www.w3.org/2001/XMLSchema#float> ;
geo:long "-72.1081"^^<http://www.w3.org/2001/XMLSchema#float> .