SPARQL queries gives different answers on different servers - sparql

I am trying to run the following sparql query :
PREFIX dct: <http://purl.org/dc/terms/>
select distinct ?subject
where
{
?concept rdfs:label 'Artificial intelligence'#en .
?concept ^dct:subject ?subject .
}
LIMIT 100
When I run this on dbpedia's public server, I get the following results :
http://dbpedia.org/sparql?default-graph-uri=&query=PREFIX++dct%3A++%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E+select+distinct+%3Fsubject+where+%7B+%3Fconcept+rdfs%3Alabel+%27Artificial+intelligence%27%40en+.+%3Fconcept+%5Edct%3Asubject+%3Fsubject+.+%7D++LIMIT+100&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
However, running the same query on a locally hosted instance of dbpedia, yields : http://34.195.108.80:8891/sparql?default-graph-uri=&query=PREFIX++dct%3A++%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E+select+distinct+%3Fsubject+where+%7B+%3Fconcept+rdfs%3Alabel+%27Artificial+intelligence%27%40en+.+%3Fconcept+%5Edct%3Asubject+%3Fsubject+.+%7D++LIMIT+100&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
Why is there a discrepancy in the answers to the point where it's entirely different?

I don't know what do you mean by "different", but without ORDER BY results will be returned more or less randomly simply influenced by the underlying system. There is even no guarantee that running the same query twice on the same server will return the results in the same order. Your query returns only 100 due to the LIMIT 100
The total number of results is the same for both queries, 271:
PREFIX dct: <http://purl.org/dc/terms/>
SELECT count(distinct ?subject) WHERE {
?concept rdfs:label 'Artificial intelligence'#en ;
? ^dct:subject ?subject .
}
For comparison, you have to use ORDER BY:
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?subject WHERE {
?concept rdfs:label 'Artificial intelligence'#en ;
^dct:subject ?subject .
}
ORDER BY ?subject

Related

How to construct a list in SPARQL

I have a ttl file that looks like this:
ex:Shape1
a sh:NodeShape ;
sh:property ex:Property-1
rdfs:label "Shape 1"
ex:Property-1
a sh:PropertyShape ;
sh:path ex:property1
sh:in (
"Option 1"
"Option 2"
) ;
sh:name "Property 1"
ex:property1
a owl:DatatypeProperty
After loading the above data into my triple store (which contains many shapes already), what query can I use to retrieve the same data back?
This query gets everything I need except for the list. For the list it only gives a blank node.
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX ex: <http://example.com/#>
CONSTRUCT {
?subject ?predicate ?object
}
WHERE {
{
bind(ex:Shape1 as ?subject)
ex:Shape1 ?predicate ?object
}
UNION
{
ex:Shape1 sh:property ?subject .
?subject ?predicate ?object
}
UNION
{
ex:Shape1 sh:property/sh:path ?subject .
?subject ?predicate ?object
}
}
First of all, why do you need to query to get the same data back? Do you mean you need to get it in the same syntax and formatting?
to get the same data in the form of triples, you can simply execute SELECT * from ?s ?p ?o and it will give you all the triples. It also depends on your triplestore.
The standard query to fetch the items in your list would be the following:
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX ex: <http://example.com/#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?item
WHERE {
ex:Property-1 sh:in/rdf:rest*/rdf:first ?item
}
You can read more about how the list items are stored internally. This link will be helpful.
You can bind the ex:Property-1 in the same way in UNION or WHERE clause in your construct query to get the desired result.
I hope it is helpful. Good luck.

SPARQL Federated Query takes too long to respond

I have a SPARQL federated query where I join data from wikidata and dbpedia. When I run the first two queries it takes reasonable time. However, when I add the 3rd service it takes too much time. In the 3rd query I fetch the entities obtained from first two queries and filter by looking at if they are 'subclass of' 'percussion instrument'.
Here is my query (Query for returning percussion instruments in middle east):
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT DISTINCT
?instrument
(?countryDbpediaID)
(?country as ?wikidataID)
(?countryLabel as ?origin)
WHERE {
SERVICE <https://query.wikidata.org/sparql>
{
SELECT DISTINCT ?country ?countryLabel
WHERE {
?country wdt:P361 wd:Q7204 .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}
SERVICE <https://dbpedia.org/sparql>
{
SELECT DISTINCT ?intermediateEntityDbpediaID
?intermediateEntityWikidataUri
?intermediateEntityWikidataID
?countryDbpediaID ?description
FROM <http://dbpedia.org>
WHERE { ?countryDbpediaID owl:sameAs ?country;
rdfs:label ?label ;
foaf:depiction ?image;
rdfs:comment ?description .
?intermediateEntityDbpediaID dbp:origin ?countryDbpediaID;
rdfs:label ?intermediateEntityLabel ;
owl:sameAs ?intermediateEntityWikidataUri .
FILTER (LANG(?label) = "en")
FILTER (LANG(?intermediateEntityLabel) = "en")
FILTER (STRSTARTS(STR(?intermediateEntityWikidataUri), STR('http://www.wikidata.org')))
FILTER (LANG(?description) = "en")
BIND(REPLACE(STR(?intermediateEntityWikidataUri),"http://www.wikidata.org/entity/","","i") AS ?intermediateEntityWikidataID)
}
}
SERVICE <https://query.wikidata.org/sparql>
{
SELECT DISTINCT ?instrument
WHERE {
?instrument wdt:P279 wd:Q133163 .
FILTER (?instrument in (URI(?intermediateEntityWikidataUri)))
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}
}
I found this related question but it didn't help me : SPARQL Speed up federated query
Is there any way to optimize this query?
The three federated queries here do not have any shared join variables. The only variables being returned from them (in the sub-queries' SELECT DISTINCT clauses) are all disjoint. That means that evaluation is performing two cartesian joins.
The three sub-queries return 21, 504, and 0 results, respectively. So I think the end result would be zero rows returned. But the query engine may be taking a very sub-optimal route towards that answer and timing out.
Update:
Given the repeated use of variables like ?intermediateEntityWikidataUri, I suspect these are intended to be used to join data across the federated sub-queries. But as written, the query can't do that. For example, given SPARQL's bottom-up semantics, you can't use ?intermediateEntityWikidataUri in the FILTER of the third query without that variable being bound in the same scope.
No matter what else is in the query, this:
SERVICE <https://query.wikidata.org/sparql>
{
SELECT DISTINCT ?instrument
WHERE {
?instrument wdt:P279 wd:Q133163 .
FILTER (?instrument in (URI(?intermediateEntityWikidataUri)))
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}
will result in zero results, because the filter expression is evaluated using an unbound variable (which will filter out all results).

Aggregate inside Subquery for SPARQL

Im a using Virtuoso and DBpedia as an endpoint.
My purpose is to retrieve all movies which have a greater amount of actor than the mean number of actors for all movies.
I thought the following query would work:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT
DISTINCT ?film
COUNT(?actor) AS ?numActors
WHERE{
?film rdf:type dbp:Film .
?film dbp:starring ?actor .
{
SELECT
AVG(?numActors) AS ?avgNumActors
WHERE{
SELECT
?Sfilm
COUNT(?Sactor) AS ?numActors
WHERE{
?Sfilm rdf:type dbp:Film .
?Sfilm dbp:starring ?Sactor
}
}
}
}
GROUP BY ?film
HAVING (COUNT(?actor) > ?avgNumActors)
LIMIT 20
but I receveice the following error
Variable ?avgNumActors is used in the result set outside aggregate and not mentioned in GROUP BY clause
What am I doing wrong?

Dbpedia sparql gives empty result to my query, what is wrong?

I'm doing sparql query in this site.
It gives me an empty result, what is wrong with my query?
prefix foaf: <http://xmlns.com/foaf/0.1/>
select * where {
?s rdf:type foaf:Person.
} LIMIT 100
This query is ok, but when I add the second pattern, I got empty result.
?s foaf:name 'Abraham_Robinson'.
prefix foaf: <http://xmlns.com/foaf/0.1/>
select * where {
?s rdf:type foaf:Person.
?s foaf:name 'Abraham_Robinson'.
} LIMIT 100
How to correct my query so the result includes this record:
http://dbpedia.org/resource/Abraham_Robinson
I guess Kingsley misread this as a freetext search, rather than a simple string comparison.
The query Kingsley posted and linked earlier delivers no solution, for the same reasons as the original query failed, as identified in the comment by AKSW, i.e. --
foaf:name values don't generally replaces spaces with underscores as in the original value; i.e., 'Abraham_Robinson' should have been 'Abraham Robinson'
foaf:name strings are typically langtagged, and it is in this case, so that actually needs to be 'Abraham Robinson'#en
Incorporating AKSW's fixes with this line ?s foaf:name 'Abraham Robinson'#en., the query works.
All that said -- you may prefer an alternative query, which will deliver results whether or not the foaf:name value is langtagged and whether or not the spaces are replaced by underscores. This one is Virtuoso-specific, and produces results faster because the bif:contains function uses its free-text indexes, would be --
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE
{
?s rdf:type foaf:Person ;
foaf:name ?name .
?name bif:contains "'Abraham Robinson'" .
}
LIMIT 100
Generic SPARQL using a REGEX FILTER works against both Virtuoso and other RDF stores, but produces results more slowly because REGEX does not leverage the free-text indexes, as in --
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE
{
?s rdf:type foaf:Person ;
foaf:name ?name .
FILTER ( REGEX ( ?name, 'Abraham Robinson' ) ) .
}
LIMIT 100
The following query should work. Right now it isn't working due to a need to refresh the text index associated with this instance (currently in progress):
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * where {
?s rdf:type foaf:Person.
?s foaf:name "'Abraham Robinson'".
}
LIMIT 100
Note how the phrase is placed within single-quotes that are within double-quotes.
If the literal content is language-tagged, as is the case in DBpedia the exact-match query would take the form (already clarified in #TallTed's response):
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * where {
?s rdf:type foaf:Person.
?s foaf:name 'Abraham Robinson'#en.
}
LIMIT 100
This SPARQL Results Page Link should produce a solution when the index update completes.

How to get dct:subjects for all entities using SPARQL

I am using the following query
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?subject WHERE {
?concept rdfs:label 'Exoskeleton'#en ;
^dct:subject ?subject .
}
ORDER BY ?subject
This doesn't give me any results. However, the rdfs:label exists. (See http://dbpedia.org/page/Exoskeleton.)
On querying with a different label, it works :
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?subject WHERE {
?concept rdfs:label 'Machine learning'#en ;
^dct:subject ?subject .
}
ORDER BY ?subject
The above query works and gives me the results. (See http://dbpedia.org/page/Machine_learning.)
What do I change, such that the first query works too?
The dct:subject predicate is used between a page and a category it belongs to. So, your second query is giving you results that are in Category:Machine learning. But since there is no Category:Exoskeleton, your first query gives you no results. This also means the two pages you liked to are irrelevant to your queries.
I don't know how to change the first query so that it works, because I don't understand what would "working" entail.
Devil is in the details:
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?concept WHERE {
?concept rdfs:label 'Machine learning'#en.
}
ORDER BY ?concept
Returns two results:
http://dbpedia.org/resource/Category:Machine_learning
http://dbpedia.org/resource/Machine_learning
While Exoskeleton has no corresponding concept:
http://dbpedia.org/resource/Exoskeleton
Thus your inverse property path finds resources under a http://dbpedia.org/resource/Category:Machine_learning concept but not under a http://dbpedia.org/resource/Machine_learning or http://dbpedia.org/resource/Exoskeleton pages.
If you drop the inverse modifier,
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?subject WHERE {
?concept rdfs:label 'Exoskeleton'#en ;
dct:subject ?subject .
}
ORDER BY ?subject
Will return the categories (subject) for the concepts under a given label.