Aggregate inside Subquery for SPARQL - sparql

Im a using Virtuoso and DBpedia as an endpoint.
My purpose is to retrieve all movies which have a greater amount of actor than the mean number of actors for all movies.
I thought the following query would work:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT
DISTINCT ?film
COUNT(?actor) AS ?numActors
WHERE{
?film rdf:type dbp:Film .
?film dbp:starring ?actor .
{
SELECT
AVG(?numActors) AS ?avgNumActors
WHERE{
SELECT
?Sfilm
COUNT(?Sactor) AS ?numActors
WHERE{
?Sfilm rdf:type dbp:Film .
?Sfilm dbp:starring ?Sactor
}
}
}
}
GROUP BY ?film
HAVING (COUNT(?actor) > ?avgNumActors)
LIMIT 20
but I receveice the following error
Variable ?avgNumActors is used in the result set outside aggregate and not mentioned in GROUP BY clause
What am I doing wrong?

Related

Dbpedia sparql gives empty result to my query, what is wrong?

I'm doing sparql query in this site.
It gives me an empty result, what is wrong with my query?
prefix foaf: <http://xmlns.com/foaf/0.1/>
select * where {
?s rdf:type foaf:Person.
} LIMIT 100
This query is ok, but when I add the second pattern, I got empty result.
?s foaf:name 'Abraham_Robinson'.
prefix foaf: <http://xmlns.com/foaf/0.1/>
select * where {
?s rdf:type foaf:Person.
?s foaf:name 'Abraham_Robinson'.
} LIMIT 100
How to correct my query so the result includes this record:
http://dbpedia.org/resource/Abraham_Robinson
I guess Kingsley misread this as a freetext search, rather than a simple string comparison.
The query Kingsley posted and linked earlier delivers no solution, for the same reasons as the original query failed, as identified in the comment by AKSW, i.e. --
foaf:name values don't generally replaces spaces with underscores as in the original value; i.e., 'Abraham_Robinson' should have been 'Abraham Robinson'
foaf:name strings are typically langtagged, and it is in this case, so that actually needs to be 'Abraham Robinson'#en
Incorporating AKSW's fixes with this line ?s foaf:name 'Abraham Robinson'#en., the query works.
All that said -- you may prefer an alternative query, which will deliver results whether or not the foaf:name value is langtagged and whether or not the spaces are replaced by underscores. This one is Virtuoso-specific, and produces results faster because the bif:contains function uses its free-text indexes, would be --
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE
{
?s rdf:type foaf:Person ;
foaf:name ?name .
?name bif:contains "'Abraham Robinson'" .
}
LIMIT 100
Generic SPARQL using a REGEX FILTER works against both Virtuoso and other RDF stores, but produces results more slowly because REGEX does not leverage the free-text indexes, as in --
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE
{
?s rdf:type foaf:Person ;
foaf:name ?name .
FILTER ( REGEX ( ?name, 'Abraham Robinson' ) ) .
}
LIMIT 100
The following query should work. Right now it isn't working due to a need to refresh the text index associated with this instance (currently in progress):
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * where {
?s rdf:type foaf:Person.
?s foaf:name "'Abraham Robinson'".
}
LIMIT 100
Note how the phrase is placed within single-quotes that are within double-quotes.
If the literal content is language-tagged, as is the case in DBpedia the exact-match query would take the form (already clarified in #TallTed's response):
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * where {
?s rdf:type foaf:Person.
?s foaf:name 'Abraham Robinson'#en.
}
LIMIT 100
This SPARQL Results Page Link should produce a solution when the index update completes.

Sparql find entities with dbo type and sort them by count

I have to do a sparql query to the dbpedia endpoint which needs to:
Find all the entities containing "vienna" in the label and "city" in the abstract
Filter them keeping only the ones that have at least one dbo rdf:type
Sort the results by count of dbo types (e.g. if an entity has 5 dbo rdf:type it has to be shown before entities with 4 dbo rdf:type)
I did several attempts, the closest to the result is:
select distinct (str(?s) as ?s) count(?t) as ?total where {{ ?s rdfs:label "vienna"#en. ?s rdf:type ?t.}
UNION { ?s rdfs:label ?l. ?s rdf:type ?t . ?l <bif:contains> '("vienna")'
. FILTER EXISTS { ?s dbo:abstract ?cc. ?cc <bif:contains> '("city")'. FILTER(lang(?cc) = "en").}}
FILTER (!strstarts(str(?s), str("http://dbpedia.org/resource/Category:")))
. FILTER (!strstarts(str(?s), str("http://dbpedia.org/property/")))
. FILTER (!strstarts(str(?s), str("http://dbpedia.org/ontology/")))
. FILTER (strstarts(str(?t), str("http://dbpedia.org/ontology/"))).}
LIMIT 50
Which will (wrongly) count the rdf:type before actually filtering it. I don't want to count rdf:type that are not dbo (ontology).
The idea is to use a subquery in which you search for the entities and to do the counting in the outer query:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s (count(*) AS ?cnt)
WHERE
{ { SELECT DISTINCT ?s
WHERE
{ ?s rdfs:label ?l .
?l <bif:contains> '"vienna"'
FILTER langMatches(lang(?l), "en")
FILTER EXISTS { ?s dbo:abstract ?cc .
?cc <bif:contains> '"city"'
FILTER langMatches(lang(?cc), "en")
}
?s rdf:type ?t
FILTER ( ! strstarts(str(?s), str("http://dbpedia.org/resource/Category:")) )
FILTER ( ! strstarts(str(?s), str("http://dbpedia.org/property/")) )
FILTER ( ! strstarts(str(?s), str(dbo:)) )
FILTER strstarts(str(?t), str(dbo:))
}
}
?s ?p ?o
FILTER strstarts(str(?p), str(dbo:))
}
GROUP BY ?s
ORDER BY DESC(?cnt)

SPARQL query to find "notable" people

I am using live Dbpedia (http://dbpedia-live.openlinksw.com/sparql/) to get basic details of notable people. My query is:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?x0 ?name ?dob WHERE {
?x0 rdf:type foaf:Person.
?x0 rdfs:label ?name.
?x0 dbpedia-owl:birthDate ?dob.
FILTER REGEX(?name,"^[A-Z]","i").
} LIMIT 200
This works and I use LIMIT 200 to limit the output to a small number of people. My problem is the 200 people are random, and I want some way of measuring 'notability' such that I return 200 notable people, rather than 200 random people. There are over 500,000 people in Dbpedia.
My question is, how can I measure 'notability' and limit the query to return notable people only? I realize there is no 'notability' property and it is very subjective. I am happy to use any indirect or approximate measure such as number of links or number of references. But I don't know how to do this.
Edit : As a result of the helpful comments I improved the query to include page ranks:
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX vrank:<http://purl.org/voc/vrank#>
SELECT DISTINCT ?s ?name2 ?dob ?v
FROM <http://dbpedia.org>
FROM <http://people.aifb.kit.edu/ath/#DBpedia_PageRank>
WHERE {
?s rdf:type foaf:Person.
?s rdfs:label ?name.
?s dbo:birthDate ?dob.
?s vrank:hasRank/vrank:rankValue ?v.
FILTER REGEX(?name,"^[A-Z].*").
BIND (str(?name) AS ?name2)
} ORDER BY DESC(?v) LIMIT 100
The problem now is there are lots of duplicates, even though I am using DISTINCT.

My SPARQL query doesn't work at all

I am currently trying to run my query but I keep getting the error that in line 0 the parentheses are not balanced at '}'
I have checked my whole code multiple times, but I don't seem to get it fixed. I am currently using the dbpedia endpoint.
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX yago: <http://dbpedia.org/class/yago/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?country ?government ?population
WHERE{ ?country dct:subject <http://dbpedia.org/resource>/Category:Countries_in_Europe> ;
rdfs:label ?country;
dbo:government ?government.
?government rdfs:label ?government.
?population rdfs:subClassOf* dbo:PopulatedPlace
rdf:type dbpedia-owl:Country;
rdfs:label ?country ;
prop:populationEstimate ?population .
FILTER (?population < 3000000) .
FILTER ( lang(?country) AND (lang(?(government = 'en')
}
Three rows in the graph should be shown, First with the country as a title, second with the governmenttypes of the countries as a title and the 3rd should be a row with the population descending from the total of 3000000.
Thanks alot in advance for helping me out!
You have multiple errors in this query.
Several things that pop out at me.
Thing 1 --
?government rdfs:label ?government.
You've got several similar ?subject ?predicate ?subject constructions.
Thing 2 --
?population rdfs:subClassOf* dbo:PopulatedPlace
rdf:type dbpedia-owl:Country;
I think you need a semicolon after dbo:PopulatedPlace
Thing 3 --
FILTER ( lang(?country) AND (lang(?(government = 'en')
That FILTER breaks syntax several ways. I think this will do what you intend --
FILTER ( lang(?country) = 'en') .
FILTER ( lang(?government) = 'en') .
Thing 4 --
<http://dbpedia.org/resource>/Category:Countries_in_Europe>
You've got an extra > in mid-string.
Thing 5 --
dbpedia-owl:Country
I think that should be dbo:Country
Thing 6 --
prop:populationEstimate
I think that should be dbp:populationEstimate
There are MANY more issues... I am not sure you're really trying.

How to combine the results using SPARQL (dbpedia.org with linkedmdb.org or freebase.com)

I need to get all awarded movies on 80th Award Ceremony
I tried to write SPARQL query:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
prefix movie: <http://data.linkedmdb.org/resource/movie/>
prefix award: <http://data.linkedmdb.org/page/film_awards_ceremony/180/>
select distinct ?film ?award where {
{ ?film a movie:film.
?award a movie:film_awards_ceremony.
} union
{ ?film a dbpedia-owl:Film }
?film rdfs:label ?label .
}
But the result is full movies list.
I found the data I need also here: https://www.freebase.com/m/02pgky2
How to combine (union) these entities in a right way ?
If is not possible - How to get the result from freebase using SPARQL and dbpedia.org?