Wikidata SPARQL query returns wrong results - sparql

This query, why does it return Gabriel Heinze and not Cristiano Ronaldo? Both satisfy the criteria
SELECT DISTINCT ?person ?personLabel WHERE {
?person wdt:P54 wd:Q18656.
?person wdt:P54 wd:Q75729.
?person wdt:P54 wd:Q8682.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Querying clubs Cristiano Ronaldo is member of, using wdt as property prefix returns only the Real Madrid FC as this statement has a higher priority rank, and wdt focuses on the highest priority rank statements.
Unfortunatly, there is no direct substitute to wdt that would include lower priority statements, but you can use a combination of p and ps:
the fixed query to find all clubs Ronaldo has been part of
your query fixed
Thanks for asking: researching, I finally learned what the t stands for in wdt: truthy \o/

Related

Duplicates in SPARQL clause VALUES lead to unexplainable result

I accidentaly noticed that if you write SPARQL query like this
SELECT ?id ?idLabel WHERE{
VALUES ?id { wd:Q1 wd:Q1 }.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
the result is kinda strange: four rows instead of expected two.
I get it that it's lame to write duplicates in the VALUE clause, but I'm just wondering why it works like this. Could someone explain please?
It's a bug of the non standard label service. When running this query you get two results for the wikibase:label of wd:Q1, these are then added to each wd:Q1. So when you have 2 same values you will get 4 rows, if you have 3 same values, you'll get 9 and so on.

SPARQL query for semantic similarity in Wikidata times out

I'd like to find entities "similar" to John Harrison in Wikidata. My naive SPARQL query always times out.
SELECT ?similar ?similarLabel (COUNT(?p) AS ?similarity) WHERE {
wd:Q314335 ?p ?o.
?similar ?p ?o.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?similar ?similarLabel
HAVING (?similarity > 5)
ORDER BY DESC(?similarity)
I've tried limiting the number of properties from a sub query, but it still times out.
Is there a more efficient SPARQL query that might succeed? Are there any other Blazegraph extensions (e.g., GAS) that might help here?

Finding specific person in dbpedia using sparql query

So I'm learning SPARQL and trying to get a specific person out of dbpedia using a SPARQL query. I'm using their endpoint at https://dbpedia.org/sparql to do this query. The person I'm trying to find is Albert Einstein . I found on his page that he has a property of dbp:name. So I wrote a query to find Albert Einstein by his name:
select ?person where {?person <http://dbpedia.org/property/name> "Albert Einstein"} LIMIT 100
Unfortunately, this gives me just an empty result set (I think) as i'm justing seeing an empty table with the header of 'Person'
What's going wrong?
I'm not sure why your query doesn't work, but this one does:
select ?person where {?person foaf:name "Albert Einstein"#en} LIMIT 100
Though this returns all entities named "Albert Einstein", which includes the scientist, but also an album of the same name. If you want just people, you can use:
select ?person
where {
?person foaf:name "Albert Einstein"#en.
?person a foaf:Person.
}
LIMIT 100

Sparql - Traverse to top (broadest)

I have an internal triples dataset.
I am trying to implement a typeahead feature for an application using the dataset.
I am trying to figure out how I can conditional traverse to the Top Concept
In this picture, if I searched for dreamworks, It would be 3 layers down (Business Organizations -> Private Company -> Dreamworks).
If I do something else, obviously it will be at different layers of the graph.
I am trying to get the value "Organization" or "Person" as the top level.
This very straightforward query works.
SELECT DISTINCT ?subjectPrefLabel ?a ?b ?o
WHERE
{
?subject skosxl:prefLabel/skosxl:literalForm ?subjectPrefLabel .
?subject skos:broader/skos:broader/skos:broader/skosxl:prefLabel/skosxl:literalForm ?a
FILTER regex(?subjectPrefLabel, "Dreamworks", 'i')
}
ORDER BY ?subjectPrefLabel
However, obviously this is unsustainable, since the developer would need to know how many levels down the hierarchy in order to issue the correct number of skos:broader.
Is there anyway I can conditionally discover the highest level of the hierarchy?
As this is an internal Ontology, I cannot share any data, so I know it will be hard to work with, any advice is appreciated.
Using the kleene star operator * incombination with a FILTER that there is no more broad concept should return the top concepts only:
SELECT DISTINCT ?subjectPrefLabel ?a ?b ?o WHERE {
?subject skosxl:prefLabel/skosxl:literalForm ?subjectPrefLabel .
?subject skos:broader* ?concept.
?concept skosxl:prefLabel/skosxl:literalForm ?a
FILTER NOT EXISTS { ?concept skos:broader ?supConcept }
FILTER regex(?subjectPrefLabel, "Dreamworks", 'i')
}
ORDER BY ?subjectPrefLabel

select all people in dbpedia associated with particular concept/occupation/profession

I am trying to select all people in wikipedia with a particular profession, occupation or are known for a particular field (i.e. contains the string of the occupation or profession).
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
}
LIMIT 100
Produces sort of what I want but I also want to include their Profession/Occupation (both of which are ?Profession or ?occupation) but the following doesn't return anything:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/profession> ?profession.
}
LIMIT 100
Or including ?Occupation doesn't work either as some don't have a profession entry but have an occupation. I just want all people (and their influenced list) for a particular profession/occupation i.e. "psychology" or "architect". I want basically a table:
Person | Profession/Occupation | Known For | Influenced Person | Profession/Occupation | Known For
The below is obviously wrong but something like this....
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
}
LIMIT 100
I don't know the exact reason*, but it seems that people in DBpedia who have dbo:influenced or dbo:knownFor don't have dbo:profession or dbo:occupation.
So, your query won't work, and you will have to find some other way to find the information you need, possibly by looking at dbp:fields or rdf:type.
* My guess is that dbo:influenced and dbo:knownFor is used for scientists and scientists don't have occupation or profession in DBpedia.
DBpedia is extraction of wikipedia infoboxes. If you look at the result of the first query and take one example out of it, check in wikipedia, you will see that there is no profession or occupation is used in the infoboxes. For example, let us look at http://dbpedia.org/page/Jerry_Coyne, wikipedia page is https://en.wikipedia.org/wiki/Jerry_Coyne. In wikipedia page, we can not find any information about profession or occupation in infobox. This is because https://en.wikipedia.org/wiki/Template:Infobox_scientist is used as template and there is no occupation/profession in that template. On the other hand there is https://en.wikipedia.org/wiki/Template:Infobox_person template, we can find occupation and known for. Therefore, below query will work:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/occupation> ?occupation.
}
LIMIT 100
But this one will query only extraction of page which used https://en.wikipedia.org/wiki/Template:Infobox_person and will not return any page which used https://en.wikipedia.org/wiki/Template:Infobox_scientist.