I'd like to find entities "similar" to John Harrison in Wikidata. My naive SPARQL query always times out.
SELECT ?similar ?similarLabel (COUNT(?p) AS ?similarity) WHERE {
wd:Q314335 ?p ?o.
?similar ?p ?o.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?similar ?similarLabel
HAVING (?similarity > 5)
ORDER BY DESC(?similarity)
I've tried limiting the number of properties from a sub query, but it still times out.
Is there a more efficient SPARQL query that might succeed? Are there any other Blazegraph extensions (e.g., GAS) that might help here?
Related
I wanted to retrive subjects which have less than three properties. I tried with following query.
SELECT ?s
WHERE {
?s ?p ?o .
}
group by ?s
having (count(?p) < 3)
limit 10
But it's taking quite long time. Do we have any other optimized way to do same thing.
I noticed that when a query does not have an 'order by' statement at the end, the results are ordered in a last-updated-first manner.
SELECT ?subjectLabel ?subject WHERE {
?subject wdt:P31 wd:Q11424.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} LIMIT 50
What if I wanted to control this and get all items that have been updated during the month of April 2021 only? I would like to limit the results in some way but I can't find the proper way of doing that. Thanks for your help!
I accidentaly noticed that if you write SPARQL query like this
SELECT ?id ?idLabel WHERE{
VALUES ?id { wd:Q1 wd:Q1 }.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
the result is kinda strange: four rows instead of expected two.
I get it that it's lame to write duplicates in the VALUE clause, but I'm just wondering why it works like this. Could someone explain please?
It's a bug of the non standard label service. When running this query you get two results for the wikibase:label of wd:Q1, these are then added to each wd:Q1. So when you have 2 same values you will get 4 rows, if you have 3 same values, you'll get 9 and so on.
This query, why does it return Gabriel Heinze and not Cristiano Ronaldo? Both satisfy the criteria
SELECT DISTINCT ?person ?personLabel WHERE {
?person wdt:P54 wd:Q18656.
?person wdt:P54 wd:Q75729.
?person wdt:P54 wd:Q8682.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Querying clubs Cristiano Ronaldo is member of, using wdt as property prefix returns only the Real Madrid FC as this statement has a higher priority rank, and wdt focuses on the highest priority rank statements.
Unfortunatly, there is no direct substitute to wdt that would include lower priority statements, but you can use a combination of p and ps:
the fixed query to find all clubs Ronaldo has been part of
your query fixed
Thanks for asking: researching, I finally learned what the t stands for in wdt: truthy \o/
I am trying to select all people in wikipedia with a particular profession, occupation or are known for a particular field (i.e. contains the string of the occupation or profession).
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
}
LIMIT 100
Produces sort of what I want but I also want to include their Profession/Occupation (both of which are ?Profession or ?occupation) but the following doesn't return anything:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/profession> ?profession.
}
LIMIT 100
Or including ?Occupation doesn't work either as some don't have a profession entry but have an occupation. I just want all people (and their influenced list) for a particular profession/occupation i.e. "psychology" or "architect". I want basically a table:
Person | Profession/Occupation | Known For | Influenced Person | Profession/Occupation | Known For
The below is obviously wrong but something like this....
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
}
LIMIT 100
I don't know the exact reason*, but it seems that people in DBpedia who have dbo:influenced or dbo:knownFor don't have dbo:profession or dbo:occupation.
So, your query won't work, and you will have to find some other way to find the information you need, possibly by looking at dbp:fields or rdf:type.
* My guess is that dbo:influenced and dbo:knownFor is used for scientists and scientists don't have occupation or profession in DBpedia.
DBpedia is extraction of wikipedia infoboxes. If you look at the result of the first query and take one example out of it, check in wikipedia, you will see that there is no profession or occupation is used in the infoboxes. For example, let us look at http://dbpedia.org/page/Jerry_Coyne, wikipedia page is https://en.wikipedia.org/wiki/Jerry_Coyne. In wikipedia page, we can not find any information about profession or occupation in infobox. This is because https://en.wikipedia.org/wiki/Template:Infobox_scientist is used as template and there is no occupation/profession in that template. On the other hand there is https://en.wikipedia.org/wiki/Template:Infobox_person template, we can find occupation and known for. Therefore, below query will work:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/occupation> ?occupation.
}
LIMIT 100
But this one will query only extraction of page which used https://en.wikipedia.org/wiki/Template:Infobox_person and will not return any page which used https://en.wikipedia.org/wiki/Template:Infobox_scientist.