Hierarchical query in Sparql to fetch all children for entity organisation - wd:43229 - sparql

I would like to write an equivalent recursive query in Sparql to query all types of organisations on Wikidata that roll up to wd:43229 (organisation)
For example, the following entity Q4926947,
The output should like so
entity|entityLabel|path
Q4926947|Blitz Arcade| Q210167->Q112042224->Q1058914->Q4830453->Q43229
Q4926947|Blitz Arcade| Q210167->Q112042224->Q783794->Q43229
Q4926947|Blitz Arcade| Q210167->Q112042224->Q18388277->Q6881511->Q4830453->Q43229
Q4926947|Blitz Arcade| Q210167->Q112042224->Q18388277->Q6881511->Q362482->Q679206->Q43229
There are several paths that lead to Q43229.
In my query, I would only like to specify the root (Q43229) and it should be able to query all the leaf nodes that link to Q43229
This is what I've got so far, but it is far from the desired result. Any help is appreciated
SELECT ?item ?itemLabel (group_concat(?linkTo; separator=",") as ?org_path) {
wd:Q43229 ^wdt:P279* ?item
OPTIONAL { ?item ^wdt:P279* ?linkTo }
SERVICE wikibase:label {bd:serviceParam wikibase:language "en" }
} group by ?item ?itemLabel
limit 5

Related

Limiting diversity of attributes in Wikidata / SPARQL Query

I want to create a dataset of 100 paintings querying from Wikidata.
With my current query, for each painting I retrieve the artist (creator), the movement, the year it was painted (inception) and so on.
I would like to fix a limit to the number of different movements in my data set. Same for the artists. (i.e. I will have only paintings from n random genres at unspecified proportions).
So for example if I limit to 2 movements I would have only Pop-Art and Cubist paintings or Neoclassic and Impressionist, etc.
SELECT ?itemLabel ?pic ?movementLabel ?creatorLabel ?inception
WHERE
{
?item wdt:P31 wd:Q3305213 .
?item wdt:P135 ?movement .
?item wdt:P170 ?creator .
?item wdt:P18 ?pic .
?item wdt:P571 ?inception
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
LIMIT 100

Wikidata Query misses results

I am trying to query all the countries from Wikidata (query link):
SELECT ?item WHERE { ?item wdt:P31 wd:Q6256. }
Unfortunately, the results are missing for example Switzerland (Q39):
https://www.wikidata.org/wiki/Q39
Looking at the Switzerland data, it has the triple: instance of (P31) country (Q6256).
Could you help me understand why Q39 is not present in the results then?
Thanks!
In WikiData, often you have what are called statements. These allow for qualifications to those statements.
For instance, in the case of Switzerland being a country, the qualification is that the rank of this statement should be 'normal' instead of preferred.
The preferred way to refer to Switzerland is sovereign state (wd:Q3624078), and it looks like WikiData will only have a wdt:P31 relationship between an entity and its 'preferred rank' classification only, as this query shows.
I think this could be because 'country' is a more generic concept, e.g. Wales is a country but not a sovereign state.
Fear not however, as this query:
SELECT DISTINCT ?item
WHERE {
?item p:P31/ps:P31 wd:Q6256.
}
returns wd:Q39 as well. What this query is doing is to navigate from Switzerland to Country via the statement.
That is, we have in our data:
wd:Q39 p:P31 wds:Q39-fbe1ac75-4a8a-93c4-6009-81055d79f9cb .
wds:Q39-fbe1ac75-4a8a-93c4-6009-81055d79f9cb ps:P31 wd:Q6256 .
but not:
wd:Q39 wdt:P31 wd:Q6256 .
Try this:
SELECT DISTINCT *
WHERE {
?item p:P31 ?y .
?y ps:P31 wd:Q6256 ;
?p ?o .
VALUES ?item {wd:Q39}
}
to see for yourself.

SPARQL query returning much more rows than expected

I'm using the following SPARQL query to get a list of all US presidents together with the start and end dates of their presidency:
SELECT ?person $personLabel $start $end
WHERE {
?person wdt:P39 wd:Q11696.
?person p:P39 ?statement.
?position_held_statement ps:P39 wd:Q11696.
?statement pq:P580 ?start.
?statement pq:P582 ?end
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC($start)
You can try it out here
Why does it return so many rows?
EDIT: I know I can use SELECT DISTINCT to get only distinct results, but I want to learn how I can find out the reason for the duplicates. Moreover, there are entries which state that Barack Obama's presidency lasted from January 2009 to January 2017, while others state his two terms separately.
You have ?position_held_statement there. That should be ?statement.
When you unexpectedly have a lot of results in SPARQL, it’s usually because of mismatched variable names turning what should be a join into a cartesian product.

wikidata query missing out countries in Europe

I am using the following query against wikidata;
SELECT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46;
wdt:P31 wd:Q6256.
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en". }
}
where P30 is continent; Q46 is Europe; P31 is Instance Of and Q6256 is country;
https://query.wikidata.org/#SELECT%20%3Fcountry%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP31%20wd%3AQ6256.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%20%20%20%20%7D
Yet this query only returns 15 countries of Europe. For instance Sweden is not returned even though Sweden appears to match the query at https://www.wikidata.org/wiki/Q34
So even though the query seems to be correct yet it is missing out many countries. Any ideas on how to resolve this?
While comparing the two entries; one for Germany/Sweden (which do not show up) and Norway which does show up, the difference I could find was that Germany/Sweden has a preferred rank for Sovereign State while just a normal rank for Country. This could be a reason where the WHERE clause decides to only show the preferred rank if it exists; and skip the remaining statements. If this be the case and I suspect it is; I wonder if there is a way to override the behavior of the query engine to search through all statements with either a preferred rank or a normal rank.
I am getting a better selection of countries when going around the truthy's by using statements. The statements are able to pull out all the statements even those with normal ranks.
SELECT DISTINCT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
filter not exists{?country p:P31/ps:P31 wd:Q3024240 }
}
order by ?countryLabel
I still have a few extra countries showing up; such as German Empire. But I think that's a different problem to fix.
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20filter%20not%20exists%7B%3Fcountry%20p%3AP31%2Fps%3AP31%20wd%3AQ3024240%20%7D%0A%20%20%20%20%20%20%7D%20%0A
Note that country_instance_of_statement captures all the statements irrespective of rank. And once I have those than I use 'ps:P31 wd:Q6256' to pull out those that have country ("wd:Q6256") as the object.
I have added in suggestions from #AKSW above.
And for those who want another approach using end time for country, this is the sparql
SELECT distinct ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
filter not exists {?country_instance_of_statement pq:P582 ?endTime }
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
}
order by ?countryLabel
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20filter%20not%20exists%20%7B%3Fcountry_instance_of_statement%20pq%3AP582%20%3FendTime%20%7D%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%20%0A

Wikidata SPARQL - Countries and their (still existing) neighbours

I want to query the neighbours to a country with SPARQL from Wikidata like this:
SELECT ?country ?countryLabel WHERE {
?country wdt:P47 wd:Q183 .
FILTER NOT EXISTS{ ?country wdt:P576 ?date } # don't count dissolved country - at least filters German Democratic Republic
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
My issue is that e.g. in this example for neighbours of germany there are still countries shown which does not exist anymore like:
Kingdom of Denmark or
Saarland.
Already tried
I could already reduce the number by the FILTER statement.
Question
How to make the statement to reduce it to 9 countries?
(also dividing in land boarder and sea boarder would be great)
Alternative
Filtering at this API would be also fine for me https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q35
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could check entities of type wd:Q133346 ('border') or wd:Q12413618 ('international border'):
SELECT ?border ?borderLabel ?country1Label ?country2Label ?isLandBorder ?isMaritimeBorder ?constraint {
VALUES (?country1) {(wd:Q183)}
?border wdt:P31 wd:Q12413618 ;
wdt:P17 ?country1 , ?country2 .
FILTER (?country1 != ?country2)
BIND (EXISTS {?border wdt:P31 wd:Q15104814} AS ?isLandBorder)
BIND (EXISTS {?border wdt:P31 wd:Q3089219} AS ?isMaritimeBorder)
BIND ((?isLandBorder || ?isMaritimeBorder) AS ?constraint)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} ORDER BY ?country1Label
Try it!
In some sense, records are duplicated: for the Afghanistan–Uzbekistan border, the list contains both (?country1=Afganistan,?country2=Uzbekistan) and (?country1=Uzbekistan,?country2=Afganistan).
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could ask on https://opendata.stackexchange.com.