Limiting diversity of attributes in Wikidata / SPARQL Query - sparql

I want to create a dataset of 100 paintings querying from Wikidata.
With my current query, for each painting I retrieve the artist (creator), the movement, the year it was painted (inception) and so on.
I would like to fix a limit to the number of different movements in my data set. Same for the artists. (i.e. I will have only paintings from n random genres at unspecified proportions).
So for example if I limit to 2 movements I would have only Pop-Art and Cubist paintings or Neoclassic and Impressionist, etc.
SELECT ?itemLabel ?pic ?movementLabel ?creatorLabel ?inception
WHERE
{
?item wdt:P31 wd:Q3305213 .
?item wdt:P135 ?movement .
?item wdt:P170 ?creator .
?item wdt:P18 ?pic .
?item wdt:P571 ?inception
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
LIMIT 100

Related

Select footballers from two or more countries with SPARQL wikidata query

I'm trying to get the day of birth of all the footballers on wikipedia per country. Until now I've managed to get all the ones for one country, but I would like to get the ones on many countries at the same time, something like:
SELECT ?person ?dateOfBirth
WHERE {
?person wdt:P1532 wd:Q16 OR ?person wdt:P1532 wd:Q141;
wdt:P569 ?dateOfBirth.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Q16 is canada and Q141 is Argentina
Thanks!
To not limit your query to a single country, replace wd:Q16 by a variable, i.e. ?person wdt:P1531 ?country.
Then you need to restrict your query to football players.
This can be done with the following line
?person wdt:P106 wd:Q937857.
And probably you want the names of the players, not only der ID. Thus, add ?personLabel after "select".
Combined, the query reads
SELECT ?person ?personLabel ?country ?countryLabel ?dateOfBirth
WHERE {
?person wdt:P106 wd:Q937857;
wdt:P1532 ?country;
wdt:P569 ?dateOfBirth.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Simple Wikidata SPARQL query for cities, countries, and country property times out

I want to fetch a sample of big cities (Q1549591) along with their country, and the countries' 3166-1 alpha-3 codes (P298). This simple query works as expected without the ISO codes, but when I add that part the query times out. I don't understand what the problem is.
SELECT ?cityLabel ?countryLabel ?iso
WHERE {
?city wdt:P31 wd:Q1549591 .
?city wdt:P17 ?country .
?country wdt:P297 ?iso .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10

How can I use SPARQL to "join table A to table B" and take only those rows in B with a MAX value?

The following SPARQL queries Wikidata for all European countries (sovereign states, Q3624078) and loads for each country the last few known population numbers (P:1082 population):
SELECT DISTINCT ?item $itemLabel ?population ?popDate
WHERE {
?item wdt:P31 wd:Q3624078. #select only sovereign states
?item wdt:P706/wdt:P361*|wdt:P361*|wdt:P30 wd:Q46. #select items which are geographically in Europe
?item p:P31 ?statement.
?statement ps:P31 wd:Q3624078.
FILTER NOT EXISTS { ?statement pq:P582 ?end. } #filter out items which are not sovereign states anymore
FILTER NOT EXISTS { ?item wdt:P31 wd:Q3024240. } #filter out historical countries
OPTIONAL {
?item p:P1082 ?popStatement.
?popStatement ps:P1082 ?population;
pq:P585 ?popDate.
FILTER ( YEAR(?popDate) >= 2014 )
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ASC($itemLabel)
You can test it here.
How can I limit the population data to the most recent population number?
I tried it by limiting it to a specific year (FILTER (YEAR(?popDate) = 2018), but this does not work because some countries don't have a population estimate for 2018.
This is similar to an SQL query like SELECT * FROM country c INNER JOIN population p ON p.CountryId = c.Id. In SQL, you would need to create a subquery with a MAX statement to find the most recent population data by country. How can I achieve the same in SPARQL?

wikidata query missing out countries in Europe

I am using the following query against wikidata;
SELECT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46;
wdt:P31 wd:Q6256.
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en". }
}
where P30 is continent; Q46 is Europe; P31 is Instance Of and Q6256 is country;
https://query.wikidata.org/#SELECT%20%3Fcountry%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP31%20wd%3AQ6256.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%20%20%20%20%7D
Yet this query only returns 15 countries of Europe. For instance Sweden is not returned even though Sweden appears to match the query at https://www.wikidata.org/wiki/Q34
So even though the query seems to be correct yet it is missing out many countries. Any ideas on how to resolve this?
While comparing the two entries; one for Germany/Sweden (which do not show up) and Norway which does show up, the difference I could find was that Germany/Sweden has a preferred rank for Sovereign State while just a normal rank for Country. This could be a reason where the WHERE clause decides to only show the preferred rank if it exists; and skip the remaining statements. If this be the case and I suspect it is; I wonder if there is a way to override the behavior of the query engine to search through all statements with either a preferred rank or a normal rank.
I am getting a better selection of countries when going around the truthy's by using statements. The statements are able to pull out all the statements even those with normal ranks.
SELECT DISTINCT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
filter not exists{?country p:P31/ps:P31 wd:Q3024240 }
}
order by ?countryLabel
I still have a few extra countries showing up; such as German Empire. But I think that's a different problem to fix.
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20filter%20not%20exists%7B%3Fcountry%20p%3AP31%2Fps%3AP31%20wd%3AQ3024240%20%7D%0A%20%20%20%20%20%20%7D%20%0A
Note that country_instance_of_statement captures all the statements irrespective of rank. And once I have those than I use 'ps:P31 wd:Q6256' to pull out those that have country ("wd:Q6256") as the object.
I have added in suggestions from #AKSW above.
And for those who want another approach using end time for country, this is the sparql
SELECT distinct ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
filter not exists {?country_instance_of_statement pq:P582 ?endTime }
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
}
order by ?countryLabel
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20filter%20not%20exists%20%7B%3Fcountry_instance_of_statement%20pq%3AP582%20%3FendTime%20%7D%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%20%0A

Wikidata SPARQL - Countries and their (still existing) neighbours

I want to query the neighbours to a country with SPARQL from Wikidata like this:
SELECT ?country ?countryLabel WHERE {
?country wdt:P47 wd:Q183 .
FILTER NOT EXISTS{ ?country wdt:P576 ?date } # don't count dissolved country - at least filters German Democratic Republic
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
My issue is that e.g. in this example for neighbours of germany there are still countries shown which does not exist anymore like:
Kingdom of Denmark or
Saarland.
Already tried
I could already reduce the number by the FILTER statement.
Question
How to make the statement to reduce it to 9 countries?
(also dividing in land boarder and sea boarder would be great)
Alternative
Filtering at this API would be also fine for me https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q35
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could check entities of type wd:Q133346 ('border') or wd:Q12413618 ('international border'):
SELECT ?border ?borderLabel ?country1Label ?country2Label ?isLandBorder ?isMaritimeBorder ?constraint {
VALUES (?country1) {(wd:Q183)}
?border wdt:P31 wd:Q12413618 ;
wdt:P17 ?country1 , ?country2 .
FILTER (?country1 != ?country2)
BIND (EXISTS {?border wdt:P31 wd:Q15104814} AS ?isLandBorder)
BIND (EXISTS {?border wdt:P31 wd:Q3089219} AS ?isMaritimeBorder)
BIND ((?isLandBorder || ?isMaritimeBorder) AS ?constraint)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} ORDER BY ?country1Label
Try it!
In some sense, records are duplicated: for the Afghanistan–Uzbekistan border, the list contains both (?country1=Afganistan,?country2=Uzbekistan) and (?country1=Uzbekistan,?country2=Afganistan).
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could ask on https://opendata.stackexchange.com.