Multiple population for a cities in WikiData - sparql

I am doing a query on cities population and would like to calculate the percentage of city population with respect to their counties. But there are multiple populations for each city due to the year, therefore would like to take the latest version.
#List of countries ordered by the number of their cities with female mayor
#added before 2016-10
SELECT ?country ?countryLabel ?countryPopulation ?city ?cityLabel ?cityPopulation (?cityPopulation / ?countryPopulation AS ?ratio) #(year(?date) as ?dateyear)
WHERE
{
?city wdt:P31/wdt:P279* wd:Q515 . # find instances of subclasses of city
?city wdt:P17 ?country . # Also find the country of the city
?city wdt:P1082 ?cityPopulation . # city population
?country wdt:P1082 ?countryPopulation .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
#ORDER BY DESC(?count)
LIMIT 100
For example here I get multiple entries with different values corresponding to the population of wd:Q9954
** UPDATE
Based on my understanding from a similar question here I tried to update my code - but still, I am not confident whether I am forming my query correctly. Also, I have to add that, I would like to calculate the city_population / country_population which even after filtering doesn't seem to hold. I appreciate your comments.
SELECT ?country ?countryLabel ?countryPopulation ?city ?cityLabel ?cityPopulation ?ciy_date
WHERE
{
#?city wdt:P31/wdt:P279* wd:Q515 .
#?city wdt:P1082 ?cityPopulation .
#?country wdt:P1082 ?countryPopulation .
?city wdt:P31/wdt:P279* wd:Q515 .
?city p:P1082 ?populationStatement .
?populationStatement ps:P1082 ?cityPopulation;
pq:P585 ?ciy_date .
?city wdt:P17 ?country .
?country wdt:P1082 ?countryPopulation .
FILTER NOT EXISTS {
?country p:P1081/pq:P585 ?hdi_country_date_ .
?city p:P1082/pq:P585 ?hdi_city_date_ .
FILTER (?hdi_city_date_ > ?ciy_date)
FILTER (?hdi_country_date_ > ?hdi_country_date_)
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
#ORDER BY DESC(?count)
LIMIT 100

Related

SPARQL wikidata group-by: get grouped items in the answer (rivers with multiple cities)

The following query returns rivers with at least 3 big cities on them.
try it on wikidata query service
# rivers with at least 3 big cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount >= 3) # ← important line
ORDER BY DESC(?citycount)
How can I include the actual cities (and their labels) in the answer? The ordering should stay the same, i.e. the first 13 results should be the cities located at Yangtze river, the next 11 results those at Rhine etc.
You should try using GROUP_CONCAT, see below.
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
(GROUP_CONCAT(CONCAT(STR(?city), "--> ", ?cityLabel); SEPARATOR="; ") AS ?cities)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
?city rdfs:label ?cityLabel .
FILTER(LANG(?cityLabel) = 'en')
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount >= 3)
ORDER BY DESC(?citycount)

SPARQL wikidata group-by: rivers with multiple cities

I want to ask wikidata which rivers have multiple big cities. The following query gives me a list where ?citycount is an integer ranging from 1 to 13.
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
ORDER BY DESC(?citycount)
LIMIT 100
How can I restrict the results to those rivers with ?citycount bigger than, say, 3?
Turning the comment of #UninformedUser into an answer:
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount > 3) # ← important line
ORDER BY DESC(?citycount)
see: https://www.w3.org/TR/sparql11-query/#having

Select footballers from two or more countries with SPARQL wikidata query

I'm trying to get the day of birth of all the footballers on wikipedia per country. Until now I've managed to get all the ones for one country, but I would like to get the ones on many countries at the same time, something like:
SELECT ?person ?dateOfBirth
WHERE {
?person wdt:P1532 wd:Q16 OR ?person wdt:P1532 wd:Q141;
wdt:P569 ?dateOfBirth.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Q16 is canada and Q141 is Argentina
Thanks!
To not limit your query to a single country, replace wd:Q16 by a variable, i.e. ?person wdt:P1531 ?country.
Then you need to restrict your query to football players.
This can be done with the following line
?person wdt:P106 wd:Q937857.
And probably you want the names of the players, not only der ID. Thus, add ?personLabel after "select".
Combined, the query reads
SELECT ?person ?personLabel ?country ?countryLabel ?dateOfBirth
WHERE {
?person wdt:P106 wd:Q937857;
wdt:P1532 ?country;
wdt:P569 ?dateOfBirth.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Recreate Wikipedia list of states by GDP with SPARQL query

I am trying to recreate this list:
https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States_by_GDP
with a Wikidata SPARQL query.
I can find states by population with this query
Additionally, the fields:
population (P1082)
GDP (P2131)
And some extra ones, like unemployment (P1198)
are covered by the wikiproject economics, though only at the country level.
That said, seeing the "List of states and territories of the United States by GDP" article makes me think at least P2131 may be available at the state level.
I have tried the following query.
SELECT DISTINCT ?state ?stateLabel ?population ?gdp
{
?state wdt:P31 wd:Q35657 ;
wdt:P1082 ?population ;
wdt:P2131 ?gdp ;
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?state ?population ?stateLabel ?gdp
ORDER BY DESC(?population)
And see no matches. What's the right query for this?
(Point in time for a given year would be excellent, like how the table gives me 2019, 2020, etc., but I'll settle for learning the vanilla first.)
Because of a Wikidata internal convention, I had to upload the GPD data in the items about the States' economies, that are linked through property P8744.
E.g., for the State of Maine you'll find the data in economy of Maine.
This is the correct query for obtaining what you want (test):
SELECT DISTINCT ?state ?stateLabel ?population ?gdp (year(?date) as ?year)
{
?state wdt:P31 wd:Q35657 ;
wdt:P1082 ?population ;
wdt:P8744/wdt:P2131 ?gdp .
?gdpStmt ps:P2131 ?gdp ;
pq:P585 ?date .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?state ?population ?stateLabel ?gdp ?date
ORDER BY DESC(?year) DESC(?population)
It also considers grouping by year.

Simple Wikidata SPARQL query for cities, countries, and country property times out

I want to fetch a sample of big cities (Q1549591) along with their country, and the countries' 3166-1 alpha-3 codes (P298). This simple query works as expected without the ISO codes, but when I add that part the query times out. I don't understand what the problem is.
SELECT ?cityLabel ?countryLabel ?iso
WHERE {
?city wdt:P31 wd:Q1549591 .
?city wdt:P17 ?country .
?country wdt:P297 ?iso .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10