SPARQL wikidata group-by: rivers with multiple cities - sparql

I want to ask wikidata which rivers have multiple big cities. The following query gives me a list where ?citycount is an integer ranging from 1 to 13.
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
ORDER BY DESC(?citycount)
LIMIT 100
How can I restrict the results to those rivers with ?citycount bigger than, say, 3?

Turning the comment of #UninformedUser into an answer:
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount > 3) # ← important line
ORDER BY DESC(?citycount)
see: https://www.w3.org/TR/sparql11-query/#having

Related

SPARQL wikidata group-by: get grouped items in the answer (rivers with multiple cities)

The following query returns rivers with at least 3 big cities on them.
try it on wikidata query service
# rivers with at least 3 big cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount >= 3) # ← important line
ORDER BY DESC(?citycount)
How can I include the actual cities (and their labels) in the answer? The ordering should stay the same, i.e. the first 13 results should be the cities located at Yangtze river, the next 11 results those at Rhine etc.
You should try using GROUP_CONCAT, see below.
# rivers and cities
SELECT ?river ?riverLabel (COUNT(?city) AS ?citycount)
(GROUP_CONCAT(CONCAT(STR(?city), "--> ", ?cityLabel); SEPARATOR="; ") AS ?cities)
WHERE {
?river wdt:P31 wd:Q4022. # ... is a river
?city wdt:P31 wd:Q1549591. # ... is a big city
?city wdt:P206 ?river. # ... located in or next to body of water
?city rdfs:label ?cityLabel .
FILTER(LANG(?cityLabel) = 'en')
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]".}
}
GROUP BY ?river ?riverLabel
HAVING (?citycount >= 3)
ORDER BY DESC(?citycount)

Recreate Wikipedia list of states by GDP with SPARQL query

I am trying to recreate this list:
https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States_by_GDP
with a Wikidata SPARQL query.
I can find states by population with this query
Additionally, the fields:
population (P1082)
GDP (P2131)
And some extra ones, like unemployment (P1198)
are covered by the wikiproject economics, though only at the country level.
That said, seeing the "List of states and territories of the United States by GDP" article makes me think at least P2131 may be available at the state level.
I have tried the following query.
SELECT DISTINCT ?state ?stateLabel ?population ?gdp
{
?state wdt:P31 wd:Q35657 ;
wdt:P1082 ?population ;
wdt:P2131 ?gdp ;
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?state ?population ?stateLabel ?gdp
ORDER BY DESC(?population)
And see no matches. What's the right query for this?
(Point in time for a given year would be excellent, like how the table gives me 2019, 2020, etc., but I'll settle for learning the vanilla first.)
Because of a Wikidata internal convention, I had to upload the GPD data in the items about the States' economies, that are linked through property P8744.
E.g., for the State of Maine you'll find the data in economy of Maine.
This is the correct query for obtaining what you want (test):
SELECT DISTINCT ?state ?stateLabel ?population ?gdp (year(?date) as ?year)
{
?state wdt:P31 wd:Q35657 ;
wdt:P1082 ?population ;
wdt:P8744/wdt:P2131 ?gdp .
?gdpStmt ps:P2131 ?gdp ;
pq:P585 ?date .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?state ?population ?stateLabel ?gdp ?date
ORDER BY DESC(?year) DESC(?population)
It also considers grouping by year.

Multiple population for a cities in WikiData

I am doing a query on cities population and would like to calculate the percentage of city population with respect to their counties. But there are multiple populations for each city due to the year, therefore would like to take the latest version.
#List of countries ordered by the number of their cities with female mayor
#added before 2016-10
SELECT ?country ?countryLabel ?countryPopulation ?city ?cityLabel ?cityPopulation (?cityPopulation / ?countryPopulation AS ?ratio) #(year(?date) as ?dateyear)
WHERE
{
?city wdt:P31/wdt:P279* wd:Q515 . # find instances of subclasses of city
?city wdt:P17 ?country . # Also find the country of the city
?city wdt:P1082 ?cityPopulation . # city population
?country wdt:P1082 ?countryPopulation .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
#ORDER BY DESC(?count)
LIMIT 100
For example here I get multiple entries with different values corresponding to the population of wd:Q9954
** UPDATE
Based on my understanding from a similar question here I tried to update my code - but still, I am not confident whether I am forming my query correctly. Also, I have to add that, I would like to calculate the city_population / country_population which even after filtering doesn't seem to hold. I appreciate your comments.
SELECT ?country ?countryLabel ?countryPopulation ?city ?cityLabel ?cityPopulation ?ciy_date
WHERE
{
#?city wdt:P31/wdt:P279* wd:Q515 .
#?city wdt:P1082 ?cityPopulation .
#?country wdt:P1082 ?countryPopulation .
?city wdt:P31/wdt:P279* wd:Q515 .
?city p:P1082 ?populationStatement .
?populationStatement ps:P1082 ?cityPopulation;
pq:P585 ?ciy_date .
?city wdt:P17 ?country .
?country wdt:P1082 ?countryPopulation .
FILTER NOT EXISTS {
?country p:P1081/pq:P585 ?hdi_country_date_ .
?city p:P1082/pq:P585 ?hdi_city_date_ .
FILTER (?hdi_city_date_ > ?ciy_date)
FILTER (?hdi_country_date_ > ?hdi_country_date_)
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
#ORDER BY DESC(?count)
LIMIT 100

Simple Wikidata SPARQL query for cities, countries, and country property times out

I want to fetch a sample of big cities (Q1549591) along with their country, and the countries' 3166-1 alpha-3 codes (P298). This simple query works as expected without the ISO codes, but when I add that part the query times out. I don't understand what the problem is.
SELECT ?cityLabel ?countryLabel ?iso
WHERE {
?city wdt:P31 wd:Q1549591 .
?city wdt:P17 ?country .
?country wdt:P297 ?iso .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10

Wikidata SPARQL - Countries and their (still existing) neighbours

I want to query the neighbours to a country with SPARQL from Wikidata like this:
SELECT ?country ?countryLabel WHERE {
?country wdt:P47 wd:Q183 .
FILTER NOT EXISTS{ ?country wdt:P576 ?date } # don't count dissolved country - at least filters German Democratic Republic
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
My issue is that e.g. in this example for neighbours of germany there are still countries shown which does not exist anymore like:
Kingdom of Denmark or
Saarland.
Already tried
I could already reduce the number by the FILTER statement.
Question
How to make the statement to reduce it to 9 countries?
(also dividing in land boarder and sea boarder would be great)
Alternative
Filtering at this API would be also fine for me https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q35
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could check entities of type wd:Q133346 ('border') or wd:Q12413618 ('international border'):
SELECT ?border ?borderLabel ?country1Label ?country2Label ?isLandBorder ?isMaritimeBorder ?constraint {
VALUES (?country1) {(wd:Q183)}
?border wdt:P31 wd:Q12413618 ;
wdt:P17 ?country1 , ?country2 .
FILTER (?country1 != ?country2)
BIND (EXISTS {?border wdt:P31 wd:Q15104814} AS ?isLandBorder)
BIND (EXISTS {?border wdt:P31 wd:Q3089219} AS ?isMaritimeBorder)
BIND ((?isLandBorder || ?isMaritimeBorder) AS ?constraint)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} ORDER BY ?country1Label
Try it!
In some sense, records are duplicated: for the Afghanistan–Uzbekistan border, the list contains both (?country1=Afganistan,?country2=Uzbekistan) and (?country1=Uzbekistan,?country2=Afganistan).
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could ask on https://opendata.stackexchange.com.