Wikidata SPARQL query is missing part of expected results - sparql

On Wikidatas SPARQL endpoint i tried to find all cities worldwide with a population greater than 100000.
I'm getting lots of correct results with my query. But when i checked for some particular cities they didn't show up in the list.
My query:
SELECT DISTINCT ?cityLabel ?population ?coord ?countryLabel ?shortCountry ?city WHERE {
?city (wdt:P31/(wdt:P279*)) wd:Q515;
wdt:P1082 ?population;
wdt:P625 ?coord.
FILTER(?population > 100000 )
?city wdt:P17 ?country.
?country wdt:P298 ?shortCountry.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ASC (?shortCountry)
I looked for the following cities in the results:
Berlin
Beirut
Lyon
But they didn't show up.
Try it out here

Ok i figured something out for myself:
I dug a little deeper into the wikidata object of the cities i mentioned and found out, that they are no instance of "city" (Q515).
Berlin belongs to e.g. "capital", "city with millions of inhabitants", "metropolis"
Lyon belongs to "commune of france" and "big city"
Beirut belongs to e.g. "capital" and "big city"
But none of them is listed as a "city". That's for sure not intentionally because other bigger cities are also an instance of "city" e.g. Paris
So in conclusion:
Check all the categories your filtering for and check your expected results to match them.

Related

How to get the number of languages a Wikipedia page is available in? (SPARQL query)

I'm trying to have a list of Italian books from the 1980 on and the number of Wikipedia pages their original Wikipedia page has been translated into, for instance I would like to have:
Book, number
The name of the Rose, 5
Where 5 is the number of languages The Name of the Rose has been translated into in Wikipedia, for instance there is an English wiki page, a Dutch one, a Spanish one, a French one, a Greek one.
Here is the query so far:
SELECT ?item ?label
WHERE
{
VALUES ?type {wd:Q571 wd:Q7725634} # book or literary work
?item wdt:P31 ?type .
?item wdt:P577 ?date FILTER (?date > "1980-01-01T00:00:00Z"^^xsd:dateTime) . #dal 1980
?item rdfs:label ?label filter (lang(?label) = "it")
?item wdt:P495 wd:Q38 .
}
I get the list of books, but I can't find the right property to look for.
Did you actually get The Name of the Rose in your example? Because its date of publication is set to 1980 (which is = 1980-01-01 for our purposes), I had to change your query to a >= comparison.
Then, using COUNT() and GROUP BY as mentioned in the comment gets you what you want. But if you really just need the number of sitselinks, there is a shortcut that may be useful. It was added because that number is often used as a good proxy for an item's popularity, and using the precomputed number is vastly more efficient than getting all links, grouping, and counting.
SELECT ?book ?bookLabel ?sitelinks ?date WHERE {
VALUES ?type { wd:Q571 wd:Q47461344 wd:Q7725634 }
?book wdt:P31 ?type;
wdt:P577 ?date;
wdt:P495 wd:Q38;
wikibase:sitelinks ?sitelinks.
FILTER((?date >= "1980-01-01T00:00:00Z"^^xsd:dateTime) && (?date < "1981-01-01T00:00:00Z"^^xsd:dateTime))
SERVICE wikibase:label { bd:serviceParam wikibase:language "it,en". }
}
Query
Note that the values may slightly differ from the version with group & count because here sites such as commons or wikiquote are also included.

Simple Wikidata SPARQL query for cities, countries, and country property times out

I want to fetch a sample of big cities (Q1549591) along with their country, and the countries' 3166-1 alpha-3 codes (P298). This simple query works as expected without the ISO codes, but when I add that part the query times out. I don't understand what the problem is.
SELECT ?cityLabel ?countryLabel ?iso
WHERE {
?city wdt:P31 wd:Q1549591 .
?city wdt:P17 ?country .
?country wdt:P297 ?iso .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10

wikidata query missing out countries in Europe

I am using the following query against wikidata;
SELECT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46;
wdt:P31 wd:Q6256.
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en". }
}
where P30 is continent; Q46 is Europe; P31 is Instance Of and Q6256 is country;
https://query.wikidata.org/#SELECT%20%3Fcountry%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP31%20wd%3AQ6256.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%20%20%20%20%7D
Yet this query only returns 15 countries of Europe. For instance Sweden is not returned even though Sweden appears to match the query at https://www.wikidata.org/wiki/Q34
So even though the query seems to be correct yet it is missing out many countries. Any ideas on how to resolve this?
While comparing the two entries; one for Germany/Sweden (which do not show up) and Norway which does show up, the difference I could find was that Germany/Sweden has a preferred rank for Sovereign State while just a normal rank for Country. This could be a reason where the WHERE clause decides to only show the preferred rank if it exists; and skip the remaining statements. If this be the case and I suspect it is; I wonder if there is a way to override the behavior of the query engine to search through all statements with either a preferred rank or a normal rank.
I am getting a better selection of countries when going around the truthy's by using statements. The statements are able to pull out all the statements even those with normal ranks.
SELECT DISTINCT ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
filter not exists{?country p:P31/ps:P31 wd:Q3024240 }
}
order by ?countryLabel
I still have a few extra countries showing up; such as German Empire. But I think that's a different problem to fix.
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20filter%20not%20exists%7B%3Fcountry%20p%3AP31%2Fps%3AP31%20wd%3AQ3024240%20%7D%0A%20%20%20%20%20%20%7D%20%0A
Note that country_instance_of_statement captures all the statements irrespective of rank. And once I have those than I use 'ps:P31 wd:Q6256' to pull out those that have country ("wd:Q6256") as the object.
I have added in suggestions from #AKSW above.
And for those who want another approach using end time for country, this is the sparql
SELECT distinct ?country ?countryLabel
WHERE
{
?country wdt:P30 wd:Q46.
?country p:P31 ?country_instance_of_statement .
?country_instance_of_statement ps:P31 wd:Q6256 .
filter not exists {?country_instance_of_statement pq:P582 ?endTime }
SERVICE wikibase:label { bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en".
}
}
order by ?countryLabel
https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20filter%20not%20exists%20%7B%3Fcountry_instance_of_statement%20pq%3AP582%20%3FendTime%20%7D%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%20%0A

Wikidata SPARQL - Countries and their (still existing) neighbours

I want to query the neighbours to a country with SPARQL from Wikidata like this:
SELECT ?country ?countryLabel WHERE {
?country wdt:P47 wd:Q183 .
FILTER NOT EXISTS{ ?country wdt:P576 ?date } # don't count dissolved country - at least filters German Democratic Republic
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
My issue is that e.g. in this example for neighbours of germany there are still countries shown which does not exist anymore like:
Kingdom of Denmark or
Saarland.
Already tried
I could already reduce the number by the FILTER statement.
Question
How to make the statement to reduce it to 9 countries?
(also dividing in land boarder and sea boarder would be great)
Alternative
Filtering at this API would be also fine for me https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q35
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could check entities of type wd:Q133346 ('border') or wd:Q12413618 ('international border'):
SELECT ?border ?borderLabel ?country1Label ?country2Label ?isLandBorder ?isMaritimeBorder ?constraint {
VALUES (?country1) {(wd:Q183)}
?border wdt:P31 wd:Q12413618 ;
wdt:P17 ?country1 , ?country2 .
FILTER (?country1 != ?country2)
BIND (EXISTS {?border wdt:P31 wd:Q15104814} AS ?isLandBorder)
BIND (EXISTS {?border wdt:P31 wd:Q3089219} AS ?isMaritimeBorder)
BIND ((?isLandBorder || ?isMaritimeBorder) AS ?constraint)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} ORDER BY ?country1Label
Try it!
In some sense, records are duplicated: for the Afghanistan–Uzbekistan border, the list contains both (?country1=Afganistan,?country2=Uzbekistan) and (?country1=Uzbekistan,?country2=Afganistan).
a database or lists or prepared HashMaps whatever with all countries of the world with neighbours
You could ask on https://opendata.stackexchange.com.

How to convert some queries from sql to sparql?

I am just learning Sparql and I have the following tables:
Countries, European Country, City and Capital
I would like to know how to make the following queries, because I didnt understand them...
1) Which country has the city "Paris" as capital.
2) Print all the European Countries
3) Print all the countries and theis capitals.
Thank you very much in advance.
This said, there is DBpedia that offers info similar to what is described, and is a good playground. To query it, http://yasgui.org (which is a better sparql editor) or http://factforge.net/sparql (which is our own integration, a few months old, but has some extra goodies).
Which country has the city "Paris" as capital
select * {
?x a dbo:Country; dbo:capital dbr:Paris
}
The yasgui results will surprise you:
dbr:Bourbon_Restoration
dbr:France
dbr:Francia
dbr:French_Fifth_Republic
dbr:Kingdom_of_France
dbr:Office_International_d'Hygiène_Publique
dbr:Second_French_Empire
dbr:West_Francia
I understand all the historic kingdoms, but dbr:Office_International_d'Hygiène_Publique is a bit of a shock. The reason is that it's an International Organization (uses Infobox Former International Organization, and it redirects to https://en.wikipedia.org/wiki/Template:Infobox_former_country, which "is currently being merged with Template:Infobox country". See http://mappings.dbpedia.org/index.php/Cleaning_up_Countries
factforge returns even more results from linked datasets (all these mean just France):
geodata:3017382/
http://ontologi.es/place/FR
http://psi.oasis-open.org/iso/3166/#250
leaks:country-FRA
wfr:fr
All the European Countries
That's better because there happens to be a Wikipedia category:
select * {
?x a dbo:Country; dct:subject dbc:Countries_in_Europe
}
yasgui
all countries and their capitals
select * {
?x a dbo:Country; dbo:capital ?capital
}
This returns a bunch of historic countries and capitals, and again some international organizations, eg
League_of_Nations
International_Authority_for_the_Ruhr
Japanese_occupation_of_British_Borneo
SO THEN, you may have better luck with Wikidata (https://query.wikidata.org/). At https://twitter.com/search?q=wikidatafacts%20country you can find a bunch of interesting queries related to countries.
countries and capitals on Wikidata
select ?country ?countryLabel ?city ?cityLabel {
?country wdt:P36 ?city
filter exists {?country wdt:P31/wdt:P279* wd:Q6256}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,pl,ru,es" .
}
} order by ?countryLabel
It uses a bunch of Qnn and Pnn but you can decode them when you mouse-over.
How did I find that wdt:P36 means "capital"? Search for property:capital
why filter exists? Because a country may have several subtypes of wd:Q6256 "country", and if I use that in the main query, it returns several results per country. This way it returns only one.
Map
You can also easily display it on a map:
#defaultView:Map
select ?country ?countryLabel ?city ?cityLabel ?coords {
?country wdt:P36 ?city
filter exists {?country wdt:P31/wdt:P279* wd:Q6256}
?city wdt:P625 ?coords
SERVICE wikibase:label {bd:serviceParam wikibase:language "en,pl,ru,es"}
}
See a couple of shots https://twitter.com/valexiev1/status/844870994942603264