SPARQL question: how to return property labels and associated date qualifiers from Wikidata - properties

I am trying to return results for a set of persons (Edinburgh University alumni) who have held political office. I would like to return the title label of the office held, along with the start and end dates for each office, with many individuals holding multiple positions. I seem to be able to get one or the other or can get it to work if the person only held one position, but can't get the two to come together where multiple offices were held.
My current version of the query is below. This will give me the start and end dates, but rather than the label if the political office, such as Member of the [x] Parliament of the United Kingdom, ?officeLabel returns a value such as: statement/Q4668868-E3734C7D-40F0-4D4A-8208-E3D6B8C944CB
SELECT DISTINCT ?alumni ?fullName ?roleLabel ?officeLabel ?start ?end WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?alumni wdt:P69 wd:Q160302.
?alumni rdfs:label ?fullName.
?alumni wdt:P106 ?role.
#Use Values to separate out politicians - Q82955
VALUES (?role) {
(wd:Q82955)
}
#Select only where position of office is stated but make dates optional
?alumni p:P39 ?office.
OPTIONAL { ?office pq:P580 ?start. }
OPTIONAL { ?office pq:P582 ?end. }
FILTER(LANGMATCHES(LANG(?fullName), "en"))
FILTER(NOT EXISTS { FILTER(LANGMATCHES(LANG(?fullName), "en-ca")) })
FILTER(NOT EXISTS { FILTER(LANGMATCHES(LANG(?fullName), "en-gb")) })
}
ORDER BY ?fullName
LIMIT 10

Yeah, I still get tripped up on qualifiers and the Wikidata Data Model too.
Diagram by
By Michael F. Schönitzer - Own work, based on File:Rdf mapping.svg, CC
BY 4.0, https://commons.wikimedia.org/w/index.php?curid=63880194
After going the "p: route" from the "item", you need the "ps: route" to get back to the "simple value".
So, using this to slightly modify your query gives the results I think you want.
SELECT DISTINCT ?alumni ?fullName ?roleLabel ?officeLabel ?start ?end WHERE {
?alumni wdt:P69 wd:Q160302.
?alumni rdfs:label ?fullName.
?alumni wdt:P106 ?role.
VALUES (?role) {
(wd:Q82955)
}
?alumni p:P39 ?officeStmnt.
?officeStmnt ps:P39 ?office.
OPTIONAL { ?officeStmnt pq:P580 ?start. }
OPTIONAL { ?officeStmnt pq:P582 ?end. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
FILTER(LANGMATCHES(LANG(?fullName), "en"))
FILTER(NOT EXISTS { FILTER(LANGMATCHES(LANG(?fullName), "en-ca")) })
FILTER(NOT EXISTS { FILTER(LANGMATCHES(LANG(?fullName), "en-gb")) })
}
ORDER BY ?fullName
LIMIT 10
Link to query on Wikidata

Related

SPARQL Distance problem WIKIDATA - problem with distance measuring

I am doing SPARQL exercise right now - I want to find all places that dont have airport in 100km range.
Right now I am stuck, cause I wanted to union them to filter data, but that isnt working.
Please help me to understand how to connect data to be able to filter the distance :)
To simplify code I changed all the cities to only Berlin.
SELECT ?placeCoor WHERE{
{
SELECT ?placeCoor WHERE{
?place wdt:P31/wdt:P279* wd:Q1248784.
?place wdt:P625 ?placeCoor.
}
}
UNION
{
SELECT DISTINCT ?berlinLoc WHERE {
wd:Q64 wdt:P625 ?berlinLoc .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } }
}
FILTER (geof:distance(?placeCoor, ?berlinLoc) >5 )
}

Wikidata: Filter post codes only current valid

I would like to filter post codes to show only the current active for today.
Problem is there are are cities with old post codes (Example).
My current query shows the old post codes:
SELECT ?city ?cityLabel ?postcode ?federal_stateLabel ?federal_state_nr WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de". }
?city (wdt:P31/(wdt:P279*)) wd:Q7930989;
wdt:P17 wd:Q183;
wdt:P281 ?postcode;
wdt:P131 ?federal_state.
?federal_state wdt:P439 ?federal_state_nr.
}
ORDER BY (?postcode)
LIMIT 10
(query.wikidata.org)
I would have to use start time P580 and end time P582 but I don't see how.
You can use this for filtering out the claims which have an end time:
?city p:P281 ?postCodeStmt .
?postCodeStmt ps:P281 ?postcode .
FILTER NOT EXISTS { ?postcode pq:P582 ?endTime . }
The whole query becomes:
SELECT ?city ?cityLabel ?postcode ?federal_stateLabel ?federal_state_nr WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de". }
?city (wdt:P31/(wdt:P279*)) wd:Q7930989;
wdt:P17 wd:Q183;
p:P281 ?postCodeStmt;
wdt:P131 ?federal_state.
?federal_state wdt:P439 ?federal_state_nr.
?postCodeStmt ps:P281 ?postcode.
FILTER NOT EXISTS { ?postCodeStmt pq:P582 ?endTime . } # Filtering out old postal codes
}
ORDER BY (?postcode)
LIMIT 100
See also Wikidata:SPARQL tutorial§Qualifiers.

Identify entity of a Wikipedia page

My question is related to a similar question/comment which unfortunately never received an answer.
Given a list of multiple Wikipedia pages, e.g.:
https://en.wikipedia.org/wiki/Donald_Trump
https://en.wikipedia.org/wiki/The_Matrix
https://en.wikipedia.org/wiki/Tiger
...
how can I find out what type of entity these articles refer to. i.e. ideally I would want something on a higher level e.g. person, movie, animal etc.
My best guess so far was the Wikidata API using SPARQL to move back the instance_of or subclass tree. However, this did not lead to meaningful results.
SELECT ?lemma ?item ?itemLabel ?itemDescription ?instance ?instanceLabel ?subclassLabel WHERE {
VALUES ?lemma {
"Donald Trump"#en
"The Matrix"#en
"Tiger" #en
}
?sitelink schema:about ?item;
schema:isPartOf <https://en.wikipedia.org/>;
schema:name ?lemma.
?item wdt:P31* ?instance.
?item wdt:P279* ?subclass.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,da,sv".}
}
The result can be seen here: https://w.wiki/ZmQ
One option would of course also be to look at the itemDescription, but I'm afraid that this is too granular to build meaningful groups from larger lists and count frequencies later on.
Does anyone have a hint/idea on how to get more general entity categories? Maybe also from the mediawiki API?
Any input would be highly appreciated!
Here are three possibilities, side-by-side:
SELECT ?lemma ?item (GROUP_CONCAT(DISTINCT ?instanceLabel; SEPARATOR = " ") AS ?a) (GROUP_CONCAT(DISTINCT ?subclassLabel; SEPARATOR = " ") AS ?b) (GROUP_CONCAT(DISTINCT ?isaLabel; SEPARATOR = " ") AS ?c) WHERE {
VALUES ?lemma {
"Donald Trump"#en
"The Matrix"#en
"Tiger"#en
}
?sitelink schema:about ?item;
schema:isPartOf <https://en.wikipedia.org/>;
schema:name ?lemma.
OPTIONAL { ?item (wdt:P31/(wdt:P279*)) ?instance. }
OPTIONAL { ?item wdt:P279 ?subclass. }
OPTIONAL { ?item wdt:P31 ?isa. }
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,da,sv".
?instance rdfs:label ?instanceLabel.
?subclass rdfs:label ?subclassLabel.
?isa rdfs:label ?isaLabel.
}
# Here, you could add: FILTER(?instanceLabel in ("mammal"#en, "movie"#en, "musical"#en (and so on...)))
}
GROUP BY ?lemma ?item
Live here.
If you're looking at labels such as "film" and "mammal", i. e. a couple dozen at most, you could explicitly list them in order of preference, then use the first one that occurs.
Note that you may be running into this bug: https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial#wikibase:Label_and_aggregations_bug

Filtering URL value

What I am trying to do is to select a certain entity for the certain condition. If the condition is url, then how I can make a proper query to get the only result that matches the v1 = "https://www.getmagicbullet.com/" .
The commented sentences are what I tried different ways, but only to fail. How could I adjust the query to get the right answer?
Thank you for your help.
SELECT DISTINCT ?iLabel ?p ?v1
WHERE {
?i wdt:P31 wd:Q212920.
?i wdt:P856 ?v1.
# FILTER (?v1Label = "https://www.getmagicbullet.com/")
# { ?v1 rdfs:label "https://www.getmagicbullet.com/"#en }
# UNION { ?v1 skos:altLabel "https://www.getmagicbullet.com/"#en }
SERVICE wikibase:label { bd:serviceParam wikibase:language "ko,en,[AUTO_LANGUAGE]". }
}
LIMIT 1000

SPARQL bordering countries example

Can anyone show me any SPARQL query to get all bordering contries of all countries from http://www4.wiwiss.fu-berlin.de/factbook/sparql?
For example Afghanistan has:
factbook:landboundary db:China,
factbook:landboundary db:Iran,
factbook:landboundary db:Pakistan,
factbook:landboundary db:Tajikistan,
factbook:landboundary db:Turkmenistan
My try of getting data:
SELECT ?country ?name ?neighbour
WHERE {
?country rdf:type factbook:Country .
?country rdfs:label ?name.
OPTIONAL{
?country factbook:landboundary ?neighbour.
}
}
ended with following message:
rethrew: de.fuberlin.wiwiss.d2rq.D2RQException: Table 'factbook.neighbors' doesn't exist: SELECT DISTINCT `T0_neighbors`.`name_encoded` FROM `bordercountries` AS `T0_bordercountries`, `neighbors` AS `T0_neighbors`, `countries` AS `T0_countries` WHERE `T0_bordercountries`.`Landboundaries_bordercountries_title` = `T0_neighbors`.`Name` AND `T0_bordercountries`.`Name` = `T0_countries`.`Name` AND `T0_countries`.`name_encoded` = 'Aruba' (E0)
I've asked the same question on http://answers.semanticweb.com but no luck yet so I'm trying my luck here
The failure seems to be caused by an internal system error. Your SPARQL query does not have any syntax errors and the predicates you provided are valid according to the data.
However, I don't understand how your query is supposed to return the neighbors of one specific country. Maybe you want to try something like this:
SELECT DISTINCT ?neighbor
WHERE {
?neighbor rdf:type factbook:Country .
?neighbor factbook:landboundary db:Afghanistan .
}
Very much later and not exactly an answer to this question (the SPARQL Endpoint to the CIA Factbook seem to be down at the moment), but WikiData has a few examples how to get bordering countries according to their data set at https://query.wikidata.org/
E.g. if you open the examples and search for "border", you get a query for "countries sharing a border with Cameroon":
#Population of countries sharing a border with Cameroon
#defaultView:LineChart
SELECT ?country ?year ?population ?countryLabel WHERE {
{
SELECT ?country ?year (AVG(?population) AS ?population) WHERE {
{
SELECT ?country (str(YEAR(?date)) AS ?year) ?population WHERE {
?country wdt:P47 wd:Q1009; # shares border with Cameroon
p:P1082 ?populationStatement.
?populationStatement ps:P1082 ?population;
pq:P585 ?date.
}
}
}
GROUP BY ?country ?year
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
To understand this one has to know (or figure out) that wd:Q1009 is actually Cameroon. Not sure how to do this.
This example also displays a - imo not very useful - display of the population of the surrounding countries by year.
A simpler version without the extra data is:
SELECT ?country ?countryLabel WHERE {
?country wdt:P47 wd:Q1009 # shares border with Cameroon
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
}
}
(The SERVICE wikibase:label is a WikiData extension)
Finally all bordering neighbours for all countries might be:
SELECT ?country ?countryLabel ?neighbourLabel ?neighbour WHERE {
?country wdt:P31 wd:Q6256;
wdt:P47 ?neighbour
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
}
} ORDER BY ?countryLabel ?neighbourLabel