I found that Wikidata API returns the data with some relevance order. For example look at the occupationvalues of https://www.wikidata.org/wiki/Q22686 and compare with occupation values list from here:
VALUES (?company) {(wd:Q22686)}
?company ?p ?statement .
?statement ?ps ?ps_ .
?wd wikibase:claim ?p.
?wd wikibase:statementProperty ?ps.
?statement wikibase:rank ?rank.
OPTIONAL {
?statement ?pq ?pq_ .
?wdpq wikibase:qualifier ?pq .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Try it
Is there a way to preserve property values order for SPARQL query?
Related
I am trying to retrieve samples of coordinates in Wikidata via SPARQL but am having a very difficult time trying to achieve it. I would want to get only a single pair of coordinates per place and display the result in a column, and the latitude and longitude of the said coordinates sample in their own columns.
The following code (link to WQS) I use below works, but it does not get the coordinates values labels in Point(5.936111111 51.21) format. When I replace p:P625 with wdt:P625, no items are retrieved. Additionally, Borculo (Q1025685) appears twice in the results with two unique coordinates:
SELECT DISTINCT ?place ?placeLabel (SAMPLE(?temp1) AS ?coords_sample) ?lat ?long {
?place p:P31 ?instanceOf.
?instanceOf ps:P31/wdt:279* wd:Q2039348.
?place p:P625 ?temp1.
?temp1 psv:P625 ?temp2.
?temp2 wikibase:geoLatitude ?lat.
?temp2 wikibase:geoLongitude ?long.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} GROUP BY ?place ?placeLabel ?lat ?long
ORDER BY ?placeLabel
Use ps:P625 for obtaining the coordinates in the desired format (see also the manual on Wikibooks).
Also, it is not sufficient to sample the coordinates statement if you also group by ?lat and ?long. Hence, you'd better to sample it in a subquery.
Final result:
SELECT DISTINCT ?place ?placeLabel ?coords ?lat ?long {
?place p:P31/ps:P31/wdt:279* wd:Q2039348 ;
p:P625 ?coords_sample .
{
SELECT (SAMPLE(?coords_stmt) AS ?coords_sample) {
?place p:P31/ps:P31/wdt:279* wd:Q2039348 ;
p:P625 ?coords_stmt .
} GROUP BY ?place
}
?coords_sample ps:P625 ?coords;
psv:P625 [
wikibase:geoLatitude ?lat;
wikibase:geoLongitude ?long
] .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?placeLabel
I would like to access child properties of wikidata entities. An example may be property P1033, which is a child of P4952, for an entity such as Q49546. How can I do this dynamically in a SPARQL query?
Using the query builder provided by the online Wikidata Query Service, I can construct a simple query, which works for normal properties (in the linked example: mass), but not for the desired sub-properties (in the linked example: NPFA-code for health hazard), which end up empty, even though they are clearly set in the web-result. Side-note: it is a different example than the one from the first paragraph.
The desired objective is the dynamic query as follows:
SELECT ?p ?item ?itemDescription ?prop ?value ?valueLabel ?itemLabel ?itemAltLabel ?propLabel WHERE {
BIND(wd:Q138809 AS ?item)
?prop wikibase:directClaim ?p.
#?item ?p ?value.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
?value rdfs:label ?valueLabel.
?prop rdfs:label ?propLabel.
?item rdfs:label ?itemLabel;
skos:altLabel ?itemAltLabel;
schema:description ?itemDescription.
}
}
ORDER BY DESC(?prop)
LIMIT 10
With the line 4 as a comment, I can get my propLabel as desired, but no value; doing it the other way round with the line not as comment, I do get only the properties, which are set on first level, but not the child properties.
Thanks to #AKSW, I herewith post the final query solving my problem:
SELECT ?item ?itemLabel ?itemDescription ?itemAltLabel ?prop ?propertyLabel ?propertyValue ?propertyValueLabel ?qualifier ?qualifierLabel ?qualifierValue
{
VALUES (?item) {(wd:Q138809)}
?item ?prop ?statement .
?statement ?ps ?propertyValue .
?property wikibase:claim ?prop .
?property wikibase:statementProperty ?ps .
OPTIONAL { ?statement ?pq ?qualifierValue . ?qualifier wikibase:qualifier ?pq . }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
The key step for me was to understand that child properties are actually called qualifiers.
In my Wikidata query, I actually have 2 problems.
Date of Birth + Gender + Description is not returned
GROUP BY does not aggregate based on the name (Bad Aggregate Error)
SELECT ?person ?label ?dob ?gender ?description
WHERE
{
?person wdt:P31 wd:Q5;
rdfs:label ?label.
FILTER(STRSTARTS(?label, "Michael Bloom")).
}
GROUP BY ?person ?label
LIMIT 5
Try it here
You must query all fields for them to being returned (SPARQL can't guess where to get them), with the exception of labels and descriptions when using Wikibase Label Service.
All fields in your SELECT must be in the GROUP BY clause, with the exception of the use of the GROUP_CONCAT function that will concatenate all values into one field.
Bad news: searching humans by other than their exact label in Wikidata Query Service is hard, as you will often hit the timeout of 60 seconds. A solution, as shown by AKSW, is to use Wikibase Mediawiki API and do an entity search.
Here is an example of your query rewritten:
SELECT ?person ?personLabel ?personDescription (GROUP_CONCAT(DISTINCT ?dob ; SEPARATOR = ' / ') AS ?dob) (GROUP_CONCAT(DISTINCT ?gender ; SEPARATOR = ' / ') AS ?gender)
WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam mwapi:search "Michael Bloom" .
bd:serviceParam mwapi:language "en" .
?person wikibase:apiOutputItem mwapi:item .
}
?person wdt:P31 wd:Q5 .
OPTIONAL { ?person wdt:P569 ?dob }
OPTIONAL { ?person wdt:P21 ?gender }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
GROUP BY ?person ?personLabel ?personDescription
I'm having trouble extracting location attributes of company HQ's.
My query: finds all companies or sub-classes, and returns some basic properties such as ISIN and URL, and the Headquarter location.
I have tried to use this example to extend the Headquarter part of the query to return location information such as city, country, and coordinate latitude and longitude. However I am getting stuck on pulling the values or labels through.
Thank you
SELECT
?item ?itemLabel ?web ?isin ?hq ?hqloc ?inception
# valueLabel is only useful for properties with item-datatype
WHERE
{
?item p:P31/ps:P31/wdt:P279* wd:Q783794.
OPTIONAL{?item wdt:P856 ?web.} # get item
OPTIONAL{?item wdt:P946 ?isin.} # get item
OPTIONAL{?item wdt:P571 ?inception.} # get item
OPTIONAL{?item wdt:P159 ?hq.}
OPTIONAL{?item p:P159 ?hqItem. # get property
?hqItem ps:P159 wd:Q515. # get property-statement wikidata-entity
?hqItem pq:P17 ?hqloc. # get country of city
}
?article schema:about ?item .
?article schema:inLanguage "en" .
?article schema:isPartOf <https://en.wikipedia.org/>.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10
A more simplified query to select some of the values you mentioned:
SELECT
?company ?companyLabel ?isin ?web ?country ?countryLabel ?inception
WHERE
{
?article schema:inLanguage "en" .
?article schema:isPartOf <https://en.wikipedia.org/>.
?article schema:about ?company .
?company p:P31/ps:P31/wdt:P279* wd:Q783794.
?company wdt:P946 ?isin.
OPTIONAL {?company wdt:P856 ?web.}
OPTIONAL {?company wdt:P571 ?inception.}
OPTIONAL {?company wdt:P17 ?country.}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} LIMIT 10
What I changed:
changed some labels to be more explicit (ex: "?item" -> "?company")
usage of P17 to directly select the country
I removed the OPTIONAL on ISIN to show that there exist some values. You did not get a result because it seems that many company instances on Wikidata lack that information.
From here, selecting the other values should be easy.
I am trying to recover the cast list for movies from wikidata.
My SPARQL query for Dr. No is as follows:
SELECT ?actor ?actorLabel WHERE {
?movie wdt:P161 ?actor .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(?movie = wd:Q102754)
}
LIMIT 1000
I can try it out at query.wikidata.org but the results are not in the order that I want. It gives 'Sean Connery', 'Zena Marshall', 'Ursula Andress'.
The database has the data in the required order as you can see from https://www.wikidata.org/wiki/Q102754 includes the cast list in order (Sean Connery, Ursula Andress, Joseph Wiseman). Generally the cast list is given in billing order and it is that that I want to recover.
SPARQL provides ordering of results by using ORDER BY, see here
The ordering in your example is based on the number of references of a statement. Here is a non-optimized version that does what you want:
SELECT ?actor ?actorLabel WHERE {
?movie p:P161 ?statement .
?statement ps:P161 ?actor .
OPTIONAL {?statement prov:wasDerivedFrom ?ref . }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(?movie = wd:Q102754)
}
group by ?movie ?actor ?actorLabel
ORDER BY DESC(count(?ref)) ASC(?actorLabel)
LIMIT 1000