Wikidata SPARL query by generic property return object instead of statement - sparql

I want to disseminate the query by spreading the nodes like in the below query.
SELECT ?item ?itemLabel ?prop ?object ?objectLabel ?prop2 ?object2
WHERE
{
VALUES (?item) { (wd:Q12418)}
?item ?prop ?object.
?object ?prop2 ?object2
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } # Helps get the label in your language, if not, then en language
}
Here I start with wd:Q12418(Mona Lisa) and I retrieve all properties of it - than for each object I want to retrieve all properties of it as well.
The problem is that ?item ?prop ?object line returns statement for object rather than the object itself. So it's pointing to Mona Lisa's statement like this: https://www.wikidata.org/wiki/Q12418#Q12418$9c08a93a-4591-561d-36a9-78083ae0a3fa
In this case second part (?object ?prop2 ?object2) works on this statement so it returns something if only that statement has subproperties. What I'd like to get is the object(e.g. painter Leonardo Da Vinci) rather than the statement so that I can iterate through object's properties.
How can I achieve that?

You want ?prop to start with the wdt: prefix, then you can add this filter function:
filter(strstarts(str(?prop), str(wdt:)))
and, if you need, an analogous one for ?prop2.

Related

Return "instances of" property for a wikidata item using SPARQL

I have a list of wikidata items I wish to extract the "instance of" property from. For example, looking up Q1339 I can see that it has a single instance type (P:31) labelled "human" (Q5). I have tried to write a simple query that would extract that but I am not getting any records returned. I am v. new to SPARQL so it's very likely I'm missing something obvious.
SELECT ?item ?itemLabel
WHERE
{
?item wdt:P31 wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

On wikidata SPARQL get coordinates in degree?

I want to get coordinates via SPARQL from wikidata displayed as degrees (e.g. 54°54'36"N). It is displayed in wikidata like this, so I suspect there is a built-in function for this purpose but I can not find it.
Example query:
SELECT DISTINCT ?countryLabel ?long
{
?country wdt:P31 wd:Q6256 ;
p:P1332 [ psv:P1332 [wikibase:geoLatitude ?long ]].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
This gives the longitude as number (e.g. -30.08). I can calculate the desired output format from this result but would prefer to get it directly from the query.
Thanks.

How to get property labels from Wikidata using SPARQL

I am using SPARQLWrapper to send SPARQL queries to Wikidata.
At the moment I am trying to find all properties for an entity. Eg. with a simple tuple such as: wd:Q11663 ?a ?b. This in itself works, but I am trying to find human readable labels for the returned properties and entities.
Although SERVICE wikibase:label works using Wikidata's GUI interface, this does not work with SPARQLWrapper - which insists on returning identical values for a variable and its 'label'.
Querying on the property rdfs:label works for the entity (?b), but this approach does not work with the property (?a).
it would appear the property is being returned as a full URI such as http://www.wikidata.org/prop/direct/P1536 . Using the GUI I can successfully query wd:P1536 ?a ?b.. This works with SPARQLWrapper if I send it as a second query - but not in the first query.
Here is my code:
from SPARQLWrapper import SPARQLWrapper, JSON
sparql = SPARQLWrapper("http://query.wikidata.org/sparql")
sparql.setQuery("""
SELECT ?a ?aLabel ?propLabel ?b ?bLabel
WHERE
{
wd:Q11663 ?a ?b.
# Doesn't work with SPARQLWrapper
#SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
#?prop wikibase:directClaim ?p
# but this does (and is more portable)
?b rdfs:label ?bLabel. filter(lang(?bLabel) = "en").
# doesn't work
#?a rdfs:label ?aLabel.
# property code can be extracted successfully
BIND( strafter(str(?a), "prop/direct/") AS ?propLabel).
#BIND( CONCAT("wd:", strafter(str(?a), "prop/direct/") ) AS ?propLabel).
# No matches, even if I concat 'wd:' to ?propLabel
?propLabel rdfs:label ?aLabel
# generic search for any properties also fails
#?propLabel ?zz ?aLabel.
}
""")
# However, this returns a label for P1536 - which is one of wd:Q11663's properties
sparql.setQuery("""SELECT ?b WHERE
{
wd:P1536 rdfs:label ?b.
}
""")
So how can I get the labels for the properties in one query (which should be more efficient)?
[aside: yes I'm a bit rough & ready with the EN filter - often dropping it if I'm not getting anything back]
I was having problems with two approaches - and the code above contains a mixture of both. Also, SPARQLWrapper isn't a problem here.
The first approach using the wikibase Label service should be like this:
SELECT ?a ?aLabel ?propLabel ?b ?bLabel
WHERE
{
?item rdfs:label "weather"#en.
?item ?a ?b.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
?prop wikibase:directClaim ?a .
}
This code also includes a lookup from the label ('weather') to the query entity (?item).
The SERVICE was working, but if there isn't an rdfs:label definition then it just returns the entity. The GUI and SPARQLWrapper (to the SPARQL endpoint) were simply returning the results in a different order - so it looked like I was seeing lots of 'failed' output (ie. entities and failed labels both being reported as the same).
This became clear when I started adding an OPTIONAL clause to the approach below.
The ?prop wikibase:directClaim ?a . line turns out to be pretty simple. Wikibase defines directClaim to map properties to entities. This then allows it to define tuples about properties (ie. a label). Many other ontologies just use the same identifiers.
My second (more generic approach) is the approach you find in many of the books and online tutorials. The problem here is that wikibase's properties have the full URL in them, and I needed to convert them into an entity. I tried string manipulation but this produces a string literal - not an entity. The solution is to use directClaim again:
?prop wikibase:directClaim ?a .
?prop rdfs:label ?propLabel. filter(lang(?propLabel) = "en").
Note that this only returns a result if rdfs:label is defined. Adding an OPTIONAL will return results even if there is no label defined.

Efficient filter query in Wikidata

I am trying to form an efficient filter query in SPARQL on Wikidata. Let me explain my process:
I query the search-entities API using key words e.g. (Apple, Orange)
The API query returns a list of relevant item ID's e.g. (wd:Q629269, wd:Q154950, wd:Q312, wd:Q95, wd:Q4878289, wd:Q10817602)
With this list of ID's, I then query SPARQL and to return items that are CLASS or are SUBLCASS of certain types e.g. (p:P31/ps:P31/wdt:P279* wd:Q43229) - which returns everything if it is an Organisation or subclass thereof.
Then for items in the list of ID's, that are of certain CLASS, return information items if they exists e.g. (OPTIONAL).
I am new to SPARQL. My Question is, is this the most efficient method to achieve this output? It seems to me to be quite inefficient and I cannot find a similar type of problem in the tutorial examples.
You can try the query here.
SELECT distinct ?item ?itemLabel ?itemDescription ?web ?inception ?ISIN
WHERE{
FILTER (?item IN (wd:Q629269, wd:Q154950, wd:Q312, wd:Q95, wd:Q4878289, wd:Q10817602))
?item p:P31/ps:P31/wdt:P279* wd:Q43229.
OPTIONAL {
?item wdt:P856 ?web. # get item-web
?item wdt:P571 ?inception. # get item-web
?item wdt:P946 ?ISIN. # get item-isin
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
LIMIT 10

Query to get properties of quantity for a certain entity class

What I am trying to do is to get properties with quantity property types of a certain class (such as City, Country, Human, River, Region, Mountain, etc). I tried several classes like Country (wd:Q6256) works okay with the query below, but many other classes makes the query to exeed time limit. How can I achieve the result optimizing the query below? or is there any other way to get the properties of Quantity type in a certain class?
SELECT DISTINCT ?p_ ?pLabel ?pAltLabel
WHERE {
VALUES (?class) {(wd:Q515)}
?x ?p_ [].
?x p:P31/ps:P31 ?class.
?p wikibase:claim ?p_.
?p wikibase:directClaim ?pwdt.
?p wikibase:propertyType ?pType.
FILTER (?pType = wikibase:Quantity)
SERVICE wikibase:label { bd:serviceParam wikibase:language "ko,en". }
}
Attempt 1: Optimizing the query
Some observations:
Instead of p:P31/ps:P31, you could use wdt:P31 which is faster by avoiding the two-property hop, but finds only the truthy statements
The expensive part is the call to the label service at the end, as can be seen by commenting that line out by placing # at the start of the line
The query retrieves every claim on every city (many!), gets the properties of the claims (few!), and only removes the duplicates in the end (with DISTINCT)
As a result, the label service is called many times for the same property, once per claim! This is the big problem with the query
This can be avoided by moving the retrieval of properties with the DISTINCT into a subquery, and calling the label service only at the end on the few properties
After that change it should be fast, but is still slow because the query optimiser seems to evaluate the query in the wrong order. Following hints from this page, we can turn the query optimiser off.
This works for me:
SELECT ?p ?pLabel ?pAltLabel {
hint:Query hint:optimizer "None" .
{
SELECT DISTINCT ?p_ {
VALUES ?class { wd:Q515 }
?x wdt:P31 ?class.
?x ?p_ [].
}
}
?p wikibase:claim ?p_.
?p wikibase:propertyType ?pType.
FILTER (?pType = wikibase:Quantity)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Attempt 2: Splitting the task into multiple queries
Having learned that the approach above doesn't work for some of the biggest categories (like wd:Q5, “human”), I tried a different approach. This will get all results, but not in a single query. It requires sending ~25 individual queries and combining the results afterwards:
We start by listing the quantity properties. There are, as of today, 503 of them.
We want to keep only those properties that are actually used on an item of type “human”.
Because that check is so slow (it needs to look at millions of items), we start by only checking the first 20 properties from our list.
In the second query, we're going to check the next 20, and so on.
This is the query that tests the first 20 properties:
SELECT DISTINCT ?p ?pLabel ?pAltLabel {
hint:Query hint:optimizer "None" .
{
SELECT ?p ?pLabel ?pAltLabel {
?p wikibase:propertyType wikibase:Quantity.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
OFFSET 0 LIMIT 20
}
?p wikibase:claim ?p_.
?x ?p_ [].
?x wdt:P31 wd:Q5.
}
Increase the OFFSET to 20, 40, 60 and so on, up to 500, to test all properties.