In a wikidata SPARQL query how to address a file? - sparql

How to address a commons Media File?
In wikidata want to find all objects, which have a commons media file attached through a specific property with a SPARQL query.
Example:
SELECT ?item
{
# get all objects which have this file as a LocatorMap
?item wdt:P242 commons:LocationPeru.svg
}
This does not work and neither do all variants like wd:commons:LocationPeru.svg, p:P242 [ps:P242 psv:commons:LocationCuba.svg] or everything else I tried. I also tried a FILTER with the label but rdfs:label ?label did not work for me on the fileobject.
So, how is the correct syntax?
Thanks.
Actually, I want to filter out specific ones from a more complex query with MINUS, but reduced my problem to this.

You can use the image's full URL:
SELECT ?item
{
?item wdt:P242 <http://commons.wikimedia.org/wiki/Special:FilePath/LocationPeru.svg>
}
Actually, from the above query it seems that no item indicates LocationPeru.svg as locator map. This is because in Q419#P242 there is an higher-ranked locator map.
For ignoring the statements' ranking, you can use:
SELECT ?item
{
?item p:P242 [ ps:P242 <http://commons.wikimedia.org/wiki/Special:FilePath/LocationPeru.svg> ]
}
For filtering only the normal rank, you can use:
SELECT ?item
{
?item p:P242 [
ps:P242 <http://commons.wikimedia.org/wiki/Special:FilePath/LocationPeru.svg> ;
wikibase:rank wikibase:NormalRank
]
}

Related

Fail to retrieve a wikidata item through SPARQL

I tried to search and get the ID of a item with a certain label: Teodor Bogdanov. I can search this name successfully through wikidata website. However, I failed to do so by searching through SPARQL. The code is here.
I also copied it here:
SELECT distinct ?item ?itemLabel ?itemDescription WHERE{
?item ?label "Teodor Bogdanov".
}
The same thing happens for Félix Anaut
Could anyone help me fix this issue? Thank you in advance.
THere are two issues in your query. The predicate in your triple pattern is a variable, while it should be a constant, rdfs:label. It asks about all items and properties linked to "Teodor Bogdanov". The second issue is the missing language tag. When these two are fixed, you get the following query
SELECT distinct ?item {
?item rdfs:label "Teodor Bogdanov"#en .
}
For "Félix Anaut", while this spelling with an acute accent is used in English Wikipedia, that's not the case in Wikidata.

Get chemical compound data in SI units from wikidata

For a project I need to get data about chemical compounds like density, mass, boiling point and melting point in SI units (meters, kg, Degree Celsius,...) via the CAS number of the compound.
With the Query builder and some testing I managed to achieve some of it with the following code (CAS-Number is the property P231 and I am searching for e.g. 67-64-1):
Wikidata Query Service
SELECT DISTINCT ?itemLabel ?melting_point ?boiling_point ?mass ?density WHERE {
{
SELECT DISTINCT ?item WHERE {
?item p:P231 ?statement0.
?statement0 (ps:P231) "67-64-1".
}
}
OPTIONAL { ?item wdt:P2101 ?melting_point. }
OPTIONAL { ?item wdt:P2102 ?boiling_point. }
OPTIONAL { ?item wdt:P2054 ?density. }
OPTIONAL { ?item wdt:P2067 ?mass. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "de". }
}
The problem is that I don't manage to get only temperatures in Degree Celsius but also Fahrenheit
This is a very interesting question, as WikiData offers plenty of tools to disambiguate -- this is why it's so powerful.
But these tools come with some learning that the user needs to do before using them.
Before starting, let me make three points about your query:
1-You don't actually need an inner query to select acetone, as you would in SQL. This is one of the reasons why SPARQL is so great compared to SQL -- you don't have to navigate an endless field of keys but your data is still 'normalised'.
2-You don't need the ?statement0 variable, as this is not used to disambiguate. You can just use the wdt:P231 property directly links acetone with its CAS registry number.
3-Since you do need to disambiguate the values of the physical quantities associated with acetone, you will need to go through a disambiguation statement.
Now, here is a query that works:
SELECT DISTINCT ?itemLabel ?melting_point ?boiling_point ?density ?mass
WHERE {
?item wdt:P231 "67-64-1".
OPTIONAL {
?item p:P2101 ?ps1 .
?ps1 ps:P2101 ?melting_point;
psv:P2101/wikibase:quantityUnit/wdt:P31/wdt:P279* wd:Q61610698
}
OPTIONAL {
?item p:P2102 ?ps2 .
?ps2 ps:P2102 ?boiling_point;
psv:P2102/wikibase:quantityUnit/wdt:P31/wdt:P279* wd:Q61610698
}
OPTIONAL {
?item p:P2054 ?ps3 .
?ps3 ps:P2054 ?density;
psv:P2054/wikibase:quantityUnit/wdt:P31/wdt:P279* wd:Q61610698
}
OPTIONAL {
?item p:P2067 ?ps4 .
?ps4 ps:P2067 ?mass;
psv:P2067/wikibase:quantityUnit/wdt:P31/wdt:P279* wd:Q61610698
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "de". }
}
To begin with, I removed the inner query and the statement mentioned in §1 and §2 above.
Then, I retrieve the statement that talks about melting point by using the p:P2101/ps:P2101 combination.
This will allows me to distinguish between Celsius and Fahrenheit values.
Now, since we have multiple physical quantities to look for (i.e. not just temperature), and we want these to be SI, we can use a property path (see below for explanation) to restrict the values that we return as being SI (as opposed to returning Celsius, kg/m^3, kg specifically and individually, although this would be perfectly valid too, just more complex).
For reference, a property path is just a way to shorten a query so:
SELECT ?person ?grandparent
WHERE {
?person :hasParent ?parent .
?parent :hasParent ?grandparent .
}
can be shortened to:
SELECT ?person ?grandparent
WHERE {
?person :hasParent/:hasParent ?grandparent .
}
Now, let's get back to the melting point being returned in Celsius and Fahrenheit.
The two statements that give us different units use a psv:P2101 property to tell us more about the value mentioned in the statement.
From this we can use the wikibase:quantityUnit property to determine the unit.
We will then want to make sure the unit is a SI unit or any subclass thereof. So wdt:P31 tells us that the unit is "an instance of" some class, and wdt:P279* wd:Q61610698 (wdt:P279* is another property path) tells us that the class is either the class of SI units (wd:Q61610698 = SI units), or a direct or indirect subclass of SI units.
I added a picture of what the data looks like (although confusingly there are two Celsius melting points for acetone for some reason.

How to construct SPARQL query for a list of Wikidata items

First off, I'm not a developer, and I'm new to writing SPARQL queries. Mostly I've been looking up existing queries and trying to tweak them to get what I need. The issue is that most documentation on query construction have to do with getting new data you don't have, rather than retrieving or extending existing data. And when you do find tips for retrieving existing data, they tend to be for ONE item at a time instead of a full data set of many items.
I mostly use OpenRefine for this. I start by loading up my existing list of names, and used the Wikidata extension service to reconcile the names to existing Wikidata IDs. So now, this is where I am, vs. where I want to go:
1 - We have a list of Wikidata IDs for reconciled matches;
2 - We have used OpenRefine to get most of the data we need from those;
3 - We don't have the label, description, or Wikipedia links (English), which are extremely valuable;
4 - I have figured out how to construct a query for the label and description of just ONE Wikidata Item:
SELECT ?itemLabel ?itemDescription WHERE { VALUES ?item {
wd:Q15485689 } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
5 - I have figured out how to construct a query to extract the Wikipedia English URL for just ONE Wikidata item:
SELECT ?article ?lang ?name WHERE {
?article schema:about wd:Q15485689;
schema:inLanguage ?lang;
schema:name ?name;
schema:isPartOf _:b13.
_:b13 wikibase:wikiGroup "wikipedia".
FILTER(?lang IN("en"))
FILTER(!(CONTAINS(?name, ":")))
OPTIONAL { ?article wdt:P31 ?instance_of. }
}
The questions are:
How do I modify either query to generate these same results for MORE THAN ONE* Wikidata item?
How do I modify the query to give me all three at once, for more than one* Wikidata item?
*we have 667, but I could do smaller batches if that's too much for the service to handle
Ideally, the query would generate something that allowed me to download a CSV file looking much like this (so I can match on and import the new data into our Airtable base which feeds the website application):
ideal CSV output
If anyone can lead me in the right direction here, I'd appreciate it.
I should also note that if OpenRefine has a way of retrieving these I'm all ears! But since these three don't have a property code, I couldn't see how to snag them from OR.
This sort of thing. See how many QIds you can get away with in the values statement. All of them in one go, probably. This query gives you the URL and the article title; clearly, you can snip the article title column if you do not want it. Note also https://www.wikidata.org/wiki/Wikidata:Request_a_query which is wikidata's own location for questions such as these.
SELECT ?item ?itemLabel ?itemDescription ?sitelink ?article
WHERE
{
VALUES ?item {wd:Q105848230 wd:Q6697407 wd:Q2344502 wd:Q1698206}
OPTIONAL {
?article schema:about ?item ;
schema:isPartOf <https://en.wikipedia.org/> ;
schema:name ?sitelink .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Yes, a VALUES statement in SPARQL can relay not only hundreds but even thousands of items. I regularly do this when cross-checking to see how Wikidata matches up to an existing data set. Some other things you could do as well that take lists of Wikidata items:
Petscan - https://petscan.wmflabs.org/
TABernacle - https://tabernacle.toolforge.org/

Efficient filter query in Wikidata

I am trying to form an efficient filter query in SPARQL on Wikidata. Let me explain my process:
I query the search-entities API using key words e.g. (Apple, Orange)
The API query returns a list of relevant item ID's e.g. (wd:Q629269, wd:Q154950, wd:Q312, wd:Q95, wd:Q4878289, wd:Q10817602)
With this list of ID's, I then query SPARQL and to return items that are CLASS or are SUBLCASS of certain types e.g. (p:P31/ps:P31/wdt:P279* wd:Q43229) - which returns everything if it is an Organisation or subclass thereof.
Then for items in the list of ID's, that are of certain CLASS, return information items if they exists e.g. (OPTIONAL).
I am new to SPARQL. My Question is, is this the most efficient method to achieve this output? It seems to me to be quite inefficient and I cannot find a similar type of problem in the tutorial examples.
You can try the query here.
SELECT distinct ?item ?itemLabel ?itemDescription ?web ?inception ?ISIN
WHERE{
FILTER (?item IN (wd:Q629269, wd:Q154950, wd:Q312, wd:Q95, wd:Q4878289, wd:Q10817602))
?item p:P31/ps:P31/wdt:P279* wd:Q43229.
OPTIONAL {
?item wdt:P856 ?web. # get item-web
?item wdt:P571 ?inception. # get item-web
?item wdt:P946 ?ISIN. # get item-isin
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
LIMIT 10

How to query Wikidata (SPARQL) by official website property P856?

How can you search the Wikidata SPARQL query service for a record whose official website (property P856) matches a specific value?
For example, fetch all records whose official website is "https://www.google.com". This should return Q95 and Q9366, but instead the following query yields no results:
SELECT ?entity
WHERE {
?entity wdt:P856 "https://www.google.com"
}
I can get it to work with a FILTER, but it's very slow:
SELECT ?entity
WHERE {
?entity wdt:P856 ?url FILTER(str(?url)="https://www.google.com")
}
Using FILTER seems very inefficient. Why doesn't the first query work?
Look at the values you get from this query --
SELECT ?entity ?value
WHERE
{
?entity wdt:P856 ?value
}
LIMIT 25
Note that they're all wrapped in <>, marking them as URIs. To compare a URI with a string literal, you must change the URI to a string, as you are doing in the second query.
In other words -- there are no entities with an official website of "https://www.google.com", but there are some entities with an official website of <https://www.google.com>.