From Wikidata Query Service I want to query the entity ethanol (i.e. Q153).
I can simply do this by querying for the CAS using the following SPARQL:
SELECT ?cas ?casLabel
WHERE {
?cas wdt:P231 "64-17-5".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
However, I don't find a way, how to query the entity by its name. Is there a property name, compound or something similar to query the string "ethanol" as below? Or should such a query be constructed differently?
SELECT ?compound ?compoundLabel
WHERE {
?compound wdt:???? "ethanol".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Another way to search for the string "ethanol" would be to use the wikidata API. However, this is not through the SPARQL endpoint:
https://www.wikidata.org/w/api.php?action=wbsearchentities&format=json&type=item&language=en&limit=50&search=ethanol
Related
I am interested in visualizing the Wikidata class hierarchy to create graphs like
I know how I can get direct superclasses of a Wikidata entity. For this I use SPARQL code like:
SELECT ?item ?itemLabel
WHERE
{
wd:Q125977 wdt:P279 ?item.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
where wdt:P279 denotes the subclass of-property.
However, this direct method requires many single requests to the Wikidata API.
How is it possible to get the same information with a single SPARQL query?
(Note that the example graph above only shows an abbreviated version. The final desired graph of all superclasses is 13 levels deep and has 69 nodes which means 68 single requests, see this jupyter notebook if interested.)
You could use a query like this to create your taxonomy (with labels) as triples directly.
CONSTRUCT {
?item1 wdt:P279 ?item2.
?item1 rdfs:label ?item1Label.
?item2 rdfs:label ?item2Label.
}
WHERE {
SELECT ?item1 ?item2 ?item1Label ?item2Label
WHERE {
wd:Q125977 (wdt:P279*) ?item1, ?item2.
FILTER(EXISTS { ?item1 wdt:P279 ?item2. })
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
}
I think you need a query like the following:
SELECT ?class ?classLabel ?superclass ?superclassLabel
WHERE
{
wd:Q125977 wdt:P279* ?class.
?class wdt:P279 ?superclass.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
where wdt:P279* is a zero-or-more path connecting a class with (a superclass of) one of its superclasses.
This will generate a mapping "class->superclass" containing all what you need for building the graph that you illustrated.
I would like to know the simplest way to handle units of distance in Sparql. In this Wikidata example I would like to select all the mountains over 8000m; however when I run the text below it also includes all of the mountains over 8000ft. Is it possible to specify that the units I am interested is meters, and for the mountains that are specified in feet, is a conversion to meters necessary?
#Mountains Over 8000m
SELECT ?mountainLabel ?height
WHERE
{
?mountain wdt:P31 wd:Q8502.
?mountain wdt:P2044 ?height.
FILTER(8000 <= ?height)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en, de". }
}
The psn namespace prefix is used to normalize units of measure. Your query can look like this:
SELECT ?mountainLabel ?height
WHERE
{
?mountain wdt:P31 wd:Q8502.
?mountain p:P2044/psn:P2044/wikibase:quantityAmount ?height .
FILTER(8000 <= ?height)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en, de". }
}
It seems that wdt:P40+ is not exactly the same as wdt:P40/wdt:P40*.
Example:
SELECT ?ancetre ?ancetreLabel
WHERE
{
?ancetre wdt:P40+ wd:Q346
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
It returns 852 results. (10 June 2019)
SELECT ?ancetre ?ancetreLabel
WHERE
{
?ancetre wdt:P40/wdt:P40* wd:Q346
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
The last returns 1046 results. (10 June 2019, same date as above)
I would expects the same results for both queries
Could someone explains this ?
I think I have found an explanation:
wdt:P40/wdt:P40* can get more result because of duplicate.
So if we replace SELECT with SELECT DISTINCT, there is no more difference between the queries.
For example, take these three cases:
Triangle Shirtwaist Factory fire (Q867316) : instance of (P31): disaster (Q3839081)
disaster: subclass of (P279) : occurrence (Q1190554)
occurrence (Q1190554) : subclass of: temporal entity (Q26907166)
World's Fair (Q172754) : subclass of (P279) : exhibition (Q464980)
exhibition (Q464980) : subclass of (P279) : event (Q1656682)
event (Q1656682) : subclass of (P279) : occurrence (Q1190554)
occurrence (Q1190554) : subclass of: temporal entity (Q26907166)
Peloponnesian War (Q33745) : instance of (P31): war (Q198)
war (Q198) : subclass of (P279) : occurrence (Q1190554)
occurrence (Q1190554) : subclass of: temporal entity (Q26907166)
I would like all the descendants of temporal entity stopping before the instances (Triangle Shirtwaist Factory fire, World's Fair, Peloponnesian War).
Is there a way to do this with SPARQL or the API?
If I understand you correctly you just want to get the way instances are classified on Wikidata.
So starting with the example #AKSW gave:
SELECT DISTINCT ?event_type ?event_typeLabel {
?event_type wdt:P279* wd:Q26907166
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
The * is a pretty expensive operation to calculate and by the time of writing Wikidata has close to 50 million items. That is why I had to add the LIMIT because I was getting time-outs without it.
Graphing it
To get a feel for the data I like to look at it in the Wikidata graph builder. Because it shows clustering so nice.
https://angryloki.github.io/wikidata-graph-builder/?property=P279&item=Q26907166&iterations=2&mode=reverse
As you can see there are already a lot of classifications after 2 iterations. So we might also already be happy with this query:
SELECT DISTINCT ?event_type ?event_typeLabel {
?event_type wdt:P279/wdt:P279 wd:Q26907166
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note that it only goes 2 times along the property P279. At the moment this gives me 281 items.
If you really need to traverse the tree in full you can filter out "instance of" (P31) statements using FILTER NOT EXISTS. But the problem is that that currently always runs into timeouts:
SELECT DISTINCT ?event_type ?event_typeLabel {
?event_type wdt:P279* wd:Q26907166 .
FILTER NOT EXISTS { ?event_type wdt:P31 [] }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
With a subquery you can limit the results from the tree, but you will get incomplete data:
SELECT ?event_type ?event_typeLabel
WHERE
{
{
SELECT DISTINCT ?event_type
WHERE
{
?event_type wdt:P279* wd:Q26907166 .
}
LIMIT 1000
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER NOT EXISTS { ?event_type wdt:P31 [] }
}
I would like to find the first common superclass(es) between several Wikidata entities.
Let's take a bridge and a cemetery. What is their "smallest" common superclass?
A bridge is a subclass of "architectural structure".
A cemetery is a subclass of "place of worship", which is a subclass of "architectural structure".
---> Their most specialized common class is "architectural structure".
This Sparql query is close to the solution :
SELECT ?classe ?classeLabel WHERE {
wd:Q12280 wdt:P279* ?classe .
FILTER EXISTS { wd:Q39614 wdt:P279* ?classe .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Problem: it returns all common classes between both items, not just the first ones. How could I filter the answer to get what I want?
If this question is of interest to someone else, here is the SPAQL query I finally use to get the Least common subsumers of more than two items. This is a mix between #AKSW response in comments and the answer to that previous question on SO.
SELECT ?lcs ?lcsLabel WHERE {
?lcs ^wdt:P279* wd:Q32815, wd:Q34627, wd:Q16970, wd:Q16560 .
filter not exists {
?sublcs ^wdt:P279* wd:Q32815, wd:Q34627, wd:Q16970, wd:Q16560 ;
wdt:P279 ?lcs .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it.