The examples for counting instances at query.wikidata.org (and elsewhere) use the wdt:P31/wdt:P279* property path. For example, the number of humans in Wikidata:
SELECT (COUNT(?item) AS ?count)
WHERE {
?item wdt:P31/wdt:P279* wd:Q5 .
}
so that entities that are a subclass of human (I don't think this type exists, but think "nobel laureate", etc.) are included. But then other examples such as "women with most sitelinks and no image born in 1921 or later", which is truncated here:
SELECT ?s ?desc ?linkcount
WHERE
{
?s wdt:P31 wd:Q5 ; # human
wdt:P21 wd:Q6581072 ; # gender: female
wdt:P569 ?born .
FILTER (?born >= "1921-01-01T00:00:00Z"^^xsd:dateTime) .
...
don't use the wdt:P31/wdt:P279* property path. Are these generally oversights (or perhaps done for speed?), or is the subclass property path not actually needed in these cases and I'm too dense to see why?
Related
I want to find all children's fiction writers using Wikidata SPARQL query. But I couldn't figure out how? Can someone help, please? The following is my approach but I don't think it is the correct way.
SELECT ?item ?itemLabel {
?item wdt:P31 wd:Q5. #find humans
?item wdt:P106 wd: #humans whose occupation is a novelist
[another condition needed] #children's fiction.
SERVICE wikibase:label {bd:serviceParam wikibase:language 'en'.}
} LIMIT 10
There is not one correct way, especially not in Wikidata where not all items of the same kind necessarily have the same properties.
One way would be to find the authors of works that are intended for (P2360) children:
# it’s a literary work (incl. any sublasses)
?book wdt:P31/wdt:P279* wd:Q7725634 .
# the literary work is intended for children
?book wdt:P2360 wd:Q7569 .
# the literary work has an author
?book wdt:P50 ?author .
# the author is a US citizen
?author wdt:P27 wd:Q30 .
Instead of getting all works that belong to the class "literary work" or any of its subclasses, you could decide to use only the class "fiction literature" (Q38072107) instead; with the risk that not all relevant works use this class.
Another way would be to find all authors that have "children’s writer" (Q4853732), or any of its subclasses, as occupation:
?author wdt:P106/wdt:P279* wd:Q4853732 .
?author wdt:P27 wd:Q30 .
As the different ways might find different results, you could could use them in the same query, using UNION:
SELECT DISTINCT ?author ?authorLabel
WHERE {
{
# way 1
}
UNION
{
# way 2
}
UNION
{
# way 3
}
SERVICE wikibase:label {bd:serviceParam wikibase:language 'en'.}
}
In Wikidata, I want to find an item's country. Either directly if the item has a country directly, or by climbing up the P131s (located in the administrative territorial entity) until I find a country. Here is the query:
?item wdt:P131*/wdt:P17 ?country.
The query above works fine... except when a sub-division used to belong to another country, like for Q25270 (Prishtina). In such case, the result can be anachronistic. That's what I want to fix.
Great news: in such cases we should only consider the unique P131 (located in the administrative territorial entity) that has no P582 (end time) sub-property attached to it, and the problem is solved!
My question: how to alter my query above to achieve that?
Example: Let's say MyItem is in MyStreet is in MyTown is in MyRegion is in MyCountry, I must make sure that MyStreet, MyTown, and MyRegion do not have a P582 (end time).
(If "sub-property" is not the correct term, please let me know the right term and I will fix the question, thanks!)
An attempt
The query below works in most cases, but unfortunately it has a bug: It finds the wrong country in cases where the current country was also the country in the past (for instance Alsace belonged to France until 1871 then to Germany and currently to France again).
SELECT DISTINCT ?country WHERE {
wd:Q6556803 wdt:P131* ?area .
?area wdt:P17 ?country .
OPTIONAL {
wd:Q6556803 wdt:P131*/p:P131 [
pq:P582 ?endTime; ps:P131/wdt:P131* ?area
] .
} .
FILTER( !BOUND( ?endTime ) ) .
}
Wikidata uses different properties for direct links and links with extra information. So, for the statement "Prishtina is located in the administrative territorial entity Socialist Autonomous Province of Kosovo", there's the simple triple:
wd:Q25270 wdt:P131 wd:Q646035
And the long form with additional information (the end time):
wd:Q25270 p:P131 wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b .
wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b ps:P131 wd:Q646035 ;
pq:P582 "1990-01-01T00:00:00Z"
So, we need to filter out all paths with an end time (pq:582):
SELECT DISTINCT ?s ?sLabel ?country ?countryLabel {
VALUES ?s {
wd:Q25270
}
?s wdt:P131* ?area .
?area wdt:P17 ?country .
FILTER NOT EXISTS {
?s p:P131/(ps:P131/p:P131)* ?statement .
?statement ps:P131 ?area .
?s p:P131/(ps:P131/p:P131)* ?intermediateStatement .
?intermediateStatement (ps:P131/p:P131)* ?statement .
?intermediateStatement pq:P582 ?endTime .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
limit 50
Here, ?intermediateStatement is a statement with an end time on the path from ?s to a country.
This query does seem to time out if there is more than one value set for ?s. Also, the query does not take into account that there might exist multiple links from an item to an area where one has a timestamp and the other doesn't (both paths will be filtered out).
What the asterisk mean in this SPARQL query?
SELECT ?uri ?type
WHERE{
?uri a ?type.
?type rdfs:subClassOf* example:Device.
}
Does it mean "subclass of a subclass"?
Can I use it with other predicates?
An asterisk (*) after a path element means “zero or more of this element”.
If there are no other elements in the path, ?a something* ?b means that ?b might also just be ?a directly, with no path elements between them at all.
?item wdt:P31/wdt:P279* ?class.
# means:
?item wdt:P31 ?class
# or
?item wdt:P31/wdt:P279 ?class
# or
?item wdt:P31/wdt:P279/wdt:P279 ?class
# or
?item wdt:P31/wdt:P279/wdt:P279/wdt:P279 ?class
See here for more detailed answer.
The asterisk after a predicate means that you want to follow a property path with zero or more occurrences of rdfs:subClassOf.
Your phrase "subclass of a subclass" is about right, although I would say "subclasses of subclasses," because the * property path is recursive. As you can see from the technical document in AKSW's comment, there are several other property path operators that go in either direction, with or without a limit on the number of occerances (or depth.)
Here's a pretty good example from Marklogic... I think this should work within any 1.1 endpoint.
https://developer.marklogic.com/features/semantics/path-examples
Yes, property paths are applicable to any predicate/property, not just rdfs:subClassOf.
How can I list properties with their values for any given DBpedia class? I'm new to this and have looked at several other questions on this but I haven't found exactly what I'm looking for.
What I'm trying to do is providing some relevant additional information to topics of conversation I have got from text mining.
Say for example the topic of conversation in a certain community is iPhones. I would like to use this word to query the DBpedia page for this word, IPhone, to get an output such as:
Type: Smartphone
Operating System: IOS
Manufacturer: Foxconn
EDIT:
Using the query from AKSW I can print the p (property?) and o (object?), although I'm still not getting the output I want. Instead of getting something like:
weight: 133.0
I get
http://dbpedia.org/property/weight:133.0
Is there a way to just get the name of the property instead of the DBpedia link?
My Code
Classes do not "have" properties with values. Instances (resp. resources or individuals) do have a relationship via a property to some value which can be an individual itself or a literal (or some anonymous instance aka blank node). And instances belong to a class. e.g. Berlin belongs to the class City
What you want is to get all outgoing values of a given resource in DBpedia:
SELECT * WHERE { <http://dbpedia.org/resource/IPhone> ?p ?o }
Alternatively, you can use SPARQL DESCRIBE, which return the data in forms of an RDF graph resp. a set of RDF triples:
DESCRIBE <http://dbpedia.org/resource/IPhone>
This might also return incoming information because it's not really specified in the W3C recommendation what has to be returned.
As stated by AKSW properties often link to other classes rather than values. If you want all properties and their values, including other classes the the below gives you the label and filters by language (put the language code you need where have put "en").
SELECT DISTINCT ?label ?o
WHERE {
<http://dbpedia.org/resource/IPhone> ?p ?o.
?p <http://www.w3.org/2000/01/rdf-schema#label> ?label .
FILTER(LANG(?label) = "" || LANGMATCHES(LANG(?label), "en"))
}
If you don't want any properties that link to other classes, then you only want datatype properties so this code could help:
SELECT DISTINCT ?label ?o
WHERE {
<http://dbpedia.org/resource/IPhone> ?p ?o.
?p <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?p a owl:DatatypeProperty .
FILTER(LANG(?label) = "" || LANGMATCHES(LANG(?label), "en"))
}
Obviously this gives you far less information and functionality, but it might just be what you're after?
Edit: In reply to your comment, it is also possible to get the labels for the values, using the same technique:
SELECT DISTINCT ?label ?oLabel
WHERE {
<http://dbpedia.org/resource/IPhone> ?p ?o.
?p <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?o <http://www.w3.org/2000/01/rdf-schema#label> ?oLabel
FILTER(LANG(?label) = "" || LANGMATCHES(LANG(?label), "en"))
}
Note that http://www.w3.org/2000/01/rdf-schema#label is often shortened to rdfs:label by defining prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
So you could also do:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?label ?oLabel
WHERE {
<http://dbpedia.org/resource/IPhone> ?p ?o.
?p rdfs:label ?label .
?o rdfs:label ?oLabel
FILTER(LANG(?label) = "" || LANGMATCHES(LANG(?label), "en"))
}
and get exactly the same result but possibly easier to read.
I'm trying to develop a tool in JS for tagging pictures, so I need a set of possible "things" from dbpedia. I already tryed to retrieve this way:
select ?s ?l {
?s a owl:Class .
?s rdf:type ?l
FILTER regex(str(?s), "House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Fs+%3Fl+%7B%0D%0A+++%3Fs+a+owl%3AClass+.%0D%0A+++%3Fs+rdf%3Atype+%3Fl%0D%0A+++FILTER+regex%28str%28%3Fs%29%2C+%22House%22%2C+%22i%22%29.%0D%0A%7D
And also this way:
select ?label
WHERE {
?concept a skos:Concept.
?concept skos:prefLabel ?label.
FILTER regex(str(?label), "^House", "i").
}
http://dbpedia.org/snorql/?query=select+%3Flabel+%0D%0AWHERE+%7B%0D%0A++%3Fconcept+a+skos%3AConcept.%0D%0A++%3Fconcept+skos%3AprefLabel+%3Flabel.%0D%0A++FILTER+regex%28str%28%3Flabel%29%2C+%22%5EHouse%22%2C+%22i%22%29.%0D%0A%7D
In the first case, I just have "instances" of the house "thing", but not the "House" class itself. In the second one, I never retrieve the "house" and the similar thing is "houses". Any alternative for retrieving a better vocabulary based in dbpedia dataset?
If you don't bother to restrict yourself to owl:Thing or to skos:Concept, you can just get things that have a label that contains "house". Rather than using regex, I chose to use contains and lcase, since a string containment could be less expensive than invoking a full regular expression processor.
select ?thing ?label where {
?thing rdfs:label ?label .
filter contains(lcase(?label), "house")
}
SPARQL results (limited to 200)