How to follow a path depend on qualifier SPARQL - sparql

I like to query out all people who is connected to each other but filter by qualifier value of the path
For example, the query below will get all human related to Putin. But his spouse, whose in statement get the qualifier "endtime" should not be followed
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?pep ?pepLabel ?relation ?relationLabel ?relatedPerson ?relatedPersonLabel ?endtimequalifier
WHERE
{
VALUES ?pep {wd:Q7747}
?relatedPerson wdt:P31 wd:Q5.
?pep ?relation ?relatedPerson.
#What should I put here for the query to ignore the spouse since the endtimequalifier is available
OPTIONAL{
?pep p:P26 [ps:P26 ?spouse; pq:P582 ?endtimequalifier ].
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Related

How can I pull the Wikidata description?

I'm trying to pull the wikidata description of biblical figures.
For example, for David, this would pull: king of Israel and Judah.
Here's what I started with:
select ?person ?personLabel
where {
?person wdt:P31 wd:Q20643955.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
From Wikidata Query Service/User Manual ยง Label_service:
You can fetch the label, alias, or description [...] In automatic mode, you only need to specify the service template, e.g.:
PREFIX wikibase: <http://wikiba.se/ontology#>
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
and WDQS will automatically generate labels as follows:
If an unbound variable in SELECT is named ?NAMELabel, then WDQS produces the label (rdfs:label) for the entity in variable ?NAME.
If an unbound variable in SELECT is named ?NAMEAltLabel, then WDQS produces the alias (skos:altLabel) for the entity in variable
?NAME.
If an unbound variable in SELECT is named ?NAMEDescription, then WDQS produces the description (schema:description) for the
entity in variable ?NAME.
So you can just specify ?personDescription among your selected variables.
Here an example.

Identify entity of a Wikipedia page

My question is related to a similar question/comment which unfortunately never received an answer.
Given a list of multiple Wikipedia pages, e.g.:
https://en.wikipedia.org/wiki/Donald_Trump
https://en.wikipedia.org/wiki/The_Matrix
https://en.wikipedia.org/wiki/Tiger
...
how can I find out what type of entity these articles refer to. i.e. ideally I would want something on a higher level e.g. person, movie, animal etc.
My best guess so far was the Wikidata API using SPARQL to move back the instance_of or subclass tree. However, this did not lead to meaningful results.
SELECT ?lemma ?item ?itemLabel ?itemDescription ?instance ?instanceLabel ?subclassLabel WHERE {
VALUES ?lemma {
"Donald Trump"#en
"The Matrix"#en
"Tiger" #en
}
?sitelink schema:about ?item;
schema:isPartOf <https://en.wikipedia.org/>;
schema:name ?lemma.
?item wdt:P31* ?instance.
?item wdt:P279* ?subclass.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,da,sv".}
}
The result can be seen here: https://w.wiki/ZmQ
One option would of course also be to look at the itemDescription, but I'm afraid that this is too granular to build meaningful groups from larger lists and count frequencies later on.
Does anyone have a hint/idea on how to get more general entity categories? Maybe also from the mediawiki API?
Any input would be highly appreciated!
Here are three possibilities, side-by-side:
SELECT ?lemma ?item (GROUP_CONCAT(DISTINCT ?instanceLabel; SEPARATOR = " ") AS ?a) (GROUP_CONCAT(DISTINCT ?subclassLabel; SEPARATOR = " ") AS ?b) (GROUP_CONCAT(DISTINCT ?isaLabel; SEPARATOR = " ") AS ?c) WHERE {
VALUES ?lemma {
"Donald Trump"#en
"The Matrix"#en
"Tiger"#en
}
?sitelink schema:about ?item;
schema:isPartOf <https://en.wikipedia.org/>;
schema:name ?lemma.
OPTIONAL { ?item (wdt:P31/(wdt:P279*)) ?instance. }
OPTIONAL { ?item wdt:P279 ?subclass. }
OPTIONAL { ?item wdt:P31 ?isa. }
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en,da,sv".
?instance rdfs:label ?instanceLabel.
?subclass rdfs:label ?subclassLabel.
?isa rdfs:label ?isaLabel.
}
# Here, you could add: FILTER(?instanceLabel in ("mammal"#en, "movie"#en, "musical"#en (and so on...)))
}
GROUP BY ?lemma ?item
Live here.
If you're looking at labels such as "film" and "mammal", i. e. a couple dozen at most, you could explicitly list them in order of preference, then use the first one that occurs.
Note that you may be running into this bug: https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial#wikibase:Label_and_aggregations_bug

How to make a sparql query inside another sparql query?

I am trying to make a sparql query inside another sparql query. In sql, we can do it like this:
SELECT column_name(s)
FROM table_name
WHERE column_name IN (SELECT STATEMENT);
I want to do the same thing in SPARQL query. Specifically I have two sparql queries and I want to combine them together. My end goal is to find 'Siemens PLM Software Company's subsidiaries. In order to do this, first i need to find the company's id and then look for its subsidiaries.
Q1: Finds the unique identity of 'Siemens PLM Software Company'
SELECT DISTINCT ?item ?label ?articleLabel WHERE {
?item ?label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
?article schema:about ?item;
schema:inLanguage "en".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Returns Q15898201
Q2: Find the subsidiary of 'Siemens PLM Software Company'
SELECT ?Subsidiary ?SubsidiaryLabel ?parent_organization ?parent_organizationLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?Subsidiary wdt:P749 wd:Q15898201.
OPTIONAL { ?Subsidiary wdt:P749 ?parent_organization. }
Returns Siemens
I would like to combine them together to something like this:
SELECT ?Subsidiary ?SubsidiaryLabel ?parent_organization ?parent_organizationLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?Subsidiary wdt:P749 wd:
{SELECT DISTINCT ?item ?label ?articleLabel WHERE {
?item ?label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
?article schema:about ?item;
schema:inLanguage "en".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
}.
OPTIONAL { ?Subsidiary wdt:P749 ?parent_organization. }
Do you have any idea how can i do this? Thank you!
There were several things not quite clear:
Did you really mean to return the rdfs:label property as ?label? I don't understand why you'd want to do that, so I assume you mean to just directly match on the rdfs:label property without returning it
Why return ?articleLabel? It's not a real label anyway but just the value of ?label as a literal. I assume you mean to just return ?article.
It seems you have the parent organization relationship backwards? Your Q2 asks for subsidiaries of Siemens PLM Software. None exist. To ask for Siemens, you need to ask for
wd:Q15898201 wdt:P749 ?parent_organization
and not
?Subsidiary wdt:P749 wd:Q15898201
With that out of the way: There is no need for a subquery here. The query can be achieved simply by writing out the graph pattern for the desired graph structure, using OPTIONAL for parts that may not exist, and making sure that the variable names match up correctly throughout the query:
SELECT ?item ?itemLabel ?parent_organization ?parent_organizationLabel WHERE {
# Find business by label
?item rdfs:label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
# Find English-language articles about the business
?article schema:about ?item;
schema:inLanguage "en".
# Find the business' parent organization, if one exists
OPTIONAL { ?item wdt:P749 ?parent_organization. }
# For any variable ?xxx, add variable ?xxxLabel with the label
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
That being said, you can use a subquery if you really want to:
SELECT ?item ?itemLabel ?parent_organization ?parent_organizationLabel WHERE {
{
SELECT ?item {
?item rdfs:label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
}
}
?article schema:about ?item;
schema:inLanguage "en".
OPTIONAL { ?item wdt:P749 ?parent_organization. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
This is equivalent to the first query, but runs much slower because the query optimiser is not as good at handling subqueries.

Querying wikidata for "property constraint"

TL;DR
How to query (sparql) about properties of a property?
Or..
So as part of my project I need to find the properties in wikidata that have any time constraint, to be specific both "start time" and "end time".
I tried this query:
SELECT DISTINCT ?prop WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?person wdt:P31 wd:Q5.
?person ?prop ?statement.
?statement pq:P580 ?starttime.
?statement pq:P582 ?endtime.
}
LIMIT 200
**yeah the properties should be related to humans
Anyway, I do get some good results like:
http://www.wikidata.org/prop/P26
http://www.wikidata.org/prop/P39
But I also get some other properties that definitely wrong.
so, basically what i'm trying to do is to get a list of properties that has the property constraint (P2302) of- allowed qualifiers constraint (Q21510851) with Start time (P580) and End Time (P582)
is that even possible:
I tried some queries like:
SELECT DISTINCT ?property ?propertyLabel ?propertyDescription ?subpTypeOf ?subpTypeOfLabel
WHERE
{
?property rdf:type wikibase:Property .
?property wdt:P2302 ?subpTypeOf.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
but does not get the results I wanted.
is it even possible to query this kind of stuff?
Thanks
Qualifiers are used on property pages too. Your second query should be:
SELECT DISTINCT ?prop ?propLabel {
?prop p:P2302 [ ps:P2302 wd:Q21510851 ; pq:P2306 wd:P580, wd:P582 ] ;
p:P2302 [ ps:P2302 wd:Q21503250 ; pq:P2308 wd:Q5 ; pq:P2309 wd:Q21503252 ] .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
} ORDER BY ASC(xsd:integer(strafter(str(?prop), concat(str(wd:), "P"))))
Try it!
Your first query is correct, but note that this is an 'as-is' query. For example, wd:P410 does not have respective constraints, but look at wd:Q83855.

Filter by type in Wikidata

This SPARQL request looks for all cities called "Berlin" in Wikidata:
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?type (a | wdt:P279) wd:Q515. # Sub-type of city
?item wdt:P31 ?type.
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
PROBLEM: It returns zero result.
Meanwhile, the request below correctly finds Q64 (capital and city-state of Germany), but it also returns a lot of other things called Berlin, so I want to filter on cities (then in a future phase I will order these cities by population, but that is outside the scope of this question):
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note: My code for getting instances of subclasses of city (Berlin is a big city which is subclass of city) seems to work correctly, as illustrated by the results of this query.
It was a Wikidata bug.
According to Wikidata's Jura1, it was a bug in Wikidata caused by someone's experiments with "preferred rank".
Discussion at https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2016/09#P31_inconsistency
The bug has been fixed just now.
You can only query for data that is contained in the dataset.
If you try an alternative of your query
SELECT DISTINCT ?item ?itemLabel ?itemDescription ?type1 ?type2 WHERE {
?item rdfs:label "Berlin"#en.
optional{?item rdf:type ?type1 }
optional{?item wdt:P279 ?type2 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
it returns no types, neither connected by rdf:type nor wdt:P279.
If you have a look at the entity of the capital and city state Berlin, you can see that there is information about "instance of", but this property is supposed to be https://www.wikidata.org/wiki/Property:P31. And none of them links to wd:Q515, I'm wondering from where you got this idea.
But to be honest, I don't know that much about Wikidata and to me, it's not clear why no rdf:type is used, but a common pattern for RDF datasets is to use
?s rdf:type/rdfs:subClassOf* SUPER_CLASS .
if we assume that there is rdf:type information available.
If you check the types wd:Q64 is an instance of
SELECT DISTINCT ?type ?typeLabel WHERE {
wd:Q64 (a | wdt:P31) ?type.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?item
None of them are City (wd:Q515) or a sub-class of it.
Looks like a data issue. Perhaps you should contact Wikidata.