How to filter wikidata labels in concept search? - sparql

I am using the following code to get the wikidata labels for the given concept (e.g., network analysis).
SELECT ?item {
VALUES ?searchTerm { "network analysis" }
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "EntitySearch".
bd:serviceParam wikibase:endpoint "www.wikidata.org".
bd:serviceParam wikibase:limit 3 .
bd:serviceParam mwapi:search ?searchTerm.
bd:serviceParam mwapi:language "en".
?item wikibase:apiOutputItem mwapi:item.
?num wikibase:apiOrdinal true.
}
?item (wdt:P279|wdt:P31) ?type
}
ORDER BY ?searchTerm ?num
This returns the following wikidata labels.
wd:Q618079 --> related to electronics
wd:Q4417999 --> related to graph theory (computer science)
wd:Q60640547 --> related to scholary article
I would like to get the wikidata labels that are only related to computer science (i.e. wd:Q4417999 in the above example).
In DBpedia I ran the below query to identify if a word is in computer science.
sparql.setQuery(" ASK { dbc:Network_analysis skos:broader{1,7} dbc:Computer_science } ")
Is it possible to do the same in wikidata (i.e. check if computer science is an ancestor of the given concept and returns only that wikidata label).
If there is a better solution than performing ancestoral search please do suggest me.
I am happy to provide more details if needed.

The sparql query mentioned below solved my issue.
SELECT DISTINCT ?item {
VALUES ?searchTerm { "network analysis"}
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "EntitySearch".
bd:serviceParam wikibase:endpoint "www.wikidata.org".
bd:serviceParam wikibase:limit 3 .
bd:serviceParam mwapi:search ?searchTerm.
bd:serviceParam mwapi:language "en".
?item wikibase:apiOutputItem mwapi:item.
?num wikibase:apiOrdinal true.
}
?item (wdt:P279|wdt:P31) ?type
filter exists {?type wdt:P279*/wdt:P361* wd:Q21198}
}
ORDER BY ?searchTerm ?num

Related

How to get IMDb links of all films in a category page on Wikipedia?

I want to get the IMDb links of all films in a category page on Wikipedia, for example: https://en.wikipedia.org/wiki/Category:American_historical_films. I have heard of Wikidata Query Service and PetScan but honestly they are too complicated for me to learn at the moment. Could you please help me how to request the IMDb links using these tools? Thanks in advance.
Here is the Wikidata query code
SELECT ?item ?title ?imdb WHERE {
{
SELECT ?item ?title WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator".
bd:serviceParam wikibase:endpoint "en.wikipedia.org".
bd:serviceParam mwapi:generator "categorymembers".
bd:serviceParam mwapi:gcmtitle "category:American_historical_films";
mwapi:gcmprop "ids|title";
mwapi:gcmtype "page";
mwapi:gcmlimit "max".
?title wikibase:apiOutput mwapi:title .
?item wikibase:apiOutputItem mwapi:item.
}
}LIMIT 100
}
?item wdt:P345 ?imdb.
} LIMIT 100
Wikidata query URL with result limited to 100

Wikidata mwapi Search multiple terms at the same time

is it possible to search multiple term at the same time ? Using the OR operator, such as if I want to search Chicago OR New York City, in media wiki api at https://query.wikidata.org/
Below is example of search just Chicago, also how to search exact term ? I don't want "123 Chicago" to be returned, I just want "Chicago"
SELECT ?item ?itemLabel WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "www.wikidata.org";
wikibase:api "EntitySearch";
mwapi:search "Chicago";
mwapi:language "en".
?item wikibase:apiOutputItem mwapi:item.
}
SERVICE wikibase:label {bd:serviceParam wikibase:language "en".}
}

How to find items with no value for a property

With the Wikidata Query Service (which I am new to), I am trying to find items that have no value for a property. In this case, I am looking for instances of (P31) humans (Q5) with no sex or gender (P21). My code is really basic:
SELECT ?item ?itemLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?item wdt:P21 wd:Q6581072.
?item wdt:P31 wd:Q5.
}
LIMIT 100
Line 3 restricts it to finding things with female as the sex or gender. What could I replace it with that would make it only find things with no value for P21? The guides that I've found and a bit of googling don't seem to have stuff about looking for things without a value for a given property.
As discussed in the comments...
SELECT ?item ?itemLabel
WHERE
{
SERVICE wikibase:label
{ bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
FILTER NOT EXISTS { ?item wdt:P21 ?val }
?item wdt:P31 wd:Q5
}
LIMIT 100

How to make a sparql query inside another sparql query?

I am trying to make a sparql query inside another sparql query. In sql, we can do it like this:
SELECT column_name(s)
FROM table_name
WHERE column_name IN (SELECT STATEMENT);
I want to do the same thing in SPARQL query. Specifically I have two sparql queries and I want to combine them together. My end goal is to find 'Siemens PLM Software Company's subsidiaries. In order to do this, first i need to find the company's id and then look for its subsidiaries.
Q1: Finds the unique identity of 'Siemens PLM Software Company'
SELECT DISTINCT ?item ?label ?articleLabel WHERE {
?item ?label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
?article schema:about ?item;
schema:inLanguage "en".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Returns Q15898201
Q2: Find the subsidiary of 'Siemens PLM Software Company'
SELECT ?Subsidiary ?SubsidiaryLabel ?parent_organization ?parent_organizationLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?Subsidiary wdt:P749 wd:Q15898201.
OPTIONAL { ?Subsidiary wdt:P749 ?parent_organization. }
Returns Siemens
I would like to combine them together to something like this:
SELECT ?Subsidiary ?SubsidiaryLabel ?parent_organization ?parent_organizationLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?Subsidiary wdt:P749 wd:
{SELECT DISTINCT ?item ?label ?articleLabel WHERE {
?item ?label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
?article schema:about ?item;
schema:inLanguage "en".
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
}.
OPTIONAL { ?Subsidiary wdt:P749 ?parent_organization. }
Do you have any idea how can i do this? Thank you!
There were several things not quite clear:
Did you really mean to return the rdfs:label property as ?label? I don't understand why you'd want to do that, so I assume you mean to just directly match on the rdfs:label property without returning it
Why return ?articleLabel? It's not a real label anyway but just the value of ?label as a literal. I assume you mean to just return ?article.
It seems you have the parent organization relationship backwards? Your Q2 asks for subsidiaries of Siemens PLM Software. None exist. To ask for Siemens, you need to ask for
wd:Q15898201 wdt:P749 ?parent_organization
and not
?Subsidiary wdt:P749 wd:Q15898201
With that out of the way: There is no need for a subquery here. The query can be achieved simply by writing out the graph pattern for the desired graph structure, using OPTIONAL for parts that may not exist, and making sure that the variable names match up correctly throughout the query:
SELECT ?item ?itemLabel ?parent_organization ?parent_organizationLabel WHERE {
# Find business by label
?item rdfs:label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
# Find English-language articles about the business
?article schema:about ?item;
schema:inLanguage "en".
# Find the business' parent organization, if one exists
OPTIONAL { ?item wdt:P749 ?parent_organization. }
# For any variable ?xxx, add variable ?xxxLabel with the label
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
That being said, you can use a subquery if you really want to:
SELECT ?item ?itemLabel ?parent_organization ?parent_organizationLabel WHERE {
{
SELECT ?item {
?item rdfs:label "Siemens PLM Software"#en;
wdt:P31 wd:Q4830453.
}
}
?article schema:about ?item;
schema:inLanguage "en".
OPTIONAL { ?item wdt:P749 ?parent_organization. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
This is equivalent to the first query, but runs much slower because the query optimiser is not as good at handling subqueries.

Filter by type in Wikidata

This SPARQL request looks for all cities called "Berlin" in Wikidata:
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?type (a | wdt:P279) wd:Q515. # Sub-type of city
?item wdt:P31 ?type.
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
PROBLEM: It returns zero result.
Meanwhile, the request below correctly finds Q64 (capital and city-state of Germany), but it also returns a lot of other things called Berlin, so I want to filter on cities (then in a future phase I will order these cities by population, but that is outside the scope of this question):
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note: My code for getting instances of subclasses of city (Berlin is a big city which is subclass of city) seems to work correctly, as illustrated by the results of this query.
It was a Wikidata bug.
According to Wikidata's Jura1, it was a bug in Wikidata caused by someone's experiments with "preferred rank".
Discussion at https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2016/09#P31_inconsistency
The bug has been fixed just now.
You can only query for data that is contained in the dataset.
If you try an alternative of your query
SELECT DISTINCT ?item ?itemLabel ?itemDescription ?type1 ?type2 WHERE {
?item rdfs:label "Berlin"#en.
optional{?item rdf:type ?type1 }
optional{?item wdt:P279 ?type2 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
it returns no types, neither connected by rdf:type nor wdt:P279.
If you have a look at the entity of the capital and city state Berlin, you can see that there is information about "instance of", but this property is supposed to be https://www.wikidata.org/wiki/Property:P31. And none of them links to wd:Q515, I'm wondering from where you got this idea.
But to be honest, I don't know that much about Wikidata and to me, it's not clear why no rdf:type is used, but a common pattern for RDF datasets is to use
?s rdf:type/rdfs:subClassOf* SUPER_CLASS .
if we assume that there is rdf:type information available.
If you check the types wd:Q64 is an instance of
SELECT DISTINCT ?type ?typeLabel WHERE {
wd:Q64 (a | wdt:P31) ?type.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?item
None of them are City (wd:Q515) or a sub-class of it.
Looks like a data issue. Perhaps you should contact Wikidata.