Multiple SELECT query for getting all children of a given root - sparql

I want to define the root categories corresponding to interests of users. Then I need to return all other potential interests under given root directory.
I tried the following query, but it looks like it enters into a loop (the query is executing internally).
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.myweb.com/myontology.owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?user ?othercat
WHERE
{
?othercat rdfs:subClassOf ?root .
{
SELECT ?user ?retailcat ?root
WHERE {
?user rdf:type owl:User .
?user owl:hasUserProfile ?userprofile .
?userprofile rdf:type owl:UserProfile .
?userprofile owl:interestedIn ?retailcat .
?entity rdf:type ?type .
?type rdfs:subClassOf* ?retailcat .
?retailcat rdfs:subClassOf ?root .
}
}
}
Indeed when I execute the sub-query, it works fine, but it returns current interests of a user, without providing the information of other child-concepts of the same root.
How to solve this issue?

I tried the following query, but it looks like it enters into a loop (the query is executing internally).
There shouldn't be a way for a SPARQL query to enter into a loop. Subqueries are executed first, and then the enclosing query is executed. There's no way to re-execute the inner part or anything like that. (Of course, a query could be expensive and take a long time, but that's not a loop, and is still bounded, in principle.)
As an aside, using owl: as prefix for something other than the standard OWL namespace is somewhat confusing, and likely to mislead other developers when they see your query. There's nothing incorrect about it per se, but you might want to consider using a different prefix for your namespace.
You do have one part of your query that could make things rather expensive. You have the pattern
?entity rdf:type ?type .
?type rdfs:subClassOf* ?retailcat .
where ?entity isn't connected to anything else, and isn't used anywhere else. That means that you'll have a subquery solution for every possible value of ?entity, and that just means you're multiplying the number of results by the number of possible values of ?entity.
If I understand your query, you're trying to go from a user's categories of interest, up the category tree to some root concepts and then find other categories under that root. You don't actually need a subquery for that; you can do it with a non-nested query:
select ?user ?othercat {
?user rdf:type owl:User .
?user owl:hasUserProfile ?userprofile .
?userprofile rdf:type owl:UserProfile .
?userprofile owl:interestedIn ?retailcat .
?retailcat rdfs:subClassOf ?root .
?othercat rdfs:subClassOf ?root .
}
That will find values of ?othercat that are siblings of ?retailcat, along with ?retailcat itself. If you want to avoid ?retailcat, you can add
filter(?othercat != ?retailcat)
but that shouldn't really impact performance much; that's just one result to filter out.
The only other factor that you might want to consider is that you're not really finding a "root" of the category tree with rdfs:subClassOf; you're just going up one level. E.g., if your category tree looks like
Automobile
SUV
SportsCar
Corvette
Mustang
and a user is interested in Mustang, then you'll go up to SportsCar and find Corvette, but you won't be going any farther up the tree. (If you have inference available, you may actually go farther up the tree, but I'm assuming for the moment that you don't.) To follow the subclass links up the tree, you can add * to the path to follow the chain:
?retailcat rdfs:subClassOf* ?root .
?othercat rdfs:subClassOf ?root .
Then you'd get all the classes in the tree (except the very top level one, Automobile).

Related

using bind concat in construct query

I have the following query
CONSTRUCT{
?entity a something;
a label ?label .
}
WHERE
{
?entity a something;
a label ?label .
BIND(CONCAT(STR( ?label ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") ) AS ?label ) .
}
I simply want to concatenate some text with ?label, however when running the query I get the following error:
BIND clause alias '?label' was previously used
I only want to return a single instance of ?label hence, I defined it in the construct clause.
The error message seems to be accurate, but is only the first of many you will get with this query. The usual request to take a look at some SPARQL learning resources to at least understand the basics of triple-based graph pattern matching, along with, a couple of hints one what to look for. CONSTRUCT isn't a bad place to start, and the following should almost do what I think you intend:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT{
?entity rdfs:label ?label .
}
WHERE
{
?entity a ex:something ;
rdfs:label ?oldlabel .
BIND(CONCAT(STR( ?oldlabel ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") ) AS ?label ) .
}
There's quite a few things different about that query, so take a look to see if it accurately does what you want. One hint is the syntactic difference between using '.' and ';' to separate the triple patterns. Another is that each clause defines either a URL, using a qname in the example, or a variable, prefixed by a '?'. Neither 'label' or 'something' are valid.
I say "almost" because CONSTRUCT only returns a set of triples. To modify the labels, which I think is the intent, you need to use SPARQL Update, i.e.:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.org/example#>
DELETE {
?entity rdfs:label ?oldlabel .
}
INSERT{
?entity rdfs:label ?label .
}
WHERE
{
?entity a ex:something .
?entity rdfs:label ?oldlabel .
BIND(CONCAT(STR( ?oldlabel ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") AS ?label ) .
}
Note how the triple pattern finds matches for ?oldlabel and deletes them, inserting the newly bound ?label instead. This query assumes a default graph is defined that holds both the original data and the target for updates. If not then the graph needs to be specified using WITH or GRAPH. (Also included another hint on the syntactic difference between using '.' and ';' to separate triple patterns.)

Removing unwanted superclass answers in SPARQL

I have an OWL file that includes a taxonomic hierarchy that I want to write a query where the answer includes each individual and its immediate taxonomic parent. Here's an example (the full query is rather messier).
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix rdf: <http:://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix : <urn:ex:> .
:fido rdf:type :Dog .
:Dog rdfs:subClassOf :Mammal .
:Mammal rdfs:subClassOf :Vertebrate .
:Vertebrate rdfs:subClassOf :Animal .
:fido :hasToy :bone
:kitty rdf:type :Cat .
:Cat rdfs:subClassOf :Mammal .
:kitty :hasToy :catnipMouse .
And this query does what I want.
prefix rdf: <http:://www.w3.org/1999/02/22-rdf-syntax-ns#> .
prefix : <urn:ex:> .
SELECT ?individual ?type
WHERE {
?individual :hasToy :bone .
?individual rdf:type ?type .
}
The problem is that I'd rather use a reasoned-over version of the OWL file, which unsurprisingly includes additional statements:
:fido rdf:type :Mammal .
:fido rdf:type :Vertebrate .
:fido rdf:type :Animal .
:kitty rdf:type :Mammal .
:kitty rdf:type :Vertebrate .
:kitty rdf:type :Animal .
And now the query results in additional answers about Fido being a Mammal, etc. I could just give up on using the reasoned version of the file, or, since the SPARQL queries are called from java, I could do a bunch of additional queries to find the least inclusive type that appears. My question is whether there is a reasonable pure SPARQL solution to only returning the Dog solution.
A generic solution is that you make sure you ask for the direct type only. A class C is the direct type of an instance X if:
X is of type C
there is no C' such that:
X is of type C'
C' is a subclass of C
C' is not equal to C
(that last condition is necessary, by the way, because in RDF/OWL, the subclass-relation is reflexive: every class is a subclass of itself)
In SPARQL, this becomes something like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <urn:ex:> .
SELECT ?individual ?type
WHERE {
?individual :hasToy :bone .
?individual a ?type .
FILTER NOT EXISTS { ?individual a ?other .
?other rdfs:subClassOf ?type .
FILTER(?other != ?type)
}
}
Depending on which API/triplestore/library you use to execute these queries, there may also be other, tool-specific solutions. For example, the Sesame API (disclosure: I am on the Sesame dev team) has the option to disable reasoning for the purpose of a single query:
TupleQuery query = conn.prepareTupleQuery(SPARQL, "SELECT ...");
query.setIncludeInferred(false);
TupleQueryResult result = query.evaluate();
Sesame also offers an optional additional inferencer (called the 'direct type inferencer') which introduces additional 'virtual' properties you can query, such as sesame:directType, sesame:directSubClassOf, etc. Other tools will undoubtedly have similar options.

Sparql to recover the Type of a DBpedia resource

I need a Sparql query to recover the Type of a specific DBpedia resource. Eg.:
pt.DBpedia resource: http://pt.dbpedia.org/resource/Argentina
Expected type: Country (as can be seen at http://pt.dbpedia.org/page/Argentina)
Using pt.DBpedia Sparql Virtuoso Interface (http://pt.dbpedia.org/sparql) I have the query below:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?l ?t where {
?l rdfs:label "Argentina"#pt .
?l rdf:type ?t .
}
But it is not recovering anything, just print the variable names. The virtuoso answer.
Actually I do not need to recover the label (?l) too.
Anyone can fix it, or help me to define the correct query?
http in graph name
I'm not sure how you generated your query string, but when I copy and paste your query into the endpoint and run it, I get results, and the resulting URL looks like:
http://pt.dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fpt.dbpedia.org&sho...
However, the link in your question is:
http://pt.dbpedia.org/sparql?default-graph-uri=pt.dbpedia.org%2F&should-sponge...
If you look carefully, you'll see that the default-graph-uri parameters are different:
yours: pt.dbpedia.org%2F
mine: http%3A%2F%2Fpt.dbpedia.org
I'm not sure how you got a URL like the one you did, but it's not right; the default-graph-uri needs to be http://pt.dbpedia.org, not pt.dbpedia.org/.
The query is fine
When I run the query you've provided at the endpoint you've linked to, I get the results that I'd expect. It's worth noting that the label here is the literal "Argentina"#pt, and that what you've called ?l is the individual, not the label. The individual ?l has the label "Argentina"#pt.
We can simplify your query a bit, using ?i instead of ?l (to suggest individual):
select ?i ?type where {
?i rdfs:label "Argentina"#pt ;
a ?type .
}
When I run this at the Portuguese endpoint, I get these results:
If you don't want the individual in the results, you don't have to select it:
select ?type where {
?i rdfs:label "Argentina"#pt ;
a ?type .
}
or even:
select ?type where {
[ rdfs:label "Argentina"#pt ; a ?type ]
}
If you know the identifier of the resource, and don't need to retrieve it by using its label, you can even just do:
select ?type where {
dbpedia-pt:Argentina a ?type
}
type
==========================================
http://www.w3.org/2002/07/owl#Thing
http://www.opengis.net/gml/_Feature
http://dbpedia.org/ontology/Place
http://dbpedia.org/ontology/PopulatedPlace
http://dbpedia.org/ontology/Country
http://schema.org/Place
http://schema.org/Country

Director Filmography - all queries are returning empty

All, I'm trying to get a director's filmography from dbpedia:
Both the queries below (and other attempts not shown) return empty sets. Query below doesn't work:
PREFIX d: <http://dbpedia.org/ontology/>
SELECT ?filmName WHERE {
?film d:director :woody_allen .
?film rdfs:label ?filmName .
}
Or (this is from) :
PREFIX m: <http://data.linkedmdb.org/resource/movie/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?filmTitle WHERE {
?film rdfs:label ?filmTitle.
?film m:director ?dir.
?dir m:director_name "Sofia Coppola".
}
Not sure what would be the problem with such simple queries. Any ideas here?
The problem with your first query is the use of :woody_allen (besides the fact that you haven't actually defined the default prefix and so the query should technically be illegal SPARQL) the term doesn't actually appear in the data as written.
Try rewriting your query like so:
PREFIX d: <http://dbpedia.org/ontology/>
SELECT ?filmName WHERE {
?film d:director <http://dbpedia.org/resource/Woody_Allen> .
?film rdfs:label ?filmName .
}
The above does give results.
As for your second query DBPedia does not use the Linked MDB ontologies so that query can't match anything

How to match exact string literals in SPARQL?

I have this query. It matches anything which has "South" in its name. But I only want the one whose foaf:name is exactly "South".
SELECT Distinct ?TypeLabel
WHERE
{
?a foaf:name "South" .
?a rdf:type ?Type .
?Type rdfs:label ?TypeLabel .
}
a bit late but anyway... I think this is what your looking for:
SELECT Distinct ?TypeLabel Where {
?a foaf:name ?name .
?a rdf:type ?Type .
?Type rdfs:label ?TypeLabel .
FILTER (?name="South"^^xsd:string)
}
you can use FILTER with the xsd types in order to restrict the result.
hope this helps...
cheers!
(Breaking out of comments for this)
Data issues
The issue is the data, not your query. If use the following query:
SELECT DISTINCT ?a
WHERE {
?a foaf:name "Imran Khan" .
}
You find (as you say) "Imran Khan Niazy".
But looking at the dbpedia entry for Imran Khan, you'll see both:
foaf:name "Imran Khan Niazy"
foaf:name "Imran Khan"
This is because RDF allows repeated use of properties.
Cause
"South" had the same issue (album, artist, and oddly 'South Luton'). These are cases where there are both familiar names ("Imran Khan", "South"), and more precise names ("Imran Khan Niazy", "South (album)") for the purposes of correctness or disambiguation.
Resolution
If you want a more precise match try adding a type (e.g. http://dbpedia.org/ontology/MusicalWork for the album).
Beware
Be aware that DBpedia derives from Wikipedia, and the extraction process isn't perfect. This is an area alive with wonky data, so don't assume your query has gone wrong.
That query should match exactly the literal South and not literals merely containing South as a substring. For partial matches you'd go to FILTER with e.g. REGEX(). Your query engine is broken in this sense - which query engine you are working with?