SPARQL if an instance has a property, others must as well - sparql

I have a specific instance and a SPARQL query that retrieves other instances similar to that one. By similar, I mean that the other instances have at least one common property and value in common with the specific instance, or have a class in common with the specific instance.
Now, I'd like to extend the query such that if the specific instance has a value for a "critical" property, then the only instances that are considered similar are those that also have that critical property (as opposed to just having at least one property and value in common).
For instance, here is some sample data in which instance1 has a value for predicate2 which is a subproperty of isCriticalPredicate.
#prefix : <http://example.org/rs#>
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:instance1 a :class1.
:instance1 :predicate1 :instance3.
:instance1 :predicate2 :instance3. # (#) critical property
:instance2 a :class1.
:instance2 :predicate1 :instance3.
:instance4 :predicate2 :instance3.
:predicate1 :hasSimilarityValue 0.6.
:predicate2 rdfs:subPropertyOf :isCriticalPredicate.
:predicate2 :hasSimilarityValue 0.6.
:class1 :hasSimilarityValue 0.4.
Here is a query in which ?x is the specific instance, instance1. The query retrieves just instance4, which is correct. However, if I remove the critical property from instance1 (line labeled with #), I get no results, but should get instance2, since it has a property in common with instance1. How can I fix this?
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity * ?importance * ?levelImportance) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
bind (4/7 as ?levelImportance)
{
values ?instanceImportance {1}
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
bind (?instanceImportance as ?importance)
}
union
{
values ?classImportance {1}
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
bind (?classImportance as ?importance)
}
filter (?x != ?item)
?x :isCriticalPredicate ?y.
?item :isCriticalPredicate ?y.
}
group by ?item

I've said it before and I'll say it again: minimal data is very helpful. From your past questions, I know that your working project has similarity values on properties and the like, but none of that really matters for the problem at hand. Here's some data that just has a few instances, property values, and one property designated as critical:
#prefix : <urn:ex:>
:p a :criticalProperty .
:a :p :u ; #-- :a has a critical property, so
:q :v . #-- only :d can be similar to it.
:c :q :v ; #-- :c has no critical properties, so
:r :w . #-- both :a and :d can be similar to it.
:d :p :u ;
:q :v .
The trick in a query like this is to filter out the results that have the problem, not to try to select the ones that don't. Logically, those mean the same thing, but in writing the query, it's easier to think about constraint violation, and to try to filter out the results that violate the constraint. In this case, you want to filter out any results where the instance has a critical property and value but the similar instance doesn't.
prefix : <urn:ex:>
select ?i (group_concat(distinct ?j) as ?js) where {
#-- :a has a critical property, but
#-- :c does not, so these are useful
#-- starting points
values ?i { :a :c }
#-- get ?j instances that have a value
#-- in common with ?i.
?i ?property ?value .
?j ?property ?value .
#-- make sure that ?i and ?j aren't
#-- the same instance
filter (?i != ?j)
#-- make sure that there is no critical
#-- property value that ?i has that
#-- ?j does not also have
filter not exists {
?i ?criticalProperty ?criticalValue .
?criticalProperty a :criticalProperty .
filter not exists {
?j ?criticalProperty ?criticalValue .
}
}
}
group by ?i
----------------------------
| i | js |
============================
| :a | "urn:ex:d" |
| :c | "urn:ex:d urn:ex:a" |
----------------------------
Related
There are some other questions that also touch on constaint satisfaction/violation that might be useful reading. While not all of these use nested filter not exists, most of them do have a pattern of filter not exists { … filter <negative condition> }.
Find lists containing ALL values in a set? (This actually uses a nested filter not exists.)
Is it possible to express a recursive definition in SPARQL?
Count values that satisfy a constraint and return 0 for those that don't satisfy it in SPARQL
SPARQL: return all intersections fulfilled by specified or equivalent classes (There's another nested filter not exists, in the last query in the answer.)

Related

Type of returned value in SPARQL Query

Is it possible to know the type of the return values in a SPARQL query?
For example, is there a function to define the type of ?x ?price ?p
in the following query?
SELECT DISTINCT ?x ?price ?p
WHERE {
?x a :Product .
?x :price ?price .
?x ?p ?o .
}
I want to know that
typeOf(x) = resource
typeOf(?p) = property
typeOf(?price) = property target etc.
datatype(?x)
The datatype function will tell you whether a result is a resource or a literal and in the latter case tell you which datatype it has exactly.
For example, the following query on https://dbpedia.org/sparql...
SELECT DISTINCT ?x ?code ?p datatype(?x) datatype(?code) datatype(?p)
WHERE {
?x a dbo:City.
?x dbo:areaCode ?code .
?x ?p ?o .
} limit 1
...will return:
x
code
p
callret-3
callret-4
callret-5
http://dbpedia.org/resource/Aconchi
"+52 623"
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2001/XMLSchema#anyURI
http://www.w3.org/2001/XMLSchema#string
http://www.w3.org/2001/XMLSchema#anyURI
However this will not differentiate between "resource" and "property" because a resource may be a property. What you probably mean is "individual" and "property" but even a property can be treated as an individual, for example in the triple rdfs:label rdfs:label "label".
However you can always query the rdf:type of a resource, which may give you rdf:Property, owl:DatatypeProperty or owl:ObjectProperty.

Classes and subclasses in OWL for SPARQL queries with inference

I have a question with OWL and SPARQL that I can't resolve. I have defined several classes, but for the question in question only 3 are the important ones: People, Men and Women; and its definitions would be the following:
<#People> a owl:Class ;
rdfs:label "People"#en .
<#Men> a owl:Class ;
rdfs:subClassOf <#People> ;
rdfs:label "Men"#en .
<#Women> a owl:Class ;
rdfs:subClassOf <#People> ;
rdfs:label "Women"#en .
And then a data in RDF for example:
<rdf:Description rdf:about="Registration#1">
<rdfs:label>ARCHEOLOGY GRADUATE</rdfs:label>
<ex:BranchKnowledge>ARTS AND HUMANITIES</ex:BranchKnowledge>
<ex:Degree>ARCHEOLOGY GRADUATE</ex:Degree>
<ex:Men rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">63</ex:Men>
<ex:Women rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">99</ex:Women>
<dcterms:coverage>2015/2016</dcterms:coverage>
</rdf:Description>
If I wanted to obtain the number of Men and Women of each data, I get it with the following query:
SELECT ?X ?degree ?branch ?men ?women
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?X ex:Men ?men .
?X ex:Women ?women
}
If I now want to get the total number of Men and Women inferring that both are People, I had thought that since both are subclasses of People, I could make the following query:
SELECT ?X ?degree ?branch ?people
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?X ex:People ?people
}
However I don't get any results.
Have I misconstructed the relationship between classes and subclasses for what I want to do or what would be the problem? (I'm working on a Virtuoso server).
You should be aware of the fact that you're using punning, i.e. the same URI for Men and Women as OWL classes and OWL data properties. I don't understand why you're doing this. Why don't you introduce properties like numberOfMen, numberOfWomen and numberOfPeople?
Next, you're using ex:People as a property, thus, you would have to define that ex:Men and ex:Women are sub-properties of ex:People.
<#People> a owl:DatatypeProperty ;
rdfs:label "People"#en .
<#Men> a owl:DatatypeProperty ;
rdfs:subPropertyOf <#People> ;
rdfs:label "Men"#en .
<#Women> a owl:DatatypeProperty ;
rdfs:subPropertyOf <#People> ;
rdfs:label "Women"#en .
And then you have to enable reasoning in the triple store if supported or use SPARQL 1.1 property paths.
But, more important, this would only lead to two rows, which means you have to do the aggregation in your SPARQL query by using the sum function, e.g.:
SELECT ?X ?degree ?branch (sum(?_people) as ?people)
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?p rdfs:subPropertyOf* ex:People .
?X ?p ?_people
}
GROUP BY ?X ?degree ?branch
Note, this query only works if there is just a single degree and branch, otherwise it would sum duplicate values.

SPARQL is it possible to bind two variables with their labels with a static text?

This is my query
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
{
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
}
union
{
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
}
filter (?x != ?item)
}
group by ?item
in my firstbind clause, I would like to not just bind the variable ?p, but also the variable ?instance. Plus, adding a text like that is why.
so the first bind should result the following results:
?p that is why ?instance
is that possible in SPARQL ?
please don't care about if the data makes sence or not, it is just a query to show you my question
If I understand you correctly, you're just looking for the concat function. As I've mentioned before, you should really browse through the SPARQL 1.1 standard, at least through the table of contents. You don't need to memorize it, but it will give you an idea of what things are possible, and an idea of where to look. Additionally, it's very helpful if you provide sample data that we can work with, because it makes it much clearer to figure out what you're trying to do. The phrasing of your title was not particularly clear, and the question doesn't really provide an example of what you're trying to accomplish. Only because I've seen some of your past questions did I have an idea of what you were aiming for. At any rate, here's some data:
#prefix : <urn:ex:>
:p :hasSimilarity 0.3 .
:A :hasSimilarity 0.6 .
:a :p :b ; #-- is is related to :b
a :A . #-- and is an :A .
:c :p :b . #-- :c is also related to :b
:d a :A . #-- :d is also an :A .
:e :p :b ; #-- :e is related to :b
a :A . #-- and is also an :A .
And here's the query and its results. You just use concat to join the str form of your variables with the appropriate strings and then bind the result to the variable.
prefix : <urn:ex:>
select ?item
(sum(?factor_) as ?factor)
(group_concat(distinct ?reason_; separator=", ") as ?reason)
{
values ?x { :a }
{ ?x ?p ?instance .
?item ?p ?instance .
?p :hasSimilarity ?factor_ .
bind(concat("has common ",str(?p)," value ",str(?instance)) as ?reason_) }
union
{ ?x a ?class.
?item a ?class.
?class :hasSimilarity ?factor_ .
bind(concat("has common class ",str(?class)) as ?reason_)
}
filter (?x != ?item)
}
group by ?item
-----------------------------------------------------------------------------------
| item | factor | reason |
===================================================================================
| :c | 0.3 | "has common urn:ex:p value urn:ex:b" |
| :d | 0.6 | "has common class urn:ex:A" |
| :e | 0.9 | "has common urn:ex:p value urn:ex:b, has common class urn:ex:A" |
-----------------------------------------------------------------------------------

SPARQL: query for a complete list

I have such a query:
CONSTRUCT {
?p a :IndContainer .
?p :contains ?ind .
} WHERE{
:ClassContainer_1 :contains ?class .
?ind a ?class .
BIND (IRI(...) AS ?p) .
}
An individual ClassContainer_1 relates to some classes. I get this classes and try to find individuals for these classes. Then I try to create an IndContainer that should store found individuals (dots are used only for simplification). So, I want to:
Create individual of IndContainer only when individuals for all bindings of ?class have been found;
Create individuals of IndContainer for all possible sets of individuals from ?ind (i.e. when some of ?class has a nuber of individuals).
Is it possible to create such a SPARQL query? Or it is necessary to use some rule engine?
EDIT (add illustration):
Positive example. Have:
test:ClassContainer_1
rdf:type test:ClassContainer ;
test:contains test:Type1 ;
test:contains test:Type2 ;
.
test:Type1_1
rdf:type test:Type1 ;
.
test:Type1_2
rdf:type test:Type1 ;
.
test:Type2_1
rdf:type test:Type2 ;
.
Want to receive:
test:IndContainer_1
rdf:type test:IndContainer ;
test:contains test:Type1_1 ;
test:contains test:Type2_1 ;
.
test:IndContainer_2
rdf:type test:IndContainer ;
test:contains test:Type1_2 ;
test:contains test:Type2_1 ;
.
Negative example: the same as positive except that there is no individuals of class Type2 and so no individuals of IndContainer should be generated.
EDIT 2 (problem essence):
We may look at this problem from the perspective of combination composing. We have two positions (in my example) in each combination. The number of positions is determined by the number of classes each ClassContainer depends on. Each position must be filled in with one individual of a class that correspond to that position. So in my example first position must be filled with one individual of Type1 class, the second - with Type2 class (but the order does not matter). We have two individuals for the first class and one individual for the second class. To get the number of combinations we may use the rule of product from combinatorics - 2*1 = 2, i.e. {Type1_1,Type2_1} - is the first combination and {Type1_2,Type2_1} - is the second combination. For each combination it is necessary to generate IndContainer individual.
If I understand your question correctly, you want a "container" for each class that is contained in a "class container" that contains the individuals that belong to that class. That's not too hard to do, as long as you can construct the IRI of the container from the IRI of the class. Here's some sample data with two classes, A and B, and a few instances (some of just A, some of just B, and some of A and B):
#prefix : <urn:ex:> .
:container a :ClassContainer ;
:contains :A, :B .
:w a :A . # an :A
:x a :A . # another :A
:y a :B . # a :B
:z a :A, :B . # both an :A and a :B
You query is already pretty close. Here's one that works, along with its result:
prefix : <urn:ex:>
construct {
?indContainer a :IndContainer ;
:contains ?ind .
}
where {
:container a :ClassContainer ;
:contains ?class .
?ind a ?class .
bind(IRI(concat(str(?class),"-container")) as ?indContainer)
}
#prefix : <urn:ex:> .
:B-container a :IndContainer ;
:contains :y , :z .
:A-container a :IndContainer ;
:contains :w , :x , :z .

retrieving most specific classes of instances

Is it possible to have a definition of an resource (from DBpedia) with a SPARQL query? I want to have something like the TBox and ABox that are shown in (Conceptual) Clustering methods for
the Semantic Web: issues and applications (slides 10–11). For example, for DBpedia resource Stephen King, I would like to have:
Stephen_King : Person &sqcap; Writer &sqcap; Male &sqcap; … (most specific classes)
You can use a query like the following to ask for the classes of which Stephen King is an instance which have no subclasses of which Stephen King is also an instance. This seems to align well with the idea of “most specific classes.” However, since (as far as I know) there's no reasoner attached to the DBpedia SPARQL endpoint, there may be subclass relationships that could be inferred but which aren't explicitly present in the data.
select distinct ?type where {
dbr:Stephen_King a ?type .
filter not exists {
?subtype ^a dbr:Stephen_King ;
rdfs:subClassOf ?type .
}
}
SPARQL results
Actually, since every class is an rdfs:subClassOf itself, you might want to add another line to that query to exclude the case where ?subtype and ?type are the same:
select distinct ?type where {
dbr:Stephen_King a ?type .
filter not exists {
?subtype ^a dbr:Stephen_King ;
rdfs:subClassOf ?type .
filter ( ?subtype != ?type )
}
}
SPARQL results
If you actually want a result string like the one shown in those slides, you could use values to bind a variable to dbr:Stephen_King, and then use some grouping and string concatenation to get something nicer looking (sort of):
select
(concat( ?person, " =\n", group_concat(?type; separator=" AND\n")) as ?sentence)
where {
values ?person { dbr:Stephen_King }
?type ^a ?person .
filter not exists {
?subtype ^a ?person ;
rdfs:subClassOf ?type .
filter ( ?subtype != ?type )
}
}
group by ?person
SPARQL results
http://dbpedia.org/resource/Stephen_King =
http://dbpedia.org/class/yago/AuthorsOfBooksAboutWritingFiction AND
http://dbpedia.org/ontology/Writer AND
http://schema.org/Person AND
http://xmlns.com/foaf/0.1/Person AND
http://dbpedia.org/class/yago/AmericanSchoolteachers AND
http://dbpedia.org/class/yago/LivingPeople AND
http://dbpedia.org/class/yago/PeopleFromBangor,Maine AND
http://dbpedia.org/class/yago/PeopleFromPortland,Maine AND
http://dbpedia.org/class/yago/PeopleFromSarasota,Florida AND
http://dbpedia.org/class/yago/PeopleSelf-identifyingAsAlcoholics AND
http://umbel.org/umbel/rc/Artist AND
http://umbel.org/umbel/rc/Writer AND
http://dbpedia.org/class/yago/20th-centuryNovelists AND
http://dbpedia.org/class/yago/21st-centuryNovelists AND
http://dbpedia.org/class/yago/AmericanHorrorWriters AND
http://dbpedia.org/class/yago/AmericanNovelists AND
http://dbpedia.org/class/yago/AmericanShortStoryWriters AND
http://dbpedia.org/class/yago/CthulhuMythosWriters AND
http://dbpedia.org/class/yago/HorrorWriters AND
http://dbpedia.org/class/yago/WritersFromMaine AND
http://dbpedia.org/class/yago/PeopleFromDurham,Maine AND
http://dbpedia.org/class/yago/PeopleFromLisbon,Maine AND
http://dbpedia.org/class/yago/PostmodernWriters