Require individual's property values to be a superset of another's? - semantic-web

I have defined an ontology with the following classes, properties, and individuals with object property assertions:
Class: Employee > Individuals: { EmployeeA }
Class: Skill > Individuals: { Skill1, Skill2, Skill3 }
Class: Job > Individuals: { DBA }
hasSkill > Domain (Employee) and Range (Skill)
isAskillBy > Domain (Skill) and Range (Employee) <inverse of hasSkill>
requireSkill > Domain (Job) and Range (Skill)
isAskillrequiredBy > Domain (Skill) and Range (Job) <inverse of requireSkill>
Individual: EmployeeA, object property assertion: hasSkill Skill1
hasSkill Skill2
, types : hasSkill only ({Skill1,Skill2}) <to close OWA
, Negative object property assertion: hasSkill Skill3
Individual: DBA, object property assertion: requireSkill Skill1
requireSkill Skill2
requireSkill Skill3
, types : requireSkill only ({Skill1,Skill2, Skill3}) <to close OWA
To classify whether an Employee is qualified for a job (in this case, the DBA position), I created the class Fit and made it equivalent to:
Employee and (hasSkill only (isAskillrequiredBy value DBA))
When I run a reasoner in Protege, the reasoner classifies EmployeeA under the class Fit, but with the closure axioms to address the Open World Assumption (OWA), EmployeeA should not be classified as Fit, since he does not have all three skills needed for the DBA position.

First, recall what the "universal" quantification in OWL means. The class expression
p only C
is the individuals x such that if x is related to y by property p, then y must be a C. That is, p(x,y) implies C(y). Your query is returning the correct results, but the query doesn't mean exactly what you want it to mean. The query
Employee and (hasSkill only (isAskillrequiredBy value DBA))
says that something must be an Employee, and if the employee has a skill, then the skill must be one that is required by the DBA position. EmployeeA certainly meets that definition, since the skills that EmployeeA has are Skill1 and Skill2, both of which are required for the DBA position.
The problem in this query is the only. It will manifest in two different ways: (i) if someone is qualified for DBA position (i.e., has all the necessary skills), but has additional skills, then you won't retrieve them, since they have skills that aren't required for the DBA position (but maybe you don't want overqualified people); and (ii) it retrieves people all of whose skills are required for the DBA position, but doesn't require that the people actually have all the skills needed for the DBA position.
What you actually want to ask for is individuals that have all the skills necessary for the DBA position. What you'd need to check is whether an individual lacks any skills that are required for the DBA. The skills required for the DBA position can be found with:
(inverse requiresSkill) value DBA
The skills not needed for the DBA are simply the negation of that:
not ((inverse requiresSkill) value DBA)
What you'd like to say is that any skills that an individual doesn't have must be from that class. Currently, you don't have a property that connects an individual to the skills that they don't have. If you did, then you could define your "Fit" class as
Employee and (lacksSkill only (not ((inverse requiresSkill) value DBA)))
Now, the question is whether you can define a property lacksSkill that's true exactly when an employee doesn't have a skill. I'm not sure whether you can do exactly that, but you can define
lacksSkill disjointProperty hasSkill
which will mean that if hasSkill(x,y) is true, then lacksSkill(x,y) must be false. It's not exactly the the same as a negation, because they can both be false at the same time, but they can't both be true.
That's enough to make this query work, and it removes the need for some of the closure axioms. You'll still need to close what the position requires, but you don't need to close the skills of the employees. That is, you don't need to say, e.g., that EmployeeA has only the skills Skill1, Skill2. I created an ontology in Protege that you can use to see this in action. It has an EmployeeA who has Skill1 and Skill2 (not enough to qualify for the job), and an EmployeeB who has all three skills, which is enough to qualify.
#prefix : <http://example.org/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix example: <http://example.org/> .
example:Skill a owl:Class .
example:Position a owl:Class .
example:Skill3 a owl:NamedIndividual , example:Skill .
example:hasSkill a owl:ObjectProperty ;
owl:propertyDisjointWith example:lacksSkill .
example:requiresSkill
a owl:ObjectProperty .
example: a owl:Ontology .
example:DBA a owl:NamedIndividual , example:Position ;
a [ a owl:Restriction ;
owl:allValuesFrom [ a owl:Class ;
owl:oneOf ( example:Skill3 example:Skill2 example:Skill1 )
] ;
owl:onProperty example:requiresSkill
] ;
example:requiresSkill example:Skill1 , example:Skill2 , example:Skill3 .
example:Skill2 a owl:NamedIndividual , example:Skill .
example:Employee a owl:Class .
example:EmployeeA a owl:NamedIndividual , example:Employee ;
example:hasSkill example:Skill1 , example:Skill2 .
example:Skill1 a owl:NamedIndividual , example:Skill .
example:lacksSkill a owl:ObjectProperty .
example:EmployeeB a owl:NamedIndividual , example:Employee ;
example:hasSkill example:Skill1 , example:Skill2 , example:Skill3 .

Related

SPARQL "Different Than"

Background
I have a graph database where nodes represent individual people and the edges are predicates hasMother and hasFather. I'd like to identify all of the nodes that share a mother, but don't share the same father.
Attempt
There are a few SPARQL terms that immediately pop into my mind when thinking about this problem. Namely Filter, != and NOT EXISTS.
I first set out and defined some sets...
Let M be the set of all winiks that have mother id, mother_id.
Let F be the set of all winiks that have father id, father_id.
The following query returns a qraph of M ∩ F.
PREFIX ex: <https://ex.com#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
CONSTRUCT WHERE {
?winik_id ex:hasMother ?mother_id.
?winik_id ex:hasFather ?father_id.
}
My hiccup is writing the exclusion part (its actually hard for me to verbalize the exact part I'm stuck on)-specifically how to deal with how general the query is (across all identifiers).
This should return pairs of people that have the same mother but different father.
SELECT ?p1 ?p2 WHERE {
?p1 ex:hasMother ?m .
?p1 ex:hasFather ?f .
?p2 ex:hasMother ?m .
?p2 ex:hasFather ?f2 .
FILTER (?f != ?f2)

extract labels with different languages out of a bound variable in sparql

As the title suggests, I have preferred labels in english and german. I am trying to extract each label by it's language and bind to a (newly created) variable.
SELECT ?copyC ?enLabel ?defaultLabel
WHERE {
?targetC skos:prefLabel ?prefUri .
?prefUri skosxl:literalForm ?b
# handle languages - english and german
bind(if(langMatches(lang(?b),"en"),?b,?_) as ?enLabel)
bind(if(langMatches(lang(?b),"de"),?b, ?_) as ?defaultLabel)
####
.
# create some uris
BIND(IRI(CONCAT("http://example.com/", REPLACE(STR(?enLabel), "\\W", "", "i") )) AS ?copyC ) .
BIND(IRI(CONCAT("http://example.com/", REPLACE(STR(?enLabel), "\\W", "", "i"),"_prefLabel_en" )) AS ?enLabelc ) .
BIND(IRI(CONCAT("http://example.com/", REPLACE(STR(?defaultLabel), "\\W", "", "i"),"_prefLabel_de" )) AS ?defaultLabelc ) .
####
}
A toy example model:
#prefix skos: <http://www.w3.org/2004/02/skos/core#> .
#prefix skosxl: <http://www.w3.org/2008/05/skos-xl#> . <http://example_copy.com/thing1> a skos:Concept ;
skosxl:prefLabel <http://example_copy.com/some_german_label>, <http://example> <http://example_copy.com/some_german_label> a skosxl:Label ;
skosxl:literalForm "some german label"#de . <http://example_copy.com/some_english_label> a skosxl:Label ;
skosxl:literalForm "some english label"#en .
results:
?c ?enLabel ?defaultLabel
1 <http://example.com/thing1> <http:/example.com/some_english_label>
2 <http:/example.com/some_german_label>
above query follows this post:
SPARQL filter language if possible in multiple value context
The ?englishLabelc is correctly associated to the bound ?copyC variable but the ?defaultLabelc is not. My desire is that both the english and german label will correctly be associated uri bound to ?copyC. I assume there is a scope problem in how I am "BIND"ing the ?copyC variable but I am not sure how to go about troubleshooting. If I simply use the ?b variable to create ?copyC two concepts will be created - one for the english and one for the german label which is what I want. Can someone help me figure this out?

DBPedia: all population fields?

I am trying to extract all entities in DBPedia that have a population. However, I have found that there are different field names for population depending on the entity. For instance, http://dbpedia.org/page/Boston has the field populationTotal while http://dbpedia.org/page/Alaska has the field 2010pop. Is there a complete list of the population fields that I can query for?
Solution via #AKSW above: query for all properties that start with "pop" and have a specified range.
SELECT ?p ?range {
?p a rdf:Property
FILTER(regex(str(?p), "pop"))
OPTIONAL {?p rdfs:range ?range}
}

SPARQL if an instance has a property, others must as well

I have a specific instance and a SPARQL query that retrieves other instances similar to that one. By similar, I mean that the other instances have at least one common property and value in common with the specific instance, or have a class in common with the specific instance.
Now, I'd like to extend the query such that if the specific instance has a value for a "critical" property, then the only instances that are considered similar are those that also have that critical property (as opposed to just having at least one property and value in common).
For instance, here is some sample data in which instance1 has a value for predicate2 which is a subproperty of isCriticalPredicate.
#prefix : <http://example.org/rs#>
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:instance1 a :class1.
:instance1 :predicate1 :instance3.
:instance1 :predicate2 :instance3. # (#) critical property
:instance2 a :class1.
:instance2 :predicate1 :instance3.
:instance4 :predicate2 :instance3.
:predicate1 :hasSimilarityValue 0.6.
:predicate2 rdfs:subPropertyOf :isCriticalPredicate.
:predicate2 :hasSimilarityValue 0.6.
:class1 :hasSimilarityValue 0.4.
Here is a query in which ?x is the specific instance, instance1. The query retrieves just instance4, which is correct. However, if I remove the critical property from instance1 (line labeled with #), I get no results, but should get instance2, since it has a property in common with instance1. How can I fix this?
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity * ?importance * ?levelImportance) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
bind (4/7 as ?levelImportance)
{
values ?instanceImportance {1}
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
bind (?instanceImportance as ?importance)
}
union
{
values ?classImportance {1}
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
bind (?classImportance as ?importance)
}
filter (?x != ?item)
?x :isCriticalPredicate ?y.
?item :isCriticalPredicate ?y.
}
group by ?item
I've said it before and I'll say it again: minimal data is very helpful. From your past questions, I know that your working project has similarity values on properties and the like, but none of that really matters for the problem at hand. Here's some data that just has a few instances, property values, and one property designated as critical:
#prefix : <urn:ex:>
:p a :criticalProperty .
:a :p :u ; #-- :a has a critical property, so
:q :v . #-- only :d can be similar to it.
:c :q :v ; #-- :c has no critical properties, so
:r :w . #-- both :a and :d can be similar to it.
:d :p :u ;
:q :v .
The trick in a query like this is to filter out the results that have the problem, not to try to select the ones that don't. Logically, those mean the same thing, but in writing the query, it's easier to think about constraint violation, and to try to filter out the results that violate the constraint. In this case, you want to filter out any results where the instance has a critical property and value but the similar instance doesn't.
prefix : <urn:ex:>
select ?i (group_concat(distinct ?j) as ?js) where {
#-- :a has a critical property, but
#-- :c does not, so these are useful
#-- starting points
values ?i { :a :c }
#-- get ?j instances that have a value
#-- in common with ?i.
?i ?property ?value .
?j ?property ?value .
#-- make sure that ?i and ?j aren't
#-- the same instance
filter (?i != ?j)
#-- make sure that there is no critical
#-- property value that ?i has that
#-- ?j does not also have
filter not exists {
?i ?criticalProperty ?criticalValue .
?criticalProperty a :criticalProperty .
filter not exists {
?j ?criticalProperty ?criticalValue .
}
}
}
group by ?i
----------------------------
| i | js |
============================
| :a | "urn:ex:d" |
| :c | "urn:ex:d urn:ex:a" |
----------------------------
Related
There are some other questions that also touch on constaint satisfaction/violation that might be useful reading. While not all of these use nested filter not exists, most of them do have a pattern of filter not exists { … filter <negative condition> }.
Find lists containing ALL values in a set? (This actually uses a nested filter not exists.)
Is it possible to express a recursive definition in SPARQL?
Count values that satisfy a constraint and return 0 for those that don't satisfy it in SPARQL
SPARQL: return all intersections fulfilled by specified or equivalent classes (There's another nested filter not exists, in the last query in the answer.)

SPARQL: query for a complete list

I have such a query:
CONSTRUCT {
?p a :IndContainer .
?p :contains ?ind .
} WHERE{
:ClassContainer_1 :contains ?class .
?ind a ?class .
BIND (IRI(...) AS ?p) .
}
An individual ClassContainer_1 relates to some classes. I get this classes and try to find individuals for these classes. Then I try to create an IndContainer that should store found individuals (dots are used only for simplification). So, I want to:
Create individual of IndContainer only when individuals for all bindings of ?class have been found;
Create individuals of IndContainer for all possible sets of individuals from ?ind (i.e. when some of ?class has a nuber of individuals).
Is it possible to create such a SPARQL query? Or it is necessary to use some rule engine?
EDIT (add illustration):
Positive example. Have:
test:ClassContainer_1
rdf:type test:ClassContainer ;
test:contains test:Type1 ;
test:contains test:Type2 ;
.
test:Type1_1
rdf:type test:Type1 ;
.
test:Type1_2
rdf:type test:Type1 ;
.
test:Type2_1
rdf:type test:Type2 ;
.
Want to receive:
test:IndContainer_1
rdf:type test:IndContainer ;
test:contains test:Type1_1 ;
test:contains test:Type2_1 ;
.
test:IndContainer_2
rdf:type test:IndContainer ;
test:contains test:Type1_2 ;
test:contains test:Type2_1 ;
.
Negative example: the same as positive except that there is no individuals of class Type2 and so no individuals of IndContainer should be generated.
EDIT 2 (problem essence):
We may look at this problem from the perspective of combination composing. We have two positions (in my example) in each combination. The number of positions is determined by the number of classes each ClassContainer depends on. Each position must be filled in with one individual of a class that correspond to that position. So in my example first position must be filled with one individual of Type1 class, the second - with Type2 class (but the order does not matter). We have two individuals for the first class and one individual for the second class. To get the number of combinations we may use the rule of product from combinatorics - 2*1 = 2, i.e. {Type1_1,Type2_1} - is the first combination and {Type1_2,Type2_1} - is the second combination. For each combination it is necessary to generate IndContainer individual.
If I understand your question correctly, you want a "container" for each class that is contained in a "class container" that contains the individuals that belong to that class. That's not too hard to do, as long as you can construct the IRI of the container from the IRI of the class. Here's some sample data with two classes, A and B, and a few instances (some of just A, some of just B, and some of A and B):
#prefix : <urn:ex:> .
:container a :ClassContainer ;
:contains :A, :B .
:w a :A . # an :A
:x a :A . # another :A
:y a :B . # a :B
:z a :A, :B . # both an :A and a :B
You query is already pretty close. Here's one that works, along with its result:
prefix : <urn:ex:>
construct {
?indContainer a :IndContainer ;
:contains ?ind .
}
where {
:container a :ClassContainer ;
:contains ?class .
?ind a ?class .
bind(IRI(concat(str(?class),"-container")) as ?indContainer)
}
#prefix : <urn:ex:> .
:B-container a :IndContainer ;
:contains :y , :z .
:A-container a :IndContainer ;
:contains :w , :x , :z .