Classes and subclasses in OWL for SPARQL queries with inference - sparql

I have a question with OWL and SPARQL that I can't resolve. I have defined several classes, but for the question in question only 3 are the important ones: People, Men and Women; and its definitions would be the following:
<#People> a owl:Class ;
rdfs:label "People"#en .
<#Men> a owl:Class ;
rdfs:subClassOf <#People> ;
rdfs:label "Men"#en .
<#Women> a owl:Class ;
rdfs:subClassOf <#People> ;
rdfs:label "Women"#en .
And then a data in RDF for example:
<rdf:Description rdf:about="Registration#1">
<rdfs:label>ARCHEOLOGY GRADUATE</rdfs:label>
<ex:BranchKnowledge>ARTS AND HUMANITIES</ex:BranchKnowledge>
<ex:Degree>ARCHEOLOGY GRADUATE</ex:Degree>
<ex:Men rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">63</ex:Men>
<ex:Women rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">99</ex:Women>
<dcterms:coverage>2015/2016</dcterms:coverage>
</rdf:Description>
If I wanted to obtain the number of Men and Women of each data, I get it with the following query:
SELECT ?X ?degree ?branch ?men ?women
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?X ex:Men ?men .
?X ex:Women ?women
}
If I now want to get the total number of Men and Women inferring that both are People, I had thought that since both are subclasses of People, I could make the following query:
SELECT ?X ?degree ?branch ?people
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?X ex:People ?people
}
However I don't get any results.
Have I misconstructed the relationship between classes and subclasses for what I want to do or what would be the problem? (I'm working on a Virtuoso server).

You should be aware of the fact that you're using punning, i.e. the same URI for Men and Women as OWL classes and OWL data properties. I don't understand why you're doing this. Why don't you introduce properties like numberOfMen, numberOfWomen and numberOfPeople?
Next, you're using ex:People as a property, thus, you would have to define that ex:Men and ex:Women are sub-properties of ex:People.
<#People> a owl:DatatypeProperty ;
rdfs:label "People"#en .
<#Men> a owl:DatatypeProperty ;
rdfs:subPropertyOf <#People> ;
rdfs:label "Men"#en .
<#Women> a owl:DatatypeProperty ;
rdfs:subPropertyOf <#People> ;
rdfs:label "Women"#en .
And then you have to enable reasoning in the triple store if supported or use SPARQL 1.1 property paths.
But, more important, this would only lead to two rows, which means you have to do the aggregation in your SPARQL query by using the sum function, e.g.:
SELECT ?X ?degree ?branch (sum(?_people) as ?people)
WHERE {
?X ex:Degree ?degree .
?X ex:BranchKnowledge ?branch .
?p rdfs:subPropertyOf* ex:People .
?X ?p ?_people
}
GROUP BY ?X ?degree ?branch
Note, this query only works if there is just a single degree and branch, otherwise it would sum duplicate values.

Related

Type of returned value in SPARQL Query

Is it possible to know the type of the return values in a SPARQL query?
For example, is there a function to define the type of ?x ?price ?p
in the following query?
SELECT DISTINCT ?x ?price ?p
WHERE {
?x a :Product .
?x :price ?price .
?x ?p ?o .
}
I want to know that
typeOf(x) = resource
typeOf(?p) = property
typeOf(?price) = property target etc.
datatype(?x)
The datatype function will tell you whether a result is a resource or a literal and in the latter case tell you which datatype it has exactly.
For example, the following query on https://dbpedia.org/sparql...
SELECT DISTINCT ?x ?code ?p datatype(?x) datatype(?code) datatype(?p)
WHERE {
?x a dbo:City.
?x dbo:areaCode ?code .
?x ?p ?o .
} limit 1
...will return:
x
code
p
callret-3
callret-4
callret-5
http://dbpedia.org/resource/Aconchi
"+52 623"
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2001/XMLSchema#anyURI
http://www.w3.org/2001/XMLSchema#string
http://www.w3.org/2001/XMLSchema#anyURI
However this will not differentiate between "resource" and "property" because a resource may be a property. What you probably mean is "individual" and "property" but even a property can be treated as an individual, for example in the triple rdfs:label rdfs:label "label".
However you can always query the rdf:type of a resource, which may give you rdf:Property, owl:DatatypeProperty or owl:ObjectProperty.

i want to get the names of similar types using sparql queries from dbpedia

I need to find the names of similar types from DBpedia so I'm trying to figure out a query which can return me the names of entities which have same subject type in its dct:subject (example I want to find similar types of white house so i want to write a query for same . I'm considering the dct:subject to find them ). If there is any other approach please mention it
Previously I tried it for rdf:type but the result are not so good and some time it shows time out
I have done my problem by the query mentioned below and now i want to consider dct:subject instead of rdf:type
select distinct ?label ?resource count(distinct ?type) as ?score where {
values ?type { dbo:Thing dbo:Organization yago:WikicatIslam-relatedControversies yago:WikicatIslamistGroups yago:WikicatRussianFederalSecurityServiceDesignatedTerroristOrganizations yago:Abstraction100002137 yago:Act100030358 yago:Cabal108241798 yago:Group100031264 yago:Movement108464601 yago:PoliticalMovement108472335
}
?resource rdfs:label ?label ;
foaf:name ?name ;
a ?type .
FILTER (lang(?label) = 'en').
}
ORDER BY DESC(?score)

dbpedia sparql query returns 0 result

I’m new to query languages and linked data so thanks a lot for the help. I also have a similar question about sparql on wikidata
Wikidata sparql query returns 0 result
I would like to look up all the art movements in dbpedia with the associated artists (associated with the movement -peopleNam, famous for the movement-famousName, influence the movement-influenceName) here is my query with lots of redundant prefix (that I could not enter because of the amount of links)
SELECT ?movementName ?famousName ?peopleName ?influenceName
WHERE {
?m dc:subject <http://dbpedia.org/resource/Category:Art_movements> .
?m rdfs:label ?movementName .
FILTER(LANG(?movementName) = "en")
?m dbp:knownFor ?a .
?a rdfs:label ?famousName .
FILTER(LANG(?famousName) = "en")
?m dbp:movement ?b .
?b rdfs:label ?peopleName .
FILTER(LANG(?peopleName) = "en")
?m dbp:influencedBy ?d .
?d rdfs:label ?influenceName .
FILTER(LANG(?influenceName) = "en")
}'
When I run the query with only the movement name I get 235 results but whenever I add any other artist name, I get 0 result. Could you show me what went wrong?
A similar query is done in wikidata with some encouraging results but I haven’t tried to match the two sources yet.
Thanks

SPARQL if an instance has a property, others must as well

I have a specific instance and a SPARQL query that retrieves other instances similar to that one. By similar, I mean that the other instances have at least one common property and value in common with the specific instance, or have a class in common with the specific instance.
Now, I'd like to extend the query such that if the specific instance has a value for a "critical" property, then the only instances that are considered similar are those that also have that critical property (as opposed to just having at least one property and value in common).
For instance, here is some sample data in which instance1 has a value for predicate2 which is a subproperty of isCriticalPredicate.
#prefix : <http://example.org/rs#>
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:instance1 a :class1.
:instance1 :predicate1 :instance3.
:instance1 :predicate2 :instance3. # (#) critical property
:instance2 a :class1.
:instance2 :predicate1 :instance3.
:instance4 :predicate2 :instance3.
:predicate1 :hasSimilarityValue 0.6.
:predicate2 rdfs:subPropertyOf :isCriticalPredicate.
:predicate2 :hasSimilarityValue 0.6.
:class1 :hasSimilarityValue 0.4.
Here is a query in which ?x is the specific instance, instance1. The query retrieves just instance4, which is correct. However, if I remove the critical property from instance1 (line labeled with #), I get no results, but should get instance2, since it has a property in common with instance1. How can I fix this?
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity * ?importance * ?levelImportance) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
bind (4/7 as ?levelImportance)
{
values ?instanceImportance {1}
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
bind (?instanceImportance as ?importance)
}
union
{
values ?classImportance {1}
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
bind (?classImportance as ?importance)
}
filter (?x != ?item)
?x :isCriticalPredicate ?y.
?item :isCriticalPredicate ?y.
}
group by ?item
I've said it before and I'll say it again: minimal data is very helpful. From your past questions, I know that your working project has similarity values on properties and the like, but none of that really matters for the problem at hand. Here's some data that just has a few instances, property values, and one property designated as critical:
#prefix : <urn:ex:>
:p a :criticalProperty .
:a :p :u ; #-- :a has a critical property, so
:q :v . #-- only :d can be similar to it.
:c :q :v ; #-- :c has no critical properties, so
:r :w . #-- both :a and :d can be similar to it.
:d :p :u ;
:q :v .
The trick in a query like this is to filter out the results that have the problem, not to try to select the ones that don't. Logically, those mean the same thing, but in writing the query, it's easier to think about constraint violation, and to try to filter out the results that violate the constraint. In this case, you want to filter out any results where the instance has a critical property and value but the similar instance doesn't.
prefix : <urn:ex:>
select ?i (group_concat(distinct ?j) as ?js) where {
#-- :a has a critical property, but
#-- :c does not, so these are useful
#-- starting points
values ?i { :a :c }
#-- get ?j instances that have a value
#-- in common with ?i.
?i ?property ?value .
?j ?property ?value .
#-- make sure that ?i and ?j aren't
#-- the same instance
filter (?i != ?j)
#-- make sure that there is no critical
#-- property value that ?i has that
#-- ?j does not also have
filter not exists {
?i ?criticalProperty ?criticalValue .
?criticalProperty a :criticalProperty .
filter not exists {
?j ?criticalProperty ?criticalValue .
}
}
}
group by ?i
----------------------------
| i | js |
============================
| :a | "urn:ex:d" |
| :c | "urn:ex:d urn:ex:a" |
----------------------------
Related
There are some other questions that also touch on constaint satisfaction/violation that might be useful reading. While not all of these use nested filter not exists, most of them do have a pattern of filter not exists { … filter <negative condition> }.
Find lists containing ALL values in a set? (This actually uses a nested filter not exists.)
Is it possible to express a recursive definition in SPARQL?
Count values that satisfy a constraint and return 0 for those that don't satisfy it in SPARQL
SPARQL: return all intersections fulfilled by specified or equivalent classes (There's another nested filter not exists, in the last query in the answer.)

Dbpedia/sparql: get population & lat/lng of all cities/towns/villages in UK

I'm entering the following query at http://dbpedia.org/sparql:
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?s ?name ?value ?lat ?lng
WHERE {
?s a <http://dbpedia.org/ontology/PopulatedPlace> .
?s <http://dbpedia.org/property/name> ?name .
?s <http://dbpedia.org/property/populationTotal> ?value .
FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
?s geo:lat ?lat .
?s geo:long ?lng .
}
(The bounding box is intended to be for the UK, the other option is to add <http://dbpedia.org/ontology/country> <http://dbpedia.org/resource/United_Kingdom> ., but there's a possibility that some places might not have been tagged with UK as the country).
The problem is that it doesn't seem to be pulling back many places (around 290).
Swapping population for populationTotal gives 1588 places, and I can't figure out (semantically) which one should be used.
Is this a limitation with the underlying data, or is there something that could be improved in the way I'm formulating the query?
Note: this question is mainly academic now as I got the info from http://download.geonames.org/export/dump/GB.zip, but I'd much prefer to use open data and the semantic web, so posting up this question to see if there was something I was missing, or to find out if there is a shortcoming in how the data is being scraped from Wikipedia and whether I can muck in.
Your query is only returning locations that have a value for populationTotal. For example, if Town A has "10,000" for populationTotal in the database, and Town B has NULL, only Town A will be returned.
If you want to return all locations in the UK, then you need to specify population as an optional parameter. This query will show you all the locations, as well as the populations for the ones that have that data.
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?s ?name ?value ?lat ?lng
WHERE {
?s a <http://dbpedia.org/ontology/PopulatedPlace> .
?s <http://dbpedia.org/property/name> ?name .
OPTIONAL { ?s <http://dbpedia.org/property/populationTotal> ?value . }
FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
?s geo:lat ?lat .
?s geo:long ?lng .
}