Assuming the triples are following:
#prefix : <http://example/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
:alice rdf:type foaf:Person .
:alice foaf:name "Alice" .
:bob rdf:type foaf:Person .
and then we perform 3 queries based on SPARQL 1.1:
Q1:
SELECT ?s
WHERE
{
?s ?p ?o .
FILTER NOT EXISTS { ?s foaf:name ?y }
}
Q2:
SELECT ?s
WHERE
{
?s ?p ?o .
FILTER NOT EXISTS { ?x foaf:name ?y }
}
Q3:
SELECT ?s
WHERE
{
?s ?p ?o .
FILTER NOT EXISTS { ?x foaf:mailbox ?y }
}
These three queries return three different solutions. Could anyone help me figure out why Q2 evaluates to no query solution in contrast to Q1 and Q3? Many thanks in advance :)
Q2 returns no solution because in your data, there exists a statement that matches ?x foaf:name ?y: ?x = :alice and ?y = "Alice". You've put no further constraints on either ?x or ?y. So no matter what the other variables in your query (?s, ?p and ?o) are bound to, the NOT EXISTS condition will always fail and therefore the query returns no result.
I'm trying to return subjects based on the relative position of their subjects in an ordered list.
A subject can be associated with multiple objects (via a single predicate) and all objects are in an ordered list. Given a reference object in this list I'd like to return the subjects in order of relative distance of their objects from the reference object.
:a : :x
:b : :v
:b : :z
:c : :v
:c : :y
:ls :list (:v :w :x :y :z)
Taking x as our starting object in the list, the code below returns
:a :x :0
:c :y :1
:b :v :2
:b :z :2
:c :v :2
Instead of returning all positions I would like only the objects relating to the subject's minimum object 'distance' to be returned (which may mean up to two objects per subject - both up and down the list). So I'd like to return
:a :x :0
:c :y :1
:b :v :2
:b :z :2
The code so far...
(with a lot of help from Find lists containing ALL values in a set? and Is it possible to get the position of an element in an RDF Collection in SPARQL?)
SELECT ?s ?p (abs(?refPos-?pos) as ?dif)
WHERE {
:ls :list/rdf:rest*/rdf:first ?o .
?s : ?o .
{
SELECT ?o (count(?mid) as ?pos) ?refPos
WHERE {
[] :list/rdf:rest* ?mid . ?mid rdf:rest* ?node .
?node rdf:first ?o .
{
SELECT ?o (count(?mid2) as ?refPos)
WHERE {
[] :list/rdf:rest* ?mid2 . ?mid2 rdf:rest* ?node2 .
?node2 rdf:first :x .
}
}
}
GROUP BY ?o
}
}
GROUP BY ?s ?o
ORDER BY ?dif
I've been trying to get a minimum ?dif (difference/distance) by grouping by ?s but because I then have to apply this (something like ?dif = ?minDif) to the ?s ?o grouping from earlier I don't know how to go back and forward between these two groupings.
Thanks for any assistance you can provide
All you needed to compound a solution is yet another one Joshua Taylor's answer: this or this.
Here below I'm using Jena functions, but I hope the idea is clear.
Query 1
PREFIX list: <http://jena.hpl.hp.com/ARQ/list#>
SELECT ?s ?el ?dif {
?s : ?el .
:ls :list/list:index (?pos ?el) .
:ls :list/list:index (?ref :x) .
BIND (ABS(?pos -?ref) AS ?dif)
{
SELECT ?s (MIN (?dif_) AS ?dif) WHERE {
?s : ?el_ .
:ls :list/list:index (?pos_ ?el_) .
:ls :list/list:index (?ref_ :x) .
BIND (ABS(?pos_ - ?ref_) AS ?dif_)
} GROUP by ?s
}
}
Query 2
PREFIX list: <http://jena.apache.org/ARQ/list#>
SELECT ?s ?el ?dif {
?s : ?el .
:ls :list/list:index (?pos ?el) .
:ls :list/list:index (?ref :x) .
BIND (ABS(?pos -?ref) AS ?dif)
FILTER NOT EXISTS {
?s : ?el_ .
:ls :list/list:index (?pos_ ?el_) .
BIND (ABS(?pos_ - ?ref) AS ?dif_) .
FILTER(?dif_ < ?dif)
}
}
Update
Query 1 can be rewritten in this way:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?s ?el ?dif {
?s : ?el
{ select (count(*) as ?pos) ?el {[] :list/rdf:rest*/rdf:rest*/rdf:first ?el} group by ?el }
{ select (count(*) as ?ref) {[] :list/rdf:rest*/rdf:rest*/rdf:first :x} }
BIND (ABS(?pos - ?ref) AS ?dif)
{
SELECT ?s (MIN(?dif_) AS ?diff) {
?s : ?el_
{ select (count(*) as ?pos_) ?el_ {[] :list/rdf:rest*/rdf:rest*/rdf:first ?el_} group by ?el_ }
{ select (count(*) as ?ref_) {[] :list/rdf:rest*/rdf:rest*/rdf:first :x} }
BIND (ABS(?pos_ - ?ref_) AS ?dif_)
} GROUP by ?s
}
FILTER (?dif = ?diff)
}
Notes
As you can see, this is not what SPARQL was designed for. For example, Blazegraph supports Gremlin...
Possibly this is not what RDF was designed for. Or try other modeling approach: do you really need RDF lists?
I haven't tested the above query in Virtuoso.
This is a followup on my question Compare models for identity, but with variables? Construct with minus?. Either I forgot what I learned then, or I didn't learn as much as I has thought.
I have triples like this:
prefix : <http://example.com/>
:rose :color :red .
:violet :color :blue .
:rose a :flower .
:flower rdfs:subClassOf :plant .
:dogs :love :trucks .
I want to discover any triples in my triplestore that don't satisfy at least one of these rules:
take :rose as the subject
take :rose's parent class as the subject
use any of the predicates that are used in any triple with :rose as the subject
take, as their object, the object of any triple with :rose as the subject.
Thus, in this case, the exception-discovery query (either select or construct) should only return
:dogs :love :trucks .
This query shows what should be in the triplestore:
PREFIX : <http://example.com/>
construct where {
:rose ?p ?o .
:rose a ?c .
?c rdfs:subClassOf ?super .
?s ?p ?x1 .
?x2 ?x3 ?o
}
.
+---+---------+-----------------+---------+
| | subject | predicate | object |
+---+---------+-----------------+---------+
| A | :flower | rdfs:subClassOf | :plant |
| B | :rose | :color | :red |
| C | :rose | rdf:type | :flower |
| D | :violet | :color | :blue |
+---+---------+-----------------+---------+
Is there a way to subtract that pattern out of everything in the triplestore, { ?s ?p ?o }, even though I'm using variable names other than ?s, ?p and ?o in the construct statement?
I have seen this post with strategies for comparing RDF, but I'd like to do it with standard SPARQL.
To tie it together with my earlier post, this final query erroneously suggests that several desired triples violate the rules set out at the top of the message.
+---------+-----------------+---------+
E | :dogs | :love | :trucks |
A | :flower | rdfs:subClassOf | :plant |
D | :violet | :color | :blue |
+---------+-----------------+---------+
Triple E is indeed undesired. But A is desired because it has :rose's class as its subject (rule 2), and triple D is desired because it's predicate is also used in some triples with :rose as the subject (rule 3).
PREFIX : <http://example.com/>
CONSTRUCT
{
?s ?p ?o .
}
WHERE
{ SELECT ?s ?p ?o
WHERE
{ { ?s ?p ?o }
MINUS
{ :rose ?p ?o ;
rdf:type ?c .
?c rdfs:subClassOf ?super .
?s ?p ?x1 .
?x2 ?x3 ?o
}
}
}
If I understand this correctly, you want to allow four types of triples in your data. If a triple (s,p,o) is in your data, it should satisfy at least one of the following criteria:
s = rose (About rose)
p = rdfs:subClassOf, and data contains (rose,a,s) (About a type)
The data also contains (rose,p,x) (Shared predicate)
The data also contains (rose,q,o) (Shared object)
It's easy enough to write a pattern for each one of those. You just need to find each triple (s,p,o) and filter out the ones that match none of those criteria. I think you can do it like this:
select ?s ?p ?o {
?s ?p ?o
filter not exists {
{ values ?s { :rose } } #-- (1)
union { values ?p { rdfs:subClassOf } :rose a ?s } #-- (2)
union { :rose ?p ?x } #-- (3)
union { :rose ?x ?o } #-- (4)
}
}
I want to extract graphs for 5 individuals who are Film(or movies) from DBPedia.
My query is:
ParameterizedSparqlString qs = new ParameterizedSparqlString( "" +
"construct{?s ?p ?o}"+
"where{?s a http://dbpedia.org/ontology/Film ."+
"?s ?p ?o"} OFFSET 0 LIMIT 5" );
I get the following result:
1- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Film .
2- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing .
3- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.wikidata.org/entity/Q386724 .
4- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Wikidata:Q11424 .
5- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Work .
Problem:
The same film is returned 5 times as all the class: Film, Thing, Q386724,WIKIdata:Q11424, and Work are equivalent class (or Subclass relation exist).
My question:
I want to return once the triple
<http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Film> .
and filter out the other 4 triples.
How do it please?
Thank you in advance
I think the following should work for you:
CONSTRUCT {?s ?p ?o}
WHERE {
{ SELECT DISTINCT ?s
WHERE {
?s a <http://dbpedia.org/ontology/Film> .
} LIMIT 5
}
?s ?p ?o .
}
I think you want this query
construct {?s a <http://dbpedia.org/ontology/Film> .}
where { ?s a <http://dbpedia.org/ontology/Film>. }
limit 5
First of all, I didn't make a minimum example because i think that my problem can be understood without it.
Second, I didn't give you the data because i think that my problem can be solved without it. However, I'm open to give it to you if you ask.
This is my query:
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
This is the result:
As you see, the result contains many values for the same suggestedItem, I want to make group according to the suggestedItem and sum the values of finalSimilarity
I tried this:
select ?item (SUM(?similarity * ?importance * ?levelImportance ) as ?finalSimilarity) (group_concat(distinct ?x) as ?likedItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason) where
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
group by ?item
order by desc(?finalSimilarity)
but the result is:
this is something wrong in my way because if you look at the finalSimilarity in, the value is 1.7. However, if you sum that manually from the first query, you get 0.62 so I did something wrong,
could you help me discover it?
Please note that the two queries are the same, it is just the select statment are different
Hint
I am already able to solve it using two selects like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rs: <http://www.SemanticRecommender.com/rs#>
PREFIX bo: <http://www.BookOntology.com/bo#>
PREFIX :<http://www.SemanticBookOntology.com/sbo#>
select ?suggestedItem ( SUM (?finalSimilarity) as ?summedFinalSimilarity) (group_concat(distinct strafter(str(?likedItem), "#")) as ?becauseYouHaveLikedThisItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason)
where {
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
where
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
}
group by ?suggestedItem
order by desc(?summedFinalSimilarity)
but to me that is a stupid solution and there must be a more clever one where i can take the aggregated data using one select
Without seeing your data, it's impossible to say, and with a query this big, it's probably not worth trying to debug the exact problem, but it's easy for this to happen if you can have duplicates (which would be easy to get, especially if you're using unions where some condition could match both parts). For instance, suppose you have data like this:
#prefix : <urn:ex:>
:x :similar [ :sim 0.10 ; :mult 2 ] ,
[ :sim 0.12 ; :mult 1 ] ,
[ :sim 0.12 ; :mult 1 ] , # yup, a duplicate
[ :sim 0.15 ; :mult 4 ] .
Then if you run this query, you'll get four result rows:
prefix : <urn:ex:>
select ?sim ((?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim | final |
================
| 0.15 | 0.60 |
| 0.12 | 0.12 |
| 0.12 | 0.12 |
| 0.10 | 0.20 |
----------------
However, if you select distinct, you'll only see three:
select distinct ?sim ((?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim | final |
================
| 0.15 | 0.60 |
| 0.12 | 0.12 |
| 0.10 | 0.20 |
----------------
Once you start to group by and sum, those non-distinct values will both get included:
select (sum(?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
---------
| final |
=========
| 1.04 |
---------
That sum is the sum of all four terms, not the three distinct ones. Even if the data doesn't have the duplicate values, the union can introduce the duplicate results:
#prefix : <urn:ex:>
:x :similar [ :sim 0.10 ; :mult 2 ] ,
[ :sim 0.12 ; :mult 1 ] ,
[ :sim 0.15 ; :mult 4 ] .
prefix : <urn:ex:>
select (sum(?sim * ?mult) as ?final) {
{ :x :similar [ :sim ?sim ; :mult ?mult ] }
union
{ :x :similar [ :sim ?sim ; :mult ?mult ] }
}
---------
| final |
=========
| 1.84 |
---------
Since you found the need to use group_concat(distinct …), I wouldn't be surprised if there are duplicates of that nature.