Preferred combination SPARQL - sparql

I am trying to write an elegant SPARQL that gives me one solution for multiple possible queries. I have a number of subjects and a number of predicates and I want to receive a single object. The existence of a single solution is very uncertain so I give multiple options. If I am not mistaken, this can be done with the following query:
SELECT ?object
WHERE {
:subjA|:subjB|:subjC :propA|:propB|:propC ?object.
}
LIMIT 1
The real problem is that I do not want any solution. I want order by :subjA and then :propA. To make it clear; I want the first solution in the following list of combinations:
:subjA :propA
:subjA :propB
:subjA :propC
:subjB :propA
:subjB :propB
:subjB :propC
:subjC :propA
:subjC :propB
:subjC :propC
How to rewrite my query to get the first possible solution?

This is for sure not a valid SPARQL query. You can only use | for the predicate which will then be called a property path but you're losing the variable binding.
SPARQL returns a set of rows but you can use for example the lexicographical order. Not sure, if this is what you want:
sample data
#prefix : <http://ex.org/> .
:subjA :propA :o1 .
:subjA :propC :o1 .
:subjB :propB :o1 .
:subjC :propB :o1 .
:subjC :propA :o1 .
:subjA :propC :o2 .
:subjB :propB :o2 .
:subjB :propC :o2 .
query
PREFIX : <http://ex.org/>
SELECT ?s ?p ?o
WHERE
{ VALUES ?s { :subjA :subjB :subjC }
VALUES ?p { :propA :propB :propC }
?s ?p ?o
}
ORDER BY ?s ?p
result
-------------------------
| s | p | o |
=========================
| :subjA | :propA | :o1 |
| :subjA | :propC | :o1 |
| :subjA | :propC | :o2 |
| :subjB | :propB | :o1 |
| :subjB | :propB | :o2 |
| :subjB | :propC | :o2 |
| :subjC | :propA | :o1 |
| :subjC | :propB | :o1 |
-------------------------
Update
Since a used-defined order is wanted, as a workaround an index for entities can be used (I changed the order of :subjB and :subjC resp. :propB and :propC to show the difference compared to lexicographical order):
query
PREFIX : <http://ex.org/>
SELECT ?s ?p ?o
WHERE
{ VALUES (?sid ?s) { (1 :subjA) (2 :subjC) (3 :subjB) }
VALUES (?pid ?p) { (1 :propA) (2 :propC) (3 :propB) }
?s ?p ?o
}
ORDER BY ?sid ?pid
result
-------------------------
| s | p | o |
=========================
| :subjA | :propA | :o1 |
| :subjA | :propC | :o1 |
| :subjA | :propC | :o2 |
| :subjC | :propB | :o1 |
| :subjC | :propA | :o1 |
| :subjB | :propB | :o1 |
| :subjB | :propB | :o2 |
| :subjB | :propC | :o2 |
-------------------------

You can do it by two ways. They are different by your needs to alphabetic sort or not.
First for alphabetic sort:
Select ?object where {
{Select (min(?sub) as ?msub) where {
?sub :propA|:propB|:propC ?obj1
Filter ( ?sub = :subjA or :sub=?...)
} .
{Select (min(?prop) as ?mrop_by_msub) where {
?msub ?prop ?obj2
Filter ( ?prop = :propA or ?prop=?...)
} .
?msub ?mrop_by_msub ?object
}
Limit 1
Second way for custom sort:
Select ?object where {
{:subjA :propA ?object}
Union
{:subjA :propB ?object}
.......
} limit 1
I write all of it by my memory, so there is may be some errors in this code.
Add third way:
You can add to your rdf store something like this:
:subjA :order_num 1
:subjB :order_num 2
:subjC :order_num 3
:propA :order_num 1
:propB :order_num 2
:propC :order_num 3
And then use query like this:
select ?object where{
?sub :order_num ?ord_num_sub .
?prop :order_num ?ord_num_prop .
?sub ?prop ?object
} order by ?ord_num_sub ?ord_num_prop
limit 1
or if you want to use different of subjs and props you can add filter:
filter ( (?ord_num_sub=2 or ?ord_num_sub=3) and (?ord_num_prop=1 or ?ord_num_prop=2) )

Related

Column result unexpectedly returning nothing

Have a SPARQL query that returns nothing for Airport_Name when I believe it should.
PREFIX nas: <https://data.nasa.gov/ontologies/atmonto/NAS#>
PREFIX gen: <https://data.nasa.gov/ontologies/atmonto/general#>
SELECT *
WHERE{
{
{SELECT ?Airport_Name{
?Airport rdf:type nas:Airport ;
nas:airportName ?Airport_Name .
}
}
}UNION{
?Location rdf:type gen:PointLocation;
gen:longitude ?X ;
gen:latitude ?Y .
}
?Location rdf:type gen:PointLocation;
gen:longitude ?X ;
gen:latitude ?Y .
FILTER (?X > -80 && ?X < -64 && ?Y > 18 && ?Y < 32)
}
Expected result should return Location, X, Y, and Airport_Name
When I try to return Airport_Name at the bottom, the Location columns then appear blank, so im not sure what has been done wrongly. If anyone could point me in the right direction, that would be much appreciated.
Your query has several issues, the correct query would be something like below.
Query
SELECT *
WHERE
{ ?airport rdf:type/rdfs:subClassOf* nas:Airport ;
nas:airportName ?Airport_Name ;
nas:airportLocation ?location .
?location gen:longitude ?X ;
gen:latitude ?Y
FILTER ( ?X > "-80"^^xsd:float && ?X < "-64"^^xsd:float && ?Y > "18"^^xsd:float && ?Y < "32"^^xsd:float )
}
Result (sample with LIMIT 10)
--------------------------------------------------------------------------------------------------------------------------
| airport | Airport_Name | location | X | Y |
==========================================================================================================================
| nas:MYEMairport | "Governors Harbour" | nas:MYEMcoordinates | "-76.330178"^^xsd:float | "25.283586"^^xsd:float |
| nas:MDPCairport | "Punta Cana Intl" | nas:MDPCcoordinates | "-68.366186"^^xsd:float | "18.570781"^^xsd:float |
| nas:MUGTairport | "Mariana Grajales" | nas:MUGTcoordinates | "-75.158333"^^xsd:float | "20.085278"^^xsd:float |
| nas:MDSTairport | "Cibao Intl" | nas:MDSTcoordinates | "-70.604689"^^xsd:float | "19.406092"^^xsd:float |
| nas:MYLSairport | "Stella Maris" | nas:MYLScoordinates | "-75.268778"^^xsd:float | "23.583047"^^xsd:float |
| nas:MBNCairport | "North Caicos" | nas:MBNCcoordinates | "-71.939658"^^xsd:float | "21.917486"^^xsd:float |
| nas:MUMOairport | "Orestes Acosta" | nas:MUMOcoordinates | "-74.922222"^^xsd:float | "20.653889"^^xsd:float |
| nas:MYRPairport | "New Port Nelson" | nas:MYRPcoordinates | "-74.836186"^^xsd:float | "23.684378"^^xsd:float |
| nas:MYEGairport | "George Town" | nas:MYEGcoordinates | "-75.781670"^^xsd:float | "23.466667"^^xsd:float |
| nas:MUBYairport | "Carlos Manuel De Cespedes" | nas:MUBYcoordinates | "-76.621389"^^xsd:float | "20.396389"^^xsd:float |
--------------------------------------------------------------------------------------------------------------------------
Note, the query uses the property path rdf:type/rdfs:subClassOf* which only works if you've also loaded the ontology (NAS.ttl). The property path is necessary because the instance data (airportInst.ttl) uses subclasses of nas:Airport like nas:InternationAirport or nas:CanadianAirport.

SPARQL filter/aggregate results by subgraph

I would like to write a query which filters out some results by using some other properties of the graph. I explain with this example.
I am modelling a permissions system. There are some subjects, resources and permissions. Subjects have certain permissions on resources. Some permissions imply other permissions. A simplified example with 3 permissions, and 3 resources that the single subject has one (two in one case) of those permissions on.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://schema#>
PREFIX d: <http://data#>
INSERT DATA {
:Permission rdf:type rdf:Property .
:Read rdf:type rdf:Property ;
rdfs:subPropertyOf :Permission .
:Write rdf:type rdf:Property ;
rdfs:subPropertyOf :Permission ;
:implies :Read .
:Admin rdf:type rdf:Property ;
rdfs:subPropertyOf :Permission ;
:implies :Write ,
:Read .
d:s1 rdf:type :Subject .
d:r1 rdf:type :Resource .
d:r2 rdf:type :Resource .
d:r3 rdf:type :Resource .
d:s1 :Admin d:r1 .
d:s1 :Read d:r1 .
d:s1 :Write d:r2 .
d:s1 :Read d:r3 .
}
I can easily write a query to see for a given subject, what the implied permissions are on some resources.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://schema#>
PREFIX d: <http://data#>
SELECT DISTINCT ?subject ?resource ?implied (?permission AS ?implied_by)
WHERE {
BIND(d:s1 AS ?subject)
?resource rdf:type :Resource .
?subject rdf:type :Subject .
?permission rdfs:subPropertyOf :Permission ;
:implies* ?implied .
?subject ?permission ?resource .
}
This will give some results like:
+----------------+----------------+---------------------+---------------------+
| subject | resource | implied | implied_by |
+----------------+----------------+---------------------+---------------------+
| http://data#s1 | http://data#r1 | http://schema#Admin | http://schema#Admin |
| http://data#s1 | http://data#r1 | http://schema#Read | http://schema#Admin |
| http://data#s1 | http://data#r1 | http://schema#Write | http://schema#Admin |
| http://data#s1 | http://data#r1 | http://schema#Read | http://schema#Read |
| http://data#s1 | http://data#r3 | http://schema#Read | http://schema#Read |
| http://data#s1 | http://data#r2 | http://schema#Write | http://schema#Write |
| http://data#s1 | http://data#r2 | http://schema#Read | http://schema#Write |
+----------------+----------------+---------------------+---------------------+
What I would like to do is write a query which only returns the minimum list of permissions per resource which imply all the others. Any ideas on how I would achieve this efficiently would be most welcome. This seems a bit like a MAX aggregation, but where a graph is actually needed to determine the results. For example, the ideal results would be:
+----------------+----------------+---------------------+---------------------+
| subject | resource | implied | implied_by |
+----------------+----------------+---------------------+---------------------+
| http://data#s1 | http://data#r1 | http://schema#Admin | http://schema#Admin |
| http://data#s1 | http://data#r3 | http://schema#Read | http://schema#Read |
| http://data#s1 | http://data#r2 | http://schema#Write | http://schema#Write |
+----------------+----------------+---------------------+---------------------+
Note that the data and queries are actually more complicated that this so it is absolutely necessary for me to use the parameter path even though I then want to get rid of most of the information it introduces.
?permission rdfs:subPropertyOf :Permission ;
:implies* ?implied .
Note also that I thought of adding an integer value to each permission so that I could simply get the minimum value of this field, but my permission structure is not single rooted in reality, so sometimes this would not be appropriate.
Effectively the question is how to remove any permissions (for a resource) which are implied by other results (for that resource)

How to get grand total in a sparql group by query

I follow up on query where the schema.org database is used to find the number of children of a class. The answer gives for each class the number of children. In my application I need the grand total of all children (i.e. the sum of the counts for each group) in order to compute for each group the percentage of the total number of children.
The query I got from the previous question is:
prefix schema: <http://schema.org/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?child (count(?grandchild) as ?nGrandchildren)
from <http://localhost:3030/testDB/data/schemaOrg>
where {
?child rdfs:subClassOf schema:Event .
optional {
?grandchild rdfs:subClassOf ?child
}
} group by ?child
which gives the expected answer (events and number of children).
How to get the total number?
I tried a nested query as:
prefix schema: <http://schema.org/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?child (count(?grandchild) as ?nGrandchildren)
from <http://localhost:3030/testDB/data/schemaOrg>
where {
select (count(?grandchild) as ?grandTotal)
{?child rdfs:subClassOf schema:Event .
optional {
?grandchild rdfs:subClassOf ?child
}
}
} group by ?child
but got a single answer: " " -> 0.
This query uses two sub-SELECTs:
* the first computes the number of grandchildren per child
* the second returns the total number of grandchildren
prefix schema: <http://schema.org/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?child ?nGrandchildren
(round(?nGrandchildren/?totalGrandchildren * 100) as ?percentageGrandchildren) {
# compute number per child
{
select ?child (count(?grandchild) as ?nGrandchildren) where {
?child rdfs:subClassOf schema:Event .
optional {
?grandchild rdfs:subClassOf ?child
}
}
group by ?child
}
# compute total number
{
select (count(?grandchild) as ?totalGrandchildren) where {
?child rdfs:subClassOf schema:Event .
optional {
?grandchild rdfs:subClassOf ?child
}
}
}
}
Output
-----------------------------------------------------------------------------------------------
| child | nGrandchildren | percentageGrandchildren |
===============================================================================================
| schema:UserInteraction | 9 | "82"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:FoodEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:MusicEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:PublicationEvent | 2 | "18"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:LiteraryEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:SportsEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:DanceEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:ScreeningEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:DeliveryEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:ExhibitionEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:EducationEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:SaleEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:VisualArtsEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:CourseInstance | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:ChildrensEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:BusinessEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:Festival | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:ComedyEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:TheaterEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:SocialEvent | 0 | "0"^^<http://www.w3.org/2001/XMLSchema#decimal> |
-----------------------------------------------------------------------------------------------

Following namespaces in SPARQL query

Given this data structure
#prefix core <http://www.w3.org/2004/02/skos/core#>
<http://localhost/myvocab/1> core#notation 1 .
<http://localhost/myvocab/1> <http://www.w3.org/2006/time#inDateTime> <http://localhost/myvocab/item1#DateDescription> .
<http://localhost/myvocab/item1#DateDescription> <http://www.w3.org/2006/time#year> 2016 ;
<http://www.w3.org/2006/time#month> "June"
If I run
select * where {?s ?p ?o .
<http://www.w3.org/2004/02/skos/core#notation> 1 . }
then only the first 2 triples (:hasID and :hasTime) are returned. Is there a sparql query (preferably not using a regex filter for matching the id) to return all triples from the child namespaces too?
I'm hoping I can achieve a result something along the lines of
-------------------------------------------------------------------------------------------------------------------------------------------------------
| s | p | o |
=======================================================================================================================================================
| <http://localhost/myvocab/item1> | <http://www.w3.org/2006/time#inDateTime> | <http://localhost/myvocab/item1#DateDescription |
| <http://localhost/myvocab/item1> | <http://www.w3.org/2004/02/skos/core#notation> | 1 |
| <http://localhost/myvocab/item1#DateDescription> | <http://www.w3.org/2006/time#month> | "June" |
| <http://localhost/myvocab/item1#DateDescription> | <http://www.w3.org/2006/time#year> | 2016 |
-------------------------------------------------------------------------------------------------------------------------------------------------------
It always helps if you provide data that we can actually load and property formed queries that you've tried. The data that you've shown isn't actually legal, nor is the query that you provided. Here's some sample data that's actually loadable:
#prefix core: <http://www.w3.org/2004/02/skos/core#>
<http://localhost/myvocab/1> core:notation 1 ;
<http://www.w3.org/2006/time#inDateTime> <http://localhost/myvocab/item1#DateDescription> .
<http://localhost/myvocab/item1#DateDescription> <http://www.w3.org/2006/time#year> 2016 ;
<http://www.w3.org/2006/time#month> "June"
If I understand you, you want to follow the :item1_DateTime object's properties too. You can do that with a property path that follows :item1's properties and value, and so on. In this query, I use (:|!:) as a wildcard, since every property is either : or it isn't. The * at the end means to follow paths of arbitrary length.
prefix core: <http://www.w3.org/2004/02/skos/core#>
select ?s ?p ?o where {
?s_ core:notation 1 ;
(<>|!<>)* ?s .
?s ?p ?o .
}
--------------------------------------------------------------------------------------------------------------------------------------------------
| s | p | o |
==================================================================================================================================================
| <http://localhost/myvocab/1> | <http://www.w3.org/2006/time#inDateTime> | <http://localhost/myvocab/item1#DateDescription> |
| <http://localhost/myvocab/1> | core:notation | 1 |
| <http://localhost/myvocab/item1#DateDescription> | <http://www.w3.org/2006/time#month> | "June" |
| <http://localhost/myvocab/item1#DateDescription> | <http://www.w3.org/2006/time#year> | 2016 |
--------------------------------------------------------------------------------------------------------------------------------------------------

Sesame 2.8.4 subquery limit bug fix?

It seems that there is bug in Sesame 2.8.4.
If I have the following data set:
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
#prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
#prefix : <http://example.org/>.
:a rdf:type :AClass .
:a :hasName "a"^^xsd:string .
:a :hasProperty :xa .
:a :hasProperty :ya .
:a :hasProperty :za .
:b rdf:type :AClass .
:b :hasName "b"^^xsd:string .
:b :hasProperty :xb .
:b :hasProperty :yb .
:c rdf:type :AClass .
:c :hasName "c"^^xsd:string .
:c :hasProperty :xc .
and run the following query on it:
prefix : <http://example.org/>
select ?s ?p ?o {
#-- first, select two instance of :AClass
{ select ?s { ?s a :AClass } limit 2 }
#-- then, select all the triples of
#-- which they are subjects
?s ?p ?o
}
The result I get back is this:
--------------------------------------------------------------------
| s | p | o |
====================================================================
| :a | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | :AClass |
| :a | :hasName | "a" |
--------------------------------------------------------------------
Instead of this one which is the correct result:
--------------------------------------------------------------------
| s | p | o |
====================================================================
| :a | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | :AClass |
| :a | :hasName | "a" |
| :a | :hasProperty | :xa |
| :a | :hasProperty | :ya |
| :a | :hasProperty | :za |
| :b | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | :AClass |
| :b | :hasName | "b" |
| :b | :hasProperty | :xb |
| :b | :hasProperty | :yb |
--------------------------------------------------------------------
Does anyone know about this bug? Has anybody encountered the same problem? Or is there another version of Sesame with this bug fixed?
The next line was added after my question was answered:
To avoid any confusions: The bug is in the Workbench and NOT the query engine.
The query engine works perfectly.
The problem you are seeing is not a bug in the query engine, but a bug in the Workbench client application (which also explains why I couldn't reproduce it earlier, as I was using the commandline client). For some reason Workbench incorrectly renders the result, showing only 2 rows (despite saying in the header that there are 9 results to be shown):
The problem is related to the result paging functionality in the Workbench, because when you change that setting it suddenly does show the complete result:
This issue has now been logged as a bug in Sesame's issue tracker: SES-2307.