How to return already specified properties for UNION in the SPARQL query result (as well as variables)? - sparql

Is there a (good) way to return already specified properties for UNION in the SPARQL query result (as well as variables)?
Any endpoint is fine but Wikidata is used as an example below (https://query.wikidata.org/sparql). For example,
SELECT DISTINCT ?item ?person
WHERE {
{? item wdt:P170 ?person . }
UNION { ?item wdt:P50 ?person . }
}
This returns something like:
| ?item | ?person |
| http://www.wikidata.org/entity/Q51136119 | http://www.wikidata.org/entity/Q736847 |
| http://www.wikidata.org/entity/Q51136131 | http://www.wikidata.org/entity/Q736847 |
What I need is to get wdt:170 and wdt:P50 in the results, so I can see what properties are used for each relation:
| ?item | ?property | ?person |
| http://www.wikidata.org/entity/Q51136119 | wdt:170 | http://www.wikidata.org/entity/Q736847 |
| http://www.wikidata.org/entity/Q51136131 | wdt:P50 | http://www.wikidata.org/entity/Q736847 |
Note
The example is simplified. There is only one UNION and two results, but there will be more UNIONs, so that it is important to know all used properties.
No problem if the property part could be full URI (e.g. http://www.wikidata.org/prop/direct/P50)
Thank you!

You could use VALUES instead of UNION:
SELECT DISTINCT ?item ?property ?person
WHERE {
VALUES ?property {
wdt:P170
wdt:P50
}
?item ?property ?person .
}
Result:

Related

Is there a "DISTINCT ON" equivalent in SPARQL?

My data is basically an event log in RDF. I have cases and events, the latter belong to the former. Events have timestamps and an actor who triggered them.
For each case I now need the latest event, when it happened, and who triggered it.
This is roughly my current query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>
SELECT ?case ?event ?timestamp ?actor
WHERE {
?case rdf:type ex:Case ;
ex:hasEvent ?event .
?event ex:timestamp ?timestamp ;
ex:hasActor ?actor .
}
ORDER BY ASC(?case) DESC(?timestamp)
Which yields something like this:
| case | event | timestamp | actor |
=================================================================================
| ex:case1 | ex:event1 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Alice |
| ex:case1 | ex:event2 | "2020-01-01T01:00:00Z"^^xsd:dateTimeStamp | ex:Bob |
| ex:case2 | ex:event3 | "2020-01-01T03:00:00Z"^^xsd:dateTimeStamp | ex:Charlie |
| ex:case2 | ex:event4 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Dan |
However I would like to only get the first and third row, as they correspond to the latest events for this case. Like this:
| case | event | timestamp | actor |
=================================================================================
| ex:case1 | ex:event1 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Alice |
| ex:case2 | ex:event3 | "2020-01-01T03:00:00Z"^^xsd:dateTimeStamp | ex:Charlie |
In order to achieve this I tried to use SELECT ?case ?event (MAX(?timestamp) AS ?latest) ?actor combined with GROUP BY ?case however SPARQL complains I need to group by ?event and ?actor as well which is not what I want of course.
I am aware that PostgreSQL has DISTINCT ON which would solve my problem, but I need to do it in SPARQL. Is there a nice way to achieve this?
Self answer based on #UninformedUser's comment:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>
SELECT ?case ?event (?latest as ?timestamp) ?actor WHERE {
?case ex:hasEvent ?event .
?event ex:timestamp ?latest ;
ex:hasActor?actor .
{ SELECT ?case (MAX(?timestamp) AS ?latest) {
?case rdf:type ex:case ;
ex:hasEvent ?event .
?event ex:timestamp ?timestamp }
group by ?case }
}

Select literal in SPARQL?

I want to construct a SPARQL query populated with values I'm setting as literals.
e.g.
SELECT 'A', 'B', attribute
FROM
TABLE
Would return a table that might look like this:
A | B | attribute
-------|---------|--------------
A | B | Mary
A | B | has
A | B | a
A | B | little
A | B | lamb
What I want to do is run a query like this to get all the object types in a triplestore:
select distinct ?o ("class" as ?item_type)
where {
?s rdf:type ?o.
}
and then (ideally) UNION it with a second query that pulls out all the distinct predicate values:
select distinct ?p ("predicate" as ?item_type)
where {
?s ?p ?o.
}
the results of which might look like:
item | item_type
-----------------|-----------------
a_thing | class
another_thing | class
a_relation | predicate
another_relation | predicate
But a UNION in SPARQL only links in an additional where clause, which means I can't specify the item_type literal I want to inject into my results-set.
I think the following should get you what you want:
SELECT DISTINCT ?item ?item_type
WHERE {
{ ?s a ?item .
BIND("class" AS ?item_type)
}
UNION
{ ?s ?item ?o
BIND("predicate" AS ?item_type)
}
}

How to transpose the query result in SPARQL

I am using TopBraid Composer for writing SPARQL queries. I have queried the following result:
| Header | Total |
|-------- |------- |
| | |
| A | 5 |
| | |
| B | 6 |
| | |
| C | 7 |
| | |
| D | 8 |
Now my humble question is whether we can transpose the result somehow as follows:
| Header | A | B | C | D |
|-------- |--- |--- |--- |--- |
| Total | 5 | 6 | 7 | 8 |
Yes and no. The first notation you use is essential for understanding SPARQL SELECT - each row represents a separate graph pattern match on the data where the first column shows the binding for ?Header and the second column shows the binding for ?Total, per your unstated query. E.g. in one of the matches, ?Header is bound to "A" and ?Total is bound to "5". Another match is ?Header = "B" and ?Total = "6", etc. (I'd suggest doing some homework on SPARQL)
From that, any language computing the SPARQL query will have some means of iterating over the result set, and you can place them in an inverted table as you show.
So, no, SPARQL can't do that (look into SPARQL graph pattern matching), but whatever language you are using should be able to iterate over the result set to get what you are looking for.
you can use a filtered left outer join query to build your own transposed table (aka pivot table).
PREFIX wd: <http://cocreate-cologne.wiki.opencura.com/entity/>
PREFIX wdt: <http://cocreate-cologne.wiki.opencura.com/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://cocreate-cologne.wiki.opencura.com/prop/>
PREFIX ps: <http://cocreate-cologne.wiki.opencura.com/prop/statement/>
PREFIX pq: <http://cocreate-cologne.wiki.opencura.com/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
select ?item ?itemLabel ?enthalten_in1Label ?enthalten_in2Label ?enthalten_in3Label {
SELECT ?item ?itemLabel ?enthalten_in1Label ?enthalten_in2Label ?enthalten_in3Label WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de". }
?item p:P3 ?statement.
?statement ps:P3 wd:Q15.
?statement pq:P13 wd:Q17.
OPTIONAL { ?item wdt:P11 ?buendnis. FILTER (?buendnis in (wd:Q32)) }
OPTIONAL { ?item wdt:P11 ?sdgKarte. FILTER (?sdgKarte in (wd:Q14)) }
OPTIONAL { ?item wdt:P11 ?agora. FILTER (?agora in (wd:Q3)) }
BIND(?buendnis as ?enthalten_in1).
BIND(?sdgKarte as ?enthalten_in2).
BIND(?agora as ?enthalten_in3).
#debug
#Filter (?item in (wd:Q1))
}
LIMIT 2000
} ORDER BY ?itemLabel

The SPARQL query - the closest blond antecedor

Let us consider we two classes: Person and its subclass BlondePerson.
Let us consider a relationship: isParent where a Person is parent of another person.
Let us define the relationship: isAncestor where there is a sequence of isParent relationships.
There might be many BlondPersons ancestor of me.
My question: how to write a SPARQL query so I learn the closest ancestor who is blond. The closest means my parent if possible, if not the grandparents, otherwise grandgrandparents and so on.
How to compose a SPARQL query for that? How to assure that I will get the ancestor who it the closest one?
Thank you.
This isn't too hard; you can use the same technique demonstrated in Is it possible to get the position of an element in an RDF Collection in SPARQL?. The idea is essentially to treat the ancestry as a sequence, from which you can get the "closeness" of each ancestor, and select the closest ancestor from a given class. If we create some sample data, we end up with something like this:
#prefix : <urn:ex:>
:a a :Person ; :hasParent :b .
:b a :Person ; :hasParent :c .
:c a :Person, :Blond ; :hasParent :d .
:d a :Person, :Blond ; :hasParent :e .
:e a :Person .
prefix : <urn:ex:>
select distinct
?person
?ancestor
(count(distinct ?mid) as ?closeness)
?isBlond
where {
values ?person { :a }
?a :hasParent+ ?mid .
?mid a :Person .
?mid :hasParent* ?ancestor .
?ancestor a :Person .
bind( if( exists { ?ancestor a :Blond }, true, false ) as ?isBlond )
}
group by ?person ?ancestor ?isBlond
order by ?person ?closeness
-------------------------------------------
| person | ancestor | closeness | isBlond |
===========================================
| :a | :b | 1 | false |
| :a | :c | 2 | true |
| :a | :d | 3 | true |
| :a | :e | 4 | false |
-------------------------------------------
That's actually more information than we needed, I just included it to show how this works. Now we can actually just require that ?ancestor is blond, order by closeness, and limit the results to the first (and thus the closest):
prefix : <urn:ex:>
select distinct
?person
?ancestor
(count(distinct ?mid) as ?closeness)
where {
values ?person { :a }
?a :hasParent+ ?mid .
?mid a :Person .
?mid :hasParent* ?ancestor .
?ancestor a :Person, :Blond .
}
group by ?person ?ancestor
order by ?person ?closeness
limit 1
---------------------------------
| person | ancestor | closeness |
=================================
| :a | :c | 2 |
---------------------------------

How to rank values in SPARQL?

I would like to create a ranking of observations using SPARQL. Suppose I have:
#prefix : <http://example.org#> .
:A :value 60 .
:B :value 23 .
:C :value 89 .
:D :value 34 .
The ranking should be: :C = 1 (the highest), :A = 2, :D = 3, :B = 4. Up until now, I was able solve it using the following query:
prefix : <http://example.org#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?x ?v ?ranking {
?x :value ?v .
{ SELECT (GROUP_CONCAT(?x;separator="") as ?ordered) {
{ SELECT ?x {
?x :value ?v .
} ORDER BY DESC(?v)
}
}
}
BIND (str(?x) as ?xName)
BIND (strbefore(?ordered,?xName) as ?before)
BIND ((strlen(?before) / strlen(?xName)) + 1 as ?ranking)
} ORDER BY ?ranking
But that query only works if the URIs for ?x have the same length. A better solution would be to have a function similar to strpos in PHP or isIndexOf in Java, but as far as I know, they are not available in SPARQL 1.1. Are there simpler solutions?
One way of doing this is to take as the ranking for a value the number of values which are less than or equal to it. This might be inefficient for larger data sets, since for each value it has to check all the other values. It doesn't require string manipulation though.
PREFIX : <http://example.org#>
SELECT ?x ?v (COUNT(*) as ?ranking) WHERE {
?x :value ?v .
[] :value ?u .
FILTER( ?v <= ?u )
}
GROUP BY ?x ?v
ORDER BY ?ranking
---------------------
| x | v | ranking |
=====================
| :C | 89 | 1 |
| :A | 60 | 2 |
| :D | 34 | 3 |
| :B | 23 | 4 |
---------------------