SPARQL for unique set of values from all columns and rows - sparql

I have a query that returns several columns, i.e.:
SELECT ?a ?b ?c
WHERE { ... }
Every column variable is an IRI. Obviously this returns a unique row for every combination of column values (note that values may not be unique to a column):
<urn:id:x:1> <urn:id:a:2> <urn:id:j:3>
<urn:id:x:1> <urn:id:a:2> <urn:id:j:4>
<urn:id:x:1> <urn:id:j:4> <urn:id:k:5>
<urn:id:y:2> <urn:id:j:4> <urn:id:k:6>
...
However, all I need are the unique IRIs spanning all rows and columns. i.e.:
<urn:id:x:1>
<urn:id:a:2>
<urn:id:j:3>
<urn:id:j:4>
<urn:id:k:5>
<urn:id:y:2>
<urn:id:k:6>
...
Is it possible to achieve this using SPARQL, or do I need to post-process the results to merge and de-deduplicate the values? Order is unimportant.

SELECT DISTINCT ?d {
...
VALUES ?i { 1 2 3 }
BIND (if(?i=1, ?a, if(?i=2, ?b, ?c)) AS ?d)
}
What does this do?
The VALUES clause creates three copies of each solution and numbers them with a variable ?i
The BIND clause creates a new variable ?d whose value is ?a, ?b or ?c, depending on whether ?i is 1, 2 or 3 in the given solution
The SELECT DISTINCT ?d returns only ?d and removes duplicates

Related

SPARQL same predicates with different values

I am using SPARQL to query my data from the triplestore.
In the triplestore I have a form containing a table, the table has different table entries. Each entry has a number of predicates. I want to use the predicates index and cost. My query needs to catch the first 2 rows, indices 0 and 1 and save the cost under a specific variable for each.
So I have tried the following (prefix ext is ommited) :
SELECT DISTINCT ?entry ?priceOne ?priceTwo
WHERE {
?form ext:table ?table.
?table ext:entry ?entry.
?entry ext:index 0;
ext:cost ?priceOne.
?entry ext:index 1;
ext:cost ?priceTwo.
}
This however does not show any values, if I remove the second part (with index 1) than I do get ?priceOne. How can I get both values?
Your current query finds each ?entry that has both ext:index values, 0 and 1. You could avoid this by using something like ?entry0 and ?entry1, essentially duplicating your triple patterns.
But typically, you would match alternatives with UNION:
SELECT DISTINCT ?entry ?priceOne ?priceTwo
WHERE {
# a shorter way to specify this, if you don’t need the ?table variable
?form ext:table/ext:entry ?entry .
{
?entry ext:index 0 ;
ext:cost ?priceOne .
}
UNION
{
?entry ext:index 1 ;
ext:cost ?priceTwo .
}
}

Retrieve only one label from set of individuals with multiple labels

Suppose I have the following set of individuals, where some of them have more than one rdfs:label:
Individual
Label
ind_01
"ind-001"
ind_02
"ind-002"
ind_02
"ind-2"
ind_03
"label3"
ind_04
"ind-4"
ind_04
"ind-04"
...
...
I would like to run a SPARQL query that retrieves only one label per individual, no matter which one (i.e., the choice can be totally arbitrary). Thus, a suitable output based on the above dataset would be:
Individual
Label
ind_01
"ind-001"
ind_02
"ind-002"
ind_03
"label3"
ind_04
"ind-4"
...
...
You could use SAMPLE, which "returns an arbitrary value from the multiset passed to it".
SELECT ?individual (SAMPLE(?label) AS ?label_sample)
WHERE {
?individual rdfs:label ?label .
}
GROUP BY ?individual

SPARQL Restrict Number of Results for Specific Variable

Suppose I want to look for some first degree neighbors of Berlin. I ask the following query:
select ?s ?p where {
?s ?p dbr:Berlin.
}
Is it possible to put a restriction on the return result, such that there are at most 5 results for each unique value of ?p?
My attempts with subqueries all time out...
But, as potentially useful if not exactly perfect solution, maybe GROUP_CONCAT, MAX/MIN or SAMPLE are of use?
SELECT
?writer (GROUP_CONCAT(?namestring; SEPARATOR = " ") AS ?namestrings)
(MIN(?namestring) AS ?min_name)
(MAX(?namestring) AS ?max_name)
(SAMPLE(?namestring) AS ?random_name)
(SAMPLE(?namestring) AS ?another_random_name_that_may_unfortunately_be_the_same_again)
WHERE {
?writer wdt:P31 wd:Q5;
wdt:P166 wd:Q37922;
wdt:P735 ?firstname.
?firstname wdt:P1705 ?namestring.
}
GROUP BY ?writer
HAVING ((COUNT(?writer)) > 2 )
LIMIT 20
See it live here.
And, as you can see, SAMPLE is apparently evaluated only once, so using it repeatedly does not get you closer to five (different) samples.
(You can leave out the HAVING for your use. I only included it to restrict it to useful examples))

Aggregate functions in Sparql query with empty records

I've been trying to run a sparql query against https://landregistry.data.gov.uk/app/qonsole# to yield some sold properties result.
The query is the following:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ukhpi: <http://landregistry.data.gov.uk/def/ukhpi/>
SELECT sum(?ukhpi_salesVolume)
WHERE
{ { SELECT ?ukhpi_refMonth ?item
WHERE
{ ?item ukhpi:refRegion <http://landregistry.data.gov.uk/id/region/haringey> ;
ukhpi:refMonth ?ukhpi_refMonth
FILTER ( ?ukhpi_refMonth >= "2019-03"^^xsd:gYearMonth )
FILTER ( ?ukhpi_refMonth < "2020-03"^^xsd:gYearMonth )
}
}
OPTIONAL
{ ?item ukhpi:salesVolume ?ukhpi_salesVolume }
}
The problem is, the result from this is empty. However, if i run the same query without the SUM on the 4th line, i can see there are 11 integer records.
My thoughts are that there is a 12th, empty record which causes all the issues in the SUM operation, but sparql is not my storngest side so i'm not sure how to filter this (and remove any empty records) if that's really the problem.
I've also noticed that most of the aggregate functions do not work as well(min, max, avg). *Count does, and returns 11
I actually solved this myself, all that was needed was a coalesce which apparently existed in sparql too.
So:
SELECT sum(COALESCE(?ukhpi_salesVolume, 0))
instead of just
SELECT sum(?ukhpi_salesVolume)

How to Add and Subtract numbers using SPARQL?

Is there any function to sum and subtract two numbers (integer or long data types) from two different classes using SPARQL? how to use it ?
As AKSW is suggesting on the comment here it is an example query that adds two variables and projects the result in the total variable.
SELECT ?total
{ ?x ns:itemcount ?xcount .
?y ns:itemcount ?ycount.
BIND (?xcount + ?ycount AS ?total)
}
See SPARQL 1.1 for futher details