Aggregate functions in Sparql query with empty records

Aggregate functions in Sparql query with empty records - sum

I've been trying to run a sparql query against https://landregistry.data.gov.uk/app/qonsole# to yield some sold properties result.
The query is the following:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ukhpi: <http://landregistry.data.gov.uk/def/ukhpi/>
SELECT sum(?ukhpi_salesVolume)
WHERE
{ { SELECT ?ukhpi_refMonth ?item
WHERE
{ ?item ukhpi:refRegion <http://landregistry.data.gov.uk/id/region/haringey> ;
ukhpi:refMonth ?ukhpi_refMonth
FILTER ( ?ukhpi_refMonth >= "2019-03"^^xsd:gYearMonth )
FILTER ( ?ukhpi_refMonth < "2020-03"^^xsd:gYearMonth )
}
}
OPTIONAL
{ ?item ukhpi:salesVolume ?ukhpi_salesVolume }
}
The problem is, the result from this is empty. However, if i run the same query without the SUM on the 4th line, i can see there are 11 integer records.
My thoughts are that there is a 12th, empty record which causes all the issues in the SUM operation, but sparql is not my storngest side so i'm not sure how to filter this (and remove any empty records) if that's really the problem.
I've also noticed that most of the aggregate functions do not work as well(min, max, avg). *Count does, and returns 11

I actually solved this myself, all that was needed was a coalesce which apparently existed in sparql too.
So:
SELECT sum(COALESCE(?ukhpi_salesVolume, 0))
instead of just
SELECT sum(?ukhpi_salesVolume)

Related

SPARQL Restrict Number of Results for Specific Variable

Suppose I want to look for some first degree neighbors of Berlin. I ask the following query:
select ?s ?p where {
?s ?p dbr:Berlin.
}
Is it possible to put a restriction on the return result, such that there are at most 5 results for each unique value of ?p?

My attempts with subqueries all time out...
But, as potentially useful if not exactly perfect solution, maybe GROUP_CONCAT, MAX/MIN or SAMPLE are of use?
SELECT
?writer (GROUP_CONCAT(?namestring; SEPARATOR = " ") AS ?namestrings)
(MIN(?namestring) AS ?min_name)
(MAX(?namestring) AS ?max_name)
(SAMPLE(?namestring) AS ?random_name)
(SAMPLE(?namestring) AS ?another_random_name_that_may_unfortunately_be_the_same_again)
WHERE {
?writer wdt:P31 wd:Q5;
wdt:P166 wd:Q37922;
wdt:P735 ?firstname.
?firstname wdt:P1705 ?namestring.
}
GROUP BY ?writer
HAVING ((COUNT(?writer)) > 2 )
LIMIT 20
See it live here.
And, as you can see, SAMPLE is apparently evaluated only once, so using it repeatedly does not get you closer to five (different) samples.
(You can leave out the HAVING for your use. I only included it to restrict it to useful examples))

SPARQL subquery wtih xsd:dateTime

I've made one query which returns the date of the Hiroshima bombing.
SELECT ?date
WHERE {
history:the_bombing_of_hiroshima history:hasStartDate ?date .
}
Result: 1945-08-06T:00:00:00
And another query which returns all of the administration that ruled the Soviet Union.
SELECT ?administration
WHERE {
?administration history:rulesCountry "The Soviet Union"^^xsd:string .
}
Now, I'm wondering. An administration in our ontology has a "startDate" and "endDate" data property. Both containing xsd:dateTime values. I want to pass the date, received from the first query into the second query so that we can get the administration that ruled the Soviet Union during the bombing of Hiroshima.
I've tried to read some SPARQL subquery examples but none of them concern dateTime filtering (check to see if a date falls between two others) and many of them are a bit confusing to me.
I suspect the query must be something like
SELECT ?administration
WHERE {
?administration history:rulesCountry "The Soviet Union"^^xsd:string .
?administration history:hasStartDate > [RESULT FROM QUERY1 HERE] && history:hasEndDate < [RESULT FROM QUERY1 HERE]
}
I'd be very grateful for any answers or pointers towards resources and tutorials I can read to get this query to work.

Wrong query evaluation with aggregation subquery

Please refer to Factforge Endpoint to execute this query. The subquery doesn't return any results. ?myVar will be projected out to the containing query, and then joined with the triple pattern ?myVar ?p ?o.. But as there are no results from the inner select, the join should result in nothing. However, this is not the case when executing the query. Isn't this a bug?
SELECT
?myVar ?p ?o
WHERE
{
{
SELECT ?myVar
WHERE {
?myVar <http://www.example.com/arbitraryNonExistent> ?xx.
}
GROUP BY ?myVar
}
?myVar ?p ?o.
}
LIMIT 10

It is the expected behaviour. According to https://www.w3.org/TR/sparql11-query/#aggregateAlgebra if there is a GROUP BY:
Group(exprlist, Ω) = { ... | μ in Ω }
and we have no matches, then Ω is empty, so:
Group(exprlist, {}) = {}
The effect is that the subquery returns a single solution where ?myVar is unbound and the join with the next statement pattern matches everything for ?myVar. At the end you are getting a lot of solutions for the whole query.
There is even a W3C SPARQL conformance testcase covering the exact scenario:
https://www.w3.org/2009/sparql/docs/tests/data-sparql11/aggregates/agg-empty-group.rq
https://www.w3.org/2009/sparql/docs/tests/data-sparql11/aggregates/agg-empty-group.srx
And also an old discussion at http://answers.semanticweb.com/questions/17410/semantics-of-sparql-aggregates.

Remove time from datetime in spaqrl

I am writing a query in SPARQL and I want to compare the date value without the time. Currently, I am getting a datetime value such as 2014-08-14T13:00:00Z. However, I want to do a filter on the date such as
FILTER (?date = "2014-08-15"^^xsd:dateTime)
I am new to SPARQL, so I need some help. Thanks.
EDITED
Thanks for the response guys. My apologies for the xsd:dateTime
FILTER (?date = "2014-08-15"^^xsd:date)
I have decided to try the following although I wanted a much 'prettier' solution.
FILTER (?date >= "2014-08-15T00:00:00Z"^^xsd:dateTime && ?date <= "2014-08-15T24:00:00Z"^^xsd:dateTime)

You can do this, but not exactly the way your question asks. The literal forms for dateTimes have to have all the fields (except the timezone), so "2014-08-15"^^xsd:dateTime isn't actually a legal dateTime. See the definition for more about the date time format.
That said, it's easy enough to pull out the year, month, and day from a datetime and put them back together into a date that you can compare with other dates:
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?dt ?date where {
values ?dt { "2011-01-10T14:45:13.815-05:00"^^xsd:dateTime }
bind(xsd:date(concat(str(year(?dt)),"-",
str(month(?dt)),"-",
str(day(?dt))))
as ?date)
}
--------------------------------------------------------------------------
| dt | date |
==========================================================================
| "2011-01-10T14:45:13.815-05:00"^^xsd:dateTime | "2011-01-10"^^xsd:date |
--------------------------------------------------------------------------
If you want to include the timezone, you can do that too; they're permitted in xsd:dates.
If you wanted to filter without creating the new date, you could also do something like
filter (year(?dt) = 2015 &&
month(?dt) = 01 &&
day(?dt) = 10)
That might be a fairly clean solution.
A note about your filter, though. You can filter the value of a variable against a constant like you did, but that often (but not always) suggests an easier way. For instance, instead of:
select ?s where {
?s a ?o .
filter ( ?o = <something> )
}
you'd usually just use the value in place, or use values to specify the value of a variable:
select ?s where {
?s a <something> .
}
select ?s where {
values ?o { <something> }
?s a ?o .
}

You could try a simple cast to xsd:date, but the SPARQL engine will likely retain the time. So it becomes a matter of parsing. One way is to just use SUBSTR, as the number of characters is known:
FILTER (xsd:date(SUBSTR(str(?date), 0, 11)) = "2014-08-15"^^xsd:date)
Another is to build the date from datetime format:
FILTER (xsd:date(CONCAT(str(YEAR(?date)), "-", str(MONTH(?date)), "-", str(DAY(?date)))) = "2014-08-15"^^xsd:date)
Perhaps not as convenient given that CONCAT requires string conversion, but the general idea is to build the string from the datetime value and cast to xsd:date.

Group By not working in DBPedia

I am working in DBPedia. I am having a problem when using group by in Sparql.
I have this code:
SELECT ?res ?titulo
WHERE {
?res rdf:type <http://dbpedia.org/class/yago/JaguaresDeChiapasFootballers> .
?res rdfs:label ?titulo .
}
GROUP BY (?res)
LIMIT 15
I want to return a list of all in this type. But I only want to return one for each URI, I put the group by and it doesn’t work and I really don’t know why?
Can someone help me?

Your original query isn't legal SPARQL. If you paste it into the SPARQL query validator at sparql.org, you'll get the following message:
Non-group key variable in SELECT: ?titulo
If you group by ?res, then you can't select the non-group variable ?titulo. If you just want one ?titulo value per ?res, then you can use …
select ?res (sample(?titulo) as ?title)
…
SPARQL results
If you want a list of the titles, then you can use group_concat to concatenate the titles:
select ?res (group_concat(?titulo;separator=', ') as ?title)
…
SPARQL results

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Aggregate functions in Sparql query with empty records - sum

I actually solved this myself, all that was needed was a coalesce which apparently existed in sparql too. So: SELECT sum(COALESCE(?ukhpi_salesVolume, 0)) instead of just SELECT sum(?ukhpi_salesVolume)

Related

SPARQL Restrict Number of Results for Specific Variable

SPARQL subquery wtih xsd:dateTime

Wrong query evaluation with aggregation subquery

Remove time from datetime in spaqrl

Group By not working in DBPedia

Categories

Resources