SPARQL query: OR in FILTER? - sparql

I would like to search court cases based on their short title, but I've noticed in the RDF records that this information is sometimes stored under one property (cdm:expression_case-law_parties) and sometimes under another (cdm:expression_title_alternative). I would like to filter on both simultaneously. The below query, where I'm trying to use an OR || in the FILTER) does not work. What is the appropriate way?
PREFIX cdm: <http://publications.europa.eu/ontology/cdm#>
SELECT ?work ?expression ?ecli ?celex ?alttitle ?parties ?title
WHERE {
?work a ?class.
?expression cdm:expression_belongs_to_work ?work.
?expression cdm:expression_title ?title.
?expression cdm:expression_uses_language <http://publications.europa.eu/resource/authority/language/ENG>.
?work cdm:case-law_ecli ?ecli.
?work cdm:resource_legal_id_celex ?celex.
OPTIONAL{?expression cdm:expression_case-law_parties ?parties}
OPTIONAL{?expression cdm:expression_title_alternative ?alttitle}
FILTER(?class in (<http://publications.europa.eu/ontology/cdm#judgement>))
FILTER CONTAINS (?alttitle, "France v Commission") || (?parties, "France v Commission")}
LIMIT 15

From Stanislav Kralin's comment:
FILTER (CONTAINS (?alttitle, "France v Commission") || CONTAINS(?parties, "France v Commission"))

Related

SPARQL Restrict Number of Results for Specific Variable

Suppose I want to look for some first degree neighbors of Berlin. I ask the following query:
select ?s ?p where {
?s ?p dbr:Berlin.
}
Is it possible to put a restriction on the return result, such that there are at most 5 results for each unique value of ?p?
My attempts with subqueries all time out...
But, as potentially useful if not exactly perfect solution, maybe GROUP_CONCAT, MAX/MIN or SAMPLE are of use?
SELECT
?writer (GROUP_CONCAT(?namestring; SEPARATOR = " ") AS ?namestrings)
(MIN(?namestring) AS ?min_name)
(MAX(?namestring) AS ?max_name)
(SAMPLE(?namestring) AS ?random_name)
(SAMPLE(?namestring) AS ?another_random_name_that_may_unfortunately_be_the_same_again)
WHERE {
?writer wdt:P31 wd:Q5;
wdt:P166 wd:Q37922;
wdt:P735 ?firstname.
?firstname wdt:P1705 ?namestring.
}
GROUP BY ?writer
HAVING ((COUNT(?writer)) > 2 )
LIMIT 20
See it live here.
And, as you can see, SAMPLE is apparently evaluated only once, so using it repeatedly does not get you closer to five (different) samples.
(You can leave out the HAVING for your use. I only included it to restrict it to useful examples))

Aggregate functions in Sparql query with empty records

I've been trying to run a sparql query against https://landregistry.data.gov.uk/app/qonsole# to yield some sold properties result.
The query is the following:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ukhpi: <http://landregistry.data.gov.uk/def/ukhpi/>
SELECT sum(?ukhpi_salesVolume)
WHERE
{ { SELECT ?ukhpi_refMonth ?item
WHERE
{ ?item ukhpi:refRegion <http://landregistry.data.gov.uk/id/region/haringey> ;
ukhpi:refMonth ?ukhpi_refMonth
FILTER ( ?ukhpi_refMonth >= "2019-03"^^xsd:gYearMonth )
FILTER ( ?ukhpi_refMonth < "2020-03"^^xsd:gYearMonth )
}
}
OPTIONAL
{ ?item ukhpi:salesVolume ?ukhpi_salesVolume }
}
The problem is, the result from this is empty. However, if i run the same query without the SUM on the 4th line, i can see there are 11 integer records.
My thoughts are that there is a 12th, empty record which causes all the issues in the SUM operation, but sparql is not my storngest side so i'm not sure how to filter this (and remove any empty records) if that's really the problem.
I've also noticed that most of the aggregate functions do not work as well(min, max, avg). *Count does, and returns 11
I actually solved this myself, all that was needed was a coalesce which apparently existed in sparql too.
So:
SELECT sum(COALESCE(?ukhpi_salesVolume, 0))
instead of just
SELECT sum(?ukhpi_salesVolume)

How to "rotate" SPARQL results in a "loop"?

Problem
I want to select an arbitrary subset of size n for each element of an answer.
For one specific element, like the city of Leipzig, I can solve it like this in DBpedia (http://www.dbpedia.org/sparql):
Example Query Single Element
select ?p
{
?p dbo:birthPlace dbr:Leipzig.
} limit 3
Output Single Element
http://dbpedia.org/resource/Walter_Ulbricht
http://dbpedia.org/resource/Anita_Berber
http://dbpedia.org/resource/Martin_Benno_Schmidt
But I want to rotate the output and do this for all (or a certain number of) cities:
Desired Output Multiple Elements
City Person1 Person2 Person3
dbr:Leipzig dbr:Walter_Ulbricht dbr:Anita_Berber dbr:Martin_Benno_Schmidt
dbr:Bonn dbr:Anton_Schumacher dbr:Hermann_Wallich dbr:Fritz_Simrock
dbr:Paris dbr:Adrien-Marie_Legendre dbr:André_Malraux dbr:Anselme_Payen
...
I tried to solve this with the following query:
SELECT ?city SAMPLE(?p1) SAMPLE(?p2) SAMPLE(?p3)
{
?city ^dbo:birthPlace ?p1,?p2,?p3.
?city a dbo:City.
FILTER(?p1<?p2&&?p2<?p3) # prevent permutations and duplicates
} GROUP BY ?city # only one line per city
LIMIT 10
However I am not sure if this is the best solution and I have a few questions:
With larger n this way of writing the query gets cumbersome, is there a more elegant option (e.g. using subqueries)?
Does this query give me all the results I want, i.e. does it sample the whole row, or does it lose results by sampling each variable seperately and then skipping valid solutions?
If it does return all the results I would have gotten by repeating single element queries, does it have the same efficiency or does it run through a large number of permutations before filtering them out? If it is not, is there a way to write it more efficiently?
Here is a reasonably elegant and efficient solution:
In a subquery, use GROUP BY with the group_concat aggregate to merge the URIs for all people of one city into one long string.
Outside of the subquery, use string functions to break apart the long string and take the first n items.
Done here for the first 100 cities with five people per city:
SELECT ?c ?p1 ?p2 ?p3 ?p4 ?p5 {
{
SELECT ?c (group_concat(?p; separator=' ') AS ?list1) {
{
SELECT ?c { ?c a dbo:City } LIMIT 100
}
OPTIONAL { ?p dbo:birthPlace ?c }
}
GROUP BY ?c
}
BIND (if(?list1, strAfter(?list1, ' '), undef) AS ?list2)
BIND (if(?list2, strAfter(?list2, ' '), undef) AS ?list3)
BIND (if(?list3, strAfter(?list3, ' '), undef) AS ?list4)
BIND (if(?list4, strAfter(?list4, ' '), undef) AS ?list5)
BIND (if(?list1, if(contains(?list1, ' '), IRI(strBefore(?list1, ' ')), IRI(?list1)), undef) AS ?p1)
BIND (if(?list2, if(contains(?list2, ' '), IRI(strBefore(?list2, ' ')), IRI(?list2)), undef) AS ?p2)
BIND (if(?list3, if(contains(?list3, ' '), IRI(strBefore(?list3, ' ')), IRI(?list3)), undef) AS ?p3)
BIND (if(?list4, if(contains(?list4, ' '), IRI(strBefore(?list4, ' ')), IRI(?list4)), undef) AS ?p4)
BIND (if(?list5, if(contains(?list5, ' '), IRI(strBefore(?list5, ' ')), IRI(?list5)), undef) AS ?p5)
}

SPARQL group by a substring and average

I am querying a large data set (temperatures recorded hourly for nearly 20 years) and I'd rather get a summary, e.g. daily temperatures.
An example query is here:
http://www.boisvert.me.uk/opendata/sparql_aq+.html?pasteid=hu5rbc7W
PREFIX opensheff: <uri://opensheffield.org/properties#>
select ?time ?temp where {
?m opensheff:sensor <uri://opensheffield.org/datagrid/sensors/Weather_Mast/Weather_Mast.ic> ;
opensheff:rawValue ?temp ;
<http://purl.oclc.org/NET/ssnx/ssn#endTime> ?time .
FILTER (str(?time) > "2011-09-24")
}
ORDER BY ASC(?time)
And the results look like this:
time temp
"2011-09-24T00:00Z" 12.31
"2011-09-24T01:00Z" 11.68
"2011-09-24T02:00Z" 11.92
"2011-09-24T03:00Z" 11.59
Now I would like to group by a part of the date string, so as to get a daily average temperature:
time temp
"2011-09-24" 12.3 # or whatever
"2011-09-23" 11.7
"2011-09-22" 11.9
"2011-09-21" 11.6
So, how do I group by a substring of ?time ?
Eventually solved it. Running here:
http://www.boisvert.me.uk/opendata/sparql_aq+.html?pasteid=j8m0Qk6s
Code:
PREFIX opensheff:
select ?d AVG(?temp) as ?day_temp
where {
?m opensheff:sensor <uri://opensheffield.org/datagrid/sensors/Weather_Mast/Weather_Mast.ic> ;
opensheff:rawValue ?temp ;
<http://purl.oclc.org/NET/ssnx/ssn#endTime> ?time .
BIND( SUBSTR(?time, 1, 10) AS ?d ) .
}
GROUP BY ?d
ORDER BY ASC(?d)
We use BIND to set a new variable to the substring required, and then grouping and averaging by that variable is simple enough.

Group By not working in DBPedia

I am working in DBPedia. I am having a problem when using group by in Sparql.
I have this code:
SELECT ?res ?titulo
WHERE {
?res rdf:type <http://dbpedia.org/class/yago/JaguaresDeChiapasFootballers> .
?res rdfs:label ?titulo .
}
GROUP BY (?res)
LIMIT 15
I want to return a list of all in this type. But I only want to return one for each URI, I put the group by and it doesn’t work and I really don’t know why?
Can someone help me?
Your original query isn't legal SPARQL. If you paste it into the SPARQL query validator at sparql.org, you'll get the following message:
Non-group key variable in SELECT: ?titulo
If you group by ?res, then you can't select the non-group variable ?titulo. If you just want one ?titulo value per ?res, then you can use …
select ?res (sample(?titulo) as ?title)
…
SPARQL results
If you want a list of the titles, then you can use group_concat to concatenate the titles:
select ?res (group_concat(?titulo;separator=', ') as ?title)
…
SPARQL results