Order SPARQL query results by length of a string? - sparql

I'm trying to autocomplete what the user writes in an input, with terms in DBpedia, similar to this jsFiddle example. Try writing dog in the input of that jsFiddle, and you will see the 'Dog' term in the suggestions.
I have the following code, and the problem is that the 10-term list I got as a result does not contains the "Dog" alternative. So, if I could order the list by the length of the (string representation of) ?concept, then I could get that term. Is this possible?
SELECT DISTINCT ?concept
WHERE {
?concept a skos:Concept .
FILTER regex(str(?concept), "dog", "i")
}
ORDER BY ASC(?concept) LIMIT 10

So, if I could order the list by the lenght of the ?concept it is possible to get the term. But I can't find the right statement to do it. Is it possible?
It sounds like you're looking for strlen.
order by strlen(str(?concept))
E.g.,
select distinct ?concept where {
?concept a skos:Concept .
filter regex(str(?concept), "dog", "i")
}
order by strlen(str(?concept))
limit 10
SPARQL results
That said, if you're just checking string membership, you don't need all the power of regular expressions, and it might be more efficient to use contains and lcase to check whether the lowercased ?concept contains "dog" with a filter like:
filter contains(lcase(str(?concept)), "dog")
The table of contents in the SPARQL spec has a big list of functions that you can browse. In particular, you'd want to look at the subsections of 17.4 Function Definitions.

Related

SPARQL Restrict Number of Results for Specific Variable

Suppose I want to look for some first degree neighbors of Berlin. I ask the following query:
select ?s ?p where {
?s ?p dbr:Berlin.
}
Is it possible to put a restriction on the return result, such that there are at most 5 results for each unique value of ?p?
My attempts with subqueries all time out...
But, as potentially useful if not exactly perfect solution, maybe GROUP_CONCAT, MAX/MIN or SAMPLE are of use?
SELECT
?writer (GROUP_CONCAT(?namestring; SEPARATOR = " ") AS ?namestrings)
(MIN(?namestring) AS ?min_name)
(MAX(?namestring) AS ?max_name)
(SAMPLE(?namestring) AS ?random_name)
(SAMPLE(?namestring) AS ?another_random_name_that_may_unfortunately_be_the_same_again)
WHERE {
?writer wdt:P31 wd:Q5;
wdt:P166 wd:Q37922;
wdt:P735 ?firstname.
?firstname wdt:P1705 ?namestring.
}
GROUP BY ?writer
HAVING ((COUNT(?writer)) > 2 )
LIMIT 20
See it live here.
And, as you can see, SAMPLE is apparently evaluated only once, so using it repeatedly does not get you closer to five (different) samples.
(You can leave out the HAVING for your use. I only included it to restrict it to useful examples))

SPARQL subquery wtih xsd:dateTime

I've made one query which returns the date of the Hiroshima bombing.
SELECT ?date
WHERE {
history:the_bombing_of_hiroshima history:hasStartDate ?date .
}
Result: 1945-08-06T:00:00:00
And another query which returns all of the administration that ruled the Soviet Union.
SELECT ?administration
WHERE {
?administration history:rulesCountry "The Soviet Union"^^xsd:string .
}
Now, I'm wondering. An administration in our ontology has a "startDate" and "endDate" data property. Both containing xsd:dateTime values. I want to pass the date, received from the first query into the second query so that we can get the administration that ruled the Soviet Union during the bombing of Hiroshima.
I've tried to read some SPARQL subquery examples but none of them concern dateTime filtering (check to see if a date falls between two others) and many of them are a bit confusing to me.
I suspect the query must be something like
SELECT ?administration
WHERE {
?administration history:rulesCountry "The Soviet Union"^^xsd:string .
?administration history:hasStartDate > [RESULT FROM QUERY1 HERE] && history:hasEndDate < [RESULT FROM QUERY1 HERE]
}
I'd be very grateful for any answers or pointers towards resources and tutorials I can read to get this query to work.

Why filter doesn't work in this context?

This is the query and the result:
As you see, I am filtering out the users that are bo:ania, so why do they still appear?
However, if I remove the widecard and select just the users ?user, bo:ania doesn't appear
I didn't provide a minimum data example because this is a question about how filter and wildcard work, not about a problem in extracting some data from a data set. However, if you need a minimum data, I'm more than happy to provide it.
?specificUser is bound to bo:ania by your VALUES statement. ?user is an entirely different binding defined by the other triple patterns. Your FILTER says to filter out results where ?user = bo:ania, and it appears to be doing that correctly, seeing that ?user is not bound to bo:ania in any of the results.
BTW, there isn't a need to use VALUES in this case unless you want to inspect multiple values. If it's just the one value, then the following would work, and not have you wondering why the binding to bo:ania is included in the result set:
SELECT *
WHERE {
?user a rs:user .
?user rs:hasRated ?rating .
?rating rs:hasRatingDate ?ratngDate .
FILTER (?ratingDates >= (now() -"P10000F"^^xsd:duration) )
FILTER (?user != bo:ania)
}

How to customize ORDER BY condition in SPARQL

I want to order the output of the SPARQL queries not alphabetically, but in a certain order that I define myself. I cannot find information on how to do it.
For example, I have a query:
SELECT DISTINCT ?a ?b ?c
WHERE { ... }
ORDER BY ?c
The variable c can have a limited range of values: "s", "m", "m1", "m2", "p" and "w".
I want the ordering to be as follows:
1) s
2) m
3) m1
4) m2
4) p
5) w
So, it is not an alphabetical order.
How to force the ORDER BY to order output in this order?
I use SPARQL endpoint Fuseki to query a turtle file, and jinja2 templates to render the results.
In general, you'd need to write a custom function, as described in Jeen's answer. However, when you have a limited number of values, and you know the specific ordering, you can simply include a values block that associates each value with its sort index and then sort on that. E.g., if you wanted to sort some sizes, you could do something like this:
select ?item where {
values (?size ?size_) { ("small" 1) ("medium" 2) ("large" 3) }
?item :hasSize ?size
}
order by ?size_
You can achieve any custom sorting order by writing a custom function that implements the sorting logic and injecting that function into your SPARQL engine. See this tutorial on how to create a custom function for the Sesame SPARQL engine - other SPARQL engines have similar functionality.
Once you have this function, you can use it in the ORDER BY argument:
SELECT ....
ORDER BY ex:customFunction(?c)

Group By not working in DBPedia

I am working in DBPedia. I am having a problem when using group by in Sparql.
I have this code:
SELECT ?res ?titulo
WHERE {
?res rdf:type <http://dbpedia.org/class/yago/JaguaresDeChiapasFootballers> .
?res rdfs:label ?titulo .
}
GROUP BY (?res)
LIMIT 15
I want to return a list of all in this type. But I only want to return one for each URI, I put the group by and it doesn’t work and I really don’t know why?
Can someone help me?
Your original query isn't legal SPARQL. If you paste it into the SPARQL query validator at sparql.org, you'll get the following message:
Non-group key variable in SELECT: ?titulo
If you group by ?res, then you can't select the non-group variable ?titulo. If you just want one ?titulo value per ?res, then you can use …
select ?res (sample(?titulo) as ?title)
…
SPARQL results
If you want a list of the titles, then you can use group_concat to concatenate the titles:
select ?res (group_concat(?titulo;separator=', ') as ?title)
…
SPARQL results