SPARQL extension function , where is added to SPARQL grammar? - sparql

I want to know when i creating sparql extension functions with apache jena arq, where is it added to the grammar sparql, for the property function and filter function.

The grammar does not change.
A new expression function has a URI and it's invoked as
BIND(my:function(?x,?y) AS ?newValue)
or in FILTER, in SELECT expressions etc.
Register with FunctionRegistry.get().put(....) or use <java:...> for auto-loading.
A property function is a property in a triple pattern:
?S my:propertyFunction ?O .
Register with PropertyFunctionRegistry.get().put(....)

Related

SPARQL - Returning label when object is URI or string when object is Literal

I would like to get the labels (rdfs:label) of objects when the object is a URI. But I would also like to get the string values when the object is a literal string. The issue is, I do not know beforehand if the object is storing a literal or a URI, and in some cases I see a mix of both literals and URIs, like in the image attached.
Any suggestions on how I can return the strings, and if there's an object, return the rdfs:label?
Thanks for your help!
First, this problem shouldn't occur in an ideal world because a property is supposed to be either an object property or a datatype property.
However, when dealing with low quality data where this does occur, I suggest the following workaround:
SELECT ?x ?desc
{
?x dbp:keyPeople ?y.
{?y rdfs:label ?desc. FILTER(isIRI(?y))} UNION
{BIND(STR(?y) AS ?desc). FILTER(!isIRI(?y))}
}
Warning
This is not thoroughly tested. I could only verify that it works correctly on DBpedia Live, which uses Virtuoso 08.03.3319. On the default DBpedia endpoint https://dbpedia.org/sparql on Virtuoso version 07.20.3235 it seems to not work correctly. Also, you need to uncheck both "Strict checking of void variables" and
"Strict checking of variable names used in multiple clauses but not logically connected to each other".

Jena Fuseki and Blazegraph behave differently with respect to 'type strictness' for string literals

I'm playing with Blazegraph (2.1.5) and Jena Fuseki (3.10.0). First I insert two triples with the following query:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
insert data {
<http://s> <http://untyped> 'abc' .
<http://s> <http://typed> 'abc'^^xsd:string .
}
The triples have objects with the same string value, but one of them is untyped, and another is types as xsd:string.
Then I execute the following query:
select * where { ?s ?p 'abc' }
Jena Fuseki finds both triples, while Blazegraph only finds the 'untyped' one.
The same happens if I specifically ask for a typed version:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
select * where { ?s ?p 'abc'^^xsd:string }
Jena Fuseki again finds both triples, while Blazegraph only finds the 'typed' one.
The behavior is clearly different.
Here are my questions:
What behavior (just one of them, or both) is consistent with the SparQL 1.1 specification?
If Jena Fuseki behavior is the only consistent with the specification, is it possible to configure Blazegraph to behave like Fuseki?
If Blazegraph behavior is the 'correct' one, is there a way to find both triples without using a UNION or FILTER?
This is an interesting question because the answer is not obvious at all. Current triplestores implement the query and update language SPARQL 1.1, standardised in 2013. It is a query language for RDF, but for the version of RDF in place at the time, that is, RDF 1.0, standardised in 2004.
In RDF 2004, literals could be plain literals or typed literals. Plain literals were a UNICODE string, with an optional language tag. Typed literals were a UNCODE string with a datatype URI.
SPARQL calls plain literals without language tag "simple literals". A simple literal, being a single UNICODE string, is never the same as a typed literal, which is a pair in all cases. So "some text" and "some text"^^xsd:string are different literals in RDF 2004 and in SPARQL 1.1.
Now, in 2014, a new version of RDF, RDF 1.1, appeared where all literals have a datatype IRI, including literals with language tags. Language-tagged strings do not have to mention their datatype IRI in concrete syntaxes (the presence of a language tag is sufficient to identify the datatype IRI as rdf:langString). Literals typed with xsd:string may be written without the datatype IRI in concrete syntax. Consequently, "some text" in Turtle or N-triple syntaxes truly means "some text"^^xsd:string, according to RDF 1.1.
The problem related to your question appears when you use an RDF API conforming to RDF 1.1, together with a SPARQL 1.1 implementation. If you load an RDF document that contains:
<subject> <predicate> "some text" .
should it be interpreted according to the RDF 1.1 spec, or should it be loaded following the SPARQL 1.1 specification? In principle, this:
INSERT DATA {
<http://s> <http://untyped> 'abc' .
<http://s> <http://typed> 'abc'^^xsd:string .
}
is SPARQL 1.1, so it should be understood to contain 2 triples, one of which is a simple literal, one is a typed literal. But SPARQL implementations use RDF APIs, so mixing RDF 1.1 and SPARQL 1.1 may get the systems to apply unpredictable behaviour. You can only rely on the documentation and testing for your specific implementation, I guess.

Makes using a URI as predicate them to a property?

I have only the following Turtle statement:
x isAuthorOf y
If I have only this statement where isAuthorOf is used as a predicate, means this that I can conclude that isAuthorOf is also a Property without instanciation (isAuthorOf rdf:type rdfs:Property)?
Thanks in advance.
Yes, an IRI used ad a property implies it is a property. However, without declaration you won't know if it's a datatype or an object property.

How do I retrieve variables from an OPTIONAL statement in SPARQL?

If I give my SPARQL query a known item which could be a number of different types. Depending on which type I'm currently interested in I want to get a certain property from this type (the way to do this will be different for each type) but once I have found this property I want to perform the same operation.
currently I have the following pseudocode:
?object rdf:hasProperty "known Property Type"
OPTIONAL {
?object rdf:hasProperty "property type 1"
#do this thing and store thing of interest in ?variableOfInterest
}
OPTIONAL{
?object rdf:hasProperty "property type 2"
#do different thing and store thing of interest in ?variableOfInterest
}
?thingIAmActuallyInterestedIn rdf:has type ?variableOfInterest
#now do long query
My problem is that outside of the OPTIONAL statement ?variableOfInterest does not get passed out, instead ?thingIAmActuallyInterestedIn is just a list off all objects.
I could put the 'long query' in both of the optional blocks but that would be a huge amount of code replication.
Is there a way to output the ?variableOfInterest from the optional statement rather than it being a dummy variable?
Ensure that ?variableOfInterest is in the SELECT clause, or it's *
?variableOfInterest will be bound in result rows when the OPTIONAL matched, and unbound when the OPTIONAL did not match.
How that gets reflected to your code depends on the API of your SPARQL engine.

why this variable is never have a value

I am checking that if an instance has a value for a specific predicate, bound that value to a specific variable, otherwise, bound that variable to the value 1 of type integer. this is my code
select ?boosted where {
:r1 a ?x
optional
{
?item rs:boostedBy ?boostedOptional
bind (if(bound(?boostedOptional), ?boostedOptional, "1"^^xsd:integer) as ?boosted)
}
}
the value of ?boosted is always empty, look please
why please?
Note
I think you don't need data to test why my code is not working, because for me it sounds like a general mistake about using the bound. however, if you want data, i give you data.
note2
there is no rs:boostedBy predicate from the first place, so i was looking to have the default value always , which is 1 of type integer in this case.
The IF needs to be outside of the optional graph pattern:
SELECT ?boosted WHERE {
:r1 a ?x
OPTIONAL{ ?item rs:boostedBy ?boostedOptional . }
BIND (IF(bound(?boostedOptional), ?boostedOptional, "1"^^xsd:integer) as ?boosted)
}
Secondly, I don't see the relationship between the rs:boostedBy property and the {:r1 a ?x} triple pattern. I.e. are you trying to see if the subject has a boostedBy property? In that case :r1 and ?item should be the same, i.e. both should be :r1 or both should be ?item, if I'm understanding your intent here.