SPIN representation to SPARQL - sparql

Is there an API that could help convert SPIN representation (of a SPARQL query) back to its SPARQL query form?
From:
[ a <http://spinrdf.org/sp#Select> ;
<http://spinrdf.org/sp#where> ( [ <http://spinrdf.org/sp#object> [ <http://spinrdf.org/sp#varName>
"o"^^<http://www.w3.org/2001/XMLSchema#string> ] ;
<http://spinrdf.org/sp#predicate>
[ <http://spinrdf.org/sp#varName>
"p"^^<http://www.w3.org/2001/XMLSchema#string> ] ;
<http://spinrdf.org/sp#subject>
[ <http://spinrdf.org/sp#varName>
"s"^^<http://www.w3.org/2001/XMLSchema#string> ]
] )
] .
To:
SELECT *
WHERE {
?s ?p ?o .
}
Thanks in advance.

I know two jena-based APIs to work with SPIN.
You can use either org.topbraid:shacl:1.0.1 which is based on jena-arq:3.0.4 or the mentioned in the comment org.spinrdf:spinrdf:3.0.0-SNAPSHOT, which is a fork of the first one, but with changed namespaces and updated dependencies.
Note the first (original) API also may work with modern jena (3.13.x), at least your task can be solved in such circumstances.
The second API has no maven release yet, but can be included into your project via jitpack.
To solve the problem you need to find the root org.apache.jena.rdf.model.Resource and cast it to org.topbraid.spin.model.Select (or org.spinrdf.model.Select) using jena polymorphism (i.e. the operation org.apache.jena.rdf.model.RDFNode#as(Class)).
Then #toString() will return the desired query with the model's prefixes.
Note that all personalities are already included into model via static initialization.
A demonstration of this approach is SpinTransformer from ONT-API test-scope, which transforms SPARQL-based queries to an equivalent form with sp:text.

Related

RDFlib Blank node update query

I am trying to update the object of a triple with a blank node as its subject using RDFlib. I firstly select the blank node in the first function and insert this blank node into the update query in the second function, however, this doesn't provide me with the required output. I can't use the add() method or initBindings as I need to save the SPARQL query executed for the user.
Sample data
#prefix rr: <http://www.w3.org/ns/r2rml#> .
[ rr:objectMap [ rr:column "age" ;
rr:language "dhhdhd"] ].
Code
mapping_graph = Graph().parse("valid_mapping.ttl",format="ttl")
# find the blank node for the update query
def find_om_IRI():
query = """SELECT ?om
WHERE {
?om rr:language 'dhhdhd' .
}
"""
qres = mapping_graph.query(query)
for row in qres:
return row[0]
# insert blank node as subject to update query
def change_language_tag():
om_IRI = find_om_IRI()
update_query = """
PREFIX rr: <http://www.w3.org/ns/r2rml#>
DELETE DATA{
_:%s rr:language 'dhhdhd' .
}
""" % (om_IRI)
processUpdate(mapping_graph, update_query)
print(update_query)
print(mapping_graph.serialize(format="ttl").decode("utf-8"))
return update_query
change_language_tag()
This however returns the following output. Leaving the graph unchanged.
#prefix rr: <http://www.w3.org/ns/r2rml#> .
[ rr:objectMap [ rr:column "age" ;
rr:language "dhhdhd"] ].
If you filter based on the blank node value. This is the final query I came up with.
PREFIX rr: <http://www.w3.org/ns/r2rml#>
DELETE { ?om rr:language "dhhdhd" }
INSERT { ?om rr:language "en-fhfhfh" }
WHERE {
SELECT ?om
WHERE {
?om rr:language "dhhdhd" .
FILTER(str(?om) = "ub1bL24C24").
}
}
Indeed, as commenter #TallTed says "Blank nodes cannot be directly referenced in separate queries". You are trying to do something with BNs for which the are expressly not defined, that is persist their absolute identity, e.g. across separate queries. You should take the approach of relative identification (locate the BN with reference to identified, URI, nodes) or single SPARQL queries. So this question is an RDF/SPARQL question, not an RDFlib question.
You said: "I can't combine the queries as there could be other object maps with the same language tag". So if you cannot deterministically refer to a node due to its lack of distinctness, you will have to change your data, but I suspect you can - see the suggestion at the end.
Then you said "I have figured out the solution and have updated the question accordingly. Its a hack really..." Yes, don't do this! You should have a solution that's not dependent on some quirks of RDFlib! RDF and Semantic Web in general is all about universally defined and standard data and querying, so don't rely on a particular toolkit for a data question like this. Use RDFlib only as an implementation but one that should be replicable in another language. I personally model all my RDFlib triple adding/deleting/selecting code as standard SPARQL queries first so that my RDFlib code is then just a standard function equivalent.
In your own answer you said "If you filter based on the blank node value...", also don't do this either!
My suggestion is to change your underlying data to include representations of things - named nodes etc - that you can use to fix on for querying. If you cannot distinguish between things that you want to change without resorting to hacks, then you have a data modelling problem that needs solving. I do think you can distinguishes object maps though.
In your data, you must be able to fix on the particular object map for which you are changing the language. Is the object map unique per column and is the column uniquely identified by its rr:column value? If so:
SELECT ?lang
WHERE {
  ?om rr:column ?col .  ?om rr:language ?lang .
  FILTER (?col = "age")
}
This will get you the object map for the column "age" so, to change it:
DELETE {
  ?om rr:language ?lang .
}
INSERT {
  ?om rr:language "new-language" .
}
WHERE {
  ?om rr:column ?col .  ?om rr:language ?lang .
  FILTER (?col = "age")
}

How to compare values, ignoring diacritics, with SPARQL

I've been trying (with no success so far) to filter values with a "broader equals" condition. That is, ignoring diacritics.
select * where {
?s per:surname1 ?t.
bind (fn:starts-with(str(?t),'Maria') as ?noAccent1) .
bind (fn:translate(str(?t),"áéíóú","aeiou") as ?noAccent2) .
} limit 100
To this moment, I've tried with XPath functions fn:contains, fn:compare, fn:translate, fn:starts-with, but none of them seem to be working.
Is there any other way (other than chaining replaces) to add collation into these functions or achieve the same goal?
The XPath functions you mention are not part of the SPARQL standard really, so as you found out, you can't rely on them being supported out of the box (though some vendors may provide them as an add-on).
However, GraphDB (which is based on RDF4J) allows you to create your own custom functions in SPARQL. It is a matter of writing a Java class that implements the org.eclipse.rdf4j.query.algebra.evaluation.function.Function interface, and registering it in the RDF4J engine by packaging it as a Java Service Provider Interface (SPI) implementation.
SPARQL and REGEX do not support efficiently transliterating character maps. If you want an efficient implementation you would need a custom RDF4J custom as described by Jeen.
If you want a quick and dirty solution use this code sample:
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX spif: <http://spinrdf.org/spif#>
select * where {
BIND("Mariana" as ?t) .
BIND("Márénísótú" as ?t2) .
BIND (regex(str(?t),'^Maria') as ?noAccent1) .
BIND (spif:replaceAll(
spif:replaceAll(
spif:replaceAll(
spif:replaceAll(
spif:replaceAll(str(?t2),"á","a"),
"é","e")
,"í","i"),
"ó","o"),
"ú","u") as ?noAccent2) .
}

How Do I Query Against Data.gov

I am trying to teach myself this weekend how to run API queries against a data source in this case data.gov. At first I thought I'd use a simple SQL variant, but it seems in this case I have to use SPARQL.
I've read through the documentation, downloaded Twinkle, and can't seem to quite get it to run. Here is an example of a query I'm running. I'm basically trying to find all gas stations that are null around Denver, CO.
PREFIX station: https://api.data.gov/nrel/alt-fuel-stations/v1/nearest.json?api_key=???location=Denver+CO
SELECT *
WHERE
{ ?x station:network ?network like "null"
}
Any help would be very much appreciated.
SPARQL is a graph pattern language for RDF triples. A query consists of a set of "basic graph patterns" described by triple patterns of the form <subject>, <predicate>, <object>. RDF defines the subject and predicate with URI's and the object is either a URI (object property) or literal (datatype or language-tagged property). Each triple pattern in a query must therefore have three entities.
Since we don't have any examples of your data, I'll provide a way to explore the data a bit. Let's assume your prefix is correctly defined, which I doubt - it will not be the REST API URL, but the URI of the entity itself. Then you can try the following:
PREFIX station: <http://api.data.gov/nrel...>
SELECT *
WHERE
{ ?s station:network ?network .
}
...setting the PREFIX to correctly represent the namespace for network. Then look at the binding for ?network and find out how they represent null. Let's say it is a string as you show. Then the query would look like:
PREFIX station: <http://api.data.gov/nrel...>
SELECT ?s
WHERE
{ ?s station:network "null" .
}
There is no like in SPARQL, but you could use a FILTER clause using regex or other string matching features of SPARQL.
And please, please, please google "SPARQL" and "RDF". There is lots of information about SPARQL, and the W3C's SPARQL 1.1 Query Language Recommendation is a comprehensive source with many good examples.

DBpedia get all cities in the world - missing a few

I use this sparql query to get as much cities as possible:
select * where {
?city rdf:type dbo:PopulatedPlace
}
However, some expected ones are missing e.g.
http://dbpedia.org/resource/Heidelberg
(neither that nor one of its wikiRedirects)
which is of a dbo:PopulatedPlace as this query returns true (in JSON):
ask {
:Heidelberg a dbo:PopulatedPlace
}
I need that list to be exhaustiv because later I will add constraints based on user input.
I use http://dbpedia.org/snorql/ to test the queries.
Any help is appreciated.
UPDATE:
One of the Devs told me the public endpoint is limited ( about 1K ).
I'll come up with a paginated solution and see if it contains the 'outlier'.
UPDATE2:
The outlier is definitly in the resultset of rdf:type dbo:Town.
Using dbo:PopulatedPlace yields too many results to check per hand, though.
The public endpoint limits results to about 1K. Pagination or use of a smaller subclass of dbo:PopulatedPlace yields the result.

SPARQL-Query results invalid?

I run a Virtuoso Server and missed a number of results when making a SPARQL-Select request. I tracked it down and find a really strange behaviour, that I cannot explain.
But to start from the beginning.
The endpoint I query can be found at http://creativeartefact.org/sparql
I) Check for a specific triple:
ASK WHERE {
<http://creativeartefact.org/mbrainzImport/f18e677a-4051-486a-aa64-d9a3bfef90af>
<http://creativeartefact.org/ontology/represents>
<http://creativeartefact.org/mbrainzImport/35ed9f2a-6ce4-44ca-9c7a-967377b0e007>. }
The query returns TRUE
II) Now getting a bit more unspecific:
ASK WHERE {
?s
<http://creativeartefact.org/ontology/represents>
<http://creativeartefact.org/mbrainzImport/35ed9f2a-6ce4-44ca-9c7a-967377b0e007>. }
If the first returns true, the second shall do as well, shouldn't it? But it doesn't. It return FALSE!
If I replace the predicate or the object with a variable, it returns true as expected. Only when setting a variable for the subject, it returns false.
That the data really exists in the triple store can be tested by running the query
SELECT * WHERE {
?s
?p
<http://creativeartefact.org/mbrainzImport/35ed9f2a-6ce4-44ca-9c7a-967377b0e007>. }
You will see, that both results comw with p = http://creativeartefact.org/ontology/represents - which is exactly the predicate I am asking for in the former query.
To make it even more strange, there ARE triples with the aforementioned format, that return the triples:
select * {
?s
<http://creativeartefact.org/ontology/represents>
<http://creativeartefact.org/mbrainzImport/e4003568-5645-4ee1-abd0-2e8156272e59>. }
Any idea, what is happening here?
Thanks in advance,
Frank
The Virtuoso being used is an original 07.00.3203 build from 2013.
I would suggest upgrading to the latest Virtuoso 07.10.3211, open source or commercial, depending on which is in use here, and see if the problem persists ...