search for literal with different language code - sparql

I have two graphs, in one the literals are tagged (#de) in the other untagged. I need a join between the two. The trivial solution with a filter runs very slow.
WHERE {
?tok nlp:lemma ?lem .
?tok2 wn:form ?t .
filter (?tok2 = ?t) .
...
An improved version which works with fuseki is
WHERE {
?tok nlp:lemma ?lem .
Bind (str(?lem) as ?lems) .
?lu :orthForm ?lems .
...
I tried ?lu :xx (str(?lem)) . but this is flagged as an error. Why?
Similarly, using value ?lems {str(?lem)}.
I assume naively that the bind does not create much overhead, thus the above solution probably o.k.
Would the same approach work for searching when the language codes are different my previous question

The only thing allowed in a triple pattern are variables, URIs, literals (in object) and bnodes. Hence instead of the pattern ?lu :xx (str(?lem)), you will need to use a BIND or projection to convert the variable to a string. Taking the first example:
WHERE {
?lu :xx ?langLem .
BIND(str(?langLem) AS ?lem)
}
Or, using projection:
SELECT (str(?langLem) AS ?lem)
WHERE {
?lu :xx ?langLem .
}
I assume you are trying to use the VALUES statement in value ?lems {str(?lem)}. VALUES is normally used to bind variables to a set of values, e.g.:
VALUES ?lem { :Euclid :Gauss }
?lem rdfs:label ?label .
...binds ?lem to :Euclid and :Gauss and executes the query, returning the union of results. I.e. it is the same as:
{ :Euclid rdfs:label ?label }
UNION
{ :Gauss rdfs:label ?label }

Related

How to bind a string with a variable in SPARQL

The following query
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT *
{
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),"corelli","i").
?proxy edm:isShownBy ?mediaURL .
}
}
computes some titles out of some works of "corelli" out of the Europeana knowledge base. But when I try to bind a variable ?surname with a string "corelli" it returns nothing.
E.g.
SELECT *
{
values ?surname { "corelli" }
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),?surname,"i").
?proxy edm:isShownBy ?mediaURL .
}
}
The FILTER REGEX expression seems not to work there (I am using a Blazegraph instance to compute this query). Neither if I use str(?surname) instead of ?surname.
Why is this so?
Anyone knows what can be done in order to set ?surname with some string values (I might want to gather from another endpoint) and let the query find data ?
Most likely you don't see results because the query times out. It's too heavy for the public endpoint you're using. Actually, both queries are too heavy, and if you make a few attempts with different string values you might notice that sometimes the first query times out too, and sometimes the second query returns some results.

RDFising data with SPARQL and SPIN

I want to RDFising data, I need construct with a SPARQL query (I'm using SPIN) an object (Book) with two properties (Title and Author). All books have "Title" but sometime haven't "Author".
When this happens, it doesn't create this "Book", and I want create it with "Title".
I'm using GraphDB and this is the query:
prefix spif: <http://spinrdf.org/spif#>
prefix pres: <http://example.com/pruebardf/>
CONSTRUCT {
?rdfIRI a pres:Book ;
pres:Author ?author .
}
WHERE {
SERVICE <http://localhost:7200/rdf-bridge/1683716393221> {
?bookRow a <urn:Row> ;
<urn:col:Author> ?author ;
<urn:col:Title> ?title .
}
BIND(IRI(CONCAT("http://example.com/", spif:encodeURL(?title))) AS ?rdfIRI)
}
Is there a solution? I can use other SPARQL syntax.
Use OPTIONAL in the SERVICE part to make the pattern not fail when <urn:col:Author> is missing.
The CONSTRUCT will then simply not put in the ?rdfIRI pres:Author ?author triple but will include ?rdfIRI a pres:Book.
If you want to set ?author when it is missing in the data, look at using COALESCE.

using bind concat in construct query

I have the following query
CONSTRUCT{
?entity a something;
a label ?label .
}
WHERE
{
?entity a something;
a label ?label .
BIND(CONCAT(STR( ?label ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") ) AS ?label ) .
}
I simply want to concatenate some text with ?label, however when running the query I get the following error:
BIND clause alias '?label' was previously used
I only want to return a single instance of ?label hence, I defined it in the construct clause.
The error message seems to be accurate, but is only the first of many you will get with this query. The usual request to take a look at some SPARQL learning resources to at least understand the basics of triple-based graph pattern matching, along with, a couple of hints one what to look for. CONSTRUCT isn't a bad place to start, and the following should almost do what I think you intend:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT{
?entity rdfs:label ?label .
}
WHERE
{
?entity a ex:something ;
rdfs:label ?oldlabel .
BIND(CONCAT(STR( ?oldlabel ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") ) AS ?label ) .
}
There's quite a few things different about that query, so take a look to see if it accurately does what you want. One hint is the syntactic difference between using '.' and ';' to separate the triple patterns. Another is that each clause defines either a URL, using a qname in the example, or a variable, prefixed by a '?'. Neither 'label' or 'something' are valid.
I say "almost" because CONSTRUCT only returns a set of triples. To modify the labels, which I think is the intent, you need to use SPARQL Update, i.e.:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.org/example#>
DELETE {
?entity rdfs:label ?oldlabel .
}
INSERT{
?entity rdfs:label ?label .
}
WHERE
{
?entity a ex:something .
?entity rdfs:label ?oldlabel .
BIND(CONCAT(STR( ?oldlabel ), " | SOME ADDITIONAL TEXT I WOULD LIKE TO APPEND MANUALLY") AS ?label ) .
}
Note how the triple pattern finds matches for ?oldlabel and deletes them, inserting the newly bound ?label instead. This query assumes a default graph is defined that holds both the original data and the target for updates. If not then the graph needs to be specified using WITH or GRAPH. (Also included another hint on the syntactic difference between using '.' and ';' to separate triple patterns.)

Get string without the language tag

A SPARQL query like:
SELECT distinct * where {
?x dc:title ?title .
}
is always returning ?title with a language tag. How to obtain an rdf language string without a language tag, e.g. return "English"#en as "English" only
I guess you are willing to show the results from one language only. If that is the case, you can take off the label with:
SELECT distinct ?stripped_title where {
?x dc:title ?title .
BIND (STR(?title) AS ?stripped_title)
}
but it would be meaningful only after you filter your results for the desired language, e.g.
FILTER ( LANG(?title) = "en" )
Alternatively, there might be some confusion in reading the results, for example you might get seemingly duplicating answers, while it just happened that the label is the same in two different languages.

Limit a SPARQL query to one dataset

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks
First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.