RDFising data with SPARQL and SPIN - sparql

I want to RDFising data, I need construct with a SPARQL query (I'm using SPIN) an object (Book) with two properties (Title and Author). All books have "Title" but sometime haven't "Author".
When this happens, it doesn't create this "Book", and I want create it with "Title".
I'm using GraphDB and this is the query:
prefix spif: <http://spinrdf.org/spif#>
prefix pres: <http://example.com/pruebardf/>
CONSTRUCT {
?rdfIRI a pres:Book ;
pres:Author ?author .
}
WHERE {
SERVICE <http://localhost:7200/rdf-bridge/1683716393221> {
?bookRow a <urn:Row> ;
<urn:col:Author> ?author ;
<urn:col:Title> ?title .
}
BIND(IRI(CONCAT("http://example.com/", spif:encodeURL(?title))) AS ?rdfIRI)
}
Is there a solution? I can use other SPARQL syntax.

Use OPTIONAL in the SERVICE part to make the pattern not fail when <urn:col:Author> is missing.
The CONSTRUCT will then simply not put in the ?rdfIRI pres:Author ?author triple but will include ?rdfIRI a pres:Book.
If you want to set ?author when it is missing in the data, look at using COALESCE.

Related

SPARQL CONSTRUCT command order of triples in created resource

My SPARQL code out of the Learning SPARQL book:
Contruct
{
?s dm:problem dm:prob29 .
dm:prob29 rdfs:label "Location value must be a URI." .
}
WHERE
{
?s dm:location ?city .
FILTER (!(isURI(?city)))
}
-- creates a file like this:
dm:prob29 rdfs:label "Location value must be a URI." .
d:item693 dm:problem dm:prob29 .enter
Why does he create the "Location value must be URI" triple first, when in the Contruct command dm:prob29 etc. is shown first? I am not really sure how this work?
The order of such triples is arbitrary, and has no importance nor meaning, in the context of a CONSTRUCT query's output.

How to bind a string with a variable in SPARQL

The following query
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT *
{
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),"corelli","i").
?proxy edm:isShownBy ?mediaURL .
}
}
computes some titles out of some works of "corelli" out of the Europeana knowledge base. But when I try to bind a variable ?surname with a string "corelli" it returns nothing.
E.g.
SELECT *
{
values ?surname { "corelli" }
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),?surname,"i").
?proxy edm:isShownBy ?mediaURL .
}
}
The FILTER REGEX expression seems not to work there (I am using a Blazegraph instance to compute this query). Neither if I use str(?surname) instead of ?surname.
Why is this so?
Anyone knows what can be done in order to set ?surname with some string values (I might want to gather from another endpoint) and let the query find data ?
Most likely you don't see results because the query times out. It's too heavy for the public endpoint you're using. Actually, both queries are too heavy, and if you make a few attempts with different string values you might notice that sometimes the first query times out too, and sometimes the second query returns some results.

Get movie(s) based on book(s) from DBpedia

I am new to SPARQL and trying to fetch a movie adapted from specific book from dbpedia. This is what I have so far:
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT *
WHERE
{
<http://dbpedia.org/page/2001:_A_Space_Odyssey> a ?type.
?type onto:basedOn ?book .
?book a onto:Book
}
I can't get any results. How can I do that?
When using any web resource, and in your case the property :basedOn, you need to make sure that you have declared the right prefix. If you are querying from the DBpedia SPARQL endpoint, then you can directly use dbo:basedOneven without declaring it, as it is among predefined. Alternatively, if you want to use your own, or if you are using another SPARQL client, make sure that whatever short name you choose for this property, you declare the prefix for http://dbpedia.org/ontology/.
Then, first, to get more result you may not restrict the type of the subject of this triple pattern, as there could be movies that actually not type as such. So, a query like this
select distinct *
{
?movie dbo:basedOn ?book .
?book a dbo:Book .
}
will give you lots of good results but not all. For example, the resource from your example will be missing. You can easily check test the available properties between these two resource with a query like this:
select ?p
{
{<http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey> }
UNION
{ <http://dbpedia.org/resource/2001:_A_Space_Odyssey> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)>}
}
You'll get only one result:
http://www.w3.org/2000/01/rdf-schema#seeAlso
(note that the URI is with 'resource', not with 'page')
Then you may search for any path between the two resource, using the method described here, or find a combination of other patterns that would increase the number of results.

How to get composer of a symphony in dbpedia?

This is my query
select *
{
?symphonies_by_composer <http://www.w3.org/2004/02/skos/core#broader> <http://dbpedia.org/resource/Category:Symphonies_by_composer> .
?symphony <http://purl.org/dc/terms/subject> ?symphonies_by_composer .
}
I run it over Dbpedia end point http://dbpedia.org/sparql/
it gives me many symphonies. i want to construct my triples, adding my own property, which is mo:composedBy like this:
PREFIX mo: <http:blablabla.com/mo#>
construct
{
?symphony mo:composedBy ?composer .
?symphony a mo:Symphony
}
{
?symphonies_by_composer <http://www.w3.org/2004/02/skos/core#broader> <http://dbpedia.org/resource/Category:Symphonies_by_composer> .
?symphony <http://purl.org/dc/terms/subject> ?symphonies_by_composer .
}
but i don't know how to get the binding for the ?composer variable.
Do you know how ?
(I'm aware that there might be no way to get it, if you think there is no way, kindly just let me know and i will pass, unfortunately, those data)
There seems to be no explicit relation in DBPedia connecting these symphonies to an actual resource that represents the composer.
A possible workaround is to extract the name of the composer from the prefLabel of the category, by snipping off the first bit ("Symphonies by"):
PREFIX mo: <http://example.com/mo#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
CONSTRUCT
{
?symphony mo:composedBy ?composer_name .
?symphony a mo:Symphony
}
WHERE
{
?symphonies_by_composer skos:broader <http://dbpedia.org/resource/Category:Symphonies_by_composer> ;
skos:prefLabel ?label .
?symphony dct:subject ?symphonies_by_composer .
BIND(SUBSTR(STR(?label), (STRLEN("Symphonies by ") + 1)) AS ?composer_name)
}
This will give you back the name of each composer as a literal value.
A second possible step is to try and reconstruct the actual IRI of the resource identifying the composer, from the name. For example, in the case of "Hans Werner Henze", the actual resource identifying the person is http://dbpedia.org/resource/Hans_Werner_Henze, so a simple further string operations or two, replacing spaces and concatenating with the dbpedia base IRI, will resolve this. However, this is brittle, as there is no guarantee that the resource exists, and even if it does, whether it actually identifies the composer (there might be more than one Hans Werner Henze, for instance).
Of course, you can expand this further by doing followup queries to verify that the resource exists and is the correct one, but it will require some additional trial and error. If the goal is simply the name of the composer, the first example query should work fine for most instances.

Limit a SPARQL query to one dataset

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks
First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.