A SPARQL query like:
SELECT distinct * where {
?x dc:title ?title .
}
is always returning ?title with a language tag. How to obtain an rdf language string without a language tag, e.g. return "English"#en as "English" only
I guess you are willing to show the results from one language only. If that is the case, you can take off the label with:
SELECT distinct ?stripped_title where {
?x dc:title ?title .
BIND (STR(?title) AS ?stripped_title)
}
but it would be meaningful only after you filter your results for the desired language, e.g.
FILTER ( LANG(?title) = "en" )
Alternatively, there might be some confusion in reading the results, for example you might get seemingly duplicating answers, while it just happened that the label is the same in two different languages.
Related
The following query
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT *
{
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),"corelli","i").
?proxy edm:isShownBy ?mediaURL .
}
}
computes some titles out of some works of "corelli" out of the Europeana knowledge base. But when I try to bind a variable ?surname with a string "corelli" it returns nothing.
E.g.
SELECT *
{
values ?surname { "corelli" }
SERVICE <http://sparql.europeana.eu> {
?CHO ore:proxyIn ?proxy;
dc:title ?title ;
dc:creator ?creator ;
dc:date ?date .
FILTER REGEX(str(?creator),?surname,"i").
?proxy edm:isShownBy ?mediaURL .
}
}
The FILTER REGEX expression seems not to work there (I am using a Blazegraph instance to compute this query). Neither if I use str(?surname) instead of ?surname.
Why is this so?
Anyone knows what can be done in order to set ?surname with some string values (I might want to gather from another endpoint) and let the query find data ?
Most likely you don't see results because the query times out. It's too heavy for the public endpoint you're using. Actually, both queries are too heavy, and if you make a few attempts with different string values you might notice that sometimes the first query times out too, and sometimes the second query returns some results.
I am new to SPARQL and trying to fetch a movie adapted from specific book from dbpedia. This is what I have so far:
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT *
WHERE
{
<http://dbpedia.org/page/2001:_A_Space_Odyssey> a ?type.
?type onto:basedOn ?book .
?book a onto:Book
}
I can't get any results. How can I do that?
When using any web resource, and in your case the property :basedOn, you need to make sure that you have declared the right prefix. If you are querying from the DBpedia SPARQL endpoint, then you can directly use dbo:basedOneven without declaring it, as it is among predefined. Alternatively, if you want to use your own, or if you are using another SPARQL client, make sure that whatever short name you choose for this property, you declare the prefix for http://dbpedia.org/ontology/.
Then, first, to get more result you may not restrict the type of the subject of this triple pattern, as there could be movies that actually not type as such. So, a query like this
select distinct *
{
?movie dbo:basedOn ?book .
?book a dbo:Book .
}
will give you lots of good results but not all. For example, the resource from your example will be missing. You can easily check test the available properties between these two resource with a query like this:
select ?p
{
{<http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey> }
UNION
{ <http://dbpedia.org/resource/2001:_A_Space_Odyssey> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)>}
}
You'll get only one result:
http://www.w3.org/2000/01/rdf-schema#seeAlso
(note that the URI is with 'resource', not with 'page')
Then you may search for any path between the two resource, using the method described here, or find a combination of other patterns that would increase the number of results.
I have two graphs, in one the literals are tagged (#de) in the other untagged. I need a join between the two. The trivial solution with a filter runs very slow.
WHERE {
?tok nlp:lemma ?lem .
?tok2 wn:form ?t .
filter (?tok2 = ?t) .
...
An improved version which works with fuseki is
WHERE {
?tok nlp:lemma ?lem .
Bind (str(?lem) as ?lems) .
?lu :orthForm ?lems .
...
I tried ?lu :xx (str(?lem)) . but this is flagged as an error. Why?
Similarly, using value ?lems {str(?lem)}.
I assume naively that the bind does not create much overhead, thus the above solution probably o.k.
Would the same approach work for searching when the language codes are different my previous question
The only thing allowed in a triple pattern are variables, URIs, literals (in object) and bnodes. Hence instead of the pattern ?lu :xx (str(?lem)), you will need to use a BIND or projection to convert the variable to a string. Taking the first example:
WHERE {
?lu :xx ?langLem .
BIND(str(?langLem) AS ?lem)
}
Or, using projection:
SELECT (str(?langLem) AS ?lem)
WHERE {
?lu :xx ?langLem .
}
I assume you are trying to use the VALUES statement in value ?lems {str(?lem)}. VALUES is normally used to bind variables to a set of values, e.g.:
VALUES ?lem { :Euclid :Gauss }
?lem rdfs:label ?label .
...binds ?lem to :Euclid and :Gauss and executes the query, returning the union of results. I.e. it is the same as:
{ :Euclid rdfs:label ?label }
UNION
{ :Gauss rdfs:label ?label }
I need to find all DBpedia categories and articles that their abstract include a specific word.
I know how to write a SPARQL query that queries the label like the following:
SELECT ?uri ?txt WHERE {
?uri rdfs:label ?txt .
?txt bif:contains "Machine" .
}
but I have not figured out yet how to search the abstract.
I've tried with the following but it seems not to be correct.
SELECT ?uri ?txt WHERE {
?uri owl:abstract ?txt .
?txt bif:contains "Machine" .
}
How can I retrieve the abstract in order to query its text?
Since you already know how to search a string for text content, this question is really about how to get the abstract. If you retrieve any DBpedia resource in a web browser, e.g., http://dbpedia.org/resource/Mount_Monadnock (which will redirect to http://dbpedia.org/page/Mount_Monadnock), you can see the triples of which it's a subject or predicate. In this case, you'll see that the property is dbpedia-owl:abstract. Thus you can do things like
select * where {
?s dbpedia-owl:abstract ?abstract .
?abstract bif:contains "Monadnock" .
filter langMatches(lang(?abstract),"en")
}
limit 10
SPARQL results
Instead of visiting the page for the resource, which not endpoints will support, you could have simply retrieved all the triples for the subject, and looked at which ones relate it to its abstract. Since you know the abstract is a literal, you could even restrict it to triples where the object is a literal, and perhaps with a language that you want. E.g.,
select ?p ?o where {
dbpedia:Mount_Monadnock ?p ?o .
filter ( isLiteral(?o) && langMatches(lang(?o),'en') )
}
SPARQL results
This also clearly shows that the property you want is http://dbpedia.org/ontology/abstract. When you have a live query interface that you can use to pull down arbitrary data, it's very easy to find out what parts of the data you want. Just pull down more than you want at first, and then refine to get just what you want.
I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks
First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.