How to get all properties of a instance of a given class - sparql

Just started studying sparql and I'm not getting a single result back, no matter what syntax I try.
I've been told to, as the title say, get all properties of instances of a class inside http://dbpedia.org/ontology/
I've been trying several examples seen on this page and other tutorials, but either I get an error, for bad syntax most of the time, or I get results that do not show any relation with the class I'm trying to see its instances.
Example of queries I've tried:
SELECT *
WHERE {
?ind rdf:type ?type .
OPTIONAL { ?type rdfs:subClassOf ?Politician }
}
LIMIT 50
If we take this accepted answer in another thread:
SELECT ?entity
WHERE {
?entity rdf:type ?type.
?type rdfs:subClassOf* :C.
}
I fail to see what changes should I made to specify a particular class. Like, that doesn't work for me, it throws an error (Virtuoso 37000 Error SP030: SPARQL compiler, line 6: Undefined namespace prefix at '' before '.' ), but it's an accepted answer so what do I know.
I got somewhere closer with this adaptation of the previous query
SELECT ?entity
WHERE {
?entity rdf:type ?type.
}
But, in this case, the problem is I get a whole lot of results that either don't make sense or I can't checker whether or not they are of instances of a certain class.
Same case with this one:
SELECT ?s ?politician WHERE { ?s a ?politician }
I changed 1 single word and I'm getting so many results I don't understand where they come that it seems difficult to make such a simple thing.

Related

DBPedia SPARQL, return certain number of relevant page URIs for entity EXCEPT the URIs where the entity belongs to a set of subclasses of Owl:Thing

Looking for SPARQL query to do the following:
For example, I have the word Apple. Apple may refer to the organization Apple_Inc or the Species of Plants class as per the ontology. Owl: Thing has a subclass called Species, so I want to return those most relevant/maximum-hit URIs where the keyword Apple does not belong to the Species subclass. So when you return all the URIs, http://dbpedia.org/page/Apple should not be one of them, neither must ANY relevant link that comes under Species subclass.
By maximum-hit/most relevant I mean the top returned results that match the query! Like when you access the PrefixSearch (i.e. Autocomplete) API, it has the parameter called MaxHits.
For example http://lookup.dbpedia.org/api/search/PrefixSearch?QueryClass=&MaxHits=2&QueryString=berl is a link where you want to return the top 2 URIs that match the QueryString=berl.
Like I'm actually really struggling to even explain the work I've done so far because I'm not able to understand the structure and how to formulate a proper query..
with respect to negation in SPARQL, I found a relevant portion of the documentation in the link here.. But I do not know how and where to proceed from there, and cannot understand why keywords like ?person are used.. I can understand the person is used to selected well.. PEOPLE names, but I would like to know how and where to find these keywords like ?person, ?name to represent a specific entity..
SELECT ?uri ?label
WHERE {
?uri rdfs:label ?label .
filter(?label="car"#en)
}
I would really appreciate if someone could link me the part of the documentation I can clearly read and understand that ?uri is used to select a URI in the form www.dbpedia.org'/page/SomeEntity and what these ?person, ?name, ?label represent.
I'm actually so lost.. I will go up and start eating one elephant at a time. For now, I'll be very grateful if I get an answer to this.
If there is anyway you know where I can avoid learning and using SPARQL, that would work too! I know Python well enough, so leveraging an API to pull this information is also fine by me. This question was posted by me.
Answer posted by #Stanislav-Kravin --
SELECT DISTINCT ?s
WHERE
{ ?s a owl:Thing .
?s rdfs:label ?label .
FILTER ( LANGMATCHES ( LANG ( ?label ), 'en' ) )
?label bif:contains '"apple"' .
FILTER NOT EXISTS { ?s rdf:type/rdfs:subClassOf* dbo:Species }
}

Querying WikiData, difference between p and wdt default prefix

I am new to wikidata and I can't figure out when I should use -->
wdt prefix (http://www.wikidata.org/prop/direct/)
and when I should use -->
p prefix (http://www.wikidata.org/prop/).
in my sparql queries. Can someone explain what each of these mean and what is the difference?
Things in the p: namespace are used to select statements. Things in the wdt: namespace are used to select entites. Entity selection, with wdt:, allows you to simplify or summarize more complex queries involving statement selection.
When you see a p: you are usually going to see a ps: or pq: shortly following. This is because you rarely want a list of statements; you usually want to know something about those statements.
This example is a two-step process showing you all the graffiti in Wikidata:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti p:P31 ?statement . # entities that are statements
?statement ps:P31 wd:Q17514 . # which state something is graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Two different versions of the P31 property are used here, housed in different namespaces. Each version comes with different expectations about how it will connect to other items. Things in the p: namespace connect entities to statements, and things in the ps: namespace connect statements to values. In the example, p:P31 is used to select statements about an entity. The entity will be graffiti, but we do not specify that until the next line, where ps:P31 is used to select the values (subjects) of the statements, specifying that those values should be graffiti.
So, that's kind of complicated! The wdt: namespace is supposed to make this kind of query simper. The example could be rewritten as:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti wdt:P31 wd:Q17514 . # entities that are graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This is now one line shorter because we are no longer looking for statements about graffiti, but for graffiti itself. The dual p: and ps: linkages are summarized with a wdt: version of the same P31 property. However, be aware:
This technique only works for statements that are true or false in nature, like, is a thing graffiti or not. (The "t" in wdt: stands for "truthy").
Information available to wdt: is just missing some facts, sometimes. Often in my experience a p: and ps: query will return a few more results than a wdt: query.
If you go to the Wikidata item page for Barack Obama at https://www.wikidata.org/wiki/Q76 and scroll down, you see the entry for the "spouse" property P26:
Think of the p: prefix as a way to get to the entire white box on the right side of the image.
In order to get to the information inside the white box, you need to dig deeper.
In order to get to the main part of the information ("Michelle Obama"), you combine the p: prefix with the ps: prefix like this:
SELECT ?spouse WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
}
The variable ?s is an abstract statement node (aka the white box).
You can get the same information with only one triple in the body of the query by using wdt::
SELECT ?spouse WHERE {
wd:Q76 wdt:P26 ?spouse .
}
So why would you ever use p:?
You might have noticed that the white box also contains meta information ("start time" and "place of marriage").
In order to get to the meta information, you combine the p: prefix with the pq: prefix.
The following example query returns all the information together with the statement node:
SELECT ?s ?spouse ?time ?place WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
?s pq:P580 ?time .
?s pq:P2842 ?place .
}
They're simply XML namespace prefixes, basically a shortcut for full URIs. So given wdt:Apples, the full URI is http://www.wikidata.org/prop/direct/Apples and given p:fruitType the URI is http://www.wikidata.org/prop/fruitType.
Prefixes/namespaces have no other meaning, they are simply ways to define the name of something with URL format. However conventions, such as defining properties in http://www.wikidata.org/prop/, are useful to separate the meanings of terms, so 'direct' is likely a sub-type of property as well (in this case having to do with wikipedia dumps).
For the specifics, you'd need to hope the authors have exposed some naming convention, or be caught in a loop of "was it p:P51 or p:P15 or maybe wdt:P51?". And may luck be with you because the "semantics" of semantic technology have been lost.

How to retrieve abstract for a DBpedia resource?

I need to find all DBpedia categories and articles that their abstract include a specific word.
I know how to write a SPARQL query that queries the label like the following:
SELECT ?uri ?txt WHERE {
?uri rdfs:label ?txt .
?txt bif:contains "Machine" .
}
but I have not figured out yet how to search the abstract.
I've tried with the following but it seems not to be correct.
SELECT ?uri ?txt WHERE {
?uri owl:abstract ?txt .
?txt bif:contains "Machine" .
}
How can I retrieve the abstract in order to query its text?
Since you already know how to search a string for text content, this question is really about how to get the abstract. If you retrieve any DBpedia resource in a web browser, e.g., http://dbpedia.org/resource/Mount_Monadnock (which will redirect to http://dbpedia.org/page/Mount_Monadnock), you can see the triples of which it's a subject or predicate. In this case, you'll see that the property is dbpedia-owl:abstract. Thus you can do things like
select * where {
?s dbpedia-owl:abstract ?abstract .
?abstract bif:contains "Monadnock" .
filter langMatches(lang(?abstract),"en")
}
limit 10
SPARQL results
Instead of visiting the page for the resource, which not endpoints will support, you could have simply retrieved all the triples for the subject, and looked at which ones relate it to its abstract. Since you know the abstract is a literal, you could even restrict it to triples where the object is a literal, and perhaps with a language that you want. E.g.,
select ?p ?o where {
dbpedia:Mount_Monadnock ?p ?o .
filter ( isLiteral(?o) && langMatches(lang(?o),'en') )
}
SPARQL results
This also clearly shows that the property you want is http://dbpedia.org/ontology/abstract. When you have a live query interface that you can use to pull down arbitrary data, it's very easy to find out what parts of the data you want. Just pull down more than you want at first, and then refine to get just what you want.

Filtering results based on specific properties with specific values (cause timeout connection to DBpedia)

I'm trying to make a SPARQL query using Prolog and DBpedia. My objective is to tag in text all Persons, so for retrieving famous people I made this query that remove all results like Music groups(Band) and Organization, since I want to tag only real people and not abstract
select ?person where{
{
?person a dbpedia-owl:Person; rdfs:label "Name Surname" #it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name"#it; foaf:surname "Surname"#it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name Surname"#it.
}
FILTER NOT EXISTS {
{ ?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Band .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
UNION
{
?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Organisation .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
}
}
I use It. version of Dbpedia if you run this query use this version although the results will not be good for me.
So for example if I search "Metallica" as a person i don't want to get results cause is it a Band or(for me, but in this case is Metallica are an Organisation too) an Organisation
and it works good this are the results Metallica Query Results and those are for "Michael Jackson" Michael Jackson Query results
My problem is when i put someone that is not a Singer or a Music band for example if i try something like "Jim Carrey" i get " error transction timed out Jim Carrey.
I think I got this problem because those properties are Undefined for Jim Carrey, but i tried an to put an OPTIONAL marker in each subquery in the first filter, but i get too the same error
I put the code in a pastebin file so you can find all three query
I know that i should not use Static String in a query or there are a lot of better mode but i need that since i compose the query with prolog and than send to sparql online so i must do in this way.
TO #Joshua I tried to remove the FILTER(String) in the NOT EXIST (Filter) But I will not work anymore thanks however for helping me
Excuse me for too much editing but i resolved some part of the starting problem but didn't find a solution
First problem :Filtering results based on specific properties with specific values. (Works)
Second : The first works only for Things with that specific property (as show above) like(Metallica,Michael Jackson, The Beatles, ...) but not for thos without the properties in the filter.
(i can't use more than two link because I'm a newbe so i will put a link in the comments with a pastebin links with the 3 Query and the results of they)

Limit a SPARQL query to one dataset

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;
SELECT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.
I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.
Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.
Thanks
First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.
To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.
Step one is to just find out what graphs are in play:
SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH ?g1 { ?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Try it here: http://is.gd/WdRAGX
From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.
To only look for things 'within' a site according to the "places" graph, use:
SELECT DISTINCT ?building_number ?name ?occupants WHERE {
?site a org:Site ;
rdfs:label "Highfield Campus" .
GRAPH <http://id.southampton.ac.uk/dataset/places/latest> {
?building spacerel:within ?site ;
skos:notation ?building_number ;
rdfs:label ?name .
}
OPTIONAL {
?building soton:buildingOccupants ?occ .
?occ rdfs:label ?occupants .
} .
} ORDER BY ?name
Alternate solutions:
Using rdf:type
Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.
Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.
?building a rooms:Building
Using FILTER
In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:
FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")
This is bar far the worst option and don't use it if you have to.
Belt and Braces
You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.