How to use "participant" vs. "participant of" correctly? - sparql

I trying to figure out the correct usage of "participant of" (P1344) and participant (P710).
As sample I want the participants of the US Civil War.
The first statement:
SELECT ?label WHERE {
wd:Q8676 wdt:P710 ?subj.
?subj rdfs:label ?label.
FILTER((LANG(?label)) = "en")
}
returns the Union & CSA.
So I tried the "reverse" Statement with participant of:
SELECT ?label WHERE {
?subj wdt:P1344 wd:Q8676.
?subj rdfs:label ?label.
FILTER((LANG(?label)) = "en")
}
Which gives me a list of 9 names and the CSA, but not the Union.
Thus, I am a bit confused, why 1) there are several people listed, especially since I don't know any of them. (My guess those are unaffiliated ones), 2) the Union is missing, 3) how does the correct statement for participant of looks like.

Inverse properties aren't kept in sync, so you can find a lot of statements going in one direction without the inverse statement. Keeping inverse statements in sync was discussed here and there, but never done, and maybe for the better as it would be a big mess: in your example, if every participant of the American Civil War or of WWII were to be added to those pages, we would potentially get very very VERY overloaded pages: using the property conflict (P607), I found
2536 known participants to the American Civil War
31698 known participants to WWII
(btw, see how you can use SERVICE wikibase:label to find labels instead of using filters)
So there seem to be a convention to link from the "small entity" to the "big entity", and keep properties such as participant (P710) for especially notable entities relatively to the subject. So the Union & CSA, rather than every single known general and soliders.

Related

DBPedia SPARQL, return certain number of relevant page URIs for entity EXCEPT the URIs where the entity belongs to a set of subclasses of Owl:Thing

Looking for SPARQL query to do the following:
For example, I have the word Apple. Apple may refer to the organization Apple_Inc or the Species of Plants class as per the ontology. Owl: Thing has a subclass called Species, so I want to return those most relevant/maximum-hit URIs where the keyword Apple does not belong to the Species subclass. So when you return all the URIs, http://dbpedia.org/page/Apple should not be one of them, neither must ANY relevant link that comes under Species subclass.
By maximum-hit/most relevant I mean the top returned results that match the query! Like when you access the PrefixSearch (i.e. Autocomplete) API, it has the parameter called MaxHits.
For example http://lookup.dbpedia.org/api/search/PrefixSearch?QueryClass=&MaxHits=2&QueryString=berl is a link where you want to return the top 2 URIs that match the QueryString=berl.
Like I'm actually really struggling to even explain the work I've done so far because I'm not able to understand the structure and how to formulate a proper query..
with respect to negation in SPARQL, I found a relevant portion of the documentation in the link here.. But I do not know how and where to proceed from there, and cannot understand why keywords like ?person are used.. I can understand the person is used to selected well.. PEOPLE names, but I would like to know how and where to find these keywords like ?person, ?name to represent a specific entity..
SELECT ?uri ?label
WHERE {
?uri rdfs:label ?label .
filter(?label="car"#en)
}
I would really appreciate if someone could link me the part of the documentation I can clearly read and understand that ?uri is used to select a URI in the form www.dbpedia.org'/page/SomeEntity and what these ?person, ?name, ?label represent.
I'm actually so lost.. I will go up and start eating one elephant at a time. For now, I'll be very grateful if I get an answer to this.
If there is anyway you know where I can avoid learning and using SPARQL, that would work too! I know Python well enough, so leveraging an API to pull this information is also fine by me. This question was posted by me.
Answer posted by #Stanislav-Kravin --
SELECT DISTINCT ?s
WHERE
{ ?s a owl:Thing .
?s rdfs:label ?label .
FILTER ( LANGMATCHES ( LANG ( ?label ), 'en' ) )
?label bif:contains '"apple"' .
FILTER NOT EXISTS { ?s rdf:type/rdfs:subClassOf* dbo:Species }
}

Querying WikiData, difference between p and wdt default prefix

I am new to wikidata and I can't figure out when I should use -->
wdt prefix (http://www.wikidata.org/prop/direct/)
and when I should use -->
p prefix (http://www.wikidata.org/prop/).
in my sparql queries. Can someone explain what each of these mean and what is the difference?
Things in the p: namespace are used to select statements. Things in the wdt: namespace are used to select entites. Entity selection, with wdt:, allows you to simplify or summarize more complex queries involving statement selection.
When you see a p: you are usually going to see a ps: or pq: shortly following. This is because you rarely want a list of statements; you usually want to know something about those statements.
This example is a two-step process showing you all the graffiti in Wikidata:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti p:P31 ?statement . # entities that are statements
?statement ps:P31 wd:Q17514 . # which state something is graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Two different versions of the P31 property are used here, housed in different namespaces. Each version comes with different expectations about how it will connect to other items. Things in the p: namespace connect entities to statements, and things in the ps: namespace connect statements to values. In the example, p:P31 is used to select statements about an entity. The entity will be graffiti, but we do not specify that until the next line, where ps:P31 is used to select the values (subjects) of the statements, specifying that those values should be graffiti.
So, that's kind of complicated! The wdt: namespace is supposed to make this kind of query simper. The example could be rewritten as:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti wdt:P31 wd:Q17514 . # entities that are graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This is now one line shorter because we are no longer looking for statements about graffiti, but for graffiti itself. The dual p: and ps: linkages are summarized with a wdt: version of the same P31 property. However, be aware:
This technique only works for statements that are true or false in nature, like, is a thing graffiti or not. (The "t" in wdt: stands for "truthy").
Information available to wdt: is just missing some facts, sometimes. Often in my experience a p: and ps: query will return a few more results than a wdt: query.
If you go to the Wikidata item page for Barack Obama at https://www.wikidata.org/wiki/Q76 and scroll down, you see the entry for the "spouse" property P26:
Think of the p: prefix as a way to get to the entire white box on the right side of the image.
In order to get to the information inside the white box, you need to dig deeper.
In order to get to the main part of the information ("Michelle Obama"), you combine the p: prefix with the ps: prefix like this:
SELECT ?spouse WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
}
The variable ?s is an abstract statement node (aka the white box).
You can get the same information with only one triple in the body of the query by using wdt::
SELECT ?spouse WHERE {
wd:Q76 wdt:P26 ?spouse .
}
So why would you ever use p:?
You might have noticed that the white box also contains meta information ("start time" and "place of marriage").
In order to get to the meta information, you combine the p: prefix with the pq: prefix.
The following example query returns all the information together with the statement node:
SELECT ?s ?spouse ?time ?place WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
?s pq:P580 ?time .
?s pq:P2842 ?place .
}
They're simply XML namespace prefixes, basically a shortcut for full URIs. So given wdt:Apples, the full URI is http://www.wikidata.org/prop/direct/Apples and given p:fruitType the URI is http://www.wikidata.org/prop/fruitType.
Prefixes/namespaces have no other meaning, they are simply ways to define the name of something with URL format. However conventions, such as defining properties in http://www.wikidata.org/prop/, are useful to separate the meanings of terms, so 'direct' is likely a sub-type of property as well (in this case having to do with wikipedia dumps).
For the specifics, you'd need to hope the authors have exposed some naming convention, or be caught in a loop of "was it p:P51 or p:P15 or maybe wdt:P51?". And may luck be with you because the "semantics" of semantic technology have been lost.

Find number of some entity type

What is sparql query that finds count of some entity? For examles, on Linked movie database, if I want find count of actors or films, how can I get it?
I tried this
SELECT (count ( ?Film)){?entity rdf:type ?Film}
but got wrong number.
There's a whole lot missing from this question (e.g., where you ran the query, what you expected as a result, etc.) but I think we can pinpoint the problem even without those details. First, let's rewrite your query using proper syntax (the formatting is optional; the important thing is count(?Film) as ?count):
select (count(?Film) as ?count) {
?entity rdf:type ?Film
}
?Film here is a variable, so you're asking "find me things and their types, and then count how many types were found." If you were trying to count the number of things of some particular film type, though, you probably wanted a query like:
select (count(?entity) as ?numberOfFilms) {
?entity rdf:type :Film .
}
Where :Film is some particular IRI, not a variable. Also note that you can abbreviate rdf:type with a, so you can make this even shorter and fit it nicely on one line again, if you want:
select (count(?entity) as ?numberOfFilms) { ?entity a :Film }

Filtering results based on specific properties with specific values (cause timeout connection to DBpedia)

I'm trying to make a SPARQL query using Prolog and DBpedia. My objective is to tag in text all Persons, so for retrieving famous people I made this query that remove all results like Music groups(Band) and Organization, since I want to tag only real people and not abstract
select ?person where{
{
?person a dbpedia-owl:Person; rdfs:label "Name Surname" #it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name"#it; foaf:surname "Surname"#it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name Surname"#it.
}
FILTER NOT EXISTS {
{ ?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Band .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
UNION
{
?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Organisation .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
}
}
I use It. version of Dbpedia if you run this query use this version although the results will not be good for me.
So for example if I search "Metallica" as a person i don't want to get results cause is it a Band or(for me, but in this case is Metallica are an Organisation too) an Organisation
and it works good this are the results Metallica Query Results and those are for "Michael Jackson" Michael Jackson Query results
My problem is when i put someone that is not a Singer or a Music band for example if i try something like "Jim Carrey" i get " error transction timed out Jim Carrey.
I think I got this problem because those properties are Undefined for Jim Carrey, but i tried an to put an OPTIONAL marker in each subquery in the first filter, but i get too the same error
I put the code in a pastebin file so you can find all three query
I know that i should not use Static String in a query or there are a lot of better mode but i need that since i compose the query with prolog and than send to sparql online so i must do in this way.
TO #Joshua I tried to remove the FILTER(String) in the NOT EXIST (Filter) But I will not work anymore thanks however for helping me
Excuse me for too much editing but i resolved some part of the starting problem but didn't find a solution
First problem :Filtering results based on specific properties with specific values. (Works)
Second : The first works only for Things with that specific property (as show above) like(Metallica,Michael Jackson, The Beatles, ...) but not for thos without the properties in the filter.
(i can't use more than two link because I'm a newbe so i will put a link in the comments with a pastebin links with the 3 Query and the results of they)

why is a sports "team" a "person" linked in dbpedia through a strict person-person query?

I have the following query:
SELECT DISTINCT(?person1), ?person2
WHERE {
?person1 ?p ?person2.
?person1 a foaf:Person.
?person2 a foaf:Person.
}
ORDER BY ?person1
LIMIT 1000
OFFSET 0
If you scroll down the results of the query here: http://dbpedia.org/snorql/
You'll see ice hockey teams etc. listed e.g.
:%C3%81g%C3%BAst_Hauksson :Iceland_national_under-21_football_team
why are these people? How can I remove them?
I also get results like:
:%C3%84ngelholms_FF__Jakob_Augustsson__1 :Jakob_Augustsson
:%C3%84ngelholms_FF__Joakim_Alriksson__1 :Joakim_Alriksson
:%C3%84ngelholms_FF__Johan_Eiswohld__1 :Johan_Eiswohld
Which just reference the same person - is there a way to remove these sorts of self-references in the original query?
The resource http://dbpedia.org/resource/Iceland_national_under-21_football_team is typed a foaf:Person (and dbpedia-owl:Person etc.), which is why it shows up in the result set.
Looking at the statements, I see that this resource is also a dbpedia-owl:SportsTeamMember, which is a subclass of dbpedia-owl:Person, which is an owl:equivalentClass of foaf:Person. This shows how the sports team was inferred to be a person.
The information in DBpedia is extracted from Wikipedia using templates, as described here. In general, mapping templates map the information in Wikipedia infoboxes and other templates to DBpedia resource properties. An article having a certain infobox (or for other mappings, a 'normal' template) are then said to be of a specific RDF class.
For example, the Infobox football club mapping template maps creates resources of type dbpedia-owl:SoccerClub from articles that have this infobox. (This doesn't apply to the Iceland team, though.)
It looks like the mapping Football squad player may have been responsible for the assertion that the Iceland team is typed a person. The template is used to list the team players, but the version of the Wikipedia page used to create the DBpedia resource has typos that could have broken the process. I'm not completely sure, but it may explain why not all national football teams are typed foaf:Person.
You can't remove specific statements from DBpedia, but you can correct errors in the source Wikipedia article, or correct, update or create mappings for DBpedia.
To remove self references, you can add a FILTER statement to your WHERE clause like this:
WHERE {
?person1 ?p ?person2.
?person1 a foaf:Person.
?person2 a foaf:Person.
FILTER (?person1 != ?person2).
}
If you are looking for specific types of relations between pairs of foaf:Persons, you can of course specify them:
WHERE {
...
?person1 foaf:knows ?person2.
...
}
Edit 2: I later realised you were asking for a different type of self-reference. From DBPedia: What's the meaning of '__1' (double underscores) in URIs? I understand these are URIs for intermediate nodes, created to avoid having to use blank nodes. For example, :%C3%84ngelholms_FF__Jakob_Augustsson__1 is the (prefixed) URI for Jakob Augustsson within (the description of) :%C3%84ngelholms_FF. For the football example, you could add FILTER (?p != dbpedia-owl:currentMember) to exclude these results.
Edit 1: added a few hyperlinks.