List all wikidata properties of a specific entity with SPARQL - sparql

I'm trying to get the list of all properties of a specific wikidata entity, starting from its id (e.g. Q39659506). I've tried to make some requests at wikidata's API using wbsearchentities and wbgetentities, but computations are too slow, due to the amount of requests and the necessary data preprocessing to obtain a Pandas Dataframe that lists properties and values like the following:
Example of the dataframe of wikidata properties I am to get
I've also tried the wikidata library for python that iterates over properties, but it is too slow compared to SPARQL. I am therefore writing this post to ask for suggestions on how to build the query such that I get properties URL, label and values URL (when available) and values, with quantities unit attached. Till now I've only been able to get the list of properties whose values are entities themselves, but not quantities or dates. You can find the query below:
# Specifying we get our info from entity DB
PREFIX entity: <http://www.wikidata.org/entity/>
SELECT ?propUrl ?propLabel ?valUrl ?valLabel ?valDesc
WHERE {
# Select the entity
entity:Q39659506 ?propUrl ?valUrl .
# Specify the property
?property ?ref ?propUrl;
rdf:type wikibase:Property;
rdfs:label ?propLabel .
# Gather values label and description
?valUrl rdfs:label ?valLabel;
schema:description ?valDesc.
# Only english language
FILTER (lang(?valLabel) = 'en') .
FILTER (lang(?propLabel) = 'en' )
FILTER (lang(?valDesc) = 'en' )
}
Thank you so much!

Related

Sparql query to read from all named graphs without knowing the names

I am looking to run a SPARQL query over any dataset. We dont know the names of the named graphs in the datasets.
These are lots of documentation and examples of selection from named graphs when you know the name of the named graph/s. There are examples showing listing named graphs.
We are running the Jena from Java so it would be possible to run 2 queries, the first gets the named graphs and we inject these into the 2nd.
But surely you can write a single query that reads from all named graphs when you dont know their names?
Note: we are looking to stay away from using default graph/s as their behaviour seems implementation dependent.
Example:
{
?s foaf:name ?name ;
vCard:nickname ?nickName .
}
If you want the pattern to match within one graph and wish to try each graph, use the GRAPH ?g form.
GRAPH ?g
{ ?s foaf:name ?name ;
vc:nickname ?nickName .
}
If you want to make a query where the pattern matches across named graphs, -- e.g. foaf:name in one graph and vCard:nickname in another, same subject --
then set union default graph tdb2:unionDefaultGraph true then the default graph as seen by the query is the union (actually, RDF merge - no duplicates) of all the named graphs. Use the pattern as originally given.
Fuseki configuration file extract:
:dataset_tdb2 rdf:type tdb2:DatasetTDB2 ;
tdb2:location "DB2" ;
## Optional - with union default for query and update WHERE matching.
tdb2:unionDefaultGraph true ;
.
In code, not Fuseki, the application can use Dataset.getUnionModel().

Get movie(s) based on book(s) from DBpedia

I am new to SPARQL and trying to fetch a movie adapted from specific book from dbpedia. This is what I have so far:
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT *
WHERE
{
<http://dbpedia.org/page/2001:_A_Space_Odyssey> a ?type.
?type onto:basedOn ?book .
?book a onto:Book
}
I can't get any results. How can I do that?
When using any web resource, and in your case the property :basedOn, you need to make sure that you have declared the right prefix. If you are querying from the DBpedia SPARQL endpoint, then you can directly use dbo:basedOneven without declaring it, as it is among predefined. Alternatively, if you want to use your own, or if you are using another SPARQL client, make sure that whatever short name you choose for this property, you declare the prefix for http://dbpedia.org/ontology/.
Then, first, to get more result you may not restrict the type of the subject of this triple pattern, as there could be movies that actually not type as such. So, a query like this
select distinct *
{
?movie dbo:basedOn ?book .
?book a dbo:Book .
}
will give you lots of good results but not all. For example, the resource from your example will be missing. You can easily check test the available properties between these two resource with a query like this:
select ?p
{
{<http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey> }
UNION
{ <http://dbpedia.org/resource/2001:_A_Space_Odyssey> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)>}
}
You'll get only one result:
http://www.w3.org/2000/01/rdf-schema#seeAlso
(note that the URI is with 'resource', not with 'page')
Then you may search for any path between the two resource, using the method described here, or find a combination of other patterns that would increase the number of results.

SPARQL query to search by name for a person

I need some help in a query.
I have a list of names and I want to write a program in python that will send a query for each person on the list and it will look for his information in dbpedia and in wikidata and will return some of it.
Can someone help me with this one?
Thanks
The following working python code takes the (slightly modified) query from #coder and uses it in a small demo program.
It will search for people with "Andreotti" as part of their name, and return one or more URLs each pointing to the page for one person found.
from SPARQLWrapper import SPARQLWrapper, JSON
NAME_FRAGMENT = "Andreotti"
QUERY = f"""
SELECT DISTINCT ?uri
WHERE {{
?uri a foaf:Person.
?uri ?p ?person_full_name.
FILTER(?p IN(dbo:birthName,dbp:birthName ,dbp:fullname,dbp:name)).
?uri rdfs:label ?person_name .
?person_name bif:contains "{NAME_FRAGMENT}" .
FILTER(langMatches(lang(?person_full_name), "en")) .
}}
LIMIT 100
"""
# Specify the DBPedia endpoint
sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery(QUERY)
sparql.setReturnFormat(JSON)
# Run the query
result = sparql.query().convert()
# The return data contains "bindings" (a list of dictionaries)
for link in result["results"]["bindings"]:
# We want the "value" attribute of the "comment" field
print(link["uri"]["value"])
Please note that as far as I understand the names are always scraped from the English wikipedia pages and therefore you will only find the English rendition (I stand to be corrected if this is not true).

dbpedia sparql movie query sql

Hi I'm trying to learn how to query DBpedia using SPARQL. I can't find any website/source that shows me how do this and I'm finding it difficult to learn how to use all the properties (like the ones available at http://mappings.dbpedia.org/index.php?title=Special%3AAllPages&from=&to=&namespace=202 ). Any good source I can learn from?
So for example if I want to check if the wikipedia page http://en.wikipedia.org/wiki/Inception is a movie (property film) or not, how do I do that?
The wikipedia URL http://en.wikipedia.org/wiki/Inception maps to the dbpedia URI http://dbpedia.org/resource/Inception. Dbpedia has a SPARQL endpoint at: http://dbpedia.org/sparql, which you may use to run queries either programmatically or via the html interface.
To check if http://dbpedia.org/page/Inception is a "movie", you have many options. To give you an idea:
If you know the URI of "movie" in dbpedia (it is http://schema.org/Movie), then run an ASK query to check against that type. ASK will return true/false based on whether the pattern in the where clause is valid against the data:
ASK where {
<http://dbpedia.org/resource/Inception> a <http://schema.org/Movie>
}
If you don't know the URI of "movie" then you have a number of options. For example:
Execute an ASK query with a filter on whether the resource has a type that contains the word "movie" somewhere in its uri (or its associated rdfs:label, or both). You would use a regular expression for this:
ASK where {
<http://dbpedia.org/resource/Inception> a ?type .
FILTER regex(str(?type), "^.*movie", "i")
}
Same idea, but return all matches and post-process the results (programmatically I pressume) to see if they match your request:
select distinct ?type where {
<http://dbpedia.org/resource/Inception> a ?type .
FILTER regex(str(?type), "^.*movie", "i")
}
Return all the types of the resource without applying a filter and post-process to see if they match your request:
select distinct ?type where {
<http://dbpedia.org/resource/Inception> a ?type
}
Many options. The SPARQL spec is you number one resource.
First I suggest you start reading up on what exactly SPARQL is. There are tons of really good tutorials such as: this.
If you want to write SPARQL queries on dbpedia, there are various endpoints that you can use. They don't always accept all features that are supported by SPARQL, but if you don't want to go through the trouble of installing one locally, they can be a relatively reliable test environment. The queries that I am going to write below, have been tested on Virtuoso endpoint.
Let's say you want to find all the movies in dbpedia. You first need to know what is the URI for a movie type in dbpedia. If you open Inception in dbpedia, you can see that the type dbpedia-owl:Film is associated to it. So if you want to get the first 100 movies, you just need to call:
select distinct *
where {
?s ?o dbpedia-owl:Film
} LIMIT 100
If you want o know more about each of these movies, you just need to expand your queries by expanding the triples.
select distinct *
where {
?s ?p dbpedia-owl:Film.
?s ?x ?y.
} LIMIT 100

Extract Named Enities that are listed in dbpedia pages but there is no informatiotion about them

I am trying to collect Albums and the list of songs per Album. For example for the Album
River_of_Tuoni, the dbpprop:title property provides the list of the songs but only two of them are in DBpedia.
so when i do the following query , I only get two, (My_Only_car,River_of_Tuoni_(song)).
SELECT ?songs
WHERE {
:River_of_Tuoni dbpprop:title ?songs. ?songs a dbpedia-owl:Song .
}
How can i get all the songs (Lullaby, Sunrise,...)?
dbpprop:title property provides the list of the songs but only two of
them are in DBpedia.
The dbprop:title property doesn't provide any list; there just happen to be a number of triples with that property. All the data is in DBpedia; the values you've shown in the screenshot indicate that two of the values happen to be URIs, while other values happen to be strings. In general, the data with the dbpprop: properties is "dirtier" than the dbpedia-owl: properties, so getting two different types of values is not very surprising.
The simple query
select ?title where {
dbpedia:River_of_Tuoni dbpprop:title ?title
}
SPARQL results
returns all the titles. However, because the dbpedia-owl: properties have cleaner data, I'd usually suggest that you consider using one of those instead of dbpprop:title, but in this case it doesn't look like there's an alternative. You might consider this alternate query to get just strings, though:
select ?title where {
dbpedia:River_of_Tuoni (dbpprop:title/rdfs:label)|dbpprop:title ?title
filter isLiteral(?title)
}
SPARQL results
For the cases where the value is another resource, taking the rdfs:label of the resource provides a string. (I think that this should also be achievable with dbprop:title/(rdfs:label?), but that doesn't work with the DBpedia endpoint.)