How to form SPARQL queries that refers to multiple resources - sparql

My question is a followup with my first question about SPARQL here.
My SPARQL query results for Mountain objects are here.
From those results I picked a certain object resource.
Now I want to get values of "is dbpedia-owl:highestPlace of" records for this chosen Mountain object.
That is, names of mountain ranges for which this mountain is highest place of.
This is, as I figure, complex. Not only because I do not know the required syntax, but also I get two objects here.
One of them is Mont Blank Massif which is of type "place".
Another one is Western Alps which is of type "mountain range" - my desired record.
I need record # 2 above but not 1. I know 1 is also relevant but sometimes it doesn't follow same pattern. Sometimes the records appear to be of YAGO type, which can be totally misleading. To be safe, I simply want to discard those records whenever there is type mismatch.
How can I form my SPARQL query to get these "is dbpedia-owl:highestPlace of" records and also have the type filtering?

you can use this query, note however that Mont_Blanc_massif in your example is both a dbpedia-owl:Place and a dbpedia-owl:MountainRange
select * where {
?place dbpedia-owl:highestPlace :Mont_Blanc.
?place rdf:type dbpedia-owl:MountainRange.
}
run query
edit after comment: filter
It is not really clear what you want to filter (yago?), technically you can filter for example like this:
select * where {
?place dbpedia-owl:highestPlace :Mont_Blanc.
?place rdf:type dbpedia-owl:MountainRange.
FILTER NOT EXISTS {
?place ?pred ?obj
Filter (regex(?obj, "yago"))
}
}
this filters out results that have any object with 'yago' in its URL.

Extending the result from the previous answer, the appropriate query would be
select * where {
?mountain a dbpedia-owl:Mountain ;
dbpedia-owl:abstract ?abstract ;
foaf:depiction ?depiction .
?range a dbpedia-owl:MountainRange ;
dbpedia-owl:highestPlace ?mountain .
FILTER(langMatches(lang(?abstract),"EN"))
}
LIMIT 10
SPARQL Results
This selects mountains with English abstracts that have at least one depiction (or else the pattern wouldn't match) and for which there is some mountain range of which the mountain is the highest place. Without the parts from the earlier question, if you just want to retrieve mountains that are the highest place of a range, you can use a query like this:
select * where {
?mountain a dbpedia-owl:Mountain .
?range a dbpedia-owl:MountainRange ;
dbpedia-owl:highestPlace ?mountain .
}
LIMIT 10
SPARQL results

Related

Why DISTINCT keyword lead to different entity for these two queries?

Query 1
PREFIX ns: <http://rdf.freebase.com/ns/>
SELECT DISTINCT ?x
WHERE {
FILTER (!isLiteral(?x) OR lang(?x) = '' OR langMatches(lang(?x), 'en'))
?x ns:type.object.type ns:religion.religious_leadership_title .
?x ns:religion.religious_leadership_title.leaders ?c0 .
?c0 ns:religion.religious_organization_leadership.start_date ?sk0 .
}
ORDER BY ?sk0
LIMIT 1
Query 2
PREFIX ns: <http://rdf.freebase.com/ns/>
SELECT ?x
WHERE {
FILTER (!isLiteral(?x) OR lang(?x) = '' OR langMatches(lang(?x), 'en'))
?x ns:type.object.type ns:religion.religious_leadership_title .
?x ns:religion.religious_leadership_title.leaders ?c0 .
?c0 ns:religion.religious_organization_leadership.start_date ?sk0 .
}
ORDER BY ?sk0
LIMIT 1
So the only difference between Q1 and Q2 is that there is a DISTINCT keyword when SELECT ?x in Q1. However, Q1 gives answer m.01h_90 while Q2 gives answer m.05rd8.
Ideally, I feel this should not lead to different results, as the purpose of DISTINCT is only to get rid of duplicates in the results set if I understand it correctly, so if the original results do not have duplicates at all, there should not be any difference by adding the DISTINCT keyword.
You have a tie on the value you're ordering. Specifying distinct is causing a different execution plan which orders the rows differently, though still ordering by the one column as requested, with another row as the first one to output. Add the output column to the order by clause and you should see consustent results between the two queries.

How to use the Count query to return results from two different places?

Here is my query:
prefix dbc: <http://dbpedia.org/resource/Category:>
SELECT COUNT (?x) AS ?numberIn_Horror, COUNT (?y) AS ?numberIn_Action
WHERE { ?x dct:subject dbc:Horror_film .
?y dct:subject dbc:Action_film .}
I get a numeric output for this query however, it does not match the actual value for the number of movies present in these genres.
Is there a way to change the query so it returns the number of movies accurately.
Additionally, I am unsure as to what value it is currently returning.
Added Picture
Game look-up

get a variable number of columns for output in sparql

Is there a way to get a variable number of columns for a given predicate? Essentially, I want to turn this:
title note
A. 1
A. 2
A. 3
B. 4
B. 5
into
title note1 note2 note3
A. 1 2 3
B. 4 5 null
Like, can i set the columns created to the maximum number of "notes" in the query or something. Thanks.
There are several ways you can approach this. One way is to change your query. Now, in the general case it is not possible to do a SELECT query that does exactly what you want. However, if you happen to know in advance what the maximum number of notes per title is, you can sort of do this.
Supposing your original query was something like this:
SELECT ?title ?note
WHERE { ?title :hasNote ?note }
And supposing you know titles have at most 3 notes, you could probably (untested) do something like this:
SELECT ?title ?note1 ?note2 ?note3
WHERE {
?title :hasNote ?note1 .
OPTIONAL { ?title :hasNote ?note2 . FILTER (?note2 != ?note1) }
OPTIONAL { ?title :hasNote ?note3 . FILTER (?note3 != ?note1 && ?note3 != ?note2) }
}
As you can see this is not a very nice solution though: it doesn't scale and is probably very inefficient to process as well.
Alternatives are various forms of post-processing. To make it simpler to post-process you could use an aggregate operator to get all notes for a single item on a single line at least:
SELECT ?title (GROUP_CONCAT(?note) as ?notes)
WHERE { ?title :hasNote ?note }
GROUP BY ?title
result:
title notes
A. "1 2 3"
B. "4 5"
You could then post-process the values of the ?notes variable to split them into the separate notes again.
Another solution is that instead of using a SELECT query, you use a CONSTRUCT query to give you back an RDF graph, rather than a table, and work directly with that in your code. Tables are kinda weird in an RDF world if you think about it: you're querying a graph model, why is the query result not a graph but a table?
CONSTRUCT
WHERE { ?title :hasNote ?note }
...and then process the result in whatever API you're using to do the queries.

Fast publication date lookup with Wikidata Query Service

Is there a way to lookup publication dates quickly in Wikidata Query Service's SPARQL to find publications of a certain date, e.g., today?
I was hoping that something like this query would be quick:
SELECT * WHERE {
?work wdt:P577 ?datetime .
BIND("2018-09-28T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> as ?now_datetime)
FILTER (?datetime = ?now_datetime)
}
LIMIT 10
However, it times out when using it on the SPARQL endpoint at https://query.wikidata.org
A range query seems neither to be quick. The query below returns after almost 30 seconds:
SELECT * WHERE {
?work wdt:P577 ?datetime .
FILTER (?datetime > "2018-09-28T00:00:00Z"^^xsd:dateTime)
}
LIMIT 1
The trick is to avoid full scan and use indexes:
VALUES:
SELECT * WHERE {
VALUES (?datetime) {("2018-09-28T00:00:00Z"^^xsd:dateTime)}
?work wdt:P577 ?datetime .
} LIMIT 10
Try it!
hint:rangeSafe:
SELECT * WHERE {
VALUES (?datetime) {("2018-09-28T00:00:00Z"^^xsd:dateTime)}
?work wdt:P577 ?date_time .
hint:Prior hint:rangeSafe true .
FILTER (?date_time > ?datetime)
} LIMIT 10
Try it!
[The rangeSafe hint] declare[s] that the data touched by the query for a specific triple pattern is strongly typed, thus allowing a range filter to be pushed down onto an index.

SPARQL Query to find results which do not meet a certain criteria

I am trying to write a SPARQL query which will return a set of patient identifier codes (?Crid) which have associated with them a specific diagnosis code (?ICD9) and DO NOT have associated with them a specific medication AND which have an order date (?OrderDate) prior to their recruitment date (?RecruitDate). I have incorporated the OBIB ontology into my graph.
Here is what I have so far (a bit simplified and with a few steps through the graph omitted for readability/sensitivity):
SELECT DISTINCT ?Crid WHERE
{?Crid a obib:CRID .
#-- Return CRIDs with a diagnosis
?Crid obib:hasPart ?ICD9 .
?ICD9 a obib:diagnosis .
#-- Return CRIDs with a medical prescription record
?Crid obib:hasPart ?medRecord .
?medRecord a obib:medicalRecord .
#-- Return CRIDs with an order date
?medRecord obib:hasPart ?OrderDate .
?OrderDate a obib:dateOfDataEntry .
#-- Return CRIDs with a recruitment date
?Crid obib:hasPart ?FormFilling .
?FormFilling a obib:formFilling .
?RecruitDate obib:isAbout ?FormFilling .
?RecruitDate a obib:dateOfDataEntry .
#-- Filter results for specific ICD9 codes
FILTER (?ICD9 = '1')
#-- Subtract Results with Certain Medication and Order Date Prior to Recruitment
#-- This is the part that I think is giving me a problem
MINUS {
FILTER (regex (?medRecord, "medication_1", "i"))
FILTER (?RecruitDate-?OrderDate < "P0D"^^xsd:dayTimeDuration)
}
}
My gut feeling is that I am not using MINUS correctly. This query returns mostly the right results: I am expecting 10 results and it is returning 12. The extraneous 2 results did take "medication_1" and have order dates before their recruitment dates, so I do not want them to be included in the set.
In case it matters, I am using a Stardog endpoint to run this query and to store my graph data.
Instead of
#-- Subtract Results with Certain Medication and Order Date Prior to Recruitment
#-- This is the part that I think is giving me a problem
MINUS {
FILTER (regex (?medRecord, "medication_1", "i"))
FILTER (?RecruitDate-?OrderDate < "P0D"^^xsd:dayTimeDuration)
}
}
I'd probably just write this without MINUS as:
FILTER (!regex(?medRecord, "medication_1", "i"))
FILTER (?RecruitDate-?OrderDate >= "P0D"^^xsd:dayTimeDuration)
I'd also probably consider whether REGEX is the right tool here (would a simple string comparison work?), but that's a different issue.