Graph restrictions to reduce dataset in SPARQL - sparql

I am writing a SPARQL query for a dataset containing thousands of graphs, and I want to compute on the fly which graphs are to be included my search. As an example, I might only want to include graphs written by me this year. This can be done easily enough by a restriction on the graph URI when the query is straightforward:
SELECT ?name WHERE {
GRAPH ?g { [ foaf:name ?name ] }
?g dc:creator ex:me ; dc:created ?date
FILTER( xsd:dateTime(?date) >= xsd:dateTime("2015-01-01") )
}
But suppose my query is more complex and I want a list of pairs of acquaintances. The naïve implementation is something like this:
SELECT ?name WHERE {
GRAPH ?g { ?name1 ^foaf:name/foaf:knows/foaf:name ?name2 }
?g dc:creator ex:me ; dc:created ?date
FILTER( xsd:dateTime(?date) >= xsd:dateTime("2015-01-01") )
}
This works fine if all three FOAF triples are in the same graph. But if any is in a different graph, it fails because ?g binds to a single graph in each result. I can explicitly write each of the three FOAF triples in their own GRAPH block, but then I have to associate each with their own graph URI variable, and repeat the graph restriction for each:
SELECT ?name WHERE {
GRAPH ?g1 { ?p1 foaf:name ?name1 }
?g1 dc:creator ex:me ; dc:created ?date1
FILTER( xsd:dateTime(?date1) >= xsd:dateTime("2015-01-01") )
GRAPH ?g2 { ?p1 foaf:knows ?p2 }
?g2 dc:creator ex:me ; dc:created ?date2
FILTER( xsd:dateTime(?date2) >= xsd:dateTime("2015-01-01") )
GRAPH ?g3 { ?p2 foaf:name ?name2 }
?g3 dc:creator ex:me ; dc:created ?date3
FILTER( xsd:dateTime(?date3) >= xsd:dateTime("2015-01-01") )
}
That code now does the right thing, but it rapidly becomes untenable as the query becomes more complex. If the main query has m triples and the graph restriction has n, the complete query ends up with m×n triples.
Is there a better solution within standard SPARQL 1.1? I'm aware that some SPARQL engines will fetch a graph from its URI, and then you can make that URI the URL of a GET request to a SPARQL endpoint, but that's not standard. I had hoped the federated query mechanism might help, but it doesn't seem to.

As I don't have your data at hand, I can't test my query reliably. However, you seem to need subqueries.
Subqueries are a way to embed SPARQL queries within other queries, normally to achieve results which cannot otherwise be achieved, such as limiting the number of results from some sub-expression within the query.
In your case, the objective is:
Getting the list of graphs for which you are the dc:creator
For each of those graphs, find some possibly useful triples
What you want as a result of your query isn't very clear, since you mention only ?name after SELECT... although it's nowhere in the rest of the query. Here is how you could try using subqueries and maybe find inspiration to solve your problem:
SELECT ?name1 ?name2 WHERE {
{GRAPH ?graph { ?p1 foaf:name ?name1 }}
UNION {GRAPH ?graph { ?p1 foaf:knows ?p2 }}
UNION {GRAPH ?graph { ?p2 foaf:name ?name2 }}
{ select ?graph where { #here we get the list of graphs you created
?graph dc:creator ex:me ; dc:created ?date
FILTER( xsd:dateTime(?date) >= xsd:dateTime("2015-01-01") )
}
}
}

Related

How do I recursively follow blank nodes to get all triples about an object? [duplicate]

This question already has answers here:
How do I construct get the whole sub graph from a given resource in RDF Graph?
(3 answers)
Closed 10 months ago.
Here is a data object for a user with ID 123, that is age 50 and likes spaghetti.
[ a :user;
:user_id 123;
:demographics [ :age 50 ];
:favorite_food [ :italian [ :spaghetti true ]]
].
Is there any sparql query where I could get back all the triples related to this user without having to do a UNION for each possible blank node value follow?
I'm trying to convert json into triples, then turn the triples back into json,
so it would be very convenient be able to query of objects of unknown nesting depth without adding a parent ID property to every child element of the json.
This seems to work but I would have to do a UNION for every layer of nesting
select ?e1 ?a1 ?v1 { ?e :user_id 123.
{ ?e ?a1 ?v1. BIND(?e AS ?e1) }
UNION { ?e ?a ?v. ?v ?a1 ?v1. BIND(?v AS ?e1). } }
UninformedUser has a pointed out a solution. From the look of the query it seems like the recursive property path trick wouldn't grab the first level before nesting, but it does grab all data.
select ?s ?p ?o { ?e :user_id 123. ?e (<>|!<>)* ?s. ?s ?p ?o }
#UninformedUser provided a solution in a comment. Though the OP thought this query using the recursive property path trick wouldn't grab the first level before nesting, it does grab all their desired data.
SELECT ?s ?p ?o
{ ?e :user_id 123 .
?e (<>|!<>)* ?s .
?s ?p ?o
}

i want to get the names of similar types using sparql queries from dbpedia

I need to find the names of similar types from DBpedia so I'm trying to figure out a query which can return me the names of entities which have same subject type in its dct:subject (example I want to find similar types of white house so i want to write a query for same . I'm considering the dct:subject to find them ). If there is any other approach please mention it
Previously I tried it for rdf:type but the result are not so good and some time it shows time out
I have done my problem by the query mentioned below and now i want to consider dct:subject instead of rdf:type
select distinct ?label ?resource count(distinct ?type) as ?score where {
values ?type { dbo:Thing dbo:Organization yago:WikicatIslam-relatedControversies yago:WikicatIslamistGroups yago:WikicatRussianFederalSecurityServiceDesignatedTerroristOrganizations yago:Abstraction100002137 yago:Act100030358 yago:Cabal108241798 yago:Group100031264 yago:Movement108464601 yago:PoliticalMovement108472335
}
?resource rdfs:label ?label ;
foaf:name ?name ;
a ?type .
FILTER (lang(?label) = 'en').
}
ORDER BY DESC(?score)

Property paths in Sparql

I am posing a Sparql query to Dbpedia endpoint with property paths:
select (COUNT(distinct ?s2) AS ?count) WHERE{
?s2 skos:broader{0,2} dbc:Countries_in_Europe
}
I want to pose the same query without property paths:
select (COUNT(distinct ?s2) AS ?count) (COUNT(distinct ?s1) AS ?count1) WHERE{
?s2 skos:broader dbc:Countries_in_Europe.
?s1 skos:broader ?s2.
}
I have two questions:
Is it possible to get ?s1+?s2 for the second query?
For the second query, I expect the sum of the count numbers +1 (dbc:Countries_in_Europe) should be the same with the first query. But they are not. What is wrong?
Thanks in advance.
You're using non-standard SPARQL, i.e. restricting the depth did not make it to the final version, see the W3C specs
I guess the first query is supposed to returns sub-categories of the given one up to a depth 2, right? Your second query doesn't do the same. You have to use a UNION of each distance, i.e. one for the direct sub-categories, and one for the other levels.
SELECT (COUNT(distinct ?s) AS ?count) WHERE {
{
?s skos:broader dbc:Countries_in_Europe
} UNION {
?s1 skos:broader dbc:Countries_in_Europe.
?s skos:broader ?s1
}
}
Note, in your first query you used {0,2} which means due to 0 distance the category dbc:Countries_in_Europe itself is also part of the result. If you need it, you should add +1 to the result of the second query.
Update
As per #JohuaTaylor's comment below, a more compact syntax would be
SELECT (COUNT(distinct ?s) AS ?count) WHERE {
?s skos:broader/skos:broader? dbc:Countries_in_Europe
}

How to return all S->P->O triples from a starting resource to a specified path depth?

My goal is to graphically represent the S->P->O relations within a depth two edges from the specified resource, p:Person_1. I want all relations within that path length to be returned from my query as ?s, ?p, ?o for further processing in my graphical application.
I tried the first query below which gives me my first set of ?s ?p ?o with repeats, then ?p2, ?o2, ?p3, ?o3 as additional columns in the result. I want to bind ?p2 and ?p3 to ?p, ?o2 and ?o3 to ?o.
SELECT *
WHERE {
p:Person_1 ?p ?o .
BIND("p:Person_1" as ?s)
OPTIONAL{
?o ?p2 ?o2 .
}
OPTIONAL{
?o2 ?p3 ?o3 .
}
}
Then, based on How do I construct get the whole sub graph from a given resource in RDF Graph?, I tried using CONSTRUCT to return the graph.
PREFIX p: <http://www.example.org/person/>
PREFIX x: <example.org/foo/>
construct { ?s ?p ?o }
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:|!x:)* ?s .
?s ?p ?o .
}
I am using Virtuoso and I get the error:
Virtuoso 37000 Error SP031: SPARQL compiler: Variable ?_::trans_subj_9_3 in T_IN list is not a value from some triple
I could post-process the result from my first query but I want to learn how to do this correctly with SPARQL, preferably on Virtuoso.
Update after testing the advice from #AKSW :
Both CONSTRUCT and SELECT statements work with the pattern suggested.
CONSTRUCT { ?s ?p ?o }
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar)* ?s .
?s ?p ?o .
} LIMIT 100
and:
SELECT s ?p ?o
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar)* ?s .
?s ?p ?o .
} LIMIT 100
The SELECT results in several duplicates that cannot be removed using DISTINCT, which results in an error that I assume is due to the 'datatype' of some of the returned values.
Virtuoso 22023 Error SR066: Unsupported case in CONVERT (DATETIME -> IRI_ID)
It appears some post-SPARQL processing is in order.
This gets me most of the way there. Still hoping I can find a solution for SPARQL that is like Cypher's "number of hops away" :
OPTIONAL MATCH path=s-[*1..3]-(o)
Here is a SPARQL query that works in Virtuoso. Note the SPARQL W3C standard does not support this syntax and it will fail in other triplestores.
PREFIX p: <http://www.example.org/person/>
PREFIX x: <example.org/foo/>
# CONSTRUCT {?s ?p ?o} # If you wish to return the graph
SELECT ?s ?p ?o # To return the triples
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar){1,3} ?s .
?s ?p ?o .
}LIMIT 100
See also K. Idehen's wiki entry here: http://linkedwiki.com/exampleView.php?ex_id=141
And thanks to #Joshua Taylor for advice in the same area.
Working Drafts of SPARQL 1.1 Property Paths included the {n,m} operator for handling this issue, which was implemented (and will remain supported) in Virtuoso. Here's a tweak to #tim's response.
Live SPARQL Query Results Page using the DBpedia endpoint (which is a Virtuoso instance).
Live SPARQL Query Definition Page that opens up query source code in the default DBpedia query editor.
Actual Query Example:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s AS ?Entity
?o AS ?Category
WHERE {
?s rdf:type <http://dbpedia.org/ontology/AcademicJournal> ;
rdf:type{1,3} ?o
}
LIMIT 100
Should you be looking for LinkedIn-like presentation of Contact Networks and Degrees of Separation between individuals, here is an example using Virtuoso-specific SPARQL Extensions that solve this particular issue:
SELECT ?o AS ?WebID
((SELECT COUNT (*) WHERE {?o foaf:knows ?xx})) AS ?contact_network_size
?dist AS ?DegreeOfSeparation
<http://www.w3.org/People/Berners-Lee/card#i> AS ?knowee
WHERE
{
{
SELECT ?s ?o
WHERE
{
?s foaf:knows ?o
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_min (1), t_max (4), t_step ('step_no') AS ?dist) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>)
FILTER (isIRI(?s) and isIRI(?o))
}
ORDER BY ?dist DESC (?contact_network_size)
LIMIT 500
Note: this approach is the only way (at the current time) to expose actual relational hops between entities in an Entity Relationship Graph that includes Transitive relations.
Live Link to Query Results
Live Link to Query Source Code
Bearing in mind that the r{n,m} operator was deprecated in the final SPARQL 1.1 (but will remain supported in Virtuoso), you can use r/r?/r? instead of r{1,3}, if you want to work strictly off the current spec:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s AS ?Entity
?o AS ?Category
WHERE {
?s rdf:type <http://dbpedia.org/ontology/AcademicJournal> ;
rdf:type / rdf:type? / rdf:type? ?o
}
LIMIT 100
Here's a live example, against the DBpedia instance hosted in Virtuoso.

retrieving most specific classes of instances

Is it possible to have a definition of an resource (from DBpedia) with a SPARQL query? I want to have something like the TBox and ABox that are shown in (Conceptual) Clustering methods for
the Semantic Web: issues and applications (slides 10–11). For example, for DBpedia resource Stephen King, I would like to have:
Stephen_King : Person &sqcap; Writer &sqcap; Male &sqcap; … (most specific classes)
You can use a query like the following to ask for the classes of which Stephen King is an instance which have no subclasses of which Stephen King is also an instance. This seems to align well with the idea of “most specific classes.” However, since (as far as I know) there's no reasoner attached to the DBpedia SPARQL endpoint, there may be subclass relationships that could be inferred but which aren't explicitly present in the data.
select distinct ?type where {
dbr:Stephen_King a ?type .
filter not exists {
?subtype ^a dbr:Stephen_King ;
rdfs:subClassOf ?type .
}
}
SPARQL results
Actually, since every class is an rdfs:subClassOf itself, you might want to add another line to that query to exclude the case where ?subtype and ?type are the same:
select distinct ?type where {
dbr:Stephen_King a ?type .
filter not exists {
?subtype ^a dbr:Stephen_King ;
rdfs:subClassOf ?type .
filter ( ?subtype != ?type )
}
}
SPARQL results
If you actually want a result string like the one shown in those slides, you could use values to bind a variable to dbr:Stephen_King, and then use some grouping and string concatenation to get something nicer looking (sort of):
select
(concat( ?person, " =\n", group_concat(?type; separator=" AND\n")) as ?sentence)
where {
values ?person { dbr:Stephen_King }
?type ^a ?person .
filter not exists {
?subtype ^a ?person ;
rdfs:subClassOf ?type .
filter ( ?subtype != ?type )
}
}
group by ?person
SPARQL results
http://dbpedia.org/resource/Stephen_King =
http://dbpedia.org/class/yago/AuthorsOfBooksAboutWritingFiction AND
http://dbpedia.org/ontology/Writer AND
http://schema.org/Person AND
http://xmlns.com/foaf/0.1/Person AND
http://dbpedia.org/class/yago/AmericanSchoolteachers AND
http://dbpedia.org/class/yago/LivingPeople AND
http://dbpedia.org/class/yago/PeopleFromBangor,Maine AND
http://dbpedia.org/class/yago/PeopleFromPortland,Maine AND
http://dbpedia.org/class/yago/PeopleFromSarasota,Florida AND
http://dbpedia.org/class/yago/PeopleSelf-identifyingAsAlcoholics AND
http://umbel.org/umbel/rc/Artist AND
http://umbel.org/umbel/rc/Writer AND
http://dbpedia.org/class/yago/20th-centuryNovelists AND
http://dbpedia.org/class/yago/21st-centuryNovelists AND
http://dbpedia.org/class/yago/AmericanHorrorWriters AND
http://dbpedia.org/class/yago/AmericanNovelists AND
http://dbpedia.org/class/yago/AmericanShortStoryWriters AND
http://dbpedia.org/class/yago/CthulhuMythosWriters AND
http://dbpedia.org/class/yago/HorrorWriters AND
http://dbpedia.org/class/yago/WritersFromMaine AND
http://dbpedia.org/class/yago/PeopleFromDurham,Maine AND
http://dbpedia.org/class/yago/PeopleFromLisbon,Maine AND
http://dbpedia.org/class/yago/PostmodernWriters