How to Join two Query in SPARQL python (sparql-client) - sparql

I am writing a SPARQL query to get name and expertise from 2 rdf files. Using rdflib graph() I can parse and query one rdf files at a time but I could not get the join ideas here. How can I do that? How can I parse two rdf files and make a join SPARQL query for example UNION in python client? please share your idea's. Thank you.
name.rdf
<rdf:RDF
xmlns:researche="http://example.com/researche#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://example.app.web/researche#researcher1">
<researche:name>John Snow</researche:name>
<researche:hasExpertise rdf:resource="http://example.com/researche#expertise1"/>
<researche:hasExpertise rdf:resource="http://example.com/researche#expertise3"/>
</rdf:Description>
</rdf:RDF>
expertise.rdf
<rdf:RDF
xmlns:researche="http://example.com/researche#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://example.com/researche#expertise1">
<researche:name>Web engineering</researche:name>
</rdf:Description>
<rdf:Description rdf:about="http://example.com/researche#expertise2">
<researche:name>DevOps</researchee:name>
</rdf:Description>
</rdf:RDF>
Here is my query.py file..
g.parse("name.rdf")
g2.parse("expertise.rdf")
qres = g.query(
"""SELECT DISTINCT ?name ?expertise
WHERE {
?a researche:name ?name .
?a researche:hasExpertise ?expertise .
}""")
N:B: the query.py is incomplete.

Related

Slow SPARQL query with rdf:Seq

I'm trying to retrieve all elements of a rdf:Seq with SPARQL. The RDF structure is as follows. A subproject with a rdf:Seq of timeclaims and the individual timeclaim information. The list of timeclaims for a subproject can be of any length:
<rdf:Description rdf:about="http://www.example.com/resource/subproject/2017-nieuw-1">
<rdf:type rdf:resource="http://www.example.com/ontologie/example/Subproject"/>
<rdfs:label>Subproject label</rdfs:label>
<pbl:subproject_timeclaims rdf:resource="http://www.example.com/resource/list/5853abbfdcc97"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.com/resource/list/5853abbfdcc97">
<rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq"/>
<rdf:_1 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd6aa4"/>
<rdf:_7 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd957b"/>
<rdf:_6 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd8e68"/>
<rdf:_14 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfdc541"/>
<rdf:_5 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd879f"/>
<rdf:_2 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd71db"/>
<rdf:_3 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd78be"/>
<rdf:_4 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd7f92"/>
<rdf:_8 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfd9c4c"/>
<rdf:_9 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfda31c"/>
<rdf:_10 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfdaa08"/>
<rdf:_11 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfdb0e6"/>
<rdf:_12 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfdb7bd"/>
<rdf:_13 rdf:resource="http://www.example.com/resource/timeclaim/5853abbfdbe7f"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.com/resource/timeclaim/5853abbfdc541">
<rdf:type rdf:resource="http://www.example.com/ontologie/example/Timeclaim"/>
<pbl:timeclaim_description>Description</pbl:timeclaim_description>
<pbl:timeclaim_hours>25</pbl:timeclaim_hours>
<pbl:timeclaim_employee
rdf:resource="http://www.example.com/resource/employee/2222333334444"/>
</rdf:Description>
Starting from the timeclaims I'm trying to retrieve the information of the subproject above (and filter on it). But the query is taking forever. Eventually the data is returned but I have the feeling it could be quicker.
SELECT *
WHERE {
?tc_item a :Timeclaim .
?tc_list ?p ?tc_item .
?subproject pbl:subproject_timeclaims ?tc_list
}
Could you point out any mistakes in the SPARQL query and better ways of doing this? Or maybe the RDF structure could be improved? The numbering in this case is not really relevant but the same list structure with rdf:Seq is present in more places in the database (and the order is important in those cases).

SPARQL: Specify datatype in INSERT query

Let's same I want to add via a SPARQL query these triples to my graph
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:my="http://example.com/ontology#">
<rdf:Description rdf:about="http:/example.com/entity/thisEntity">
<rdf:type rdf:resource="http://example.com/ontology#Identification" />
<my:someID rdf:datatype="http://www.w3.org/2001/XMLSchema#string">003</my:someID>
<my:hasSource rdf:resource="http:/example.com/entity/thisSource" />
</rdf:Description>
</rdf:RDF>
How am I supposed to specify the data type of the third triple?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX my: <http://example.com/ontology#>
INSERT DATA {
<http:/example.com/entity/thisEntity> rdf:type <http://example.com/ontology#Identification>;
my:hasSource <http:/example.com/entity/thisSource>;
my:someID "003"#xsd:string.
}
You're almost there. The # is for language tags. Datatypes are appended by using the ^^ separator instead. So use "003"^^xsd:string.
You will also need to add the namespace prefix to your update:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
As an aside: in RDF 1.1, xsd:string is the default datatype for any literal that does not have an explicitly defined datatype nor a language tag. So "003" and "003"^^xsd:string are both the same, string-typed, literal.

SPARQL query does not find individual in imported ontology when run in jena

I have 3 ontology files where the first imports the second and the second imports the third:
The first ontology imports the second one:
<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.example.com/user/rainer/ontologies/2016/1/usecase_individuals#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:uc="http://www.example.com/user/rainer/ontologies/2016/1/usecase#">
<owl:Ontology rdf:about="http://www.example.com/user/rainer/ontologies/2016/1/usecase_individuals">
<owl:imports rdf:resource="http://www.example.com/user/rainer/ontologies/2016/1/usecase"/>
</owl:Ontology>
....
The second ontology imports the thrid one:
<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.example.com/user/rainer/ontologies/2016/1/usecase#"
xml:base="http://www.example.com/user/rainer/ontologies/2016/1/usecase"
xmlns:fgcm="http://www.example.com/user/rainer/ontologies/2016/1/fgcm#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:uc="http://www.example.com/user/rainer/ontologies/2016/1/usecase#">
<owl:Ontology rdf:about="http://www.example.com/user/rainer/ontologies/2016/1/usecase">
<owl:imports rdf:resource="http://www.boeing.com/user/rainer/ontologies/2016/1/fgcm"/>
</owl:Ontology>
....
And the third ontology (created in Protégé) asserts an individual:
<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.boeing.com/user/rainer/ontologies/2016/1/fgcm#"
...
<owl:NamedIndividual rdf:about="http://www.boeing.com/user/rainer/ontologies/2016/1/fgcm#admin">
<rdf:type rdf:resource="http://www.boeing.com/user/rainer/ontologies/2016/1/fgcm#User"/>
<userName>admin</userName>
</owl:NamedIndividual>
...
When I open the first ontology in Protégé and execute the SPARQL query
PREFIX fgcm: <http://www.example.com/user/rainer/ontologies/2016/1/fgcm#>
SELECT ?subject ?name WHERE { ?subject fgcm:userName ?name}
it find the individual in the third ontology without a problem. When I run the same SPARQL query from code in Jena I don't get that individual. The query is run against an OntModel that was created with the default settings.
I know that Jena is able to load and import the ontologies because I can access classes and properties from the imported ontologies, both in SPARQL queries and directly using the Jena API. My problem appears to be limited to the individuals that are asserted in the imported ontology.
I have searched for settings (when loading the ontology such as the different OntModelSpecs or when creating/ running the query) that might change this behavior but haven't found any solutions.
It turned out that I was mistaken about Jena successfully loading the imported ontologies. (Not getting an error does not imply that the ontology that should be imported was actually found).
The SPARQL queries returned the expected result after using an OntDocumentManager and telling it where to find the ontology files that needed to be imported. This is the code snipped that worked for me:
OntDocumentManager mgr = new OntDocumentManager ();
mgr.addAltEntry("http://www.boeing.com/user/rainer/ontologies/2016/1/usecase", "file:C:\\Dev\\luna_workspace\\fgcm_translate\\usecase.owl");
mgr.addAltEntry("http://www.boeing.com/user/rainer/ontologies/2016/1/fgcm", "file:C:\\Dev\\luna_workspace\\fgcm_translate\\fgcm.owl");
OntModelSpec spec = new OntModelSpec ( OntModelSpec .OWL_DL_MEM_TRANS_INF);
spec.setDocumentManager(mgr);
OntModel model = ModelFactory.createOntologyModel(spec);
I hope this helps someone if they run into a similar problem.

OWL ontology: SPARQL query a range or domain of an ObjectProperty when they're unionOf classes

I have to query an ontology version 1.4, with a SPARQL 1.1 endpoint, so I can't use the OWL 2 semantic like ClassAssertion etc...
several properties in the ontology have this format:
<owl:ObjectProperty rdf:about="&km4c;hasGeometry">
<rdfs:comment>some services and all railway elements have a specific geometry like polygons or linestrings</rdfs:comment>
<rdfs:range rdf:resource="&gis;Geometry"/>
<rdfs:domain>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<rdf:Description rdf:about="&km4c;RailwayElement"/>
<rdf:Description rdf:about="&km4c;Service"/>
</owl:unionOf>
</owl:Class>
</rdfs:domain>
</owl:ObjectProperty>
with domain or range that is a Union of more than one class. The problem is that i want to retrieve all the classes with a certain domain and range, but with the following query:
SELECT DISTINCT ?p
{
?p rdfs:range gis:Geometry.
?p rdfs:domain km4c:Service
}
i get no result, instead of km4c:hasGeometry.
Is there a way to somehow look inside the Collection with this kind of goal?
First, note that the semantics of union of domains and ranges may not be what you expect. In OWL, when you say that class D is a domain of property P, it means that whenever you have an assertion P(x,y), you can infer that D(x). That means that if a domain of P is a union C &sqcup; D, then from P(x,y), you can infer that x is an element of C &sqcup; D; i.e., that x is either a C or a D, but you don't necessarily know which. For instance, you might define:
hasWings rdfs:domain (Airplane &sqcup; Bird)
Then, from hasWings(x,2), you could infer that x is an Airplane or a Bird, but you still wouldn't know which.
Anyhow, if you still want a union class as a domain, you can do that. In the RDF serialization of the mapping of the OWL ontology, the unioned classes are in an RDF list. It's a little bit more complicated to query over those, but you can certainly do it. Since you didn't provide a complete OWL ontology, we can't query over your actual data (in the future, please provide complete, minimal working data that we can use), but we can create a simple ontology. There are two classes, A and B, and two properties, p and q. The domain of p is A, and the domain of q is A or B:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://example.org/"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<owl:Ontology rdf:about="http://example.org/"/>
<owl:Class rdf:about="http://example.org/#A"/>
<owl:Class rdf:about="http://example.org/#B"/>
<owl:ObjectProperty rdf:about="http://example.org/#q">
<rdfs:domain>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="http://example.org/#A"/>
<owl:Class rdf:about="http://example.org/#B"/>
</owl:unionOf>
</owl:Class>
</rdfs:domain>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="http://example.org/#p">
<rdfs:domain rdf:resource="http://example.org/#A"/>
</owl:ObjectProperty>
</rdf:RDF>
SPARQL syntax is much more like the N3/Turtle serialization of RDF, so it's helpful to see that serialization, too. The unionOf list is much clearer here:
#prefix : <http://example.org/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<http://example.org/#A>
a owl:Class .
<http://example.org/#p>
a owl:ObjectProperty ;
rdfs:domain <http://example.org/#A> .
<http://example.org/#B>
a owl:Class .
<http://example.org/#q>
a owl:ObjectProperty ;
rdfs:domain [ a owl:Class ;
owl:unionOf ( <http://example.org/#A> <http://example.org/#B> )
] .
: a owl:Ontology .
Now you can use a query like this to find properties and their domains, or the unioned classes if one of the domains is a union class:
prefix : <http://example.org/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?p ?d where {
?p rdfs:domain/(owl:unionOf/rdf:rest*/rdf:first)* ?d
filter isIri(?d)
}
-----------------------------------------------------
| p | d |
=====================================================
| <http://example.org/#q> | <http://example.org/#A> |
| <http://example.org/#q> | <http://example.org/#B> |
| <http://example.org/#p> | <http://example.org/#A> |
-----------------------------------------------------
The interesting parts of that query are:
?p rdfs:domain/(owl:unionOf/rdf:rest*/rdf:first)* ?d
which says you follow a path from ?p to ?d, and the path:
starts with rdfs:domain
and is followed by zero or more repetitions of:
owl:unionOf
followed by zero or more rdf:rest
followed by a single rdf:first
It's not exactly related to this question, but you may find the discussion of querying RDF lists in this answer (disclosure: my answer) to Is it possible to get the position of an element in an RDF Collection in SPARQL? useful.
Then, I also added
filter isIri(?d)
because otherwise we get the node that represents the union class, but that's a blank node that you (probably) don't want.

MINUS or NOT EXIST in SPARQL

I have a query and want NOT to show some information
SELECT ?Recipe
WHERE {
?Ingredient <http://linkedrecipes.org/schema/ingredientOf> ?Recipe .
MINUS {
<http://linkedrecipes.org/schema#Milk> <http://linkedrecipes.org/schema/ingredientOf> ?Recipe .
}
}
I want to choose all Recipes where Milk is not an Ingredient
After running this query I just have an error
My data is:
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rcp="http://linkedrecipes.org/schema/">
<rdf:Description rdf:about="http://linkedrecipes.org/schema#Milk">
<rcp:ingredientOf rdf:resource="http://linkedrecipes.org/schema#SaladUniqueID"/>
<rcp:ingredientOf rdf:resource="http://linkedrecipes.org/schema#CoffeeUniqueID"/>
</rdf:Description>
<rdf:Description rdf:about="http://linkedrecipes.org/schema#Salt">
<rcp:ingredientOf rdf:resource="http://linkedrecipes.org/schema#SoupUniqueID"/>
</rdf:Description>
</rdf:RDF>
In the result I want to have "SoupUniqueID".
Using a NOT FILTER will be easier.