SPARQL property path, 1st part is unique, path repeats after

SPARQL property path, 1st part is unique, path repeats after - sparql

Assume we have the triples
a1 blah <http://foo.com/f>
a1 pt c1
c1 pt d1
How would I construct a SPARQL query to select a1 or c1 or d1? I need to be able to walk one blah predicate or multiple pt predicates.
So a1 can be part of a triple like : a1 blah <http://foo.com/f>
but I dont know how to specify the case where its not a1, but its an object connected to a1 through another path in the other direction. But I also need to retain the path using blah, since thats what a1 or f1 is connected to the object by
Here is how I got it to work, but its not pretty:
select * from
{
# other sparql stuff before it
{ ?s blah <http://foo.com/f>
?s pt+ <c1> }
UNION
{ <c1> blah <http://foo.com/f> }
}
I can change 'c1' to 'a1' or 'd1' and the above sparql still works

AKSW's comment was exactly right, you need to use a sequence, not an alternative. Based on your example, it sounds like you actually want subjects that are connected to a specific object by a chain of zero or one occurrences of "blah", and then zero or more occurrences of "pt".
If you want to do this for specific predicates p and q, that you can specify with IRI references, then it's just
?x p?/q* ?y
The * following the q means zero or more, and the ? following the p means zero or one.
However, if you don't know the specific IRIs in advance, e.g., you want to use variables, in something like:
?x (?p)?/(?q)* ?y
then you may be out of luck. SPARQL property paths don't support variables. You can get wildcards in property paths with something like s|!s, since every URI is either s or it isn't, but there's no provision for putting repeatable variables into property paths.

Related

Retrieve only one label from set of individuals with multiple labels

Suppose I have the following set of individuals, where some of them have more than one rdfs:label:
Individual
Label
ind_01
"ind-001"
ind_02
"ind-002"
ind_02
"ind-2"
ind_03
"label3"
ind_04
"ind-4"
ind_04
"ind-04"
...
...
I would like to run a SPARQL query that retrieves only one label per individual, no matter which one (i.e., the choice can be totally arbitrary). Thus, a suitable output based on the above dataset would be:
Individual
Label
ind_01
"ind-001"
ind_02
"ind-002"
ind_03
"label3"
ind_04
"ind-4"
...
...

You could use SAMPLE, which "returns an arbitrary value from the multiset passed to it".
SELECT ?individual (SAMPLE(?label) AS ?label_sample)
WHERE {
?individual rdfs:label ?label .
}
GROUP BY ?individual

SPARQL: filter on both string and integer?

I'm testing SPARQL with Protégé on this data file
https://raw.githubusercontent.com/miranda-zhang/cloud-computing-schema/master/example/sparql-generate/result/gcloud_vm.ttl
Validated the following works:
PREFIX cocoon: <https://raw.githubusercontent.com/miranda-zhang/cloud-computing-schema/master/ontology_dev/cocoon.ttl>
SELECT ?VM ?cores
WHERE {
?VM a cocoon:VM ;
cocoon:numberOfCores ?cores .
}
For example, it returns something like:
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-ULTRAMEM-80-PREEMPTIBLE "80"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-HIGHCPU-64-PREEMPTIBLE "64"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-STANDARD-2 "2"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-F1-MICRO "shared"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-HIGHCPU-8-PREEMPTIBLE "8"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-HIGHCPU-32 "32"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-HIGHMEM-16-PREEMPTIBLE "16"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-STANDARD-96-PREEMPTIBLE "96"#
https://w3id.org/cocoon/data/vm/gcloud/CP-COMPUTEENGINE-VMIMAGE-N1-STANDARD-4 "4"#
I'm not sure if I can apply a filter on ?cores, I tried the following, but they returned nothing:
cocoon:numberOfCores "shared" .
Or
FILTER(?cores = "4") .
I'd also like to apply filter on ?cores (i.e. > 4 and < 8), so I have to make it an xsd:integer? But then I have to get rid of shared which is about < 1 core

Thanks AKSW, impressive knowledge about Protégé.
In the end, I changed my data type to xsd:decimal. Seems to be enough for now.

Order SPARQL query results by length of a string?

I'm trying to autocomplete what the user writes in an input, with terms in DBpedia, similar to this jsFiddle example. Try writing dog in the input of that jsFiddle, and you will see the 'Dog' term in the suggestions.
I have the following code, and the problem is that the 10-term list I got as a result does not contains the "Dog" alternative. So, if I could order the list by the length of the (string representation of) ?concept, then I could get that term. Is this possible?
SELECT DISTINCT ?concept
WHERE {
?concept a skos:Concept .
FILTER regex(str(?concept), "dog", "i")
}
ORDER BY ASC(?concept) LIMIT 10

So, if I could order the list by the lenght of the ?concept it is possible to get the term. But I can't find the right statement to do it. Is it possible?
It sounds like you're looking for strlen.
order by strlen(str(?concept))
E.g.,
select distinct ?concept where {
?concept a skos:Concept .
filter regex(str(?concept), "dog", "i")
}
order by strlen(str(?concept))
limit 10
SPARQL results
That said, if you're just checking string membership, you don't need all the power of regular expressions, and it might be more efficient to use contains and lcase to check whether the lowercased ?concept contains "dog" with a filter like:
filter contains(lcase(str(?concept)), "dog")
The table of contents in the SPARQL spec has a big list of functions that you can browse. In particular, you'd want to look at the subsections of 17.4 Function Definitions.

How to customize ORDER BY condition in SPARQL

I want to order the output of the SPARQL queries not alphabetically, but in a certain order that I define myself. I cannot find information on how to do it.
For example, I have a query:
SELECT DISTINCT ?a ?b ?c
WHERE { ... }
ORDER BY ?c
The variable c can have a limited range of values: "s", "m", "m1", "m2", "p" and "w".
I want the ordering to be as follows:
1) s
2) m
3) m1
4) m2
4) p
5) w
So, it is not an alphabetical order.
How to force the ORDER BY to order output in this order?
I use SPARQL endpoint Fuseki to query a turtle file, and jinja2 templates to render the results.

In general, you'd need to write a custom function, as described in Jeen's answer. However, when you have a limited number of values, and you know the specific ordering, you can simply include a values block that associates each value with its sort index and then sort on that. E.g., if you wanted to sort some sizes, you could do something like this:
select ?item where {
values (?size ?size_) { ("small" 1) ("medium" 2) ("large" 3) }
?item :hasSize ?size
}
order by ?size_

You can achieve any custom sorting order by writing a custom function that implements the sorting logic and injecting that function into your SPARQL engine. See this tutorial on how to create a custom function for the Sesame SPARQL engine - other SPARQL engines have similar functionality.
Once you have this function, you can use it in the ORDER BY argument:
SELECT ....
ORDER BY ex:customFunction(?c)

Jena StmtIterator and database

I have my model stored in a triple store(persistence).
I want to select all individuals related by some document name.
I can do it in 2 ways
1) SPARQL request:
PREFIX base:<http://example#>
select ?s2 ?p2 ?o2
where {
{?doc base:fullName <file:/c:/1.txt>; ?p1 ?o1
} UNION {
?s2 ?p2 ?o2 ;
base:documentID ?doc }
}
Question: How to create Jena's Model from the ResultSet?
2) StmtIterator stmtIterator = model.listStatements(...)
The problem with this approach that I need to use model.listStatements(...) operation for the several times :
a) get document URI by document name
b) get a list of individuals ID related to this document URI
c) finally - get a collection of individuals
I concern about performance - 3 times run model.listStatements(...) - many database requests.
Or all data are read into memory(I doubt about it) from the database during model creation:
Dataset ds = RdfStoreFactory.connectDataset(store, conn);
GraphStore graphStore = GraphStoreFactory.create(ds) ;
?

You need to back up a little bit and think more clearly about what you are trying to do. Your sparql query, once it's corrected (see below), will do a perfectly good job of producing an iterator over the resultset, which will provide you with the properties of each of the documents you're looking for. Specifically, you get one set of bindings for each of s2, p2 and o2 for each value in the resultset. That's what you ask for when you specify select ?s2 ?p2 ?o2. And it's normally what you want: usually, we select some values out of the triple store in order to process them in some way (e.g. rendering them into a list on the UI) and for that we exactly want an iterator over the results. You can have the query return you a model not a resultset, by virtue of a SPARQL construct query or SPARQL describe. However, you then have a need to iterate over the resources in the model, so you aren't much further forward (except that your model is smaller and in-memory).
Your query, incidentally, can be improved. The variables p1 and o1 make the query engine do useless work since you never use them, and there's no need for a union. Corrected, your query should be:
PREFIX base:<http://example#>
select ?s2 ?p2 ?o2
where {
?doc base:fullName <file:/c:/1.txt> .
?s2 base:documentID ?doc ;
?p2 ?o2 .
}
To execute any query, select, describe or construct, from Java see the Jena documentation.
You can efficiently achieve the same results as your query using the model API. For example, (untested):
Model m = ..... ; // your model here
String baseNS = "http://example#";
Resource fileName = m.createResource( "file:/c:/1.txt" );
// create RDF property objects for the properties we need. This can be done in
// a vocab class, or automated with schemagen
Property fullName = m.createProperty( baseNS + "fullName" );
Property documentID = m.createProperty( baseNS + "documentID" );
// find the ?doc with the given fullName
for (ResIterator i = m.listSubjectsWithProperty( fullName, fileName ); i.hasNext(); ) {
Resource doc = i.next();
// find all of the ?s2 whose documentID is ?doc
for (StmtIterator j = m.listStatements( null, documentID, doc ); j.hasNext(); ) {
Resource s2 = j.next().getSubject();
// list the properties of ?s2
for (StmtIterator k = s2.listProperties(); k.hasNext(); ) {
Statement stmt = k.next();
Property p2 = stmt.getPredicate();
RDFNode o2 = stmt.getObject();
// do something with s2 p2 o2 ...
}
}
}
Note that your schema design makes this more complex that it needs to be. If, for example, the document full name resource had a base:isFullNameOf property, then you could simply do a lookup to get the doc. Similarly, it's not clear why you need to distinguish between doc and s2: why not simply have the document properties attached to the doc resource?
Finally: no, opening a database connection does not load the entire DB into memory. However, TDB in particular does make extensive use of caching of regions of the graph in order to make queries more efficient.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SPARQL property path, 1st part is unique, path repeats after - sparql

Related

Retrieve only one label from set of individuals with multiple labels

SPARQL: filter on both string and integer?

Order SPARQL query results by length of a string?

How to customize ORDER BY condition in SPARQL

Jena StmtIterator and database

Categories

Resources