How to recursively query with sparql? - sparql

I have RDF like this:
<companyA> <heldBy> "companyB" .
<companyA> <heldBy> "companyC" .
<companyB> <heldBy> "companyD" .
It's a stockholder relation, what I wanna do is to query all holders of companyA and mark them by level.
For example, companyB is level 1, companyD is level 2.
How can I do this with sparql? Or is it doable with sparql?
Thanks for the help!
According to the comment section, I think this might be doable with querying all holders with select ?holder where { <companyA> <heldBy>+ ?holder }.
Then calculate the length of the path to these nodes, sort them into levels.
Might even do this with SPARQL only, working on it.
Thanks again for the help!

Related

Extract all rdfs:type from a specific DBPedia entry

all.
I'm trying to write a simple SPARQL query generator to fetch all rdf:type relations of a specific DBPedia resource.
query = """SELECT * WHERE {{ <""" + resource """> rdfs:type ?subject.}}"""
This yields the Query
SELECT * WHERE {{ <http://dbpedia.org/page/Energy> rdfs:type ?subject.}}
But the query returns empty. What am I doing wrong? The DBPedia entry clearly has rdfs:type relations:
owl:Thing
dbo:Building
yago:Abstraction100002137
yago:Assets113329641
yago:NaturalResource113332009
yago:Possession100032613
yago:Relation100031921
yago:Resource113331778
yago:WikicatNaturalResources
Thanks in advance!
Change the energy address from page to resource, the query looks like this (in addition, I suggest you to use the a instead of therdf:type):
SELECT * WHERE {{ <http://dbpedia.org/resource/Energy> a ?subject.}}
In order to avoid this issue, chech the exact resurces addresses in a raw data format. For example, the XML triples can be reviewed with a web browser from the dbpedia webpage. http://dbpedia.org/page/Energy, in the top bar there is a button named formats.

Analysis with SPARQL

I am trying to accomplish some relatively simple analysis with a specific graph.
In Marklogic SPARQL path are created with the following patterns
path+ (one or more duplicate path links)
path* (zero or more duplicate path links)
path? (zero or one path link)
path1/path2 (traversing through 2 different links)
From here, one analysis I would like to achieve is retrieving all nodes that fulfills a specific condition between node X and node Y. Based on this my query would be something like
?nodeX <nodeID> 1
?nodeY <nodeID> 250
?nodeX <nodeLink>* ?nodeY
Which does not really seem correct to me, as I don't think this allows me to retrieve the path linking nodeX to nodeY.
I would also like to know if it is possible to do things such as
Betweeness centrality which is a measure of the number of times a vertex is found between the shortest path of each vertex pair in a graph.
Closeness centrality which is a measure of the distance of one vertex to all other reachable vertices in the graph.
==Update==
Based on the suggestion I have managed to retrieve the path using the following query.
?nodeX <nodeID> "1"
?nodeY <nodeID> "250"
?nodeX <nodeLink>* ?v
?v ?p ?u
?u <nodeLink>* ?nodeY
When I attempted to do <p> | !<p> in my query an error occurred and stating ! was not a valid expression. However, I believe I can still do the same by using ?path which will accept any predicate.

SPARQL path traversing

I am trying to create a query using SPARQL on a ttl file where I have part of the graph representing links as follows:
Is it possible to search for the type Debit and get all the literals associated with its parent ie: R494Vol1D2, Salvo, Vassallo?
Do I need to use paths?
As AKSW correctly said, RDF is about directed graphs. So I created a small n-triples file based on your image of the graph. I assume that the dataset looks like this:
<http://natarchives.com.mt/deed/R494Vol1-D2> <http://purl.org/dc/terms/type> "Debit".
<http://natarchives.com.mt/deed/R494Vol1-D2> <http://purl.org/dc/terms/identifier> "R494Vol1D2".
<http://natarchives.com.mt/deed/R494Vol1-D2> <http://data.archiveshub.ac.uk/def/associatedWith> <http://natarchives.com.mt/person/person796>.
<http://natarchives.com.mt/person/person796> <http://xmlns.com/foaf/0.1/firstName> "Salvo".
<http://natarchives.com.mt/person/person796> <http://xmlns.com/foaf/0.1/family_name> "Vassallo".
Also I did not know the prefix locah but according to http://prefix.cc it stands for http://data.archiveshub.ac.uk/def/
So if this dataset is correct you could use the following query:
1 SELECT ?literal WHERE{
2 ?start <http://purl.org/dc/terms/type> "Debit".
3 ?start <http://data.archiveshub.ac.uk/def/associatedWith>* ?parent.
4 ?parent ?hasLiteral ?literal.
5 FILTER(isLiteral(?literal) && ?literal != "Debit" )
6 }
In line 2 we define the starting point of our path, which is every vertex that has the type "Debit". Then we look for all vertices that are connected to ?start with an edge labelled with <http://data.archiveshub.ac.uk/def/associatedWith>. These vertices are then bound to ?parent. After that we look for all triples that have ?parent as subject and store the object in ?literal. In Line 6 we filter everything that is not a literal or is "Debit" from ?literal resulting in the desired outcome.
If I modeled the direction of <http://data.archiveshub.ac.uk/def/associatedWith> wrongly, you could change line 3 of the query to:
?start ^<http://data.archiveshub.ac.uk/def/associatedWith>* ?parent
This would change the direction of the edge.
And to answer the question if you need to use paths: If you do not know how long the path of edges labeled with <http://data.archiveshub.ac.uk/def/associatedWith> will be, then in my opinion yes, you will have to use either * or + of property paths.

How Do I Query Against Data.gov

I am trying to teach myself this weekend how to run API queries against a data source in this case data.gov. At first I thought I'd use a simple SQL variant, but it seems in this case I have to use SPARQL.
I've read through the documentation, downloaded Twinkle, and can't seem to quite get it to run. Here is an example of a query I'm running. I'm basically trying to find all gas stations that are null around Denver, CO.
PREFIX station: https://api.data.gov/nrel/alt-fuel-stations/v1/nearest.json?api_key=???location=Denver+CO
SELECT *
WHERE
{ ?x station:network ?network like "null"
}
Any help would be very much appreciated.
SPARQL is a graph pattern language for RDF triples. A query consists of a set of "basic graph patterns" described by triple patterns of the form <subject>, <predicate>, <object>. RDF defines the subject and predicate with URI's and the object is either a URI (object property) or literal (datatype or language-tagged property). Each triple pattern in a query must therefore have three entities.
Since we don't have any examples of your data, I'll provide a way to explore the data a bit. Let's assume your prefix is correctly defined, which I doubt - it will not be the REST API URL, but the URI of the entity itself. Then you can try the following:
PREFIX station: <http://api.data.gov/nrel...>
SELECT *
WHERE
{ ?s station:network ?network .
}
...setting the PREFIX to correctly represent the namespace for network. Then look at the binding for ?network and find out how they represent null. Let's say it is a string as you show. Then the query would look like:
PREFIX station: <http://api.data.gov/nrel...>
SELECT ?s
WHERE
{ ?s station:network "null" .
}
There is no like in SPARQL, but you could use a FILTER clause using regex or other string matching features of SPARQL.
And please, please, please google "SPARQL" and "RDF". There is lots of information about SPARQL, and the W3C's SPARQL 1.1 Query Language Recommendation is a comprehensive source with many good examples.

SPARQL prefix wildcard

I'm attempting to write a SPARQL query which would allow me to find all nodes which are reachable from a given node. At the moment every edge has the prefix http://www.foo.com/edge# and there are 3 possible edges (uses, extends, implements). While I can get the correct result from "?start (edge:uses | edge:implements | edge:extends)* ?reached " I would like to reduce that down to one statement, some kind of wildcard after edge:, so that if I add more edge types then I wouldn't need to extend the query. Is this possible?
see this SPARQL - Restricting Result Resource to Certain Namespace(s)
If you know it's always going to be in the same namespace, you could have something looking like:
?start ?edge ?reached
FILTER(REGEX(STR(?var), "^http://www.foo.com/edge#"))