SPARQL: Count how often an object property is used - sparql

I have an RDF graph and I want to count how many times the skos:broader object property is actually used. Is there a simple (and ideally efficient) SPARQL query I can use for that?

Trivially simple answer:
SELECT (COUNT(*) as ?cnt)
WHERE { ?s skos:broader ?o }

Related

MarkLogic seems not to follow the RDF specification

Write the following RDF data into MarkLogic:
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://John> <http://have> 10 .
<http://Peter> <http://have> "10"^^xsd:double .
Then execute the following query:
SELECT *
WHERE { ?s ?p 10. }
The query result is:
?s
?p
<http://John>
<http://have>
<http://Peter>
<http://have>
In RDF document, RDF term is equal if and only if the value and data type are both the same.
Literal term equality: Two literals are term-equal (the same RDF literal) if and only if the two lexical forms, the two datatype IRIs, and the two language tags (if any) compare equal, character by character. Thus, two literals can have the same value without being the same RDF term.
https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal
It seems that MarkLogic does not follow the RDF specification?
Expected query result:
The expected query result is:
?s
?p
<http://John>
<http://have>
Alternatives
Executing the SparQL query in Query Console and with Java Client API will both get the unexpected query result.
This query can return the expected query result in Apache Jena and RDF4j.
Can someone give me an answer or a hint about it?
A triple store that does that is not automatically wrong.
Try:
{ ?s ?p ?x . FILTER (sameTerm(?x, 10)}
{ ?s ?p ?x . FILTER (?x = 10) }
and also
Also try:
with 010 and +10, "010^^xsd:integer
and explore the lexical forms:
FILTER(str(?x) = "10")
The use cases for both value-based matching and term-based matching exist.
SPARQL query does not say whether the data loaded is exactly the data queried (by design). Several store do variations of value-processing as default behaviour. Jena does for TDB2 but keeps datatypes apart (TDB1 conflating datatypes was less popular).
It depends on what use case they are targetting. If you want a particular effect, then use a FILTER form of the query assuming data was not pre-processed on loading.
See the answer on Unexpected data writing result in MarkLogic
See https://www.w3.org/TR/sparql11-entailment/#CanonicalLit for the discussion that D-entailment gives many, many answers even with a restriction to canonical lexical forms.

Is there a better way to do case insensitive queries? [duplicate]

I am using Jena ARQ to write a SPARQL query against a large ontology being read from Jena TDB in order to find the types associated with concepts based on rdfs label:
SELECT DISTINCT ?type WHERE {
?x <http://www.w3.org/2000/01/rdf-schema#label> "aspirin" .
?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type .
}
This works pretty well and is actually quite speedy (<1 second). Unfortunately, for some terms, I need to perform this query in a case-insensitive way. For instance, because the label "Tylenol" is in the ontology, but not "tylenol", the following query comes up empty:
SELECT DISTINCT ?type WHERE {
?x <http://www.w3.org/2000/01/rdf-schema#label> "tylenol" .
?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type .
}
I can write a case-insensitive version of this query using FILTER syntax like so:
SELECT DISTINCT ?type WHERE {
?x <http://www.w3.org/2000/01/rdf-schema#label> ?term .
?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type .
FILTER ( regex (str(?term), "tylenol", "i") )
}
But now the query takes over a minute to complete! Is there any way to write the case-insensitive query in a more efficient manner?
From all the the possible string operators that you can use in SPARQL, regex is probably the most expensive one. Your query might run faster if you avoid regex and you use UCASE or LCASE on both sides of the test instead. Something like:
SELECT DISTINCT ?type WHERE {
?x <http://www.w3.org/2000/01/rdf-schema#label> ?term .
?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type .
FILTER (lcase(str(?term)) = "tylenol")
}
This might be faster but in general do not expect great performance for text search with any triple store. Triple stores are very good at graph matching and not so good at string matching.
The reason the query with the FILTER query runs slower is because ?term is unbound it requires scanning the PSO or POS index to find all statements with the rdfs:label predicate and filter them against the regex. When it was bound to a concrete resource (in your first example), it could use a OPS or POS index to scan over only statements with the rdfs:label predicate and the specified object resource, which would have a much lower cardinality.
The common solution to this type of text searching problem is to use an external text index. In this case, Jena provides a free text index called LARQ, which uses Lucene to perform the search and joins the results with the rest of the query.

How to list all all graphs in Virtuoso?

When I go to http://localhost:8890/sparql/, there are two fields: Default Data Set Name (Graph IRI) and query. How can I list what all graphs (that go in the former field) are available in my DB? The field is not mandatory and I can just run a query against all namespaces. But I would like to know how to list the graphs available.
The only non-empty graph I was able to run was http://localhost:8890/sparql
For example, in a relational database environment, I believe this kind of info could be retrieved from system tables.
As noted in comments, this query will get you a list of all Named Graphs (which, as also noted, are not the same as "namespaces") in the targeted store --
SELECT DISTINCT ?g
WHERE { GRAPH ?g {?s ?p ?o} }
ORDER BY ?g
You can see live results (limited here to 100 graph names) on the DBpedia endpoint (a very short list, as you would expect) and on URIBurner (a much longer and more varied list).
I know this is an old question, but I was facing the same problem and thought someone else might benefit from the solution I found.
I was trying to execute this query to list all graphs and it was taking about 5 minutes to return a result:
SELECT DISTINCT ?g
WHERE {
GRAPH ?g {?s ?p ?o}
}
Instead, you should try the following query suggested in this documentation from Virtuoso:
SELECT DISTINCT ?g
WHERE {
GRAPH ?g {?s a ?o}
}
This query finished in less than 1 second and returned the list of graphs in the store. Of course, this query would only return graphs which have at least one triple with the predicate rdf:type, but it's still a big improvement on the one suggested by TallTed.

Sparql query dbpedia istance's property and values in a domain

I want to write a SPARQL query to retrieve all properties and values for an instance in a particular domain. For example, I want all properties and values of the singer "Sting" only in "MusicalArtist" domain. How can I do this?
If I understand you correctly, this is a pretty basic SPARQL question, and you'd be better off starting with a good tutorial on SPARQL, or sitting down with the SPARQL 1.1 Query Language document. Just skim through it, you don't need to memorize everything right away, but just get familiar with it. After all, Stack Overflow isn't for reproducing standard documentation; you're expected to already be familiar with that.
As you've shown in a previous question, you can ask for all properties of a given subject with a query like:
select ?p ?o where {
dbpedia:Barry_White ?p ?o
}
You can select all instances of a particular class, e.g., dbpedia-owl:MusicalWork with a query like
select ?o where {
?o a dbpedia-owl:MusicalWork
}
It sounds like you're simply asking how to combine the two. It's just
select ?p ?o where {
dbpedia:Barry_White ?p ?o .
?o a dbpedia-owl:MusicalWork
}
SPARQL results (0 results)
That query doesn't actually have any results, because Barry White isn't related to anything that's a MusicalWork, evidently. However, if you just ask for Works that he's associated with, you'll get a few results:
select ?p ?o where {
dbpedia:Barry_White ?p ?o .
?o a dbpedia-owl:Work
}
SPARQL results (3 results)

How to list all different properties in DBPedia

I have a burning question concerning DBpedia. Namely, I was wondering how I could search for all the properties in DBpedia per page. The URI http://nl.dbpedia.org/property/einde concerns the property "einde". I would like to get all existing property/ pages. This does not seem too hard, but I don't know anything about SPARQL, so that's why I want to ask for some help. Perhaps there is some kind of dump of it, but I honestly don't know.
Rather than asking for pages whose URLs begin with, e.g., http://nl.dbpedia.org/property/, we can express the query by asking “for which values of ?x is there a triple ?x rdf:type rdf:Property in DBpedia?” This is a pretty simple SPARQL query to write. Because I expected that there would be lots of properties in DBPedia, I first wrote a query to count how many there are, and afterward wrote a query to actually list them.
There are 48292 things in DBpedia declared to be of rdf:type rdf:Property, as reported by this SPARQL query, run against one of DBpedia's SPARQL endpoints:
select COUNT( ?property ) where {
?property a rdf:Property
}
SPARQL Results
You can get the list by selecting ?property instead of COUNT( ?property ):
select ?property where {
?property a rdf:Property
}
SPARQL Results
I second Joshua Taylor's answer, however if you want to limit the properties to the Dutch DBpedia, you need to change the default-graph-uri query parameter to nl.dbpedia.org and set the SPARQL endpoint to nl.dbpedia.org/sparql, as in the following query. You will get a result-set of just above 8000 elements.
SELECT DISTINCT ?pred WHERE {
?pred a rdf:Property
}
ORDER BY ?pred
run query
These are the Dutch translations of the properties that have been mapped from Wikipedia so far. The full English list is also available. According to mappings.dbpedia.org, there are ~1700 properties with missing Dutch translations.