I'm trying to lookup a company based on it's ticker symbol using sparql.
This query will list businesses and their tickers (basic query)
SELECT DISTINCT ?id ?idLabel ?ticker
WHERE {
?id wdt:P31/wdt:P279* wd:Q4830453 .
?id wdt:P249 ?ticker .
?id rdfs:label ?idLabel
FILTER(LANG(?idLabel) = 'en').
}
But, IBM is not included, because IBM has it's stock ticker placed 'inside' the P414 property (Stock exchange).
https://www.wikidata.org/wiki/Q37156
How can I expand this list to include companies with P414 and P249 tickers "inside" them?
Here is how I can show ibm is not included:
SELECT DISTINCT ?id ?idLabel ?exchange ?ticker2
WHERE {
?id wdt:P31/wdt:P279* wd:Q4830453 .
?id wdt:P249 ?ticker . FILTER(LCASE(STR(?ticker)) = 'ibm') .
?id rdfs:label ?idLabel
FILTER(LANG(?idLabel) = 'en').
}
So the answer based on AKSW and Stanislav's comments are that THIS query will list all the stocks on the New York Stock Exchange (as long as the ticker is listed 'under' the exchange:
SELECT DISTINCT ?id ?idLabel ?exchange ?ticker
WHERE {
?id wdt:P31/wdt:P279* wd:Q4830453 .
?id p:P414 ?exchange .
?exchange ps:P414 wd:Q13677 .
?exchange pq:P249 ?ticker .
?id rdfs:label ?idLabel
FILTER(LANG(?idLabel) = 'en').
}
And THIS query will find a specific stock (IBM) on the New York stock exchange:
SELECT DISTINCT ?id ?idLabel ?exchange ?ticker
WHERE {
?id wdt:P31/wdt:P279* wd:Q4830453 .
?id p:P414 ?exchange .
?exchange ps:P414 wd:Q13677 .
?exchange pq:P249 ?ticker . FILTER(LCASE(STR(?ticker)) = 'ibm') .
?id rdfs:label ?idLabel
FILTER(LANG(?idLabel) = 'en').
}
And THIS query will find a specific stock on ANY stock exchange, OR referenced directly (Shown here with two different stock tickers to illustrate the search). This query is quite long because wikidata sometimes have the stock exchange sub field UNDER the ticker, and sometimes the other way around. Oh, and sometimes they are two different fields altogether (not linked). Oh Joy.
SELECT DISTINCT ?id ?idLabel ?exchange ?exchangeLabel ?ticker
WHERE {
?id wdt:P31/wdt:P279* wd:Q4830453 .
{
# Find cases where the ticker is the main attribute, and the exchange may be below it.
?id wdt:P249 ?ticker . FILTER(LCASE(STR(?ticker)) = 'nsu') .
?id p:P249 ?tickersub .
?tickersub pq:P414 ?exchange
}
UNION {
# Find the exchange and it's ticker
?id wdt:P414 ?exchange .
?id p:P414 ?exchangesub .
?exchangesub pq:P249 ?ticker . FILTER(LCASE(STR(?ticker)) = 'ibm') .
}
UNION {
# Find the exchange and it's ticker
?id wdt:P414 ?exchange .
?id wdt:P249 ?ticker . FILTER(LCASE(STR(?ticker)) = 'frme') .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
I created the following SPARQL query to Wikidata. And the result of this query are records related to states in Germany. But as you can see, results are occurring four times in a row (you can test it here: https://query.wikidata.org/). I supposed that there is a problem with geo coordinates and languages but I can't resolve it anyway. What is wrong with this query and how can I fix it to receive a result without repetition?
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX schema: <http://schema.org/>
PREFIX psv: <http://www.wikidata.org/prop/statement/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT DISTINCT ?subject ?featureCode ?countryCode ?name ?latitude ?longitude ?description ?iso31662
WHERE
{ ?subject wdt:P31 wd:Q1221156 ;
rdfs:label ?name ;
wdt:P17 ?countryClass .
?countryClass
wdt:P297 ?countryCode .
?subject wdt:P31/(wdt:P279)* ?adminArea .
?adminArea wdt:P2452 "A.ADM1" ;
wdt:P2452 ?featureCode .
?subject wdt:P300 ?iso31662
OPTIONAL
{ ?subject schema:description ?description
FILTER ( lang(?description) = "en" )
?subject p:P625 ?coordinate .
?coordinate psv:P625 ?coordinateNode .
?coordinateNode
wikibase:geoLatitude ?latitude ;
wikibase:geoLongitude ?longitude
}
FILTER ( lang(?name) = "en" )
FILTER EXISTS { ?subject wdt:P300 ?iso31662 }
}
ORDER BY lcase(?name)
OFFSET 0
LIMIT 200
In short, "9.0411111111111"^^xsd:double and "9.0411111111111"^^xsd:decimal are distinct, though they might be equal in some sense.
Check this:
SELECT DISTINCT ?subject ?featureCode ?countryCode ?name ?description ?iso31662
(datatype(?latitude) AS ?lat)
(datatype(?longitude) AS ?long)
and this:
SELECT DISTINCT ?subject ?featureCode ?countryCode ?name ?description ?iso31662
(xsd:decimal(?latitude) AS ?lat)
(xsd:decimal(?longitude) AS ?long)
I want to get all the pizza names which has cheese toppings but the result shows (_:b0) which is kind of an owl restriction following is my query
PREFIX pizza: <http://www.co-ode.org/ontologies/pizza/pizza.owl#>
SELECT ?X WHERE {
?X rdfs:subClassOf* [
owl:onProperty pizza:hasTopping ;
owl:someValuesFrom pizza:CheeseTopping
]
}
using Pizza ontology from stanford
This works (Without reasoning enabled)
PREFIX pizza: <http://www.co-ode.org/ontologies/pizza/pizza.owl#>
SELECT ?X ?topping WHERE {
?X rdfs:subClassOf ?Y .
?Y owl:someValuesFrom ?topping .
?topping rdfs:subClassOf* pizza:CheeseTopping
}
ORDER BY ?X
Some are listed more than once as they could contain more than one CheeseTopping. To remove duplicates:
PREFIX pizza: <http://www.co-ode.org/ontologies/pizza/pizza.owl#>
SELECT DISTINCT ?X WHERE {
?X rdfs:subClassOf ?Y .
?Y owl:someValuesFrom ?topping .
?topping rdfs:subClassOf* pizza:CheeseTopping
}
ORDER BY ?X
This works if you enable a reasoner:
PREFIX pizza: <http://www.co-ode.org/ontologies/pizza/pizza.owl#>
SELECT DISTINCT ?X WHERE {
?X rdfs:subClassOf pizza:CheeseyPizza
}
Ref:
Used the pizza ontology from here: http://protege.stanford.edu/ontologies/pizza/pizza.owl
That query works but is really complex and might be incomplete because some pizzas use complex OWL constructs:
PREFIX pizza: <http://www.co-ode.org/ontologies/pizza/pizza.owl#>
SELECT DISTINCT ?pizza WHERE {
{
?pizza rdfs:subClassOf* pizza:Pizza .
?pizza owl:equivalentClass|rdfs:subClassOf [
rdf:type owl:Restriction ;
owl:onProperty pizza:hasTopping ;
owl:someValuesFrom/rdfs:subClassOf* pizza:CheeseTopping
]
} UNION {
?pizza owl:equivalentClass _:b0 .
_:b0 rdf:type owl:Class ;
owl:intersectionOf _:b1 .
_:b1 (rdf:rest)*/rdf:first ?otherClass.
?otherClass rdf:type owl:Restriction ;
owl:onProperty pizza:hasTopping ;
owl:someValuesFrom/rdfs:subClassOf* pizza:CheeseTopping
}
}
First of all, I didn't make a minimum example because i think that my problem can be understood without it.
Second, I didn't give you the data because i think that my problem can be solved without it. However, I'm open to give it to you if you ask.
This is my query:
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
This is the result:
As you see, the result contains many values for the same suggestedItem, I want to make group according to the suggestedItem and sum the values of finalSimilarity
I tried this:
select ?item (SUM(?similarity * ?importance * ?levelImportance ) as ?finalSimilarity) (group_concat(distinct ?x) as ?likedItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason) where
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
group by ?item
order by desc(?finalSimilarity)
but the result is:
this is something wrong in my way because if you look at the finalSimilarity in, the value is 1.7. However, if you sum that manually from the first query, you get 0.62 so I did something wrong,
could you help me discover it?
Please note that the two queries are the same, it is just the select statment are different
Hint
I am already able to solve it using two selects like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rs: <http://www.SemanticRecommender.com/rs#>
PREFIX bo: <http://www.BookOntology.com/bo#>
PREFIX :<http://www.SemanticBookOntology.com/sbo#>
select ?suggestedItem ( SUM (?finalSimilarity) as ?summedFinalSimilarity) (group_concat(distinct strafter(str(?likedItem), "#")) as ?becauseYouHaveLikedThisItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason)
where {
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
where
{
values ?user {bo:ania}
#the variable ?x is bound to the items the user :ania has liked.
?user rs:hasRated ?ratings.
?ratings a rs:Likes.
?ratings rs:aboutItem ?x.
?ratings rs:ratesBy ?ratingValue.
#level 0 class similarities
{
#extract all the items that are from the same class (type) as the liked items.
#I assumed the being from the same class accounts for 50% of the similarities.
#This value can be changed according to the test or the application domain.
values ?classImportance {0.5} #class level
bind (?classImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
?classSimilarity rs:appliedOnClass ?class .
?classSimilarity rs:hasClassSimilarityValue ?similarity .
?item a ?class.
bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
}
union
#level 0 instance similarities
{
#extract the items that share the same value for important predicates with the already liked items..
#I assumed that having the same instance for important predicates account for 100% of the similarities.
#This value can be changed according to the test or the application domain.
values ?instanceImportance {1} #instance level
bind (?instanceImportance as ?importance)
bind( 4/7 as ?levelImportance)
?x a ?class.
?class rdfs:subClassOf ?mainClass .
?mainClass rdfs:subClassOf rs:RecommendableClass .
?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
?propertySimilarity rs:appliedOnProperty ?property .
?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
?x ?property ?value .
?item ?property ?value .
bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
}
filter (?x != ?item)
}
}
group by ?suggestedItem
order by desc(?summedFinalSimilarity)
but to me that is a stupid solution and there must be a more clever one where i can take the aggregated data using one select
Without seeing your data, it's impossible to say, and with a query this big, it's probably not worth trying to debug the exact problem, but it's easy for this to happen if you can have duplicates (which would be easy to get, especially if you're using unions where some condition could match both parts). For instance, suppose you have data like this:
#prefix : <urn:ex:>
:x :similar [ :sim 0.10 ; :mult 2 ] ,
[ :sim 0.12 ; :mult 1 ] ,
[ :sim 0.12 ; :mult 1 ] , # yup, a duplicate
[ :sim 0.15 ; :mult 4 ] .
Then if you run this query, you'll get four result rows:
prefix : <urn:ex:>
select ?sim ((?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim | final |
================
| 0.15 | 0.60 |
| 0.12 | 0.12 |
| 0.12 | 0.12 |
| 0.10 | 0.20 |
----------------
However, if you select distinct, you'll only see three:
select distinct ?sim ((?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim | final |
================
| 0.15 | 0.60 |
| 0.12 | 0.12 |
| 0.10 | 0.20 |
----------------
Once you start to group by and sum, those non-distinct values will both get included:
select (sum(?sim * ?mult) as ?final) {
:x :similar [ :sim ?sim ; :mult ?mult ] .
}
---------
| final |
=========
| 1.04 |
---------
That sum is the sum of all four terms, not the three distinct ones. Even if the data doesn't have the duplicate values, the union can introduce the duplicate results:
#prefix : <urn:ex:>
:x :similar [ :sim 0.10 ; :mult 2 ] ,
[ :sim 0.12 ; :mult 1 ] ,
[ :sim 0.15 ; :mult 4 ] .
prefix : <urn:ex:>
select (sum(?sim * ?mult) as ?final) {
{ :x :similar [ :sim ?sim ; :mult ?mult ] }
union
{ :x :similar [ :sim ?sim ; :mult ?mult ] }
}
---------
| final |
=========
| 1.84 |
---------
Since you found the need to use group_concat(distinct …), I wouldn't be surprised if there are duplicates of that nature.
I am trying to query dbpedia to get some people data and I don't have subjects just names of the people I want to query and their birth/death dates.
I am trying to do a query along these lines. I want the name, birth date, death date and thumbnail of everyone with the surname Presley. What I then intend to do is loop through the results returned and find the best match for Elvis Presley 1935-1977 which is the data I have.
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?Name ?thumbnail ?birthDate ?deathDate WHERE {
{
dbo:name ?Name ;
dbo:birthDate ?birthDate ;
dbo:birthDate ?deathDate ;
dbo:thumbnail ?thumbnail ;
FILTER contains(?Name#en, "Presley")
}
What is the best way to construct my sparql query?
UPDATE:
I have put together this query which seems to work to some extent but I don't entirely understand it, and I can't figure out the contains, but it does at least run and return results.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?subject ?thumbnail ?birthdate ?deathdate WHERE {
{
?subject rdfs:label "Elvis Presley"#en ;
dbo:thumbnail ?thumbnail ;
dbo:birthDate ?birthdate ;
dbo:deathDate ?deathdate ;
a owl:Thing .
}
UNION
{
?altName rdfs:label "Elvis Presley"#en ;
dbo:thumbnail ?thumbnail ;
dbo:birthDate ?birthdate ;
dbo:deathDate ?deathdate ;
dbo:wikiPageRedirects ?s .
}
}
Some entities might not have all of that information, so it's better to use optional. You can use foaf:surname to check for surname directly.
select * where {
?s foaf:surname "Presley"#en
optional { ?s dbo:name ?name }
optional { ?s dbo:birthDate ?birth }
optional { ?s dbo:deathDate ?death }
optional { ?s dbo:thumbnail ?thumb }
}