SPARQL - Concat values on multiple rows

SPARQL - Concat values on multiple rows - sparql

I want a list of all currencies by El Salvador with their subdivision.
I use this query:
SELECT ?currency ?currencyLabel ?currencyIso4217 ?subdivisionLabel {
?currency wdt:P498 ?currencyIso4217 .
?currency wdt:P9059 ?subdivision .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
{
?country wdt:P38 ?currency .
BIND(wd:Q792 AS ?country).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}
Try it here: Link
This gives the following result:
Works perfectly. But row 2 and three are the same currency. That currency only uses multiple names for the subdivision. I want them to concat so I get this output:
Is that possible?

Grouping in SPARQL works similar to SQL.
The function GROUP BY combines results that have identical values into groups. Afterwards, an aggregate function is applied to the other (non-identical) values for each group.
Typical aggregate functions are COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT, and SAMPLE.
For you GROUP_CONCAT is of interest. It performs a string concatenation across the values of a group. With the argument separator you can even specify a separator character. The order of the strings is arbitrary.
The syntax of GROUP BY and GROUP_CONCAT was already given to you in the comment by UninformedUser but I repeat it here in a slightly adapted form:
SELECT ?currency ?currencyLabel ?currencyIso4217 (GROUP_CONCAT(?subdivisionLabel; separator = ", ") as ?subdivisionLabels) {
BIND(wd:Q792 AS ?country).
?country wdt:P38 ?currency .
?currency wdt:P498 ?currencyIso4217 .
?currency wdt:P9059 ?subdivision .
?subdivision rdfs:label ?subdivisionLabel .
FILTER(lang(?subdivisionLabel) = 'en')
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} GROUP BY ?currency ?currencyLabel ?currencyIso4217

Related

How to get only the first value from an optional property?

Like in the SQL aggregate MAX, MIN or FIRST, it gets only one value, not duplicating lines.
Real Wikidata case
Where the OPTIONAL clause expands from 253 to 257 lines:
# Countries and its codes
SELECT ?code ?item ?itemLabel ?osmId
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code
try here
I need only one (any) osmId. How to do something like FIRST{OPTIONAL{?item wdt:P402 ?osmId .}} ?
NOTES:
it is not a duplicate of How to get only the most recent value from a Wikidata property?
it is not a duplicate of Why does this Wikidata SPARQL query only work for the first element in a list?
... no exactly need for simple "any first".

Here a WIKI answer (please you can edit to enhance this answer!)
# Countries and its codes
SELECT ?code ?item ?itemLabel
(MAX(?osmId) as ?osmId_max) (COUNT(?code) as ?osmId_n)
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code ?item ?itemLabel
ORDER BY ?code
try
The COUNT(?code) is only to check the lines where osmId was not an Unique-ID.
Other simple solution to filter only the first option?
Using SAMPLE
As the #ValerioCocchi suggestion, we can use SAMPLE instead MAX:
SELECT ?code ?item ?itemLabel (SAMPLE(?osmId) as ?osmId_sample)
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code ?item ?itemLabel
ORDER BY ?code
try
SAMPLE use a little bit less CPU-time, but the main motivation to use is when you don’t care which value is returned. In the case of Wikidata, when the property-value is to be unique but there are some (minimal) errors, and you can ignore them.
NOTE about the osmId: the advantage of MAX in this particular query, using an numeric ID related to a temporal sequence, is that it can be a "fresher" ID... But in OpenStreetMap (OSM) the strategy can be the inverse: most old is the most stable ID. So, SAMPLE make sense also in a context of ignorance about better strategy.
Using FILTER
The #StanislavKralin suggestion:
SELECT ?code ?item ?itemLabel ?osmId
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{
?item wdt:P402 ?osmId
FILTER NOT EXISTS {
?item wdt:P402 ?osmId, ?osmId_ .
FILTER (?osmId_ > ?osmId)
}
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code
try
Seems more verbose.

Filtering URL value

What I am trying to do is to select a certain entity for the certain condition. If the condition is url, then how I can make a proper query to get the only result that matches the v1 = "https://www.getmagicbullet.com/" .
The commented sentences are what I tried different ways, but only to fail. How could I adjust the query to get the right answer?
Thank you for your help.
SELECT DISTINCT ?iLabel ?p ?v1
WHERE {
?i wdt:P31 wd:Q212920.
?i wdt:P856 ?v1.
# FILTER (?v1Label = "https://www.getmagicbullet.com/")
# { ?v1 rdfs:label "https://www.getmagicbullet.com/"#en }
# UNION { ?v1 skos:altLabel "https://www.getmagicbullet.com/"#en }
SERVICE wikibase:label { bd:serviceParam wikibase:language "ko,en,[AUTO_LANGUAGE]". }
}
LIMIT 1000

SPARQL query for finding films originating from and released in the United States

I have the following SPARQL query that appears to correctly produce the films produced in the US (country of origin) and released in the US (place of publication) in 2018. The issue I'm having is that one row is produced for each release even though the other releases are outside of the US. I've added a limit to reduce the size of the response.
Here is the query:
SELECT ?item ?name ?publication_date ?placeLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?item rdfs:label ?name;
wdt:P31 wd:Q11424;
wdt:P495 wd:Q30; # -> country of origin US
wdt:P577 ?publication_date.
?item p:P577 ?publication_statement.
?publication_statement pq:P291 ?place.
FILTER(xsd:date(?publication_date) > "2018-01-01"^^xsd:date)
FILTER(
(LANG(?name)) = "en"
&& ?place=wd:Q30) # -> place of publication
}
ORDER BY ?name
LIMIT 10
I would like to change it so that it produces one row per movie IF it had a release in the US in 2018.
Thanks for your help. Comments on the use of FILTER or other non idiomatic SPARQL are also welcome.

You can use GROUP BY:
SELECT ?item (SAMPLE(?name) as ?Name) (SAMPLE(?publication_date) as ?Date) WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?item rdfs:label ?name;
wdt:P31 wd:Q11424;
wdt:P495 wd:Q30; # -> country of origin US
wdt:P577 ?publication_date.
?item p:P577 ?publication_statement.
?publication_statement pq:P291 ?place.
FILTER(xsd:date(?publication_date) > "2018-01-01"^^xsd:date)
FILTER(
(LANG(?name)) = "en"
&& ?place=wd:Q30) # -> place of publication
}
GROUP BY ?item
ORDER BY ?Name
LIMIT 10
See this query on Wikidata.
And you need to fix the SELECT line as you can't pass out the indeterminate non-group keys without explicitly saying. See similar question.

How to query Wikidata for "also known as"

I would like to know how to query Wikidata by using the alias ("also known as").
Right now I am trying
SELECT ?item
WHERE
{
?item rdfs:aliases ?alias.
FILTER(CONTAINS(?alias, "Angela Kasner"#en))
}
LIMIT 5
This is simply a query that works if I replace rdfs:aliases by rdfs:labels.
I am trying this, because Help:Aliases says that aliases are searchable in the same way as labels, but I can't find any other resource on that nor can I find an example.

This query might be helpful for someone querying also known as for properties:
SELECT ?property ?propertyLabel ?propertyDescription (GROUP_CONCAT(DISTINCT(?altLabel); separator = ", ") AS ?altLabel_list) WHERE {
?property a wikibase:Property .
OPTIONAL { ?property skos:altLabel ?altLabel . FILTER (lang(?altLabel) = "en") }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
}
GROUP BY ?property ?propertyLabel ?propertyDescription
LIMIT 5000

Recover the "original" order

I am trying to recover the cast list for movies from wikidata.
My SPARQL query for Dr. No is as follows:
SELECT ?actor ?actorLabel WHERE {
?movie wdt:P161 ?actor .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(?movie = wd:Q102754)
}
LIMIT 1000
I can try it out at query.wikidata.org but the results are not in the order that I want. It gives 'Sean Connery', 'Zena Marshall', 'Ursula Andress'.
The database has the data in the required order as you can see from https://www.wikidata.org/wiki/Q102754 includes the cast list in order (Sean Connery, Ursula Andress, Joseph Wiseman). Generally the cast list is given in billing order and it is that that I want to recover.

SPARQL provides ordering of results by using ORDER BY, see here
The ordering in your example is based on the number of references of a statement. Here is a non-optimized version that does what you want:
SELECT ?actor ?actorLabel WHERE {
?movie p:P161 ?statement .
?statement ps:P161 ?actor .
OPTIONAL {?statement prov:wasDerivedFrom ?ref . }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(?movie = wd:Q102754)
}
group by ?movie ?actor ?actorLabel
ORDER BY DESC(count(?ref)) ASC(?actorLabel)
LIMIT 1000

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SPARQL - Concat values on multiple rows - sparql

Related

How to get only the first value from an optional property?

Filtering URL value

SPARQL query for finding films originating from and released in the United States

How to query Wikidata for "also known as"

Recover the "original" order

Categories

Resources