sparql use subquery as result/variable for outer query - sparql

I want to use result of nested query as a variable/sub graph for outer query.
My use-case is I have One-to-Many relation between product and offers, and I want to get all/selected products as well as offers count, minPrice and condition from offer record.
Here is my query
SELECT ?ProductID ?priceMIN ?total ?condition
WHERE {
{
SELECT ?ProductID (COUNT(?cond) AS ?total) (MIN(?pr) AS ?priceMIN ) ?condition WHERE{
?ProductID ^mod:isOfferOf ?oid.
?oid dprop:priceMIN ?pr.
?oid dprop:total ?cond.
BIND (if( ?cond = 1, "New", "Used") AS ?condition).
}
}
VALUES ?ProductID { prod:Rl5RVl5R prod:Rl5RVl5Q prod:Rl5RVl5W prod:Rl5RVl5Y prod:Rl5RVl5U }
}
and i am getting this data.
ProductID |priceMIN |condition |total
product:Rl5RVl5R | 3267 | Used | 1
product:Rl5RVl5R | 3216 | New | 4
product:Rl5RVl5Y | 327 | New | 1
product:Rl5RVl5Q | 323 | New | 1
product:Rl5RVl5Q | 3268 | Used | 1
product:Rl5RVl5W | 326 | New | 1
product:Rl5RVl5W | 3271 | Used | 4
product:Rl5RVl5U | 325 | New | 2
product:Rl5RVl5U | 3270 | Used | 1
Now I want to assign this value like
product:Rl5RVl5U dprop:newTotal ?total
product:Rl5RVl5U dprop:newMin ?priceMIN
product:Rl5RVl5U dprop:usedTotal ?total
product:Rl5RVl5U dprop:usedMin ?priceMIN

Related

Create a nested json with column values as key-value pairs

I am trying to build a JSON from the following tables
table : car_makers
+------+-------------+---------+
| cmid | companyname | country |
+------+-------------+---------+
| 1 | Toyota | Japan |
| 2 | Volkswagen | Germany |
| 3 | Nissan | Japan |
+------+-------------+---------+
Table : cars
+------+---------+-----------+
| cmid | carname | cartype |
+------+---------+-----------+
| 1 | Camry | Sedan |
| 1 | Corolla | Sedan |
| 2 | Golf | Hatchback |
| 2 | Tiguan | SUV |
| 3 | Qashqai | SUV |
+------+---------+-----------+
I am trying to create a nested JSON of this structure :
{
"companyName": "Volkswagen",
"carType": "Germany",
"cars": {
"Tiguan": "SUV",
"Golf": "Hatchback"
}
}
but the best I could do with the this query
select json_build_object('companyName',companyName, 'carType', country, 'cars', JSON_AGG(json_build_object('carName', carName, 'carType', carType) ))
from car_makers cm
join cars c on c.cmid = cm.cmid
group by companyName,country
is this -
{
"companyName": "Volkswagen",
"carType": "Germany",
"cars": [
{
"carName": "Tiguan",
"carType": "SUV"
},
{
"carName": "Golf",
"carType": "Hatchback"
}
]
}
So, how can I correct my current query to replace the nested json array with a json element of key-value pairs from column values ?
here is the fiddle with sample data and the query I have tried
You can use json_object_agg:
select json_build_object('companyName', c.companyName,
'country', c.country, 'cars', json_object_agg(c1.carName, c1.carType))
from car_makers c join cars c1 on c.cmid = c1.cmid
group by c.companyName, c.country
See fiddle.

Athena: compare columns consisting array<strings> in two different tables

I have 2 external tables (parquet files in S3) in Athena, each of them has a column which is array of strings. One of the tables is a subset and I need to compare these array values with the other table having the superset array. I believe the problem would be clearer with the below illustration. Both tables do not have any duplicate records.
Table 1 (Sample Subset table)
+---+-----------+---------------------------+
|no | prod_name | article_list |
+---+-----------+---------------------------+
| 1 |sofa | ['ABC','PQR'] |
| 2 |cupboard | ['LMN','DEF','XYZ'] |
| 3 |table | ['DEF'] |
| 4 |chair | ['DEF','PQR','ABC'] |
| 5 |dresser | ['LMN','IJK','WXY','STU'] |
+---+--------------------+------------------+
Table 2 (Sample Superset table)
+---+---------+--------------+---------------------------------------------------+
|no | wh_code | restock_date | article_list |
+---+---------+--------------+---------------------------------------------------+
| 1 |WH0001 | 2020-01-12 | ['ABC','BCE','CDE','DEF','JKL','PQR','QRS','STU'] |
| 2 |WH0001 | 2020-04-15 | ['ABC','CDE','DEF','IJK','LMN','PQR','STU','XYZ'] |
| 3 |WH0002 | 2021-03-17 | ['BCE','DEF','IJK','LMN','PQR','RST','STU','WXY'] |
| 4 |WH0003 | 2021-08-20 | ['ABC','IJK','LMN','NOP','PQR','RST','STU','WXY'] |
| 5 |WH0003 | 2022-03-26 | ['DEF','IJK','LMN','NOP','PQR','RST','STU','XYZ'] |
+---+---------+--------------+---------------------------------------------------+
Required result
+------------------------+---------+-----------------+
|article_list (table 1) | wh_code | restock_date |
+---+--------------------+---------------------------+
| ['ABC','PQR'] | WH0001 | 2020-01-12 |
| ['ABC','PQR'] | WH0001 | 2020-04-15 |
| ['ABC','PQR'] | WH0003 | 2021-08-20 |
| ['LMN','DEF','XYZ'] | WH0001 | 2020-04-15 |
| ['LMN','DEF','XYZ'] | WH0003 | 2021-08-20 |
| ['DEF'] | WH0001 | 2020-01-12 |
| ['DEF'] | WH0001 | 2020-04-15 |
| ['DEF'] | WH0002 | 2021-03-17 |
| ['DEF'] | WH0003 | 2022-03-26 |
| . | . | . |
| . | . | . |
| . | . | . |
+------------------------+---------+-----------------+
The following query in Athena works to find a particular combination (['ABC', 'PQR']) in table 2 consisting of the superset array. It results in the first 3 rows of the required result.
SELECT ['ABC', 'PQR'] as article_list,
wh_code,
restock_date
FROM "table_2"
WHERE filter(ARRAY ['ABC', 'PQR'], x -> NOT CONTAINS(article_list, x)) = ARRAY[]
group by wh_code, restock_date
Request help to write a generic query (considering all the combinations from table 1) to get the desired result
Join the two table on the required condition. Also it seems that you should consider using array_except to simplify the query (also I use cardinality to count number of elements):
-- sample data
with table1(no, prod_name, article_list ) as (
values ( 1, 'sofa', array['ABC','PQR']),
( 2, 'cupboard', array['LMN','DEF','XYZ'] )
),
table2 (no, wh_code, restock_date, article_list) as (
values (1, 'WH0001', date '2020-01-12', array['ABC','BCE','CDE','DEF','JKL','PQR','QRS','STU']),
(2, 'WH0001', date '2020-04-15', array['ABC','CDE','DEF','IJK','LMN','PQR','STU','XYZ']),
(3, 'WH0002', date '2021-03-17', array['BCE','DEF','IJK','LMN','PQR','RST','STU','WXY']),
(4, 'WH0003', date '2021-08-20', array['ABC','IJK','LMN','NOP','PQR','RST','STU','WXY'])
)
-- query
select t1.article_list, t2.wh_code, t2.restock_date
from table1 t1
join table2 t2 on cardinality(array_except(t1.article_list, t2.article_list)) = 0;
Output:
article_list
wh_code
restock_date
[ABC, PQR]
WH0001
2020-01-12
[ABC, PQR]
WH0001
2020-04-15
[ABC, PQR]
WH0003
2021-08-20
[LMN, DEF, XYZ]
WH0001
2020-04-15
UPD
Try next one, but taking in account size of the data maybe you will need to partition the queries:
-- query
select arbitrary(article_list), wh_code, restock_date
from (select no, article_list, article
from table1, unnest (article_list) as t(article)) as t1
join (select no, wh_code, restock_date, article
from table2, unnest (article_list) as t(article)) as t2 on t1.article = t2.article
group by t1.no, wh_code, restock_date
having count(t1.article) = cardinality(arbitrary(article_list));

SQL Multiple Joining

I'm trying to join the two table and at the same time getting the value of the certain column by using inner join, I tried joining until the 3rd diagram but when it comes to the fourth it doesn't display the null values, how can I display the values of the 4th column even the null values
here's the code of the SQL
betl.user_id,
betl.agent_id,
ah1.parent_id,
ah2.user_id,
ah3.user_id AS parent_of_agent
FROM
wpc16_02.bets_logs betl
INNER JOIN
wpc16_02.agent_heirarchy ah1 ON betl.agent_id = ah1.user_id
INNER JOIN
wpc16_02.agent_heirarchy ah2 ON ah1.parent_id = ah2.id
INNER JOIN
wpc16_02.agent_heirarchy ah3 ON ah2.parent_id = ah3.id
WHERE
fight_id = 1930 AND agent_income = 0
here's what I'm trying to achieve by using innerjoins:
Here's the result I got when trying the joining upto the 3rd diagram:
user_id | agent_id | parent_id | user_id_of_parent
15012 | 2212 | 96 | 160
227097 | 22061 | 266 | 64
465174 | 464899 | 126 | 211
505094 | 504767 | 980 | 5358
241158 | 8281 | 18 | 67
463344 | 462715 | 751 | 3420
184396 | 29870 | 502 | 2123
486847 | 43225 | 164 | 234
482120 | 482023 | 4430 | 46469
369628 | 217212 | 8283 | 109697
When joining upto 4th diagram:
user_id | agent_id | parent_id | user_id_of_parent | master_uid
184396 | 29870 | 502 | 2123 | 160
482120 | 482023 | 4430 | 46469 | 699
369628 | 217212 | 8283 | 109697 | 71
97287 | 93996 | 7332 | 93866 | 3114
113287 | 113228 | 2714 | 20652 | 4050
366287 | 361918 | 17603 | 235880 | 234
439935 | 236147 | 3776 | 40054 | 103
480201 | 436936 | 1041 | 5761 | 160
456400 | 456248 | 32901 | 431900 | 240
502877 | 497592 | 2571 | 20845 | 3918
notice the other datas have been removed because when I joined the 4th diagram some of the results are null
You seem to want LEFT JOIN. It is a little unclear what the exact query is, because your question doesn't have information such as which columns are in which tables.
But the idea is:
SELECT . . .
FROM wpc16_02.bets_logs betl LEFT JOIN
wpc16_02.agent_heirarchy ah1
ON betl.agent_id = ah1.user_id LEFT JOIN
wpc16_02.agent_heirarchy ah2
ON ah1.parent_id = ah2.id LEFT JOIN
wpc16_02.agent_heirarchy ah3
ON ah2.parent_id = ah3.id
WHERE betl.fight_id = 1930 AND betl.agent_income = 0
This assums that flight_id and agent_income are from the first table. If they are in one of the hierarchy tables, then the conditions should go in the appropriate ON clause.

Get all Wikidata items that have more than 10 languages?

I'm trying to get the most famous movies in the world from Wikidata with SPARQL.
I have the following query:
SELECT ?item WHERE {
?item wdt:P31 wd:Q11424.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
which returns ALL movies (about 214143).
I basically only need movies that have, let's say, more than 10 language entries on wikipedia, as I'm guessing these will be the most famous ones.
Is there a way to do this inside the query itself, without checking all entries ?
A naive answer to your question is:
SELECT ?movie (count(?wikipage) AS ?count) WHERE {
hint:Query hint:optimizer "None" .
?movie wdt:P31 wd:Q11424 .
?wikipage schema:about ?movie .
?wikipage schema:isPartOf/wikibase:wikiGroup "wikipedia"
} GROUP BY ?movie HAVING (?count > 10) ORDER BY DESC(?count)
Try it!
Alternatively, you could consider total number of sitelinks. Sitelinks include links to Wikipedia and also links to Wikiquote, Wikivoyage etc. The advantage is that total number of sitelinks is precomputed.
SELECT ?movie ?sitelinks WHERE {
?movie wdt:P31 wd:Q11424 .
?movie wikibase:sitelinks ?sitelinks .
FILTER (?sitelinks > 10)
} ORDER BY DESC(?sitelinks)
Try it!
See also these questions:
Get Wikipedia URLs (sitelinks) in Wikidata SPARQL query
Wikidata results sorted by something similar to a PageRank
As #TallTed and #AKSW have pointed out, the number of labels in different languages may be differ from the number of Wikipedia articles in different languages. Here below a comparison.
Top 5 movies by Wikipedia articles
| title | articles | sitelinks | labels |
|---------------------|----------|-----------|--------|
| Avatar | 92 | 103 | 99 |
| Titanic | 86 | 100 | 101 |
| The Godfather | 79 | 103 | 82 |
| Slumdog Millionaire | 72 | 75 | 80 |
| Forrest Gump | 71 | 101 | 84 |
Top 5 movies by sitelinks
| title | articles | sitelinks | labels |
|---------------|----------|-----------|--------|
| Avatar | 92 | 103 | 99 |
| The Godfather | 79 | 103 | 82 |
| Forrest Gump | 71 | 101 | 84 |
| Titanic | 86 | 100 | 101 |
| The Matrix | 67 | 94 | 77 |
Top 5 movies by labels
| title | articles | sitelinks | labels |
|------------------------------|----------|-----------|--------|
| The 25th Reich | 2 | 2 | 227 |
| Time Is But Brief | 0 | 0 | 224 |
| Michael Moore in TrumpLand | 6 | 6 | 222 |
| Magnus - The Mozart of Chess | 1 | 1 | 221 |
| Lee Chong Wei | 1 | 1 | 196 |

How to select Multiple Rows based on one Column

So I have looked around the internet, and couldn't find anything that could be related to my issue.
This is part of my DB:
ID | English | Pun | SID | Writer |
=======================================================
1 | stuff | stuff | 1 | Full |
2 | stuff | stuff | 1 | Rec. |
3 | stuff | stuff | 2 | Full |
4 | stuff | stuff | 2 | Rec. |
Now how would I get all rows with SID being equal to 1.
Like this
ID | English | Pun | SID | Writer |
=======================================================
1 | stuff | stuff | 1 | Full |
2 | stuff | stuff | 1 | Rec. |
Or when I want to get all rows with SID being equal to 2.
ID | English | Pun | SID | Writer |
=======================================================
3 | stuff | stuff | 2 | Full |
4 | stuff | stuff | 2 | Rec. |
This is my current SQL Query using SQLite:
SELECT * FROM table_name WHERE SID = 1
And I only get the first row, how would I be able to get all of the rows?
Here is my PHP Code:
class GurDB extends SQLite3
{
function __construct()
{
$this->open('gurbani.db3');
}
}
$db = new GurDB();
$mode = $_GET["mode"];
if($mode == "2") {
$shabadnum = $_GET["shabadNo"];
$result = $db->query("SELECT * FROM table_name WHERE SID = $shabadnum");
$array = $result->fetchArray(SQLITE3_ASSOC);
print_r($array);
}
Fetch array only gives you one row... you want something like this:
while($row = $result->fetch_array())
{
$rows[] = $row;
}