Returning a complex sql with joins as an array of json objects - sql

Using a sql query, how can I return an array of json objects that looks like this:
{
"result":[
{
"RentBookRegistrationId":1,
"date":"15-08-2022",
"PersonName":"Peter",
"Books":[
{
"name":"Ulysses"
},
{
"name":"Hamlet
}
],
"Processes":[
{
"no":1,
"name":"Online booking"
},
{
"no":2,
"name":"Reserved beforehand"
},
{
"no":4,
"name":"Vending machined used"
}
]
}
]
}
From a SQL Server database that looks like this:
Table: RentBookRegistration
+----+------------+-----------------------------+
| id | date | person |
+----+------------+-----------------------------+
| 1 | 15-08-2022 | {"no": 1, "name": "Peter"} |
+----+-------- ---+-----------------------------+
| 2 | 16-08-2022 | {"no": 2, "name": "Anna"} |
+----+------------+-----------------------------+
| 3 | 17-08-2022 | {"no": 1, "name": "Peter"} |
+----+------------+-----------------------------+
| 4 | 17-08-2022 | {"no": 2, "name": "Mark"} |
+----+------------+-----------------------------+
Table: BookData
+----+------------------------+-----------------------------------------------------------+
| id | rentBookRegistrationId | book |
+----+------------------------+-----------------------------------------------------------+
| 1 | 1 | {"name": "Ulysses", "author": "James Joyce", "year": 1918}|
+----+------------------------+-----------------------------------------------------------+
| 2 | 1 | {"name": "Hamlet", "author": "Shakespeare", "year": 1601} |
+----+------------------------+-----------------------------------------------------------+
| 3 | 2 | {"name": "Dune", "author": "Frank Herbert", "year": 1965} |
+----+------------------------+-----------------------------------------------------------+
| 4 | 3 | {"name": "Hamlet", "author": "Shakespeare", "year": 1601} |
+----+------------------------+-----------------------------------------------------------+
| 5 | 4 | {"name": "Hamlet", "author": "Shakespeare", "year": 1601} |
+----+------------------------+-----------------------------------------------------------+
Table: ProcessData
+----+------------------------+-----------------------------------------------------------+
| id | rentBookRegistrationId | processUsed |
+----+------------------------+-----------------------------------------------------------+
| 1 | 1 | {"no": 1, "name": "Online booking"} |
+----+------------------------+-----------------------------------------------------------+
| 2 | 1 | {"no": 2, "name": "Reserved beforehand"} |
+----+------------------------+-----------------------------------------------------------+
| 3 | 1 | {"no": 4, "name": "Vending machined used"} |
+----+------------------------+-----------------------------------------------------------+
| 3 | 2 | {"no": 1, "name": "Online booking"} |
+----+------------------------+-----------------------------------------------------------+
| 4 | 2 | {"no": 4, "name": "Vending machined used"} |
+----+------------------------+-----------------------------------------------------------+
| 5 | 3 | {"no": 2, "name": "Reserved beforehand"} |
+----+------------------------+-----------------------------------------------------------+
The table layout might seems a bit stupid, but they are simplified to make the question straightforward.
This is how far I've come so far:
select … from RentBookRegistration R where R.PersonName = 'Peter'

This isn't pretty. You are storing JSON data and then want to consume that JSOn data and turn it into different JSON data. Honestly, you would be better off storing your data in a normalised format, and then you can make the JSON more easily.
Saying that, you can do this, but you need to consume your JSON data first, with OPENJSON and then turn it back into JSON with FOR JSON:
SELECT (SELECT RBR.id AS RentBookRegistrationId,
RBR.date,
p.name,
(SELECT b.name
FROM dbo.BookData BD
CROSS APPLY OPENJSON(BD.book)
WITH (name nvarchar(50)) b
WHERE BD.rentBookRegistrationId = RBR.id
FOR JSON AUTO) AS Books,
(SELECT pU.no,
pU.name
FROM dbo.ProcessData PD
CROSS APPLY OPENJSON(PD.processUsed)
WITH (no int,
name nvarchar(50)) pU
WHERE PD.rentBookRegistrationId = RBR.id
FOR JSON AUTO) AS Processes
FROM OPENJSON(RBR.person)
WITH (no int,
name nvarchar(50)) p
FOR JSON PATH,ROOT('result'))
FROM dbo.RentBookRegistration RBR;
I assume you want 1 row per person.
db<>fiddle

If you can't fix the table structure, you can use JSON_VALUE and JSON_QUERY to do what you want to do. But I recommend fixing the table structures. Better performance, better resource use, and much simpler queries that are more human readable.
https://learn.microsoft.com/en-us/sql/relational-databases/json/json-data-sql-server?view=sql-server-ver16

Related

How to extract elements from Presto ARRAY(MAP(VARCHAR, VARCHAR))

I have an array of maps and data format is ARRAY(MAP(VARCHAR, VARCHAR)); I'd like to extract "id" and "description" features from this "Item_Details" column:
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Item_Details | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | [{"created":"2019-09-15","product":"apple watch", "amount": "$7,900"},{"created":"2022-09-19","product":"iPhone", "amount": "$78,300"},{"created":"2021-01-13","product":"Macbook Pro", "amount": "$163,980"}] | | |
| Google | US | [{"created":"2020-07-15","product":"Nest", "amount": "$78,300"},{"created":"2021-07-15","product":"Google phone", "amount": "$178,900"}] | | |
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
My expected outputs would be:
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Item_Details | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | ["product":["apple watch", "iPhone", "Macbook Pro"], "amount":[ "$7,900", "$78,300","$163,980"] | | |
| Google | US | ["product":["Nest", "Google phone"], "amount": "$78,300", "$178,900"] | | |
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
or
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+--+
| Company | Country | Product | Amount | | |
+===========+=============+============================================================================================================================================================+==+==+
| Apple | US | apple watch | $7,900 | | |
| Apple | US | iPhone | $78,300 | | |
| Apple | US | Macbook Pro | $163,980 | | |
...
+-----------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
I tried element_at(Item_Details, 'product') and json_extract_scalar(Item_Details, '$.product') but received error "Unexpected parameters (array(map(varchar,varchar)), varchar(23)) for function element_at. "
Any suggestions is much appreciated! Thank you in advance
For the second one you can unnest the array and access elements of map:
-- sampel data
WITH dataset(Company, Country, Item_Details) AS (
values ('Google', 'US', array[
map(array['created', 'product', 'amount'], array['2019-09-15', 'Nest', '$78,300']),
map(array['created', 'product', 'amount'], array['2019-09-16', 'Nest1', '$79,300'])
])
)
-- query
select Company,
Country,
m['product'] product,
m['amount'] amount
from dataset d,
unnest(Item_Details) as t(m);
Output:
Company
Country
product
amount
Google
US
Nest
$78,300
Google
US
Nest1
$79,300

Postgres SQL: How to query jsonb column having data in array

I have a table customrule with a structure
id. - int
name - varchar
actions. - jsonb
I already have read about the -> operator. But it seems not working in my case as I have data stored in as an array.
+-----------------------------------------------------------------------------------+
| Name | Id | Actions |
+-----------------------------------------------------------------------------------+
| CR-1 | 1 | [{"name": "Action1", "count": "1"},{"name": "Action2", "count": "2"}] |
+-------------------+---------------------------------------------------------------+
| CR-2 | 2 | [{"name": "Action5", "count": "1"},{"name": "Action4", "count": "2"}] |
+-----------------------------------------------------------------------------------+
| CR-3 | 3 | [{"name": "Action1", "count": "1"},{"name": "Action1", "count": "2"}] |
+-----------------------------------------------------------------------------------+
I want to query this data and get all records which have Action1 used in the actions column. Which should return row 1 and 3rd as a result.
You need to use the contains operator with an array parameter
select id, name, actions
from customrule
where actions #> '[{"name": "Action1"}]'
Online example

How to modify the following cypher syntax in AgensGraph?

MATCH (wu:wiki_user)
OPTIONAL MATCH (n:wiki_doc{author:wu.uid}), (o:wiki_doc{editor:wu.uid})
RETURN wu.uid AS User_id, wu.org AS Organization, wu.email AS email, wu.token AS balance,
count(n) AS Writing, count(o) AS Modifying;
user_id | organization | email | balance | writing | modifying
--------------------------------------------------------------------------
"ailee" | "Org2" | "hazel#gbc.com" | 5 | 0 | 0
"hazel" | "Org1" | "hazel#gbc.com" | 5 | 2 | 2
match (n:wiki_doc{editor:'hazel'}) return n;
n
wiki_doc[9.11]
{"bid": "hazel_doc1", "cid": "Basic", "org": "Org1", "title": "Hello world!",
"author": "hazel", "editor": "hazel", "revnum": 1, "created": "2018-09-25
09:00:000", "hasfile": 2, "contents": "I was wrong", "modified": "2018-09-25
10:00:000"}
(1 row)
In fact, the number of updates in the case of hazel is 1, and 2
queries are used when the above query is used.
How to modify the query so that only one can be normally viewed.
MATCH( wu:wiki_user )
OPTIONAL MATCH (n:wiki_doc{author:wu.uid}), (o:wiki_doc{editor:wu.uid})
RETURN wu.uid AS User_id, wu.org AS Organization, wu.email AS email, wu.token AS balance,
count(distinct id(n)) as Writing, count(distinct id(o)) as Modifying;
user_id | organization | email | balance | writing | modifying
+----------------------------------------------------------+
"ailee" | "Org2" | "hazel#gbc.com" | 5 | 0 | 0
"hazel" | "Org1" | "hazel#gbc.com" | 5 | 2 | 1
(2 rows)

Postgresql join on array and transform to json

I would like to make a join on array containing ids and transform the result of this subselect into json (json array).
I have the following model:

The lnam_refs column contains identifiers that are related to the lnam column
I would like transform the column lnam_refs into something like [row_to_json(), row_to_json()] or [] or [row_to_json()] or …
I tried several methods but I can not achieve a clean result…
To try to be clearer :
Table in input:
id | label | lnam | lnam_refs
--------+----------------------+----------+-----------------------
1 | 'master1' | 11111111 | {33333333}
2 | 'master2' | 22222222 | {44444444,55555555}
3 | 'slave1' | 33333333 | {}
4 | 'slave2' | 44444444 | {}
5 | 'slave3' | 55555555 | {}
6 | 'master3' | 66666666 | {}
Results Expected:
id | label | lnam | lnam_refs | slaves
--------+----------------------+----------+-----------------------+---------------------------------
1 | 'master1' | 11111111 | {33333333} | [ {id: 3, label: 'slave1', lnam: 33333333, lnam_refs: []} ]
2 | 'master2' | 22222222 | {44444444,55555555} | [ {id: 4, label: 'slave2', lnam: 44444444, lnam_refs: []}, {id: 5, label: 'slave3', lnam: 55555555, lnam_refs: []} ]
6 | 'master3' | 66666666 | {} | []
Thanks for your help !
Here's one way to do it. (I created a table called t with that data you supplied.)
SELECT *, (SELECT JSON_AGG(ROW_TO_JSON(t2)) FROM t t2 WHERE label LIKE 'slave%' AND lnam = ANY(t1.lnam_refs)) AS slaves
FROM t t1
WHERE label LIKE 'master%'
I use the label field in the WHERE clause as I don't know how else you're determining which records should be master etc.
Result:
1;master1;11111111;{33333333};[{"id":3,"label":"slave1","lnam":33333333,"lnam_refs":[]}]
2;master2;22222222;{44444444,55555555};[{"id":4,"label":"slave2","lnam":44444444,"lnam_refs":[]}, {"id":5,"label":"slave3","lnam":55555555,"lnam_refs":[]}]
6;master3;66666666;{};

Rank over partition from postgresql in elasticsearch

We are facing a problem with migration a large data set into elasticsearch from postgres (backup or whatever).
We have schema similar like this
+---------------+--------------+------------+-----------+
| user_id | created_at | latitude | longitude |
+---------------+--------------+------------+-----------+
| 5 | 23.1.2015 | 12.49 | 20.39 |
+---------------+--------------+------------+-----------+
| 2 | 23.1.2015 | 12.42 | 20.32 |
+---------------+--------------+------------+-----------+
| 2 | 24.1.2015 | 12.41 | 20.31 |
+---------------+--------------+------------+-----------+
| 5 | 25.1.2015 | 12.45 | 20.32 |
+---------------+--------------+------------+-----------+
| 1 | 23.1.2015 | 12.43 | 20.34 |
+---------------+--------------+------------+-----------+
| 1 | 24.1.2015 | 12.42 | 20.31 |
+---------------+--------------+------------+-----------+
And we are able to find a latest position by created_at thanks to rank function in SQL
... WITH locations AS (
select user_id, lat, lon, rank() over (partition by user_id order by created_at) as r
FROM locations)
SELECT user_id, lat, lon FROM locations WHERE r = 1
and the result is only newest created locations for each user:
+---------------+--------------+------------+-----------+
| user_id | created_at | latitude | longitude |
+---------------+--------------+------------+-----------+
| 2 | 24.1.2015 | 12.41 | 20.31 |
+---------------+--------------+------------+-----------+
| 5 | 25.1.2015 | 12.45 | 20.32 |
+---------------+--------------+------------+-----------+
| 1 | 24.1.2015 | 12.42 | 20.31 |
+---------------+--------------+------------+-----------+
After we import the data into elasticsearch, our document model looks like:
{
"location" : { "lat" : 12.45, "lon" : 46.84 },
"user_id" : 5,
"created_at" : "2015-01-24T07:55:20.606+00:00"
}
etc...
I am looking for alternatives for this SQL query in elasticsearch query, I think it must be possible, but i did not find how yet.
You can achieve this using field collapsing clubbed with inner_hits.
{
"collapse": {
"field": "user_id",
"inner_hits": {
"name": "order by created_at",
"size": 1,
"sort": [
{
"created_at": "desc"
}
]
}
},
}
Detailed Article: https://blog.francium.tech/sql-window-function-partition-by-in-elasticsearch-c2e3941495b6
It is simple: if you want to find the oldest record (for a given id), you just need the records for which no older ones (with the same id) exist. (this assumes that for a given id, no records exist with the same created_at date)
SELECT * FROM locations ll
WHERE NOT EXISTS (
SELECT * FROM locations nx
WHERE nx.user_id = ll.user_id
AND nx.created_at > ll.created_at
);
EDITED (it appears the OP wants the newst observation, not the oldest)
use top_hits.
"aggs": {
"user_id": {
"terms": {"field": "user_id"},
"aggs": {
"top_location": {
"top_hits": {
"size": 1,
"sort": { "created_at": "asc" },
"_source": []
}
}
}
}
}