How to group multiple values to only two groups? - sql

So, I have 2 tables.
Type table
id
Name
1.
General
2.
Mostly Used
3.
Low
Component table
id
Name
typeId
1.
Component 1
1
2.
Component 2
1
4.
Component 4
2
6.
Component 6
2
7.
Component 5
3
There can be numerous types but I want to get only 'General' and 'Others' as types along with the component as follows:
[{
"General": [{
"id": "1",
"name": "General",
"component": [{
"id": 1,
"name": "component 1",
"componentTypeId": 1
}, {
"id": 2,
"name": "component 2",
"componentTypeId": 1
}]
}],
"Others": [{
"id": "2",
"name": "Mostly Used",
"component": [{
"id": 4,
"name": "component 4",
"componentTypeId": 2
}, {
"id": 6,
"name": "component 6",
"componentTypeId": 2
}]
},
{
"id": "3",
"name": "Low",
"component": [{
"id": 7,
"name": "component 5",
"componentTypeId": 3
}]
}
]
}]
WITH CTE_TYPES AS (
SELECT
CASE WHEN t. "name" <> 'General' THEN
'Others'
ELSE
'General'
END AS TYPE,
t.id,
t.name
FROM
type AS t
GROUP BY
TYPE,
t.id
),
CTE_COMPONENT AS (
SELECT
c.id,
c.name,
c.typeid
FROM
component c
)
SELECT
JSON_AGG(jsonb_build_object ('id', CT.id, 'name', CT.name, 'type', CT.type, 'component', CC))
FROM
CTE_COMPONENTTYPES CT
INNER JOIN CTE_COMPONENT CC ON CT.id = CC.tradingplancomponenttypeid
GROUP BY
CT.type
I get 2 types from the query as I expected but the components are not grouped together
Can you also point to resources to learn advanced SQL queries?

Here after is a solution to get your expected result as specified in your question :
First part
The first part of the query aggregates all the components with the same TypeId into a jsonb array. It also calculates the new type column with the value 'Others' for all the type names different from General or with the value 'General' :
SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
jsonb_build_object builds a jsonb object from a set of key/value arguments
jsonb_agg aggregates jsonb objects into a single jsonb array.
Second part
The second part of the query is much more complex because of the structure of your expected result where you want to nest the types which are different from General with their components inside each other according to the TypeId order, ie Low type with TypeId = 3 is nested inside Mostly Used type with TypeId = 2 :
{ "id": "2",
, "name": "Mostly Used"
, "component": [ { "id": 4
, "name": "component 4"
, "componentTypeId": 2
}
, { ... }
, { "id": "3"
, "name": "Low" --> 'Low' type is nested inside 'Mostly Used' type
, "component": [ { "id": 7
, "name": "component 5"
, "componentTypeId": 3
}
, { ... }
]
}
]
}
To do such a nested structure with a random number of TypeId, you could create a recursive query, but I prefer here to create a user-defined aggregate function which will make the query much more simple and readable, see the manual. The aggregate function jsonb_set_inv_agg is based on the user-defined function jsonb_set_inv which inserts the jsonb object x inside the existing jsonb object z according to the path p. This function is based on the jsonb_set standard function :
CREATE OR REPLACE FUNCTION jsonb_set_inv(x jsonb, p text[], z jsonb, b boolean)
RETURNS jsonb LANGUAGE sql IMMUTABLE AS
$$
SELECT jsonb_set(z, p, COALESCE(z#>p || x, z#>p), b) ;
$$ ;
CREATE AGGREGATE jsonb_set_inv_agg(p text[], z jsonb, b boolean)
( sfunc = jsonb_set_inv
, stype = jsonb
) ;
Based on the newly created aggregate function jsonb_set_inv_agg and the jsonb_agg and jsonb_build_object standard functions already seen above, the final query is :
SELECT jsonb_agg(jsonb_build_object('General', x.list)) FILTER (WHERE x.type = 'General')
|| jsonb_build_object('Others', jsonb_set_inv_agg('{component}', x.list, true ORDER BY x.id DESC) FILTER (WHERE x.type = 'Others'))
FROM
( SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
) AS x
see the full test result in dbfiddle.

Related

Querying over PostgreSQL JSONB column

I have a table "blobs" with a column "metadata" in jsonb data-type,
Example:
{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
]
}
I want to identify statuses in the "metas" array that match with certain, given strings. I have tried the following so far but without results:
SELECT * FROM blobs
WHERE metadata is not null AND
(
SELECT count(*) FROM jsonb_array_elements(metadata->'metas') AS cn
WHERE cn->>'status' IN ('active','reported')
) > 0;
It would also be sufficient if I could compare the string with "status" in the first array object.
I am using PostgreSQL 9.6.24
for some clarity I usually break code into series of WITH statements. My idea for your problem would be to use json path (https://www.postgresql.org/docs/12/functions-json.html#FUNCTIONS-SQLJSON-PATH) and function jsonb_path_query.
Below code gives a list of counts, I will leave the rest to you, to get final data.
I've added ID column just to have something to join on. Otherwise join on metadata.
Also, note additional " in where condition. Left join in blob_ext is there just to have null value if metadata is not present or that path does not work.
with blob as (
select row_number() over()"id", * from (VALUES
(
'{
"total_count": 2,
"items": [
{
"name": "somename",
"metadata": {
"metas": [
{
"id": "11258",
"score": 6.1,
"status": "active",
"published_at": "2019-04-20T00:29:00",
"nvd_modified_at": "2022-04-06T18:07:00"
},
{
"id": "9251",
"score": 5.1,
"status": "active",
"published_at": "2018-01-18T23:29:00",
"nvd_modified_at": "2021-01-08T12:15:00"
}
]
}
}
]}'::jsonb),
(null::jsonb)) b(metadata)
)
, blob_ext as (
select bb.*, blob_sts.status
from blob bb
left join (
select
bb2.id,
jsonb_path_query (bb2.metadata::jsonb, '$.items[*].metadata.metas[*].status'::jsonpath)::character varying "status"
FROM blob bb2
) as blob_sts ON
blob_sts.id = bb.id
)
select bbe.id, count(*) cnt, bbe.metadata
from blob_ext bbe
where bbe.status in ('"active"', '"reported"')
group by bbe.id, bbe.metadata;
A way is to peel one layer at a time with jsonb_extract_path() and jsonb_array_elements():
with cte_items as (
select id,
metadata,
jsonb_extract_path(jx.value,'metadata','metas') as metas
from blobs,
lateral jsonb_array_elements(jsonb_extract_path(metadata,'items')) as jx),
cte_metas as (
select id,
metadata,
jsonb_extract_path_text(s.value,'status') as status
from cte_items,
lateral jsonb_array_elements(metas) s)
select distinct
id,
metadata
from cte_metas
where status in ('active','reported');

Joining tables with filter condition on multiple columns in Oracle DBMS

{
"description": "test",
"id": "1",
"name": "test",
"prod": [
{
"id": "1",
"name": "name",
"re": [
{
"name": "name1",
"value": "1"
},
{
"name": "name2",
"value": "1"
},
{
"name": "name3",
"value": "0"
},
{
"name": "name4",
"value": "0"
}
]
}
]
}
Here is the best I can do with your JSON input and your sample output.
Note that your document has a unique "id" and "name" ("1" and "test" in your example). Then it has an array named "productSpecificationRelationship". Each element of this array is an object with its own "id" - in the query, I show this id with the column name PSR_ID (PSR for Product Specification Relationship). Also, each object in this first-level array contains a sub-array (second level), with objects with "name" ("name" again!) and "value" keys. (This looks very much like an entity-attribute-value model - very poor practice.) In the intermediate step in my query (before pivoting), I call these RC_NAME and RC_VALUE (RC for Relationship Characteristic).
In your sample output you have more than one value in the ID and NAME columns. I don't see how that is possible; perhaps from unpacking more than one document? The JSON document you shared with us has "id" and "name" as top-level attributes.
In the output, I understand (or rather, assume, since I didn't understand too much from your question) that you should also include the PSR_ID - there is only one in your document, with value "10499", but in principle there may be more than one, and the output will have one row per such id.
Also, I assume the "name" values are limited to the four you mentioned (or, if there can be more, you are only interested in those four in the output).
With all that said, here is the query. Note that I called the table ES for simplicity. Also, you will see that I had to go to nested path twice (since your document includes an array of arrays, and I wanted to pick up the PSR_ID from the outer array and the tokens from the nested arrays).
TABLE SETUP
create table es (payloadentityspecification clob
check (payloadentityspecification is json) );
insert into es (payloadentityspecification) values (
'{
"description": "test",
"id": "1",
"name": "test",
"productSpecificationRelationship": [
{
"id": "10499",
"relationshipType": "channelRelation",
"relationshipCharacteristic": [
{
"name": "out_of_home",
"value": "1"
},
{
"name": "out_of_home_ios",
"value": "1"
},
{
"name": "out_of_home_android",
"value": "0"
},
{
"name": "out_of_home_web",
"value": "0"
}
]
}
]
}');
commit;
QUERY
with
prep (id, name, psr_id, rc_name, rc_value) as (
select id, name, psr_id, rc_name, rc_value
from es,
json_table(payloadentityspecification, '$'
columns (
id varchar2(10) path '$.id',
name varchar2(40) path '$.name',
nested path '$.productSpecificationRelationship[*]'
columns (
psr_id varchar2(10) path '$.id',
nested path '$.relationshipCharacteristic[*]'
columns (
rc_name varchar2(50) path '$.name',
rc_value varchar2(50) path '$.value'
)
)
)
)
)
select id, name, psr_id, ooh, ooh_android, ooh_ios, ooh_web
from prep
pivot ( min(case rc_value when '1' then 'TRUE'
when '0' then 'FALSE' else 'UNDEFINED' end)
for rc_name in ( 'out_of_home' as ooh,
'out_of_home_android' as ooh_android,
'out_of_home_ios' as ooh_ios,
'out_of_home_web' as ooh_web
)
)
;
OUTPUT
ID NAME PSR_ID OOH OOH_ANDROID OOH_IOS OOH_WEB
-- ---- ------ ----------- ----------- ----------- -----------
1 test 10499 TRUE FALSE TRUE FALSE
Conditional aggregation might be used in order to pivot the result set after extracting the values by using JSON_TABLE() and JSON_VALUE() functions such as
SELECT JSON_VALUE(payloadentityspecification, '$.name') AS channel_map_name,
MAX(CASE WHEN name = 'out_of_home' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh,
MAX(CASE WHEN name = 'out_of_home_android' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_android,
MAX(CASE WHEN name = 'out_of_home_ios' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_ios,
MAX(CASE WHEN name = 'out_of_home_web' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_web
FROM EntitySpecification ES,
JSON_TABLE (payloadentityspecification, '$.productSpecificationRelationship[*]'
COLUMNS ( NESTED PATH '$.relationshipCharacteristic[*]'
COLUMNS (
description VARCHAR2(250) PATH '$.description',
name VARCHAR2(250) PATH '$.name',
value VARCHAR2(250) PATH '$.value'
)
)) jt
WHERE payloadentityspecification IS JSON
GROUP BY JSON_VALUE(payloadentityspecification, '$.name')
Demo

In BigQuery, how do I check if two ARRAY of STRUCTs are equal

I have a query that outputs two array of structs:
SELECT modelId, oldClassCounts, newClassCounts
FROM `xyz`
GROUP BY 1
How do I create another column that is TRUE if oldClassCounts = newClassCounts?
Here is a sample result in JSON:
[
{
"modelId": "FBF21609-65F8-4076-9B22-D6E277F1B36A",
"oldClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
],
"newClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
]
}
]
I want the equality column to be TRUE if oldClassCounts and newClassCounts are exactly the same like the output above.
Anything else should be false.
I would go about with this solution
#standardSQL
WITH xyz AS (
SELECT "FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] AS oldClassCounts,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] as newClassCounts),
o as (SELECT modelId, id, count, array_length(oldClassCounts) as len FROM xyz, UNNEST(oldClassCounts) as old_c),
n as (SELECT modelId, id, count, array_length(newClassCounts) as len FROM xyz, UNNEST(newClassCounts) as new_c),
uneq as (select * from o except distinct select * from n)
select xyz.*, IF(uneq.modelId is not null, false, true) as equal from xyz left join (select distinct modelId from uneq) uneq on xyz.modelId = uneq.modelId
It works regardless of the order or having duplicates within the arrays. The idea is that we treat each of the arrays as a separate temporary table removing all elements that exist in one but not the other (using except distinct) and then have an extra check for the length of the arrays in case there are duplicates e.g.
"FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)]
I would consider comparing the result of TO_JSON_STRING function applied on both of these arrays.
In the query it would be done in the following way:
SELECT modelId,
oldClassCounts,
newClassCounts,
CASE WHEN TO_JSON_STRING(oldClassCounts) = TO_JSON_STRING(newClassCounts)
THEN true
ELSE false
END
FROM `xyz`;
I'm not sure about GROUP BY 1 part, because non of the fields are grouped or aggregated.
It is not going to work, if the order of elements in the array is going to be different. This solution is not perfect, but worked for the data you provided.

SQL to JSON - Grouping Results into JSON Array

I am trying to come up with an SQL solution for arranging output to match an expected JSON format.
I have some simple SQL to highlight where the issue is coming from;
SELECT TOP 1 'Surname' AS 'name.family'
,'Forename, Middle Name' AS 'name.given'
,'Title' AS 'name.prefix'
,getDATE() AS 'birthdate'
,'F' AS 'gender'
,'Yes' AS 'active'
,'work' AS 'telecom.use'
,'phone' AS 'telecom.system'
,'12344556' AS 'telecom.value'
FROM tblCustomer
FOR json path
Which will return JSON as;
[
{
"name": {
"family": "Surname",
"given": "Forename, Middle Name",
"prefix": "Title"
},
"birthdate": "2019-02-13T12:06:45.490",
"gender": "F",
"active": "Yes",
"telecom": {
"use": "work",
"system": "phone",
"value": "12344556"
}
}
]
What I need to is to add extra objects into the "telecome" array so it would appear as;
[
{
"name": {
"family": "Surname",
"given": "Forename, Middle Name",
"prefix": "Title"
},
"birthdate": "2019-02-13T12:06:45.490",
"gender": "F",
"active": "Yes",
"telecom": {
"use": "work",
"system": "phone",
"value": "12344556"
},
{
"use": "work",
"system": "home",
"value": "12344556"
},
}
]
I have incorrectly assume I could keep adding to my SQL as follows;
SELECT TOP 1 'Surname' AS 'name.family'
,'Forename, Middle Name' AS 'name.given'
,'Title' AS 'name.prefix'
,getDATE() AS 'birthdate'
,'F' AS 'gender'
,'Yes' AS 'active'
,'work' AS 'telecom.use'
,'phone' AS 'telecom.system'
,'12344556' AS 'telecom.value'
,'home' AS 'telecom.use'
FROM tblCustomer
FOR json path
And it would nest the items as per my naming indents however;
Property 'telecom.use' cannot be generated in JSON output due to a
conflict with another column name or alias. Use different names and
aliases for each column in SELECT list.
Is there a way to handle this nesting with SQL or will I need to create separate for JSON queries and merge them?
Thanks
Using ##Version Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64)
Aug 22 2017 17:04:49 Copyright (C) 2017 Microsoft Corporation
Express Edition (64-bit) on Windows Server 2012 R2 Datacenter 6.3
(Build 9600: ) (Hypervisor)
Small edit to the question to use dynamic values rather than forced static members.
SELECT TOP 1 'Surname' AS 'name.family'
,'Forename, Middle Name' AS 'name.given'
,'Title' AS 'name.prefix'
,getDATE() AS 'birthdate'
,'F' AS 'gender'
,'Yes' AS 'active'
,'work' AS 'telecom.use'
,'phone' AS 'telecom.system'
,customerWorkTelephone AS 'telecom.value'
,'home' AS 'telecom.use'
,'phone' AS 'telecom.system'
,customerHomeTelephone AS 'telecom.value'
FROM tblCustomer
FOR json path
The "value" items will be taken from columns within the tblCustomer table. I've tried to make good on the responses below but cant get the logic quite correct in the sub query.
Thanks again
FURTHER EDIT
I have some SQL that is giving me the output I expect however I am not sure its the best that it could be, is my approach less than optimal?
SELECT TOP 1 [name.family] = 'Surname'
,[name.given] = 'Forename, Middle Name'
,[name.prefix] = 'Title'
,[birthdate] = GETDATE()
,[gender] = 'F'
,[active] = 'Yes'
,[telecom] = (
SELECT [use] = V.used
,[system] = 'phone'
,[value] = CASE V.used
WHEN 'work'
THEN cu.customerWorkTelephone
WHEN 'home'
THEN cu.customerHomeTelephone
when 'mobile'
then cu.customerMobileTelephone
END
FROM (
VALUES ('work')
,('home')
,('mobile')
) AS V(used)
FOR json path
)
FROM tblCustomer cu
FOR JSON PATH
Using a subselect with a few hard-coded rows:
SELECT TOP 1
'Surname' AS 'name.family'
,'Forename, Middle Name' AS 'name.given'
,'Title' AS 'name.prefix'
,getDATE() AS 'birthdate'
,'F' AS 'gender'
,'Yes' AS 'active'
,'telecom' = (
SELECT
'work' AS 'use'
,V.system AS 'system'
,'12344556' AS 'value'
FROM
(VALUES
('phone'),
('home')) AS V(system)
FOR JSON PATH)
FROM tblCustomer
FOR JSON PATH
Note the lack of the telecom. prefix inside the subquery.
Results (without the table reference):
[
{
"name": {
"family": "Surname",
"given": "Forename, Middle Name",
"prefix": "Title"
},
"birthdate": "2019-02-13T12:53:08.400",
"gender": "F",
"active": "Yes",
"telecom": [
{
"use": "work",
"system": "phone",
"value": "12344556"
},
{
"use": "work",
"system": "home",
"value": "12344556"
}
]
}
]
PD: Particularly for SQL Server I find using the alias on the left side more readable:
SELECT TOP 1
[name.family] = 'Surname',
[name.given] = 'Forename, Middle Name',
[name.prefix] = 'Title',
[birthdate] = GETDATE(),
[gender] = 'F',
[active] = 'Yes',
[telecom] = (
SELECT
[use] = 'work',
[system] = V.system,
[value] = '12344556'
FROM
(VALUES ('phone'), ('home')) AS V(system)
FOR JSON
PATH)
FROM tblCustomer
FOR JSON
PATH
SELECT
EMP.ID,
EMP.NAME,
DEP.NAME
FROM EMPLOYEE EMP INNER JOIN DEPARTMENT DEP ON EMP.DEPID=DEP.DEPID
WHERE EMP.SALARY>1000
FOR JSON PATH

jsonb LIKE query on nested objects in an array

My JSON data looks like this:
[{
"id": 1,
"payload": {
"location": "NY",
"details": [{
"name": "cafe",
"cuisine": "mexican"
},
{
"name": "foody",
"cuisine": "italian"
}
]
}
}, {
"id": 2,
"payload": {
"location": "NY",
"details": [{
"name": "mbar",
"cuisine": "mexican"
},
{
"name": "fdy",
"cuisine": "italian"
}
]
}
}]
given a text "foo" I want to return all the tuples that have this substring. But I cannot figure out how to write the query for the same.
I followed this related answer but cannot figure out how to do LIKE.
This is what I have working right now:
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (SELECT ARRAY (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload,
details}')
)
) AS d(details)
WHERE d.details #> '{cafe}';
Instead of passing the whole text of cafe I want to pass ca and get the results that match that text.
Your solution can be simplified some more:
SELECT r.res->>'name' AS feature_name, d.name AS detail_name
FROM restaurants r
, jsonb_populate_recordset(null::foo, r.res #> '{payload, details}') d
WHERE d.name LIKE '%oh%';
Or simpler, yet, with jsonb_array_elements() since you don't actually need the row type (foo) at all in this example:
SELECT r.res->>'name' AS feature_name, d->>'name' AS detail_name
FROM restaurants r
, jsonb_array_elements(r.res #> '{payload, details}') d
WHERE d->>'name' LIKE '%oh%';
db<>fiddle here
But that's not what you asked exactly:
I want to return all the tuples that have this substring.
You are returning all JSON array elements (0-n per base table row), where one particular key ('{payload,details,*,name}') matches (case-sensitively).
And your original question had a nested JSON array on top of this. You removed the outer array for this solution - I did the same.
Depending on your actual requirements the new text search capability of Postgres 10 might be useful.
I ended up doing this(inspired by this answer - jsonb query with nested objects in an array)
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload, details}')
) AS d(details)
WHERE d.details LIKE '%oh%';
Fiddle here - http://sqlfiddle.com/#!15/f2027/5