SQL GROUP BY and create Element for each GROUP - sql

UPDATE:
I was able to solve this with INSERT INTO:
INSERT INTO newClass
FROM
SELECT newClassID AS id, newClassName AS name
FROM oldClass
GROUP BY newClassID
I'm using SQL with Orient DB version 2.2.
I want to group element of one class by attribute newClassID (there are multiple Elements in this class that share newClassID, lets say I have 20 Elements in the first class and 3 unique newClassID). Each group will have unique newClassIDand also newClassName as these attributes go hand in hand.
For each group I want to create an element of another Class (newClass), that will have the id = newClassID and name = newClassName. In the above case this will result to 3 new Elements in the newClass.
The Elelemt of the first class would look like:
Elem
newClassID
newClassName
1
id1
name1
2
id2
name2
3
id1
name1
4
id3
name3
5
id2
name2
And I wand to create 3 new elements in new Class:
Elem
ID
Name
1
id1
name1
2
id2
name2
3
id3
name3
With this query I get id and name as attributes:
BEGIN
let pattern = SELECT newClassID, newClassName
FROM oldClass GROUP BY newClassID
COMMIT
RETURN { 'id' : $pattern.newClassID, 'name' : $pattern.newClassName }
Result:
[
{
"#type": "d",
"#version": 0,
"name": [
"name1",
"name2",
"name3"
],
"id": [
"id1",
"id2",
"id3"
],
"#fieldTypes": "name=e,id=e"
}
]
Now I want to call this function (called GetNewClass()) and create for each entry in id a newClassElement. I can not use FOREACH as this is just available from version 3.x . To be honest: here I have no clue how to do this..
I tried:
BEGIN
INSERT INTO newClass FROM SELECT GetNewClass()
let newElements = SELECT FROM newClass
RETURN $newElements
But this gives me just one element with:
[
{
"#type": "d",
"#rid": "#41:1",
"#version": 1,
"#class": "newClass",
"GetProdID": {
"name": [
"name1",
"name2",
"name3"
],
"id": [
"id1",
"id2",
"id3"
]
}
}
]
Thanks a lot!

Related

How can I modify all values that match a condition inside a json array?

I have a table which has a JSON column called people like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 3 }]
2
[{ "id": 2 }, { "id": 3 }, { "id": 1 }]
...and I need to update the people column and put a 0 in the path $[*].id where id = 3, so after executing the query, the table should end like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 0 }]
2
[{ "id": 2 }, { "id": 0 }, { "id": 1 }]
There may be more than one match per row.
Honestly, I didn´t tried any query since I cannot figure out how can I loop inside a field, but my idea was something like this:
UPDATE mytable
SET people = JSON_SET(people, '$[*].id', 0)
WHERE /* ...something should go here */
This is my version
SELECT VERSION()
+-----------------+
| version() |
+-----------------+
| 10.4.22-MariaDB |
+-----------------+
If the id values in people are unique, you can use a combination of JSON_SEARCH and JSON_REPLACE to change the values:
UPDATE mytable
SET people = JSON_REPLACE(people, JSON_UNQUOTE(JSON_SEARCH(people, 'one', 3)), 0)
WHERE JSON_SEARCH(people, 'one', 3) IS NOT NULL
Note that the WHERE clause is necessary to prevent the query replacing values with NULL when the value is not found due to JSON_SEARCH returning NULL (which then causes JSON_REPLACE to return NULL as well).
If the id values are not unique, you will have to rely on string replacement, preferably using REGEXP_REPLACE to deal with possible differences in spacing in the values (and also avoiding replacing 3 in (for example) 23 or 34:
UPDATE mytable
SET people = REGEXP_REPLACE(people, '("id"\\s*:\\s*)2\\b', '\\14')
Demo on dbfiddle
As stated in the official documentation, MySQL stores JSON-format strings in a string column, for this reason you can either use the JSON_SET function or any string function.
For your specific task, applying the REPLACE string function may suit your case:
UPDATE
mytable
SET
people = REPLACE(people, CONCAT('"id": ', 3, ' '), CONCAT('"id": ',0, ' '))
WHERE
....;

How to group multiple values to only two groups?

So, I have 2 tables.
Type table
id
Name
1.
General
2.
Mostly Used
3.
Low
Component table
id
Name
typeId
1.
Component 1
1
2.
Component 2
1
4.
Component 4
2
6.
Component 6
2
7.
Component 5
3
There can be numerous types but I want to get only 'General' and 'Others' as types along with the component as follows:
[{
"General": [{
"id": "1",
"name": "General",
"component": [{
"id": 1,
"name": "component 1",
"componentTypeId": 1
}, {
"id": 2,
"name": "component 2",
"componentTypeId": 1
}]
}],
"Others": [{
"id": "2",
"name": "Mostly Used",
"component": [{
"id": 4,
"name": "component 4",
"componentTypeId": 2
}, {
"id": 6,
"name": "component 6",
"componentTypeId": 2
}]
},
{
"id": "3",
"name": "Low",
"component": [{
"id": 7,
"name": "component 5",
"componentTypeId": 3
}]
}
]
}]
WITH CTE_TYPES AS (
SELECT
CASE WHEN t. "name" <> 'General' THEN
'Others'
ELSE
'General'
END AS TYPE,
t.id,
t.name
FROM
type AS t
GROUP BY
TYPE,
t.id
),
CTE_COMPONENT AS (
SELECT
c.id,
c.name,
c.typeid
FROM
component c
)
SELECT
JSON_AGG(jsonb_build_object ('id', CT.id, 'name', CT.name, 'type', CT.type, 'component', CC))
FROM
CTE_COMPONENTTYPES CT
INNER JOIN CTE_COMPONENT CC ON CT.id = CC.tradingplancomponenttypeid
GROUP BY
CT.type
I get 2 types from the query as I expected but the components are not grouped together
Can you also point to resources to learn advanced SQL queries?
Here after is a solution to get your expected result as specified in your question :
First part
The first part of the query aggregates all the components with the same TypeId into a jsonb array. It also calculates the new type column with the value 'Others' for all the type names different from General or with the value 'General' :
SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
jsonb_build_object builds a jsonb object from a set of key/value arguments
jsonb_agg aggregates jsonb objects into a single jsonb array.
Second part
The second part of the query is much more complex because of the structure of your expected result where you want to nest the types which are different from General with their components inside each other according to the TypeId order, ie Low type with TypeId = 3 is nested inside Mostly Used type with TypeId = 2 :
{ "id": "2",
, "name": "Mostly Used"
, "component": [ { "id": 4
, "name": "component 4"
, "componentTypeId": 2
}
, { ... }
, { "id": "3"
, "name": "Low" --> 'Low' type is nested inside 'Mostly Used' type
, "component": [ { "id": 7
, "name": "component 5"
, "componentTypeId": 3
}
, { ... }
]
}
]
}
To do such a nested structure with a random number of TypeId, you could create a recursive query, but I prefer here to create a user-defined aggregate function which will make the query much more simple and readable, see the manual. The aggregate function jsonb_set_inv_agg is based on the user-defined function jsonb_set_inv which inserts the jsonb object x inside the existing jsonb object z according to the path p. This function is based on the jsonb_set standard function :
CREATE OR REPLACE FUNCTION jsonb_set_inv(x jsonb, p text[], z jsonb, b boolean)
RETURNS jsonb LANGUAGE sql IMMUTABLE AS
$$
SELECT jsonb_set(z, p, COALESCE(z#>p || x, z#>p), b) ;
$$ ;
CREATE AGGREGATE jsonb_set_inv_agg(p text[], z jsonb, b boolean)
( sfunc = jsonb_set_inv
, stype = jsonb
) ;
Based on the newly created aggregate function jsonb_set_inv_agg and the jsonb_agg and jsonb_build_object standard functions already seen above, the final query is :
SELECT jsonb_agg(jsonb_build_object('General', x.list)) FILTER (WHERE x.type = 'General')
|| jsonb_build_object('Others', jsonb_set_inv_agg('{component}', x.list, true ORDER BY x.id DESC) FILTER (WHERE x.type = 'Others'))
FROM
( SELECT CASE WHEN t.name <> 'General' THEN 'Others' ELSE 'General' END AS type
, t.id, t.name
, jsonb_build_object('id', t.id, 'name', t.name, 'component', jsonb_agg(jsonb_build_object('id', c.id, 'name', c.name, 'componentTypeId', c.typeid))) AS list
FROM component AS c
INNER JOIN type AS t
ON t.id = c.typeid
GROUP BY t.id, t.name
) AS x
see the full test result in dbfiddle.

Joining tables with filter condition on multiple columns in Oracle DBMS

{
"description": "test",
"id": "1",
"name": "test",
"prod": [
{
"id": "1",
"name": "name",
"re": [
{
"name": "name1",
"value": "1"
},
{
"name": "name2",
"value": "1"
},
{
"name": "name3",
"value": "0"
},
{
"name": "name4",
"value": "0"
}
]
}
]
}
Here is the best I can do with your JSON input and your sample output.
Note that your document has a unique "id" and "name" ("1" and "test" in your example). Then it has an array named "productSpecificationRelationship". Each element of this array is an object with its own "id" - in the query, I show this id with the column name PSR_ID (PSR for Product Specification Relationship). Also, each object in this first-level array contains a sub-array (second level), with objects with "name" ("name" again!) and "value" keys. (This looks very much like an entity-attribute-value model - very poor practice.) In the intermediate step in my query (before pivoting), I call these RC_NAME and RC_VALUE (RC for Relationship Characteristic).
In your sample output you have more than one value in the ID and NAME columns. I don't see how that is possible; perhaps from unpacking more than one document? The JSON document you shared with us has "id" and "name" as top-level attributes.
In the output, I understand (or rather, assume, since I didn't understand too much from your question) that you should also include the PSR_ID - there is only one in your document, with value "10499", but in principle there may be more than one, and the output will have one row per such id.
Also, I assume the "name" values are limited to the four you mentioned (or, if there can be more, you are only interested in those four in the output).
With all that said, here is the query. Note that I called the table ES for simplicity. Also, you will see that I had to go to nested path twice (since your document includes an array of arrays, and I wanted to pick up the PSR_ID from the outer array and the tokens from the nested arrays).
TABLE SETUP
create table es (payloadentityspecification clob
check (payloadentityspecification is json) );
insert into es (payloadentityspecification) values (
'{
"description": "test",
"id": "1",
"name": "test",
"productSpecificationRelationship": [
{
"id": "10499",
"relationshipType": "channelRelation",
"relationshipCharacteristic": [
{
"name": "out_of_home",
"value": "1"
},
{
"name": "out_of_home_ios",
"value": "1"
},
{
"name": "out_of_home_android",
"value": "0"
},
{
"name": "out_of_home_web",
"value": "0"
}
]
}
]
}');
commit;
QUERY
with
prep (id, name, psr_id, rc_name, rc_value) as (
select id, name, psr_id, rc_name, rc_value
from es,
json_table(payloadentityspecification, '$'
columns (
id varchar2(10) path '$.id',
name varchar2(40) path '$.name',
nested path '$.productSpecificationRelationship[*]'
columns (
psr_id varchar2(10) path '$.id',
nested path '$.relationshipCharacteristic[*]'
columns (
rc_name varchar2(50) path '$.name',
rc_value varchar2(50) path '$.value'
)
)
)
)
)
select id, name, psr_id, ooh, ooh_android, ooh_ios, ooh_web
from prep
pivot ( min(case rc_value when '1' then 'TRUE'
when '0' then 'FALSE' else 'UNDEFINED' end)
for rc_name in ( 'out_of_home' as ooh,
'out_of_home_android' as ooh_android,
'out_of_home_ios' as ooh_ios,
'out_of_home_web' as ooh_web
)
)
;
OUTPUT
ID NAME PSR_ID OOH OOH_ANDROID OOH_IOS OOH_WEB
-- ---- ------ ----------- ----------- ----------- -----------
1 test 10499 TRUE FALSE TRUE FALSE
Conditional aggregation might be used in order to pivot the result set after extracting the values by using JSON_TABLE() and JSON_VALUE() functions such as
SELECT JSON_VALUE(payloadentityspecification, '$.name') AS channel_map_name,
MAX(CASE WHEN name = 'out_of_home' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh,
MAX(CASE WHEN name = 'out_of_home_android' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_android,
MAX(CASE WHEN name = 'out_of_home_ios' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_ios,
MAX(CASE WHEN name = 'out_of_home_web' THEN
DECODE(value,1,'TRUE',0,'FALSE','UNDEFINED')
END) AS ooh_web
FROM EntitySpecification ES,
JSON_TABLE (payloadentityspecification, '$.productSpecificationRelationship[*]'
COLUMNS ( NESTED PATH '$.relationshipCharacteristic[*]'
COLUMNS (
description VARCHAR2(250) PATH '$.description',
name VARCHAR2(250) PATH '$.name',
value VARCHAR2(250) PATH '$.value'
)
)) jt
WHERE payloadentityspecification IS JSON
GROUP BY JSON_VALUE(payloadentityspecification, '$.name')
Demo

SQL query to match and return all occurrences of search string

I have a json document in a column (record) with a table (TABLE) as below. Need to write a SQL query to bring all occurrences of values of fields "a", "b", 'k" within aaagroup.
Result should be:
NAME1 age1 comment1
NAME2 age2
NAME3 comment3
JSON data:
{
"reportfile": {
"aaa": {
"aaagroup": [{
"a": "NAME1",
"b": "age1",
"k": "comment1"
},
{
"a": "NAME2",
"b": "age2"
},
{
"a": "NAME3",
"k": "comment3"
}]
},
"dsa": {
"dsagroup": [{
"j": "Name"
},
{
"j": "Title"
}]
}
}
}
I used the below query for a single occurrence:
Data:
{"reportfile":{"aaa":{"aaagroup":[{"a":"NAME1","k":"age1}]},"dsa":{"dsagroup":[{"j":"USERNAME"}],"l":"1","m":"1"}}}
Query:
select
substr(cc.BUS_NME, 1, strpos(cc.BUS_NME,'"')-1) as BUS_NME,
substr(cc.AGE, 1, strpos(cc.AGE,'"')-1) as AGE
from
(substr(bb.aaa,strpos(bb.aaa,'"a":"')+5) as BUS_NME,
substr(bb.aaa,strpos(bb.aaa,'"k":"')+5) as AGE
from
(substr(aa.G, strpos(aa.G,'"aaagroup'),strpos(aa.G,'},')) as aaa
from
(select substr(record, strpos(record,'"aaagroup')) as G
from TABLE) aa) bb) cc
ush rani – If I am getting your question correctly, you will have a external table like this and you can try below query to get the desire result from external table
sample external table:
CREATE EXTERNAL TABLE Ext_JSON_data(
reportfile string
)
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
)
LOCATION
's3://bucket/folder/'
Query to fetch desire result:
WITH the_table AS (
SELECT CAST(social AS MAP(VARCHAR, JSON)) AS social_data
FROM (
VALUES
(JSON '{"aaa": {"aaagroup": [{"a": "NAME1","b": "age1","k": "comment1"},{"a": "NAME2","b": "age2"},{"a": "NAME3","k": "comment3"}]},"dsa": {"dsagroup": [{"j": "Name"},{"j": "Title"}]}}')
) AS t (social)
),
cte_first_level as
(
SELECT
first_level_key
,CAST(first_level_value AS MAP(VARCHAR, JSON))As first_level_value
FROM the_table
CROSS JOIN UNNEST (social_data) AS t (first_level_key, first_level_value)
),
cte_second_level as
(
Select
first_level_key
,SECOND_level_key
,SECOND_level_value
from
cte_first_level
CROSS JOIN UNNEST (first_level_value) AS t (SECOND_level_key, SECOND_level_value)
)
SELECT
first_level_key
,SECOND_level_key
,SECOND_level_value
,items
,items['a'] value_of_a
,items['b'] value_of_b
,items['k'] value_of_k
from
cte_second_level
cross join unnest(cast(json_extract(SECOND_level_value, '$') AS ARRAY<MAP<VARCHAR, VARCHAR>>)) t (items)
Query Output :

How to unnest bigquery field that is stored as a string?

I am trying to unnest a field but something is wrong with my query.
Sample data in my table
'1234', '{ "id" : "123" , "items" : [ { "quantity" : 1 , "product" : { "id" : "p1" , "categories" : [ "cat1","cat2","cat3"] }}] }'
There are 2 fields in the dataset: row_id and parts, where parts is a dictionary object with list items (categories) in it but datatype of a parts is string. I would like the output to be individual row for each category.
This is what I have tried but I am not getting any result back.
#standardSQL
with t as (
select "1234" as row_id, '{ "id" : "123" , "items" : [ { "quantity" : 1 , "product" : { "id" : "p1" , "categories" : [ "cat1","cat2","cat3"] }}] }' as parts
)
select row_id, _categories
from t,
UNNEST(REGEXP_EXTRACT_ALL(JSON_EXTRACT(parts, '$.items'), r'"categories":"(.+?)"')) _categories
expected result
id, _categories
1234, cat1
1234, cat2
1234, cat3
Below is for BigQuery Standard SQL
#standardSQL
WITH t AS (
SELECT "1234" AS row_id, '{ "id" : "123" , "items" : [ { "quantity" : 1 , "product" : { "id" : "p1" , "categories" : [ "cat1","cat2","cat3"] }}] }' AS parts
)
SELECT row_id, REPLACE(_categories, '"', '') _categories
FROM t, UNNEST(SPLIT(REGEXP_EXTRACT(
JSON_EXTRACT(parts, '$.items'),
r'"categories":\[(.+?)]'))
) _categories
and produces expected result
Row row_id _categories
1 1234 cat1
2 1234 cat2
3 1234 cat3
Update
Above solution was mostly focused on fixing regexp used in extract - but not addressed more generic case of having multiple products. Below solution addresses such more generic case
#standardSQL
WITH t AS (
SELECT "1234" AS row_id, '''{ "id" : "123" , "items" : [
{ "quantity" : 1 , "product" : { "id" : "p1" , "categories" : [ "cat1","cat2","cat3"] }},
{ "quantity" : 2 , "product" : { "id" : "p2" , "categories" : [ "cat4","cat5","cat6"] }}
] }''' AS parts
)
SELECT row_id, REPLACE(category, '"', '') category
FROM t, UNNEST(REGEXP_EXTRACT_ALL(parts, r'"categories" : \[(.+?)]')) categories,
UNNEST(SPLIT(categories)) category
with result
Row row_id category
1 1234 cat1
2 1234 cat2
3 1234 cat3
4 1234 cat4
5 1234 cat5
6 1234 cat6