Oracle 12c Json split - sql

This is how I am getting result in Oracle 12c
Id
Start Date Range
End Date Range
1
[ "2019-01-07","2019-02-17","2019-03-17"]
[ "2019-01-14","2019-02-21","2019-03-21"]
And I want it
Id
Start Date Range
End Date Range
1
2019-01-07
2019-01-14
1
2019-02-17
2019-02-21
1
2019-03-17
2019-03-21
Earlier I had asked this question for single-column split and below is the link
How to replace special characters and then break line in oracle
But when I add another column there is the cartesian product.

You can use json_table to extract the strings from the JSON arrays, presumably as actual dates:
select t.id, s.n, s.start_date, e.end_date
from your_table t
cross apply json_table (
t.start_range, '$[*]'
columns
n for ordinality,
start_date date path '$'
) s
join json_table (
t.end_range, '$[*]'
columns
n for ordinality,
end_date date path '$'
) e
on e.n = s.n
The for ordinality clauses provide an index into each array, and the join then matches up the 'related' array entries.
ID | N | START_DATE | END_DATE
-: | -: | :--------- | :--------
1 | 1 | 07-JAN-19 | 14-JAN-19
1 | 2 | 17-FEB-19 | 21-FEB-19
1 | 3 | 17-MAR-19 | 21-MAR-19
If you want string rather than dates for some reason you can just change the data type in the column clauses.
db<>fiddle

You can also do it with regexp_substr and connect by after replacing [,] and " with empty spaces.
Schema and insert statements:
create table testtable(Id int, Start_Date_Range varchar(500), End_Date_Range varchar(500));
insert into testtable values(1 ,'[ "2019-01-07","2019-02-17","2019-03-17"]', '[ "2019-01-14","2019-02-21","2019-03-21"]');
Query:
select distinct id, trim(regexp_substr(replace(replace(replace(Start_Date_Range,'"',''),'[',''),']',''),'[^,]+', 1, level) ) Start_Date_Range,
trim(regexp_substr(replace(replace(replace(end_Date_Range,'"',''),'[',''),']',''),'[^,]+', 1, level) ) End_Date_Range,
level
from testtable
connect by regexp_substr(Start_Date_Range, '[^,]+', 1, level) is not null
order by id, level;
Output:
ID
START_DATE_RANGE
END_DATE_RANGE
LEVEL
1
2019-01-07
2019-01-14
1
1
2019-02-17
2019-02-21
2
1
2019-03-17
2019-03-21
3
db<fiddle here

In a comment to the OP, I pointed out that the data model is not quite right. The values in the two JSON arrays are related; such data should be encoded in a single object, not two. There should be a single array of objects, each object having two members: start date and end date.
To illustrate the data model I am suggesting, I start with a sample input table (with an additional id), I use Alex Poole's answer exactly as is to generate the table the OP asked about, but then I follow that with JSON generating functions to put the data back into JSON format to illustrate how I think the input data should look like. (The provider of the JSON string should create a JSON in this format, rather than sending two separate JSON arrays of strings representing dates).
What I do not show here is how to use a single call to JSON_TABLE to split the data from the single array of objects created at the end of my query. That is a lot simpler than the query to get data out of two separate JSON arrays.
NOTE - this is not really an answer; I wrote it as an answer because it obviously wouldn't fit in a comment.
with
t (id, start_date_range, end_date_range) as (
select 1, '["2019-01-07","2019-02-17","2019-03-17"]',
'["2019-01-14","2019-02-21","2019-03-21"]' from dual union all
select 5, '["2020-04-23","2020-06-15"]',
'["2020-04-30","2020-06-19"]' from dual
)
, shown_as_table(id, n, start_date, end_date) as (
select t.id, s.n, to_char(s.start_date, 'yyyy-mm-dd'),
to_char(e.end_date, 'yyyy-mm-dd')
from t
cross apply json_table (
t.start_date_range, '$[*]'
columns
n for ordinality,
start_date date path '$'
) s
join json_table (
t.end_date_range, '$[*]'
columns
n for ordinality,
end_date date path '$'
) e
on e.n = s.n
)
select id, json_arrayagg(
json_object('start' value start_date, 'end' value end_date)
format json
order by n
) as date_range_array
from shown_as_table
group by id
;
Output:
ID DATE_RANGE_ARRAY
-- -------------------------------------------------------------------------------------------------------------------------------
1 [{"start":"2019-01-07","end":"2019-01-14"},{"start":"2019-02-17","end":"2019-02-21"},{"start":"2019-03-17","end":"2019-03-21"}]
5 [{"start":"2020-04-23","end":"2020-04-30"},{"start":"2020-06-15","end":"2020-06-19"}]

Related

Oracle - Split the parameter by comma and check if the parameter exist in Column

I am new to Oracle and not sure if there are any inbuilt functions to do this task.
I have a column that contains Product_ID's separated by comma.
Product_ID
123,234,546,789,487
I am passing a list of Product_ID's separated by a comma as varchar2.
so, I am passing "234,789" as varchar2.
I want to find if 234 and 789 exist in that column and if exists get that row.
How can I do this?
If you want to check that all the values in your input list are included in the column then you can use:
SELECT *
FROM table_name t
WHERE EXISTS (
WITH input ( value ) AS (
SELECT '123,789' FROM DUAL -- Your input value
)
SELECT 1
FROM input
WHERE ','||t.product_id||',' LIKE '%,' || REGEXP_SUBSTR( value, '[^,]+', 1, LEVEL ) || ',%'
CONNECT BY LEVEL <= REGEXP_COUNT( value, '[^,]+' )
HAVING COUNT(*) = REGEXP_COUNT( value, '[^,]+' )
)
Which, for the sample data:
CREATE TABLE table_name ( Product_ID ) AS
SELECT '123,234,546,789,487' FROM DUAL
Outputs:
| PRODUCT_ID |
| :------------------ |
| 123,234,546,789,487 |
If you want to check that at least one value in your input list is in the column then you can use the same query without the line containing the HAVING clause.
db<>fiddle here
Here's one way - making the comma-separated number lists into JSON arrays so that we can split them using json_table, then re-aggregating as nested tables so that we can compare with the submultiset operator:
create type table_of_pid as table of number;
/
with
sample_data (product_id) as (
select '123,234,546,789,487' from dual union all
select '333,444,555,666,888' from dual
)
, user_input (product_list) as (
select '234,789' from dual
)
select *
from sample_data
where ( select cast(collect(pid) as table_of_pid) as input_pid
from user_input cross apply
json_table('[' || product_list || ']', '$[*]'
columns pid number path '$')
)
submultiset
( select cast(collect(pid) as table_of_pid) as input_pid
from json_table('[' || product_id || ']', '$[*]'
columns pid number path '$')
)
;
PRODUCT_ID
-------------------
123,234,546,789,487
Your inputs violate First Normal Form, the most basic sanity requirement in a relational database. If the data was in normal form, you wouldn't need any of the JSON trickery. Still, the aggregation into collection and the submultiset comparison would be the correct approach even if the data was already in normal form.
It is a bad idea to store comma-separated values into a single column.
One option to do what you asked for is in the following example; read comments within code. Note that - for large tables - performance WILL suffer.
SQL> set ver off
SQL>
SQL> with
2 test (product_id) as
3 -- your sample table
4 (select '123,234,546,789,487' from dual union all
5 select '111,222,333' from dual
6 ),
7 test_split as
8 -- you have to split it into rows
9 (select product_id,
10 regexp_substr(product_id, '[^,]+', 1, column_value) val
11 from test
12 cross join table(cast(multiset(select level from dual
13 connect by level <= regexp_count(product_id, ',') + 1
14 ) as sys.odcinumberlist))
15 ),
16 parameter_split as
17 -- split input parameter into rows as well
18 (select regexp_substr('&&par_id', '[^,]+', 1, level) val
19 from dual
20 connect by level <= regexp_count('&&par_id', ',') + 1
21 )
22 -- join split values, return the result
23 select distinct t.product_id
24 from test_split t join parameter_split p on p.val = t.val;
Enter value for par_id: 123,546
PRODUCT_ID
-------------------
123,234,546,789,487
SQL> undefine par_id
SQL> /
Enter value for par_id: 333
PRODUCT_ID
-------------------
111,222,333
SQL>

Split SQL string with specific string instead of separator?

I have table that looks like:
|ID | String
|546 | 1,2,1,5,7,8
|486 | 2,4,8,1,5,1
|465 | 18,11,20,1,4,18,11
|484 | 11,10,11,12,50,11
I want to split the string to this:
|ID | String
|546 | 1,2
|546 | 1,5
|486 | 1,5,1
|486 | 1
|465 | 1,4
My goal is to show ID and all the strings starting with 1 with just the next number after them.
I filtered all rows without '%1,%' and I don't know how to continue.
If you use SQL Server 2016+, you may try to use a JSON-based approach. You need to transform the data into a valid JSON array and parse the JSON array with OPENJSON(). Note that STRING_SPLIT() is not an option here, because as is mentioned in the documentation, the output rows might be in any order and the order is not guaranteed to match the order of the substrings in the input string.
Table:
CREATE TABLE Data (
ID int,
[String] varchar(100)
)
INSERT INTO Data
(ID, [String])
VALUES
(546, '1,2,1,5,7,8'),
(486, '2,4,8,1,5,1'),
(465, '18,11,20,1,4,18,11'),
(484, '11,10,11,12,50,11')
Statement:
SELECT
ID,
CONCAT(FirstValue, ',', SecondValue) AS [String]
FROM (
SELECT
d.ID,
j.[value] As FirstValue,
LEAD(j.[value]) OVER (PARTITION BY d.ID ORDER BY CONVERT(int, j.[key])) AS SecondValue
FROM Data d
CROSS APPLY OPENJSON(CONCAT('[', d.[String], ']')) j
) t
WHERE t.FirstValue = '1'
Result:
----------
ID String
----------
465 1,4
486 1,5
486 1,
546 1,2
546 1,5
Something like :
SELECT ID, S.value
FROM Data
CROSS APPLY STRING_SPLIT(REPLACE(',' + String, ',1,', '#1,'), '#') AS S
WHERE value LIKE '1,%'
?

How to use REGEXP_SUBSTR properly?

Currently in my select statement I have id and value. The value is json which looks like this:
{"layerId":"nameOfLayer","layerParams":{some unnecessary data}
I would like to have in my select id and nameOfLayer so the output would be for example:
1, layerName
2, layerName2
etc.
The json looks always the same so the layerID is the first.
Could you tell me how can I use REGEXP_SUBSTR properly in my select query which looks like this now?
select
id,
value
from
...
where
table1.id = table2.bookmark_id
and ...;
In Oracle 11g, you can extract the layerId using the following regular expression, where js is the name of your JSON column:
regexp_replace(js, '^.*"layerId":"([^"]+).*$', '\1')
This basically extracts the string between double quotes after "layerId":.
In more recent versions, you would add a check constraint on the table to ensure that the document is valid JSON, and then use the dot notation to access the object attribute as follows:
create table mytable (
id int primary key,
js varchar2(200),
constraint ensure_js_is_json check (js is json)
);
insert into mytable values (1, '{"layerId":"nameOfLayer","layerParams":{} }');
select id, t.js.layerId from mytable t;
Demo on DB Fiddle:
ID | LAYERID
-: | :----------
1 | nameOfLayer
Don't use regular expressions; use a JSON_TABLE or JSON_VALUE to parse JSON:
Oracle 18c Setup:
CREATE TABLE test_data (
id INTEGER,
value VARCHAR2(4000)
);
INSERT INTO test_data ( id, value )
SELECT 1, '{"layerId":"nameOfLayer","layerParams":{"some":"unnecessary data"}}' FROM DUAL UNION ALL
SELECT 2, '{"layerParams":{"layerId":"NOT THIS ONE!"},"layerId":"nameOfLayer"}' FROM DUAL UNION ALL
SELECT 3, '{"layerId":"Name with \"Quotes\"","layerParams":{"layerId":"NOT THIS ONE!"}}' FROM DUAL;
Query 1:
SELECT t.id,
j.layerId
FROM test_data t
CROSS JOIN
JSON_TABLE(
t.value,
'$'
COLUMNS (
layerId VARCHAR2(50) PATH '$.layerId'
)
) j
Query 2:
If you only want a single value you could, alternatively, use JSON_VALUE:
SELECT id,
JSON_VALUE( value, '$.layerId' ) AS layerId
FROM test_data
Output:
Both output:
ID | LAYERID
-: | :-----------------
1 | nameOfLayer
2 | nameOfLayer
3 | Name with "Quotes"
Query 3:
You can try regular expressions but they do not always work as expected:
SELECT id,
REPLACE(
REGEXP_SUBSTR( value, '[{,]"layerId":"((\\"|[^"])*)"', 1, 1, NULL, 1 ),
'\"',
'"'
) AS layerID
FROM test_data
Output:
ID | LAYERID
-: | :-----------------
1 | nameOfLayer
2 | NOT THIS ONE!
3 | Name with "Quotes"
So if you can guarantee that no-one is going to put data into the database where the JSON is in a different order then this may work; however the JSON specification allows key-value pairs to be in any order so regular expressions are not a general solution that will parse every JSON string. You should be using a proper JSON parser and there are 3rd party solutions available for Oracle 11g or you can upgrade to Oracle 12c where there is a native solution.
db<>fiddle here
I think you can use regexp_substr like this:
regexp_substr(str, '[^"]+',1,2) as layer_id,
regexp_substr(str, '[^"]+',1,4) as layername
Db<>fiddle demo
Cheers!!

(SQLite) Selecting # separated data as multiple rows

I have a table wihch contains foreign keys concactenated (separator #) in a field. I want to transform the data to one row per FK so that I can do a join on the data.
My table looks like this:
ARCHE
id_a | str_ids
str_ids field contains concactenate FK as follows: #id1#id2#id4#id7#
(There is a different number of agregated id's for each row)
I am not really familiar with SQLite, and I have trouble finding the equivalent. I understood I have to do this "with recursive" but it seems I can't get the hang of this.
The oracle equivalent of what I am looking for is as follows:
select
id_a
,trim(regexp_substr(str_ids, '[^#]+', 1, LEVEL)) as id_b
from arche
connect by trim(regexp_substr(str_ids, '[^#]+', 1, LEVEL)) IS NOT NULL
In SQLite you can use a recursive common table expression to solve this. The recursive CTE selects from the original table and splits the strings into subparts, which the main query selects from.
WITH RECURSIVE cte(id, val, etc) AS(
SELECT id_a, '', str_ids FROM arche
UNION ALL
SELECT
id
, SUBSTR(etc, 0, INSTR(etc, '#'))
, SUBSTR(etc, INSTR(etc, '#')+1)
FROM cte
WHERE etc <> ''
)
SELECT id AS id_a, REPLACE(val, 'id', '') AS id_b
FROM cte
WHERE val <> ''
ORDER BY id, val
Here is an example :
Schema (SQLite v3.26)
Query #1
WITH RECURSIVE cte(id, val, etc) AS(
SELECT 1, '', '#id1#id2#id4#id7#'
UNION ALL
SELECT
id
, SUBSTR(etc, 0, INSTR(etc, '#'))
, SUBSTR(etc, INSTR(etc, '#')+1)
FROM cte
WHERE etc <> ''
)
SELECT id AS id_a, val AS id_b
FROM cte
WHERE val <> ''
ORDER BY id, val;
| id_a | id_b |
| ---- | ---- |
| 1 | id1 |
| 1 | id2 |
| 1 | id4 |
| 1 | id7 |
View on DB Fiddle
NB2 :
REGEXP_REPLACE does not exists in SQLite, I replaced it with REPLACE
you need a # ad the end of string for this to work (having two #` is ok, too)
this is not a very performant approach ; if you have lots of rows to process, this might not scale well.

Convert an array into a Map

I have a table with a column like
[{"key":"e","value":["253","203","204"]},{"key":"st","value":["mi"]},{"key":"k2","value":["1","2"]}]
Which is of the format array<struct<key:string,value:array<string>>>
I want to convert the column into below format :
{"e":["253","203","204"],"st":["mi"],"k2":["1","2"]}
which is of the type map<string,array<string>>
I have tried exploding the array but that does not work. Any ideas how I can do this in hive.
Without use of external libraries it's impossible. Please refer to brickhouse or create your own UDAF.
Note: further code provides snippets to reproduce the problem and solving the problem that Hive's built-in functions can solve. i.e map<string,string> not map<string, array<string>>.
-- reproducing the problem
CREATE TABLE test_table(id INT, input ARRAY<STRUCT<key:STRING,value:ARRAY<STRING>>>);
INSERT INTO TABLE test_table
SELECT
1 AS id,
ARRAY(
named_struct("key","e", "value", ARRAY("253","203","204")),
named_struct("key","st", "value", ARRAY("mi")),
named_struct("key","k2", "value", ARRAY("1", "2"))
) AS input;
SELECT id, input FROM test_table;
+-----+-------------------------------------------------------------------------------------------------------+--+
| id | input |
+-----+-------------------------------------------------------------------------------------------------------+--+
| 1 | [{"key":"e","value":["253","203","204"]},{"key":"st","value":["mi"]},{"key":"k2","value":["1","2"]}] |
+-----+-------------------------------------------------------------------------------------------------------+--+
With exploding and using STRUCT features, we can split the keys and values.
SELECT id, exploded_input.key, exploded_input.value
FROM (
SELECT id, exploded_input
FROM test_table LATERAL VIEW explode(input) d AS exploded_input
) x;
+-----+------+----------------------+--+
| id | key | value |
+-----+------+----------------------+--+
| 1 | e | ["253","203","204"] |
| 1 | st | ["mi"] |
| 1 | k2 | ["1","2"] |
+-----+------+----------------------+--+
The idea is to use your UDAF to "collect" a map while aggregating on id.
What Hive can solve with built in function is generating map<string,string> by converting rows to strings with a special delimiter, aggregate rows via another special delimiter and use str_to_map built-in function on the delimiters to generate map<string, string>.
SELECT
id,
str_to_map(
-- outputs: e:253,203,204#st:mi#k2:1,2 with delimiters between aggregated rows
concat_ws('#', collect_list(list_to_string)),
'#', -- first delimiter
':' -- second delimiter
) mapped_output
FROM (
SELECT
id,
-- outputs 3 rows: (e:253,203,203), (st:mi), (k2:1,2)
CONCAT(exploded_input.key,':' , CONCAT_WS(',', exploded_input.value)) as list_to_string
FROM (
SELECT id, exploded_input
FROM test_table LATERAL VIEW explode(input) d AS exploded_input
) x
) y
GROUP BY id;
Which outputs a string to string map like:
+-----+-------------------------------------------+--+
| id | mapped_output |
+-----+-------------------------------------------+--+
| 1 | {"e":"253,203,204","st":"mi","k2":"1,2"} |
+-----+-------------------------------------------+--+
with input_set as (
select array(named_struct('key','e','value',array('253','203','204')),named_struct('key','st','value',array('mi')),named_struct('key','k2','value',array('1','2'))) as input_array
), break_input_set as (
select y.col_num as y_col_num,y.col_value as y_col_value from input_set lateral view posexplode(input_set.input_array) y as col_num, col_value
), create_map as (
select map(y_col_value.key,y_col_value.value) as final_map from break_input_set
)
select * from create_map;
var Array = [{"key":"e","value":["253","203","204"]},{"key":"st","value":["mi"]},{"key":"k2","value":["1","2"]}];
var obj = {}
for(var i=0;i<Array.length;i++){
obj[Array[i].key] = Array[i].value
}
obj will be in the required format