delimit the output of xml extract function in oracle [duplicate] - sql

I have a CLOB column that contains XML type data. For example XML data is:
<A><B>123</b><C>456</C><B>789</b></A>
I have tried the concat function:
concat(xmltype (a.xml).EXTRACT ('//B/text()').getStringVal (),';'))
or
xmltype (a.xml).EXTRACT (concat('//B/text()',';').getStringVal ()))
But they are giving ";" at end only not after each <B> tag.
I am currently using
xmltype (a.xml).EXTRACT ('//B/text()').getStringVal ()
I want to concatenate all <B> with ; and expected result should be 123;789
Please suggest me how can I concatenate my data.

The concat() SQL function concatenates two values, so it's just appending the semicolon to each extracted value independently. But you're really trying to do string aggregation of the results (which could, presumably, really be more than two extracted values).
You can use XMLQuery instead of extract, and use an XPath string-join() function to do the concatentation:
XMLQuery('string-join(/A/B, ";")' passing xmltype(a.xml) returning content)
Demo with fixed XMl end-node tags:
-- CTE for sample data
with a (xml) as (
select '<A><B>123</B><C>456</C><B>789</B></A>' from dual
)
-- actual query
select XMLQuery('string-join(/A/B, ";")' passing xmltype(a.xml) returning content) as result
from a;
RESULT
------------------------------
123;789
You could also extract all of the individual <B> values using XMLTable, and then use SQL-level aggregation:
-- CTE for sample data
with a (xml) as (
select '<A><B>123</B><C>456</C><B>789</B></A>' from dual
)
-- actual query
select listagg(x.b, ';') within group (order by null) as result
from a
cross join XMLTable('/A/B' passing xmltype(a.xml) columns b number path '.') x;
RESULT
------------------------------
123;789
which gives you more flexibility and would allow grouping by other node values more easily, but that doesn't seem to be needed here based on your example value.

Related

Hive sql extract one to multiple values from key value pairs

I have a column that looks like:
[{"key_1":true,"key_2":true,"key_3":false},{"key_1":false,"key_2":false,"key_3":false},...]
There can be 1 to many items described by parameters in {} in the column.
I would like to extract values only of parameters described by key_1. Is there a function for that? I tried so far json related functions (json_tuple, get_json_object) but each time I received null.
Consider below json path.
WITH sample_data AS (
SELECT '[{"key_1":true,"key_2":true,"key_3":false},{"key_1":false,"key_2":false,"key_3":false}]' json
)
SELECT get_json_object(json, '$[*].key_1') AS key1_values FROM sample_data;
Query results

Parsing within a field using SQL

We are receiving data in one column where further parsing is needed. In this example the separator is ~.
Goal is to grab the pass or fail value from its respective pair.
SL
Data
1
"PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS"
2
"PARAM-0040,FAIL~PARAM-0045,FAIL~PARAM-0070,PASS"
Required outcome:
SL
PARAM-0040
PARAM-0045
PARAM-0070
1
PASS
PASS
PASS
2
FAIL
FAIL
PASS
This will be a part of a bigger SQL query where we are selecting many other columns, and these three columns are to be picked up from the source as well and passed in the query as selected columns.
E.g.
Select Column1, Column2, [ Parse code ] as PARAM-0040, [ Parse code ] as PARAM-0045, [ Parse code ] as PARAM-0070, Column6 .....
Thanks
You can do that with a regular expression. But regexps are non-standard.
This is how it is done in postgresql: REGEXP_MATCHES()
https://www.postgresqltutorial.com/postgresql-regexp_matches/
In postgresql regexp_matches returns zero or more values. So then it has to be broken down (thus the {})
A simpler way, also in postgresql is to use substring.
substring('foobar' from 'o(.)b')
Like:
select substring('PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS' from 'PARAM-0040,([^~]+)~');
substring
-----------
PASS
(1 row)
You may use the str_to_map function to split your data and subsequently extract each param's value. This example will first split each param/value pair by ~ before splitting the parameter and value by ,.
Reproducible example with your sample data:
WITH my_table AS (
SELECT 1 as SL, "PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS" as DATA
UNION ALL
SELECT 2 as SL, "PARAM-0040,FAIL~PARAM-0045,FAIL~PARAM-0070,PASS" as DATA
),
param_mapped_data AS (
SELECT SL, str_to_map(DATA,"~",",") param_map FROM my_table
)
SELECT
SL,
param_map['PARAM-0040'] AS PARAM0040,
param_map['PARAM-0045'] AS PARAM0045,
param_map['PARAM-0070'] AS PARAM0070
FROM
param_mapped_data
Actual code assuming your table is named my_table
WITH param_mapped_data AS (
SELECT SL, str_to_map(DATA,"~",",") param_map FROM my_table
)
SELECT
SL,
param_map['PARAM-0040'] AS PARAM0040,
param_map['PARAM-0045'] AS PARAM0045,
param_map['PARAM-0070'] AS PARAM0070
FROM
param_mapped_data
Outputs:
sl
param0040
param0045
param0070
1
PASS
PASS
PASS
2
FAIL
FAIL
PASS

unnest() not exploding array, returns error Column alias list has 1 entries but 't' has 2 columns available

I have some json data which includes a property 'characters' and it looks like this:
select json_data['characters'] from latest_snapshot_events
Returns: [{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]
This is returned on a single row. I would like a single row for each item within the array.
I found several SO posts and other blogs advising me to use unnest(). I've tried this several times and cannot get a result to return. For example, here is the documentation from presto. The bottom covers unnest as a stand in for hive's lateral view explode:
SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
So I tried to apply this to my table:
characters as (
select
jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
where json_data is the field in latest_snapshot_events that contains the the property 'characters' which is an array like the one shown above.
This returns an error:
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 69:12: Column alias list has 1 entries but 't' has 2 columns available
How can I unnest/explode latest_snapshot_events.json_data['characters'] onto multiple rows?
Since characters is a JSON array in textual representation, you'll have to:
Parse the JSON text with json_parse to produce a value of type JSON.
Convert the JSON value into a SQL array using CAST.
Explode the array using UNNEST.
For instance:
WITH data(characters) AS (
VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
which produces:
entry
-----------------------------------------------------------------------
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
In the example above, I convert the JSON value into an array(json), but
you can further convert it to something more concrete if the values inside each
array entry have a regular schema. For example, for your data, it is
possible to cast it to an array(map(varchar, json)) since every element in the
array is a JSON object.
json_parse works if your initial data is a JSON string. However, for array(row) types (i.e. an array of objects/dictionaries), casting to array(json) will convert each row into an array, removing all keys from the object and preventing you from using dot notation or json_extract functions.
To unnest array(row) data, the syntax is much simpler:
CROSS JOIN UNNEST(my_array) AS my_row
I got stuck with this error trying to unpivot data.
This might help someone:
SELECT a_col, b_col
FROM
(
SELECT MAP(
ARRAY['a', 'b', 'c', 'd'],
ARRAY[1, 2, 3, 4]
) my_col
) CROSS JOIN UNNEST(my_col) as t(a_col, b_col)
t() allows you define multiple columns as outputs.

Split Clob containing XML

I have a table containing, in a clob column, a value like this: <root><node><a>text1a</a><b>text1b</b></node><node><a>text2a</a><b>text2b</b></node></root>
Using PL/SQL I need to query it and obtain this output in two rows:
<node><a>text1a</a><b>text1b</b></node>
<node><a>text2a</a><b>text2b</b></node>
It could be more the 4000 chars each one.
Tag must be included in the output.
Convert clob to xmltype and use xmltable to parse it:
with s as (select '<root><node><a>text1a</a><b>text1b</b></node><node><a>text2a</a><b>text2b</b></node></root>' c from dual)
select x.node node_xml, x.node.getclobval() node_clob
from s,
xmltable(
'/root/node'
passing xmltype(s.c)
columns
node xmltype path '.'
) x;
NODE_XML NODE_CLOB
------------------------------------------ ------------------------------------------
<node><a>text1a</a><b>text1b</b></node> <node><a>text1a</a><b>text1b</b></node>
<node><a>text2a</a><b>text2b</b></node> <node><a>text2a</a><b>text2b</b></node>
Original data is not in varchar; it is in some LOB. If you are ok with returning 2 lobs, look up PLSQL table functions (to put a function in from clause) and utilizing dbms_lob package return 2 rows where each row is a LOB as well.
If you want to return it as varchar data, then you have the 4000 limit. All you can do is return in multiple rows of 4000 and combine them all in your client software.
You can see this link where there is a solution to split blob into 4000 byte strings. https://medium.com/#thesaadahmad/a-blobs-journey-from-the-database-to-the-browser-98884261e137

How to extract a part of a JSON String from an SQL string Base64

Situation:
I have a column of encoded Base64 strings that I would like to decode and extract a specific part of it. It seems the decoded value is in JSON format (not familiar with JSON at all)
Objective:
How can I extract the value of a dictionary part of a JSON string.
Current query:
SELECT
CONVERT(VARCHAR(MAX),CAST('' AS XML).value('xs:base64Binary(sql:column("BASE64_COLUMN"))', 'VARBINARY(MAX)')) AS RESULT
FROM
(SELECT [value] AS BASE64_COLUMN FROM #test1) as testtable
Output:
The output looks like this:
RESULT
{"a":1,"b":2,"c":3,"d":4}
Desired Output:
How can i extract the value of the first dictionary: 1 ?
I feel like we don't have all the pieces here, but, using that JSON string you can get the value for the key 'a' by simply using OPENJSON:
DECLARE #JSON nvarchar(MAX) = N'{"a":1,"b":2,"c":3,"d":4}';
SELECT [value]
FROM OPENJSON(#JSON)
WHERE [key] = 'a';
I, of course, am assuming you are using SQL Server 2016. Otherwise SQL Server does not support JSON parsing, and you'll be better using an application to do so.
For 2014-, if you just want the first node's value then you could do
SELECT V.JSON, SUBSTRING([JSON],CHARINDEX(':',[JSON])+1,CHARINDEX(',',[JSON])-(CHARINDEX(':',[JSON])+1))
FROM (VALUES(#JSON))V([JSON]);
So, for a blind guess against your table (as I can't test):
SELECT SUBSTRING([JSONString],CHARINDEX(':',[JSONString])+1,CHARINDEX(',',[JSONString])-(CHARINDEX(':',[JSONString])+1))
FROM (SELECT [value] AS BASE64_COLUMN FROM #test1) as testtable
CROSS APPLY (VALUES(CONVERT(VARCHAR(MAX),CAST('' AS XML).value('xs:base64Binary(sql:column("BASE64_COLUMN"))', 'VARBINARY(MAX)')))V(JSONString);
This assumes that every string from your expression contains both at least 1 : character and , character. If it dosen't, then we need more (representative) sample data.
I don't know how permanent/clean this solution is (in fact it's quite ugly), and in most cases, you should reconsider using json-encoded strings for these values or parse it in an application as #Larnu suggests. In any case, you could use PATINDEX() and SUBSTRING() to get the first value:
SELECT
SUBSTRING (
CONVERT(VARCHAR(MAX),CAST('' AS XML).value('xs:base64Binary(sql:column("BASE64_COLUMN"))', 'VARBINARY(MAX)')),
6,
PATINDEX(
'%,"b":%',
CONVERT(VARCHAR(MAX),CAST('' AS XML).value('xs:base64Binary(sql:column("BASE64_COLUMN"))', 'VARBINARY(MAX)'))
) -6
)
FROM
(SELECT [value] AS BASE64_COLUMN FROM #test1) as testtable;