I'm having a problem joining two tables using IN.
Example:
with nodes(node_id, mpath) as (
SELECT node_id, drugs_cls_node_view.mpath
FROM drugs_cls_entries_view
inner join drugs_cls_node_view on drugs_cls_node_view.id = node_id
WHERE mnn_id in (13575)
)
select DISTINCT n.node_id, drugs_cls_node_view.*
from nodes n
inner join drugs_cls_node_view
on drugs_cls_node_view.id in (array_replace(string_to_array(n.mpath, '/'), '', '0')::bigint[])
I get the exception:
ERROR: operator does not exist: bigint = bigint[]
With
on drugs_cls_node_view.id in
(array_replace(string_to_array(n.mpath, '/'), '', '0')::bigint[])
you look for the ID in a set containing just one element. This element is an array. The ID can never equal the array, hence the error.
You must unnest the array to have single values to compare with:
on drugs_cls_node_view.id in
(select(unnest(array_replace(string_to_array(n.mpath, '/'), '', '0')::bigint[])))
Or use ANY on the array instead of IN:
on drugs_cls_node_view.id = ANY
(array_replace(string_to_array(n.mpath, '/'), '', '0')::bigint[])
There may be syntactical errors in my code, as I am no postgres guy, but it should do with maybe a little correction here or there :-)
Related
I need to remove a few records (that contain t) in order to parse/flatten the data column. The query in the CTE that creates 'tab', works independent but when inside the CTE i get the same error while trying to parse json, if I were not have tried to filter out the culprit.
with tab as (
select * from table
where data like '%t%')
select b.value::string, a.* from tab a,
lateral flatten( input => PARSE_JSON( a.data) ) b ;
;
error:
Error parsing JSON: unknown keyword "test123", pos 8
example data:
Date Data
1-12-12 {id: 13-43}
1-12-14 {id: 43-43}
1-11-14 {test12}
1-11-14 {test2}
1-02-14 {id: 44-43}
It is possible to replace PARSE_JSON(a.data) with TRY_PARSE_JSON(a.data) which will produce NULL instead of error for invalid input.
More at: TRY_PARSE_JSON
I am trying to extract the following JSON into its own rows like the table below in Presto query. The issue here is the name of the key/av engine name is different for each row, and I am stuck on how I can extract and iterate on the keys without knowing the value of the key.
The json is a value of a table row
{
"Bkav":
{
"detected": false,
"result": null,
},
"Lionic":
{
"detected": true,
"result": Trojan.Generic.3611249',
},
...
AV Engine Name
Detected Virus
Result
Bkav
false
null
Lionic
true
Trojan.Generic.3611249
I have tried to use json_extract following the documentation here https://teradata.github.io/presto/docs/141t/functions/json.html but there is no mention of extraction if we don't know the key :( I am trying to find a solution that works in both presto & hive query, is there a common query that is applicable to both?
You can cast your json to map(varchar, json) and process it with unnest to flatten:
-- sample data
WITH dataset (json_str) AS (
VALUES (
'{"Bkav":{"detected": false,"result": null},"Lionic":{"detected": true,"result": "Trojan.Generic.3611249"}}'
)
)
--query
select k "AV Engine Name", json_extract_scalar(v, '$.detected') "Detected Virus", json_extract_scalar(v, '$.result') "Result"
from (
select cast(json_parse(json_str) as map(varchar, json)) as m
from dataset
)
cross join unnest (map_keys(m), map_values(m)) t(k, v)
Output:
AV Engine Name
Detected Virus
Result
Bkav
false
Lionic
true
Trojan.Generic.3611249
The presto query suggested by #Guru works, but for hive, there is no easy way.
I had to extract the json
Parse it with replace to remove some character and bracket
Then convert it back to a map, and repeat for one more time to get the nested value out
SELECT
av_engine,
str_to_map(regexp_replace(engine_result, '\\}', ''),',', ':') AS output_map
FROM (
SELECT
str_to_map(regexp_replace(regexp_replace(get_json_object(raw_response, '$.scans'), '\"', ''), '\\{',''),'\\},', ':') AS key_val_map
FROM restricted_antispam.abuse_malware_scanning
) AS S
LATERAL VIEW EXPLODE(key_val_map) temp AS av_engine, engine_result
I am currently figuring out how to do a bit more complex data migration in my database and whether it is even possible to do in SQL (not very experienced SQL developer myself).
Let's say that I store JSONs in one of my text columns in a Postgres table wtih roughly the following format:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}]}
Now, I would like to migrate the value part to a bit more complex format:
{"type":"something","params":[{"value":{"id":"00de1be5-f75b-4072-ba30-c67e4fdf2333","path":"/hardcoded/string"}}]}
Furthermore, I also need to reason whether the value contains a UUID pattern, and if not, use slightly different structure:
{"type":"something-else","params":[{"value":"not-id"}]} ---> {"type":"something-else","params":[{"value":{"value":"not-id","path":""}}]}
I know I can define a procedure and use REGEX_REPLACE: REGEXP_REPLACE(source, pattern, replacement_string,[, flags]) but I have no idea how to approach the reasoning about whether the content contains ID or not. Could someone suggest at least some direction or hint how to do this?
You can use jsonb function for extract data and change them. At the end you should extend data.
Sample data structure and query result: dbfiddle
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.data
PS:
If your table had id or unique field you can change group by t.data to do things like that:
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.id
To replace values at any depth, you can use a recursive CTE to run replacements for each value of a value key, using a conditional to check if the value is a UUID, and producing the proper JSON object accordingly:
with recursive cte(v, i, js) as (
select (select array_to_json(array_agg(distinct t.i))
from (select (regexp_matches(js, '"value":("[\w\-]+")', 'g'))[1] i) t), 0, js from (select '{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}]}' js) t1
union all
select c.v, c.i+1, regexp_replace(
regexp_replace(c.js, regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'),
case when not ((c.v -> c.i)::text ~ '\w+\-\w+\-\w+\-\w+\-\w+') then
json_build_object('value', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '')::text
else json_build_object('id', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '/hardcoded/path')::text end, 'g'),
'(")(?=\{)|(?<=\})(")', '', 'g')
from cte c where c.i < json_array_length(c.v)
)
select js from cte order by i desc limit 1
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}]}
On a more complex JSON input string:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}, {"more":[{"additional":[{"value":"00f41be5-g75b-4072-ba30-c67e4fdf3777"}]}]}]}
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}, {"more":[{"additional":[{"value":{"id" : "00f41be5-g75b-4072-ba30-c67e4fdf3777", "path" : "/hardcoded/path"}}]}]}]}
Help needed to extract the data below from XML messages. I have table which contains the xml message in clob data type. I am trying using below query but it is not returning any data . I need to extract all the values from xml message.
<iORDERS:iORDERS xmlns:iORDERS="urn:iORDERS-abcdonline-com:Integration:v1">
<ORDER_NOTIFY>
<MESSAGE_DATETIME>2017-06-13T12:20:51+10:00</MESSAGE_DATETIME>
<MESSAGE_SEQ>1</MESSAGE_SEQ>
<MESSAGE_TYPE>PLACED</MESSAGE_TYPE>
<ORDER_HEAD>
<ORDER_ID>1111</ORDER_ID>
<DROP_SHIP_ORDER_NO></DROP_SHIP_ORDER_NO>
<CUSTOMER_ORDER_NO>22222</CUSTOMER_ORDER_NO>
<DISPATCH_LOCATION>
<SKU>2323234</SKU>
<UPC>4549432533626</UPC>
<REQUESTED_QTY>1</REQUESTED_QTY>
<DISPATCH_ASSIGNMENT>7777</DISPATCH_ASSIGNMENT>
<PROVIDER_ID>100</PROVIDER_ID>
<PKG_TYPE>SAT</PKG_TYPE>
</DISPATCH_LOCATION>
</ORDER_HEAD>
</ORDER_NOTIFY>
</iORDERS:iORDERS>
query :
select wor.batch_no,wor.web_service_no,x.*
from web_orders wo
cross join XMLTABLE (
XMLNAMESPACES(DEFAULT 'urn:iORDERS-abcdonline-com:Integration:v1'),
'iORDERS/ORDER_NOTIFY/ORDER_HEAD/DISPATCH_LOCATION'
passing xmltype(wo.xml_message)
columns
MESSAGE_TYPE varchar(120) path './../../../MESSAGE_TYPE') x;
You need to provide the named namespace identifier rather than a defealt, and your column path is going up one too many levels:
select wo.batch_no,wo.web_service_no,x.*
from web_orders wo
cross join XMLTABLE (
XMLNAMESPACES('urn:iORDERS-abcdonline-com:Integration:v1' as "iORDERS"),
'iORDERS:iORDERS/ORDER_NOTIFY/ORDER_HEAD/DISPATCH_LOCATION'
passing xmltype(wo.xml_message)
columns
MESSAGE_TYPE varchar(120) path './../../MESSAGE_TYPE') x;
BATCH_NO WEB_SERVICE_NO MESSAGE_TYPE
---------- -------------- ------------------------------------------------------------------------------------------------------------------------
1 2 PLACED
Presumably you're planning on getting for information that that from the XML, and/or expect to have multiple nodes; otherwise, to just get the message type you could simplify to:
select wo.batch_no,wo.web_service_no,x.*
from web_orders wo
cross join XMLTABLE (
XMLNAMESPACES('urn:iORDERS-abcdonline-com:Integration:v1' as "iORDERS"),
'iORDERS:iORDERS/ORDER_NOTIFY'
passing xmltype(wo.xml_message)
columns
MESSAGE_TYPE varchar(120) path 'MESSAGE_TYPE') x;
or even, with a single node:
select wo.batch_no,wo.web_service_no,XMLQuery(
'declare namespace iORDERS="urn:iORDERS-abcdonline-com:Integration:v1"; (: :)
iORDERS:iORDERS/ORDER_NOTIFY/MESSAGE_TYPE/text()'
passing xmltype(wo.xml_message)
returning content).getStringVal() as message_type
from web_orders wo;
I have the following query:
DECLARE #AccString varchar(max)
SET #AccString=''
SELECT #Acctring=#AccString + description + ' [ ] '
FROM tl_sb_accessoryInventory ai
JOIN tl_sb_accessory a on a.accessoryID = ai.accessoryID
WHERE userID=6
SELECT userID, serviceTag, model, #AccString AS ACCESSORIES FROM tl_sb_oldLaptop ol
JOIN tl_sb_laptopType lt ON ol.laptopTypeID = lt.laptopTypeID
WHERE userID=6
which outputs this:
What I want to be able to do is run this for every userID in a table tl_sb_user.
The statement to get the userIDs is:
Select userID from tl_sb_user
How can I get this to output a row as above for each user?
You are trying to do a string concatenation subquery. In SQL Server, you need to do the string concatenation using a correlated subquery with for xml path. Arcane, but it generally works.
The results is something like this:
SELECT userID, serviceTag, model, #AccString AS ACCESSORIES,
stuff((select ' [ ] ' + description
from tl_sb_accessoryInventory ai join
tl_sb_accessory a
on a.accessoryID = ai.accessoryID
where a.userId = ol.UserId
for xml path ('')
), 1, 11, '') as accessories
FROM tl_sb_oldLaptop ol JOIN
tl_sb_laptopType lt
ON ol.laptopTypeID = lt.laptopTypeID;
You don't have table aliases identifying where the columns come from, so I am just guessing that a.userId = ol.UserId references the right tables.
Also, this substitutes certain characters with html forms. Notably '<' and '>' turn into things like '<' and '>'. When I encounter this problem, I use replace() to replace the values.
Simply leave out the WHERE clause.