Joining JSON array with another table - sql

My data is in array format like [1,2,3] how to join with another table. Here the query i am trying:
select
RCM.header_details->'auditAssertion'as auditassertion
from masters."RCM" RCM
left join reference."AUDIT_ASSERTION_APPLICATION" as AAA on AAA.id=RCM.header_details->'auditAssertion'

You can use the ? operator to check if a value belongs to json(b) array:
select m.header_details->'auditAssertion'as auditassertion
from masters.rcm m
left join reference.audit_assertion_application a
on m.header_details->'auditAssertion' ? a.id::text
For performance, Postgres would support the following index:
create index on masters.rcm using gin ((header_details->'auditAssertion'));

Related

Select rows with jsonb that have and only have certain keys in postgresql

I have a jsonb column called data in a table called people. The json's values are arrays. It looks like this:
{"bar":["def"],"foo":["abc","hij"]}
In the above example, this jsonb has 2 keys "bar" and "foo". Both values are arrays containing several elements. I am trying to query using several key-value pairs but the values here are single strings. I am trying to make sure the results have and only have the keys in the query and at the same time the corresponding value in the query exists in the json's arrays.
For example, using
{"bar":"def", "foo":"abc"} or {"bar":"def", "foo":"hij"}
, I should be able to get the result.
But if using
{"bar":"def"} or {"foo":"abc"} or {"bar":"def", "foo":"abc", "xyz":"123"}
, I shouldn't get the result since the keys don't match exactly.
I have tried using data->'bar' #> '["def"]' AND data->'foo' #> '["abc"]' to make sure the key-value pairs in the query exist in the data jsonb, but I don't know how to filter out the rows that have more keys than in the query. I was thinking about converting all the keys in the jsonb into an array and use the keys in the query as an array to check if the array from the query contains the array from the jsonb, but couldn't really know how to do it properly. If there is any other better solution, please share your thoughts.
You can full outer join the keys of your objects, check that a key match exists, and then verify the target value exists in the array of possibilities:
create or replace function js_match(record jsonb, template jsonb) returns bool as $$
select not exists (select 1 from jsonb_each(record) t1
full outer join jsonb_each(template) t2 on t1.key = t2.key
where t1.key is null or t2.key is null or not exists
(select 1 from jsonb_array_elements(t1.value) v where v = t2.value))
$$ language sql;
Usage:
select * from people where js_match(data, '{"bar":"def", "foo":"abc"}'::jsonb)
See fiddle
This answer uses a function to make the comparisons easier during the main selection; however, below is a pure query version:
select * from people p where not exists (select 1 from jsonb_each(p.data) t1
full outer join jsonb_each('{"bar":"def", "foo":"abc"}'::jsonb) t2 on t1.key = t2.key
where t1.key is null or t2.key is null or not exists
(select 1 from jsonb_array_elements(t1.value) v where v = t2.value))
See fiddle

BigQuery - How to unnest multiple nested values

I am trying to Select the two values I have highlighted in the image (attributes.price.list.item.net AND attributes.price.list.item.listPrice.gross)
I am using the following snippet but it just flattens the whole list array and returns every column within. If I try to unnest any other way, I only get errors. How can I unnest multiple nested arrays like this?
SELECT attributes.price.list
FROM my_table LEFT JOIN UNNEST(attributes.price.list)
Consider below approach
SELECT
el.item.net,
el.item.listPrice.gross
FROM my_table
LEFT JOIN UNNEST(attributes.price.list) el

Databricks spark sql to show associated strings from hashed strings

I'm using a query in databricks like this :
select * from thisdata where hashed_string in (sha2("mystring1", 512),sha2("mystring2", 512),sha2("mystring3", 512))
This works well and gives me the data I need, but is there a way to show the associated string to the hashed string?
example
mystring1 - 1494219340aa5fcb224f6b775782f297ba5487
mystring2 - 5430af17738573156426276f1e01fc3ff3c9e1
Probably not as theres a reason for it to be hashed, but just checking if there is a way.
If you have table with string and corresponding hash columns then you can perform inner join instead of using IN clause. After joining, using concat_ws function you can get the required result.
Let's say, you create a table with name hashtable where you have columns mystring and hashed_mystring and other table name as maintable.
You can use below query to join and extract the result in the required format.
select concat_ws('-',h.mystring, m.hashed_string) from maintable m
inner join hashtable h on m.hashed_string = h.hashed_mystring

jsonb gin index not being used in postgresql

I create the following indexes on jsonb columns in my table:
CREATE INDEX idx_gin_accounts ON t1 USING GIN (accounts jsonb_path_ops);
CREATE INDEX idx_gin_stocks ON t1 USING GIN (stocks jsonb_path_ops);
CREATE INDEX idx_gin_stocks_value ON t1 USING GIN ((stocks-> 'value'));
CREATE INDEX idx_gin_stocks_type ON t1 USING GIN ((stocks-> 'type'));
My query is like this:
SELECT
t.accounts ->> 'name' as account_name
//other columns
FROM t1 t
left join lateral jsonb_array_elements(t.accounts) a(accounts)
on 1 = 1 and a.accounts #> '{"role": "ADVISOR"}'
left join lateral jsonb_array_elements(t1.stocks) s(stocks)
on 1 = 1 and s.stocks #> '{"type": "RIC"}'
WHERE (s.stocks -> 'value' ? 'XXX')
When I analyse with EXPLAIN ANALYSE I do not see these indexes being used in the query plan.
Should different indexes be created? Or How can I use these ones to speed up the search?
Say When I pass in (s.stocks-> 'value' ? 'XXX') in where condition, I would want the search to be optimal?
You can not index the results of a set returning function (other than making a materialized view).
We can reason out that if a.accounts #> '{"role": "ADVISOR"}' than it is necessary that t.accounts #> '[{"role": "ADVISOR"}]'. PostgreSQL can't reason that out, but we can.
However, this also won't help, because you are doing left joins. If every single row of t1 is getting returned, what do expect an index to accomplish?
With your added WHERE clause, you can use a JSONPATH (if you are using the latest version of PostgreSQL) to get the rows of t1 that you seem to want. It will use the index on t1 (stocks), either with or without the jsonb_path_ops:
WHERE (s.stocks -> 'value' ? 'XXX') AND
t.stocks #? '$[*] ? (#.type == "RIC" ) ? (exists (#.value.XXX))';
However, the index is not actually very efficient if almost all entries have a type RIC, so this is pyrrhic victory.

Unnesting 3rd level dependency in Google BigQuery

I'm trying to Replace the schema in existing table using BQ. There are certain fields in BQ which have 3-5 level schema dependency.
For Ex. comsalesorders.comSalesOrdersInfo.storetransactionid this field is nested under two fields.
Since I'm using this to replace existing table, I can not change the field names in query.
The query looks similar to this
SELECT * REPLACE(comsalesorders.comSalesOrdersInfo.storetransactionid AS STRING) FROM CentralizedOrders_streaming.orderStatusUpdated, UNNEST(comsalesorders) AS comsalesorders, UNNEST(comsalesorders.comSalesOrdersInfo) AS comsalesorders.comSalesOrdersInfo
BQ enables unnesting first schema field but presents problem for 2nd nesting.
What changes do I need to make to this query to use UNNEST() for such depedndent schemas ?
Given that you don't have a schema, I will try to provide a generalized answer. Please try to understand the difference between the 2 queries.
-- Provide an alias for each unnest (as if each is a separate table)
select c.stuff
from table
left join unnest(table.first_level_nested) a
left join unnest(a.second_level_nested) b
left join unnest(b.third_level_nested) c
-- b and c won't work here because you are 'double unnesting'
select c.stuff
from table
left join unnest(table.first_level_nested) a
left join unnest(first_level_nested.second_level_nested) b
left join unnest(first_level_nested.second_level_nested.third_level_nested) c
I'm not sure I understand your question, but as I could guess, you want to change one column type to another type, such as STRING.
The UNNEST function is only used with columns that are array types, for example:
"comsalesorders":["comSalesOrdersInfo":{}, comSalesOrdersInfo:{}, comSalesOrdersInfo:{}]
But not with this kind of columns:
"comSalesOrdersInfo":{"storeTransactionID":"X1056-943462","ItemsWarrenty":0,"currencyCountry":"USD"}
Therefore, if a didn't misunderstand your question, I would make a query like this:
SELECT *, CAST(A.comSalesOrdersInfo.storeTransactionID as STRING)
FROM `TABLE`, UNNEST(comsalesorders) as A