In BigQuery, how do I check if two ARRAY of STRUCTs are equal

In BigQuery, how do I check if two ARRAY of STRUCTs are equal - sql

I have a query that outputs two array of structs:
SELECT modelId, oldClassCounts, newClassCounts
FROM `xyz`
GROUP BY 1
How do I create another column that is TRUE if oldClassCounts = newClassCounts?
Here is a sample result in JSON:
[
{
"modelId": "FBF21609-65F8-4076-9B22-D6E277F1B36A",
"oldClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
],
"newClassCounts": [
{
"id": "A041EBB1-E041-4944-B231-48BC4CCE025B",
"count": "33"
},
{
"id": "B8E4812B-A323-47DD-A6ED-9DF877F501CA",
"count": "82"
}
]
}
]
I want the equality column to be TRUE if oldClassCounts and newClassCounts are exactly the same like the output above.
Anything else should be false.

I would go about with this solution
#standardSQL
WITH xyz AS (
SELECT "FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] AS oldClassCounts,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)] as newClassCounts),
o as (SELECT modelId, id, count, array_length(oldClassCounts) as len FROM xyz, UNNEST(oldClassCounts) as old_c),
n as (SELECT modelId, id, count, array_length(newClassCounts) as len FROM xyz, UNNEST(newClassCounts) as new_c),
uneq as (select * from o except distinct select * from n)
select xyz.*, IF(uneq.modelId is not null, false, true) as equal from xyz left join (select distinct modelId from uneq) uneq on xyz.modelId = uneq.modelId
It works regardless of the order or having duplicates within the arrays. The idea is that we treat each of the arrays as a separate temporary table removing all elements that exist in one but not the other (using except distinct) and then have an extra check for the length of the arrays in case there are duplicates e.g.
"FBF21609-65F8-4076-9B22-D6E277F1B36A" AS modelId,
[STRUCT("A041EBB1-E041-4944-B231-48BC4CCE025B" as id, "33" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count),
STRUCT("B8E4812B-A323-47DD-A6ED-9DF877F501CA" as id, "82" as count)]

I would consider comparing the result of TO_JSON_STRING function applied on both of these arrays.
In the query it would be done in the following way:
SELECT modelId,
oldClassCounts,
newClassCounts,
CASE WHEN TO_JSON_STRING(oldClassCounts) = TO_JSON_STRING(newClassCounts)
THEN true
ELSE false
END
FROM `xyz`;
I'm not sure about GROUP BY 1 part, because non of the fields are grouped or aggregated.
It is not going to work, if the order of elements in the array is going to be different. This solution is not perfect, but worked for the data you provided.

Related

SingleStore (MemSQL) How to flatten array using JSON_TO_ARRAY

I have a table source:
data
{ "results": { "rows": [ { "title": "A", "count": 61 }, { "title": "B", "count": 9 } ] }}
{ "results": { "rows": [ { "title": "C", "count": 43 } ] }}
And I want a table dest:
title
count
A
61
B
9
C
43
I found there is JSON_TO_ARRAY function that might be helpful, but got stuck how to apply it.
How to correctly flatten the json array from the table?

I have the following that works on your example but it might help you with the syntax.
In this query I created a table called json_tab with a column called jsondata.
With t as (
select table_col AS title FROM json_tab join TABLE(JSON_TO_ARRAY(jsondata::results::rows)))
SELECT t.title::$title title,t.title::$count count FROM t
I took example from the code snippet to work with Nested Arrays in a JSON Column
https://github.com/singlestore-labs/singlestoredb-samples/blob/main/JSON/Analyzing_nested_arrays.sql

Three options I came up with, which are essentially the same:
INSERT INTO dest
WITH t AS(
SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows))
)
SELECT arrRows::$title as title, arrRows::%count as count FROM t;
INSERT INTO dest
SELECT arrRows::$title as title, arrRows::%count as count FROM
(SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows)));
INSERT INTO dest
SELECT t.table_col::$title as title, t.table_col::%count as count
FROM source JOIN TABLE(json_to_array(data::results::rows)) t;

GET last element of array in json column of my Transact SQL table

Thanks for helping.
I have my table CONVERSATIONS structured in columns like this :
[ ID , JSON_CONTENT ]
In the column ID i have a simple id in Varchar
In the column JSON_CONTENT i something like this :
{
"id_conversation" : "25bc8cbffa8b4223a2ed527e30d927bf",
"exchanges": [
{
"A" : "...",
"B": "..."
},
{
"A" : "...",
"B": "..."
},
{
"A" : "...",
"Z" : "..."
}
]
}
I would like to query and get the id and the last element of exchanges :
[ ID , LAST_ELT_IN_EXCHANGE_IN_JSON_CONTENT]
I wanted to do this :
select TOP 3 ID, JSON_QUERY(JSON_CONTENT, '$.exchange[-1]')
from CONVERSATION
But of course Transact SQL is not Python.
I saw theses answers, but i don't know how to applicate to my problem.
Select last value from Json array
Thanks for helping <3

If I understand you correctly, you need an additional APPLY operator and a combination of OPENJSON() and ROW_NUMBER(). The result from the OPENJSON() call is a table with columns key, value and type and when the JSON content is an array, the key column returns the index of the element in the specified array:
Table:
SELECT ID, JSON_CONTENT
INTO CONVERSATION
FROM (VALUES
(1, '{"id_conversation":"25bc8cbffa8b4223a2ed527e30d927bf","exchanges":[{"A":"...","B":"..."},{"A":"...","B":"..."},{"A":"...","Z":"..."}]}')
) v (ID, JSON_CONTENT)
Statement:
SELECT c.ID, j.[value]
FROM CONVERSATION c
OUTER APPLY (
SELECT [value], ROW_NUMBER() OVER (ORDER BY CONVERT(int, [key]) DESC) AS rn
FROM OPENJSON(c.JSON_CONTENT, '$.exchanges')
) j
WHERE j.rn = 1
Result:
ID value
------------------------
1 {
"A" : "...",
"Z" : "..."
}
Notice, that -1 is not a valid array index in your path expression, but you can access the item in a JSON array by index (e.g. '$.exchanges[2]').

Get data from any items in array

PosgreSQL 9.5
Field type: jsonb
Here json
{
"options": [
{
"name": "method"
},
{
"name": "flavor"
},
{
"name": "weight",
"value": {
"name": "300g"
}
}
]
}
And here query that get value of item (weight) with index = 2 from array:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, 2, value, name }' as options_weight_value
FROM product
Nice. It's work fine.
But the problem that weight can be in any index in array. First or second and so on.
So I need to get value of name (300g) in node "weight" .
I need smt like this:
SELECT
id,
product.data #>'{title,en}' AS title_en,
product.data #>>'{options, *, value, name, weight }' as options_weight_value
FROM product
Is it possible ?

I think I found solution:
SELECT
id,
p.data #>'{title,en}' AS title_en,
p.data #>'{weight,qty}' AS weight_qty,
(select *
from jsonb_array_elements(p.data -> 'options') AS options_array
where
options_array ->> 'name' = 'weight'
) #>'{value,name}' as options_weight
from product p
And now find value of weight(if exist) in any array's item. In this example it = 300g

Postgresql search if exists in nested jsonb

I'm new with jsonb request and i got a problem. Inside an 'Items' table, I have 'id' and 'data' jsonb. Here is what can look like a data:
[
{
"paramId": 3,
"value": "dog"
},
{
"paramId": 4,
"value": "cat"
},
{
"paramId": 5,
"value": "fish"
},
{
"paramId": 6,
"value": "",
"fields": [
{
"paramId": 3,
"value": "cat"
},
{
"paramId": 4,
"value": "dog"
}
]
},
{
"paramId": 6,
"value": "",
"fields": [
{
"paramId": 5,
"value": "cat"
},
{
"paramId": 3,
"value": "dog"
}
]
}
]
The value in data is always an array with object inside but sometimes the object can have a 'fields' value with objects inside. It is maximum one level deep.
How can I select the id of the items which as for example an object containing "paramId": 3 and "value": "cat" and also have an object with "paramId": 5 and "value" LIKE '%ish%'.
I already have found a way to do that when the object is on level 0
SELECT i.*
FROM items i
JOIN LATERAL jsonb_array_elements(i.data) obj3(val) ON obj.val->>'paramId' = '3'
JOIN LATERAL jsonb_array_elements(i.data) obj5(val) ON obj2.val->>'paramId' = '5'
WHERE obj3.val->>'valeur' = 'cat'
AND obj5.val->>'valeur' LIKE '%ish%';
but I don't know how to search inside the fields array if fields exists.
Thank you in advance for you help.
EDIT:
It looks like my question is not clear. I will try to make it better.
What I want to do is to find all the 'item' having in the 'data' column objects who match my search criteria. This without looking if the objects are at first level or inside a 'fields' key of an object.
Again for example. This record should be selected if I search:
'paramId': 3 AND 'value': 'cat
'paramId': 4 AND 'value': LIKE '%og%'
the matching ones are in the 'fields' key of the object with 'paramId': 6 and I don't know how to do that.

This can be expressed using a JSON/Path expression without the need for unnesting everything
To search for paramId = 3 and value = 'cat'
select *
from items
where data #? '$[*] ? ( (#.paramId == 3 && #.value == "cat") || exists( #.fields[*] ? (#.paramId == 3 && #.value == "cat")) )'
The $[*] part iterates over all elements of the first level array. To check the elements in the fields array, the exists() operator is used to nest the expression. #.fields[*] iterates over all elements in the fields array and applies the same expression again. I don't see a way how repeating the values could be avoided though.
For a "like" condition, you can use like_regex:
select *
from items
where data #? '$[*] ? ( (#.paramId == 4 && #.value like_regex ".*og.*") || exists( #.fields[*] ? (#.paramId == 4 && #.value like_regex ".*og.*")) )'

For now I have found a solution but it is not really clean and I don't know how it will perform in production with 10M records.
SELECT i.id, i.data
FROM ( -- A;
select it.id, it.data, i as value
from items it,
jsonb_array_elements(it.data) i
union
select it.id, it.data, f as value
from items it,
jsonb_array_elements(it.data) i,
jsonb_array_elements(i -> 'fields') f
) as i
WHERE (i.value ->> 'paramId' = '5' -- B1;
AND i.value ->> 'value' LIKE '%ish%')
OR (i.value ->> 'paramId' = '3' -- B2;
AND i.value ->> 'value' = 'cat')
group by i.id, i.data
having COUNT(*) >= 2; -- C;
A: I "flatten" the first and second level (second level is in 'fields' key)
B1, B2: These are my search criteria
C: I make sure the fields have all the criteria matching. If 3 criteria --> COUNT(*) >=3
It really doesn't look clean to me. It is working for dev purpose but I think there is a better way to do it.
If somebody have an idea Big thanks to him/her!

jsonb LIKE query on nested objects in an array

My JSON data looks like this:
[{
"id": 1,
"payload": {
"location": "NY",
"details": [{
"name": "cafe",
"cuisine": "mexican"
},
{
"name": "foody",
"cuisine": "italian"
}
]
}
}, {
"id": 2,
"payload": {
"location": "NY",
"details": [{
"name": "mbar",
"cuisine": "mexican"
},
{
"name": "fdy",
"cuisine": "italian"
}
]
}
}]
given a text "foo" I want to return all the tuples that have this substring. But I cannot figure out how to write the query for the same.
I followed this related answer but cannot figure out how to do LIKE.
This is what I have working right now:
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (SELECT ARRAY (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload,
details}')
)
) AS d(details)
WHERE d.details #> '{cafe}';
Instead of passing the whole text of cafe I want to pass ca and get the results that match that text.

Your solution can be simplified some more:
SELECT r.res->>'name' AS feature_name, d.name AS detail_name
FROM restaurants r
, jsonb_populate_recordset(null::foo, r.res #> '{payload, details}') d
WHERE d.name LIKE '%oh%';
Or simpler, yet, with jsonb_array_elements() since you don't actually need the row type (foo) at all in this example:
SELECT r.res->>'name' AS feature_name, d->>'name' AS detail_name
FROM restaurants r
, jsonb_array_elements(r.res #> '{payload, details}') d
WHERE d->>'name' LIKE '%oh%';
db<>fiddle here
But that's not what you asked exactly:
I want to return all the tuples that have this substring.
You are returning all JSON array elements (0-n per base table row), where one particular key ('{payload,details,*,name}') matches (case-sensitively).
And your original question had a nested JSON array on top of this. You removed the outer array for this solution - I did the same.
Depending on your actual requirements the new text search capability of Postgres 10 might be useful.

I ended up doing this(inspired by this answer - jsonb query with nested objects in an array)
SELECT r.res->>'name' AS feature_name, d.details::text
FROM restaurants r
, LATERAL (
SELECT * FROM json_populate_recordset(null::foo, r.res#>'{payload, details}')
) AS d(details)
WHERE d.details LIKE '%oh%';
Fiddle here - http://sqlfiddle.com/#!15/f2027/5

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

In BigQuery, how do I check if two ARRAY of STRUCTs are equal - sql

Related

SingleStore (MemSQL) How to flatten array using JSON_TO_ARRAY

GET last element of array in json column of my Transact SQL table

Get data from any items in array

Postgresql search if exists in nested jsonb

jsonb LIKE query on nested objects in an array

Categories

Resources