sql how to select row based on json format field value? - sql

I have a table ,one of the table filed is json format:
my_table
id json_field
1 { "to_status": 7, "to_status_name": "In Progress", "role": "admin"}
2 { "to_status": 3, "to_status_name": "Completed", "role": "admin"}
3 { "to_status": 2, "to_status_name": "Completed", "role": "customer"}
How can I select all rows that "to_status" is 3 or 2 ?
Any friend can help?

You can use
SELECT * FROM my_table
WHERE JSON_EXTRACT(json_field, '$.to_status') IN (3, 2)
or
SELECT * FROM my_table
WHERE json_field->>'$.to_status' IN (3, 2)
to select what you need.

When extracted Json values are strings. So you need to test for string values or cast to_status extracted values to an integer:
SELECT * FROM my_table WHERE json_field->>'to_status' in ('2','3');
or
SELECT * FROM my_table WHERE (json_field->>'to_status')::int in (2,3);
But then you can get the same results extracting the to_status_name:
SELECT * FROM my_table WHERE json_field->>'to_status_name' = 'Completed';

Related

Postgresql, JSONB, how to query a joined string from an array of objects

I've a JSONB inside a PostgreSQL table with this structure (more or less)
{
"obj1": {
"obj2": {
"obj3": [
{
"obj4": {
"obj": "A"
}
},
{
"obj4": {
"obj": "B"
}
}
]
}
}
}
Then my obj3 is an array of objects and I wanna the obj inside obj4 separated by comma.
Thus what I really need is something like:
1 | A,B
2 | C,D
3 | NULL
I'm using PostgreSql 14. Any help is going to be appreciate.
and I've got this
SELECT t.id,
jsonb_path_query(t.b,
'$."obj1"."obj2"."obj3"[*]."obj4"."obj"' ::jsonpath) AS obj5
FROM (VALUES(1,
'{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"A"}},{"obj4":{"obj":"B"}}]}}}'
::jsonb), (2,
'{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"C"}},{"obj4":{"obj":"D"}}]}}}'
::jsonb), (3, '{}' ::jsonb)) t(id, b);
But the json_path_query multiply the rows and remove not found results as well...
You need to group the resulting rows by t.id so that to group A & B, and C & D on the same row while using the string_agg function to group them in the same resulting column with ',' as separator.
But to do so, you need first to switch the jsonb_path_query function from the SELECT clause to the FROM clause while introducing a LEFT JOIN so that to keep the rows with no output from the jsonb_path_query function.
The solution is :
select t.id, string_agg(obj5->>'obj', ',') AS result
from (
values (1, '{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"A"}},{"obj4":{"obj":"B"}}]}}}'::jsonb),
(2, '{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"C"}},{"obj4":{"obj":"D"}}]}}}'::jsonb),
(3, '{}'::jsonb)
) t(id, b)
left join lateral jsonb_path_query(t.b, '$.obj1.obj2.obj3[*].obj4') as obj5
on TRUE
group by t.id;
see dbfiddle
Inside-out: climb the object tree, flatten the array and then select/aggregate. DB fiddle
select id, (
select string_agg(j->'obj4'->>'obj', ',')
from jsonb_array_elements(b->'obj1'->'obj2'->'obj3') as j
) as objlist
from the_table;
id
objlist
1
A,B
2
C,D
3
For the sake of clarity/reuse, I'd create a function to convert the jsonb array to a Postgres array.
CREATE OR REPLACE FUNCTION jsonb_text_array(jsonb)
RETURNS text[]
LANGUAGE sql
PARALLEL SAFE
LEAKPROOF
STRICT
AS $$
SELECT array_agg(t)
FROM jsonb_array_elements_text($1) x(t)
;
$$;
Then this query should return what you want.
SELECT t.id
, jsonb_text_array(jsonb_path_query_array(t.b, '$.obj1.obj2.obj3.obj4.obj')) AS obj5
FROM ( VALUES
(1, '{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"A"}},{"obj4":{"obj":"B"}}]}}}'::jsonb)
, (2, '{"obj1":{"obj2":{"obj3":[{"obj4":{"obj":"C"}},{"obj4":{"obj":"D"}}]}}}'::jsonb)
, (3, '{}'::jsonb)
) t(id, b);
If you really want a string returned instead of an array, change the function to return text instead of text[] and use string_agg(t, ',') instead of array_agg(t).

Redshift Postgresql - How to Parse Nested JSON

I am trying to parse a JSON text using JSON_EXTRACT_PATH_TEXT() function.
JSON sample:
{
"data":[
{
"name":"ping",
"idx":0,
"cnt":27,
"min":16,
"max":33,
"avg":24.67,
"dev":5.05
},
{
"name":"late",
"idx":0,
"cnt":27,
"min":8,
"max":17,
"avg":12.59,
"dev":2.63
}
]
}
'
I tried JSON_EXTRACT_PATH_TEXT(event , '{"name":"late"}', 'avg') function to get 'avg' for name = "late", but it returns blank.
Can anyone help, please?
Thanks
This is a rather complicated task in Redshift, that, unlike Postgres, has poor support to manage JSON, and no function to unnest arrays.
Here is one way to do it using a number table; you need to populate the table with incrementing numbers starting at 0, like:
create table nums as
select 0 i union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 n union all select 6
union all select 7 union all select 8 union all select 9
;
Once the table is created, you can use it to walk the JSON array using json_extract_array_element_text(), and check its content with json_extract_path_text():
select json_extract_path_text(item, 'avg') as my_avg
from (
select json_extract_array_element_text(t.items, n.i, true) as item
from (
select json_extract_path_text(mycol, 'data', true ) as items
from mytable
) t
inner join nums n on n.i < json_array_length(t.items, true)
) t
where json_extract_path_text(item, 'name') = 'late';
You'll need to use json_array_elements for that:
select obj->'avg'
from foo f, json_array_elements(f.event->'data') obj
where obj->>'name' = 'late';
Working example
create table foo (id int, event json);
insert into foo values (1,'{
"data":[
{
"name":"ping",
"idx":0,
"cnt":27,
"min":16,
"max":33,
"avg":24.67,
"dev":5.05
},
{
"name":"late",
"idx":0,
"cnt":27,
"min":8,
"max":17,
"avg":12.59,
"dev":2.63
}]}');

How to parse JSON in Standard SQL BigQuery?

After streaming some json data into BQ, we have a record that looks like:
"{\"Type\": \"Some_type\", \"Identification\": {\"Name\": \"First Last\"}}"
How would I extract the type from this? E.g. I would like to get Some_type.
I tried all possible combinations shown in https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions without success, namely, I thought:
SELECT JSON_EXTRACT_SCALAR(raw_json , "$[\"Type\"]") as parsed_type FROM `table` LIMIT 1000
is what I need. However, I get:
Invalid token in JSONPath at: ["Type"]
Picture of rows preview
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, "{\"Type\": \"Some_type\", \"Identification\": {\"Name\": \"First Last\"}}" raw_json UNION ALL
SELECT 2, '{"Type": "Some_type", "Identification": {"Name": "First Last"}}'
)
SELECT id, JSON_EXTRACT_SCALAR(raw_json , "$.Type") AS parsed_type
FROM `project.dataset.table`
with result
Row id parsed_type
1 1 Some_type
2 2 Some_type
See below update example - take a look at third record which I think mimic your case
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, "{\"Type\": \"Some_type\", \"Identification\": {\"Name\": \"First Last\"}}" raw_json UNION ALL
SELECT 2, '''{"Type": "Some_type", "Identification": {"Name": "First Last"}}''' UNION ALL
SELECT 3, '''"{\"Type\": \"
null1\"}"
'''
)
SELECT id,
JSON_EXTRACT_SCALAR(REGEXP_REPLACE(raw_json, r'^"|"$', '') , "$.Type") AS parsed_type
FROM `project.dataset.table`
with result
Row id parsed_type
1 1 Some_type
2 2 Some_type
3 3 null1
Note: I use null1 instead of null so you can easily see that it is not a NULL but rather string null1

How do I find elements in an array in BigQuery

I am trying to search for a row that has certain key value pairs in an array. A row in my BigQuery table would look something like this.
{
"ip": "192.168.1.1",
"cookie" [
{
"key": "apple",
"value: "red"
},
{
"key": "orange",
"value: "orange"
},
{
"key": "grape",
"value: "purple"
}
]
}
I thought about using implicit UNNEST or CROSS JOIN like the following, but it didn't work because unnesting it would just create multiple different rows.
SELECT ip
FROM table t, t.cookie c
WHERE (c.key = "grape" AND c.value ="purple") AND (c.key = "orange" AND c.value ="orange")
This link is really close to what I want to do, except they are using legacy SQL and not standardSQL
#standardSQL
SELECT ip
FROM yourTable
WHERE (
SELECT COUNT(1)
FROM UNNEST(cookie) AS pair
WHERE pair IN (('grape', 'purple'), ('orange', 'orange'))
) >= 2
you can test it with below dummy data
#standardSQL
WITH yourTable AS (
SELECT '192.168.1.1' AS ip, [('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
SELECT '192.168.1.2', [('abc', 'xyz')]
)
SELECT ip
FROM yourTable
WHERE (
SELECT COUNT(1)
FROM UNNEST(cookie) AS pair
WHERE pair IN (('grape', 'purple'), ('orange', 'orange'))
) >= 2
In case if you need output ip if at least one pair is in array - you need to change >= 2 to >=1 in WHERE clause
Mikhail's solution is good if it is guaranteed that there are no duplicate pairs in the cookie array. But if there could be duplicates, here is the alternative solution:
#standardSQL
WITH yourTable AS (
SELECT
'192.168.1.1' AS ip,
[('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
SELECT
'192.168.1.2',
[('abc', 'xyz'), ('orange', 'orange'), ('orange', 'orange')]
)
SELECT ip
FROM yourTable t
WHERE (
('grape', 'purple') IN UNNEST(t.cookie) AND
('orange', 'orange') IN UNNEST(t.cookie) )
Results in only
ip
-----------
192.168.1.1

Output data after updating SQL

imagine there are 2 tables :
T_Customer (p_customer_id, name, prename, country, age)
and
T_SomeInfo (f_customer_id, somebit, otherbit)
Now I want to update 1 random somebit and OUTPUT updated T_Customer, which belongs to f_customer_id of effected row.
Atm I've following statement :
UPDATE randombit SET randombit.somebit= 1
OUTPUT inserted.f_customer_id
FROM
(
SELECT TOP 1 * FROM T_SomeInfo
WHERE somebit= 0 AND otherbit = 0
ORDER BY NEWID()
) AS randombit
So I f_customer_id of my updated row.
But I'm not able to build a valid statement to OUTPUT a value from another table.
This is a statement I tried without success:
UPDATE randombit SET randombit.somebit= 1
OUTPUT customer.*
FROM T_Customer AS customer
WHERE customer.f_customer_id = inserted.f_customer_id
FROM
(
SELECT TOP 1 * FROM T_SomeInfo
WHERE somebit= 0 AND otherbit = 0
ORDER BY NEWID()
) AS randombit
Is there any solution to update and output (with INNER JOIN or SELECT) into one statement?
EDIT as example:
There are 2 customers :
T_Customer (1, "Smith", "John", "country", 10)
T_Customer (2, "John", "William", "country2", 20)
actually a update
UPDATE randombit SET randombit.somebit= 1
OUTPUT inserted.f_customer_id
FROM
(
SELECT TOP 1 * FROM T_SomeInfo
WHERE somebit= 0 AND otherbit = 0
ORDER BY NEWID()
) AS randombit
will output (if he's the random winner):
1
But I want to see
1, "Smith", "John", "country", 10