JSONB sort aggregation - sql

I found this query that suits my needs thanks to this answer here in order to sort fields of data in a JSON document.
(Fake, generated random data)
SELECT jsonb_agg(elem)
FROM (
SELECT *
FROM jsonb_array_elements('[{
"id": "1",
"first_name": "Maximo",
"last_name": "Sambiedge",
"email": "msambiedge0#economist.com",
"gender": "Male",
"ip_address": "242.145.232.65"
}, {
"id": "2",
"first_name": "Maria",
"last_name": "Selland",
"email": "aselland1#sitemeter.com",
"gender": "Female",
"ip_address": "184.174.58.32"
}]') a(elem)
ORDER BY (elem->>'email') -- order by integer value of "ts"
) sub;
As we can see, this works with hardcoded data which doesn't quite fit my needs. I can't seem to figure out how to replace the JSON data with the jsonb column in my table.
My attempt below yields 'data is not defined'
SELECT jsonb_agg(elem), (SELECT data FROM file_metadata)
FROM (
SELECT *
FROM jsonb_array_elements(data) a(elem)
ORDER BY (elem->>'email')
) sub;
My suspicions are that a subquery is needed inside the FROM clause?
Here is a SQLFiddle of my issue to help describe the table and how the structure is defined: http://sqlfiddle.com/#!17/41102/92

You are almost there. You just need to bring in the original table, like so:
SELECT jsonb_agg(elem)
FROM (
SELECT elem
FROM file_metadata, jsonb_array_elements(data) a(elem)
ORDER BY (elem->>'email')
) sub;
Updated DB Fiddle

Related

PostgreSQL array of object intersection

Given I have rows in my database, with a JSONB column that holds an array of items as such:
[
{"type": "human", "name": "Alice"},
{"type": "dog", "name": "Fido"},
{"type": "dog", "name": "Pluto"}
]
I need to be able to query rows based on this column. The query I want to write is a check to see if my array argument intersects, at any point, with this column.
Eg:
If I search for [{"type": "human", "name": "Alice"}], I should get a hit.
If I search for [{"type": "human", "name": "Alice"}, {"type": "dog", "name": "Doggy"}] I should also get a hit (Since one of the objects intersects)
I've tried using the ?| operator, but according to the docs, comparison is only made by keys. I need to match the entire jsonb object
You can use exists with cross join:
select t.* from tbl t where exists (select 1 from jsonb_array_elements(t.items) v
cross join jsonb_array_elements('[{"type": "human", "name": "Alice"}, {"type": "dog", "name": "Doggy"}]'::jsonb) v1
where v.value = v1.value)
See fiddle.
As a function:
create or replace function get_results(param jsonb)
returns table(items jsonb)
as $$
select t.* from tbl t where exists (select 1 from jsonb_array_elements(t.items) v
cross join jsonb_array_elements(param) v1
where v.value = v1.value)
$$ language sql;
See fiddle.

In Snowflake, how do I parse an unnamed JSON array and access each key with key value rather than array slicing methods?

I have a JSON array in Snowflake, where I need to parse into a table, i.e., convert all information into a row. Assume my table includes two columns: id and json_tag column.
An example row is as below:
{ [
json_tag
[
{
"key": "app.name",
"value": "myapp1"
},
{
"key": "device.name",
"value": "myiPhone11"
},
{
"key": "iOS.dist",
"value": "latestDist5"
}
]
This is not a standard example, another example can have 5 or even 6 with new key names. An example is below:
{ [
json_tag
[
{
"key": "app.name",
"value": "myapp2"
},
{
"key": "app.cost",
"value": "$2.99"
},
{
"key": "device.name",
"value": "myiPhoneX"
},
{
"key": "device.color",
"value": "gold"
},
{
"key": "iOS.dist",
"value": "latestDist4.9"
}
]
What I want is working on a table (without creating a new one) where the json row is split into columns as below:
id app.name app.cost device.name device.color iOS.dist
1 myapp1 null myiPhone11 null latestDist5
2 myapp2 $2.99 myiPhoneX gold latestDist4.9
I tried the following snippet:
with parsed_tb as (
select id,
to_variant(parse_json(json_tag)) as parsed_json_tag
from mytable
)
select parsed_json_tag[0]:value::varchar as app_name,
parsed_json_tag[1]:value::varchar as app_cost,
parsed_json_tag[2]:value::varchar as device_name
from tb;
As you can imagine, the snippet above does not work when there is no app.cost in key values or every row differs in number of keys and values.
I tried lateral flatten command in Snowflake, but out creates many rows and I cannot figure out how to put them in columns in the same row. I also tried using recursive command, and could not achieve it.
So my question is:
How can I access a key by its name rather than slicing an array as I do above? - this would solve my problem I guess.
If #1 solution I imagine does not fix, how can I attain the table above?
I have found a solution to this problem with the help of following thread:
Mysql, reshape data from long / tall to wide
Solution is first parsing json column with unique identifier, and then lateral flattening and then grouping the results by unique identifier [formatting table from long to wide as given in the link above].
So for the examples I provided above, the solution would be as follows:
with tb as (
select id,
to_variant(parse_json(json_tags)) as parsed_json_tags
from mytable
),
flattened as (
select id,
VALUE:value::string as value,
VALUE:key::string as key
from mytable, lateral flatten(input => (tb.parsed_json_tags))
)
select id,
MAX( IFF( key='app.name', value, NULL ) ) AS app_name,
MAX( IFF( key='app.cost', value, NULL ) ) AS app_cost,
MAX( IFF( key='device.name', value, NULL ) ) AS device_name,
MAX( IFF( key='device.color', value, NULL ) ) AS device_color,
MAX( IFF( key='iOS.dist', value, NULL ) ) AS ios_dist
from flattened
group by 1;

How to parse this array of dicts and extract key columns from a big query external table

I have this Gaint Array of (dicts) loaded from a Json in a date partitioned big query external table with table structure as below as
Field name
Type.
Mode
meta
Record
Nullable
Messages
String
Repeated
date
Integer
Nullable
Every "Messages" Field is in its own row/record in my Bigquery table (New_line_delimited_Json)
I am trying to parse the "messages" field/column to extract some fields Key1 and Key2 which happens to be inside an Array (of dicts). For sake of simplicity ,below is the snippet of json of which "messages" is a field that I am trying to unnest/explode.
Ignore this schema;updated schema below***
[
{
"meta": {
"table": "FEED",
"source": "CP1"
},
"Messages": [
"{
"Key1":"2022-01-10",
"Key2":"H21257061"
}"
],
"date": "20220110"
},
{
"meta": {
"table": "FEED",
"source": "CP1"
},
"Messages": [
"{
"Key1":"2022-01-11",
"Key2":"H21257062"
}"
],
"date": "20220111"
}
]
updated schema on 01/17
{
"meta": {
"table": "FEED",
"source": "CP1"
},
"Messages": [
"{
"Key1":"2022-01-10",
"Key2":"H21257061"
}",
"{
"Key1":"2022-01-10",
"Key2":"H21257062"
}"
],
"date": "20220110"
},
updated schema representation on 01/17:
so far I have tried this but I am getting sql output of key1 and Key2 as Nulls
WITH table AS (SELECT Messages as array_column FROM `project.dataset.table` )
SELECT
json_extract_scalar(flattened_array, '$.Messages.key1') as key1,
json_extract_scalar(flattened_array, '$.Messages.key2') as key2
FROM table t
CROSS JOIN UNNEST(t.array_column) AS flattened_array
Still a little ambiguous so I assume below correctly represents your table (at least it matches the structure/schema in your question)
If my assumption correct - consider below approach
select * except(id) from (
select to_json_string(t) id, kv[offset(0)] as key, kv[safe_offset(1)] as value
from your_table t,
t.messages as message,
unnest([struct( split(translate(message, '"', ''), ':') as kv)])
)
pivot (min(value) for key in ('Key1', 'Key2'))
If / when applied to above sample data - output is
Edit: Trying to help you further - Ok so, looks like your table looks like below
In this case - try below (quite light modification of previous version)
select * except(id) from (
select to_json_string(t) id, kv[offset(0)] as key, kv[safe_offset(1)] as value
from your_table t,
unnest(regexp_extract_all(messages, r'"[^"]+":"[^"]+"')) as message,
unnest([struct( split(translate(message, '"', ''), ':') as kv)])
)
pivot (min(value) for key in ('Key1', 'Key2'))
with output
But obviously, I would use below simplest approach
select
json_extract_scalar(messages, '$.Key1') as Key1,
json_extract_scalar(messages, '$.Key2') as Key2
from your_table
SELECT
JSON_QUERY(message,"$.Key1") as Key1,
JSON_QUERY(message,"$.Key2") as Key2
FROM
`project.dataset.table` as table
CROSS JOIN UNNEST(table.Messages) as message
CROSS JOIN for flattening the array,which will return a row for each message.
After that “JSON_QUERY” to extract the needed values from the JSON string.

Query all objects within json array

I'm looking to do a query on a column in my database but the column is of type jsonb. This is an example of the structure:
select json_column->>'left' from schema.table;
[{"id": 123, "name": "Joe"},
{"id": 456, "name": "Jane"},
{"id": 789, "name": "John"},
{"id": 159, "name": "Jess"}]
Essentially I'm just trying to return all the name fields from this but I can't figure it out.
I have tried
select json_column->'left'->>'name' from schema.table
But this returns a blank value just.
I have also tried:
select elem->>'name'
from schema.table m,
jsonb_array_elements(json_column->'left') elem;
But that gives me:
ERROR: cannot extract elements from an object
This seems to work when I have a where clause inserted, for example:
select elem->>'name'
from schema.table m,
jsonb_array_elements(json_column->'left') elem
where m.id = 1;

Postgres JSONB do a select from an array of data

I'm using a Postgres database and I'm trying to use the new JSONB type. I have a table named employees with a column named previous_companies that contains the following JSON data:
[{"company":"Facebook", "link": "www.facebook.com"}, {"company":"Google", "link": "www.google.com"}, {"company":"Some Face", "link": "www.someface.com"}]
I'm trying to select all the employees that have certain string for the field "company", for example:
If I want all the employees that worked on a company, that has "face" on it's name I would have:
[{"company":"Facebook", "link": "www.facebook.com"}, {"company":"Some Face", "link": "www.someface.com"}]
I was able to do a query for the EXACT string, like this:
SELECT * FROM employees WHERE previous_companies #> '[{"company":"Facebook"}]'
but it returns this: [{"company":"Facebook", "link": "www.facebook.com"}]
As you can see this does not support querying for incomplete strings.
Thanks!
jsonb_array_elements() function may be helpful for querying by array JSONB column:
SELECT
id,
to_jsonb(array_agg(previous_company)) AS previous_companies
FROM (
SELECT
id,
jsonb_array_elements(previous_companies) AS previous_company
FROM ( VALUES
('id1', '[{"company":"Facebook", "link": "www.facebook.com"},{"company":"Google", "link": "www.google.com"}, {"company":"Some Face", "link": "www.someface.com"}]'::jsonb),
('id2', '[{"company":"Some Face", "link": "www.someface.com"}]'::jsonb),
('id3', '[{"company":"Google", "link": "www.google.com"}]'::jsonb)
) employees (id, previous_companies)
) T
WHERE
lower(previous_company->>'company') LIKE '%face%'
GROUP BY
id
;