How to iterate on json data with sql/knexjs query - sql

I'm using postgresql db.I have a table named 'offers' which has a column 'validity' which contains the following data in JSON format:
[{"end_date": "2019-12-31", "program_id": "4", "start_date": "2019-10-27"},
{"end_date":"2020-12-31", "program_id": "6", "start_date": "2020-01-01"},
{"end_date": "2020-01-01", "program_id": "3", "start_date": "2019-10-12"}]
Now I want to get all records where 'validity' column contains:
program_id = 4 and end_date > current_date.
How to write SQL query or knexjs query to achieve this?
Thanks in advance

You can use an EXISTS condition:
select o.*
from offers o
where exists (select *
from jsonb_array_elements(o.validity) as v(item)
where v.item ->> 'program_id' = '3'
and (v.item ->> 'end_date')::date > current_date)
Online example

Related

BigQuery unnest to new columns instead of new rows

I have a BigQuery table with nested and repeated columns that I'm having trouble unnesting to columns instead of rows.
This is what it the table looks like:
{
"dateTime": "Tue Mar 01 2022 11:11:11 GMT-0800 (Pacific Standard Time)",
"Field1": "123456",
"Field2": true,
"Matches": [
{
"ID": "abc123",
"FieldA": "asdfljadsf",
"FieldB": "0.99"
},
{
"ID": "def456",
"FieldA": "sdgfhgdkghj",
"FieldB": "0.92"
},
{
"ID": "ghi789",
"FieldA": "tfgjyrtjy",
"FieldB": "0.64"
}
]
},
What I want is to return a table with each one of the nested fields as an individual column so I have a clean dataframe to work with:
{
"dateTime": "Tue Mar 01 2022 11:11:11 GMT-0800 (Pacific Standard Time)",
"Field1": "123456",
"Field2": true,
"Matches_1_ID": "abc123",
"Matches_1_FieldA": "asdfljadsf",
"Matches_1_FieldB": "0.99",
"Matches_2_ID": "def456",
"Matches_2_FieldA": "sdgfhgdkghj",
"Matches_2_FieldB": "0.92",
"Matches_3_ID": "ghi789",
"Matches_3_FieldA": "tfgjyrtjy",
"Matches_3_FieldB": "0.64"
},
I tried using UNNEST as below, however it creates new rows with only one set of additional columns, so not what I'm looking for.
SELECT *
FROM table,
UNNEST(Matches) as items
Any solution for this? Thank you in advance.
Consider below approach
select * from (
select t.* except(Matches), match.*, offset + 1 as offset
from your_table t, t.Matches match with offset
)
pivot (min(ID) as id, min(FieldA) as FieldA, min(FieldB) as FieldB for offset in (1,2,3))
If applied to sample data in y our question - output is
if there is are no matches, that entire row is not included in the output table, whereas I want to keep that record with the fields it does have. How can I modify to keep all records?
use left join instead as in below
select * from (
select t.* except(Matches), match.*, offset + 1 as offset
from your_table t left join t.Matches match with offset
)
pivot (min(ID) as id, min(FieldA) as FieldA, min(FieldB) as FieldB for offset in (1,2,3))

Select items from jsonb array in postgres 12

I'm trying to pull elements from JSONB column.
I have table like:
id NUMBER
data JSONB
data structure is:
[{
"id": "abcd",
"validTo": "timestamp"
}, ...]
I'm querying that row with SELECT * FROM testtable WHERE data #> '[{"id": "abcd"}]', and it almost works like I want to.
The trouble is data column is huge, like 100k records, so I would like to pull only data elements I'm looking for.
For example if I would query for
SELECT * FROM testtable WHERE data #> '[{"id": "abcd"}]' OR data #> '[{"id": "abcde"}]' I expect data column to contain only records with id abcd or abcde. Like that:
[
{"id": "abcd"},
{"id": "abcde"}
]
It would be okay if query would return separate entries with single data record.
I have no ideas how to solve it, trying lot options since days.
To have separate output for records having multiple matches
with a (id, data) as (
values
(1, '[{"id": "abcd", "validTo": 2}, {"id": "abcde", "validTo": 4}]'::jsonb),
(2, '[{"id": "abcd", "validTo": 3}, {"id": "abc", "validTo": 6}]'::jsonb),
(3, '[{"id": "abc", "validTo": 5}]'::jsonb)
)
select id, jsonb_array_elements(jsonb_path_query_array(data, '$[*] ? (#.id=="abcd" || #.id=="abcde")'))
from a;
You will need to unnest, filter and aggregate back:
select t.id, j.*
from testtable t
join lateral (
select jsonb_agg(e.x) as data
from jsonb_array_elements(t.data) as e(x)
where e.x #> '{"id": "abcd"}'
or e.x #> '{"id": "abcde"}'
) as j on true
Online example
With Postgres 12 you could use jsonb_path_query_array() as an alternative, but that would require to repeat the conditions:
select t.id,
jsonb_path_query_array(data, '$[*] ? (#.id == "abcd" || #.id == "abcde")')
from testtable t
where t.data #> '[{"id": "abcd"}]'
or t.data #> '[{"id": "abcde"}]'
Didn't quite get your question.Are you asking that the answer should only contain data column without id column .Then I think this is the query:
Select data from testtable where id="abcd" or id="abcde";

JSONB subset of array

I have the following JSONB field on my posts table: comments.
comments looks like this:
[
{"id": 1, "text": "My comment"},
{"id": 2, "text": "My other comment"},
]
I would like to select some information about each comments.
SELECT comments->?????? FROM posts WHERE posts.id = 1;
Is there a way for me to select only the id fields of my JSON. Eg. the result of my SQL query should be:
[{"id": 1}, {"id": 2}]
Thanks!
You can use jsonb_to_recordset to split each comment into its own row. Only columns you specify will end up in the row, so you can use this to keep only the id column. Then you can aggregate the comments for one post into an array using json_agg:
select json_agg(c)
from posts p
cross join lateral
jsonb_to_recordset(comments) c(id int) -- Only keep id
where p.id = 1
This results in:
[{"id":1},{"id":2}]
Example at db-fiddle.com
You may use json_build_object + json_agg
select json_agg(json_build_object('id',j->>'id'))
from posts cross join jsonb_array_elements(comments) as j
where (j->>'id')::int = 1;
DEMO

Extracting data from an array of JSON objects for specific object values

In my table, there is a column of JSON type which contains an array of objects describing time offsets:
[
{
"type": "start"
"time": 1.234
},
{
"type": "end"
"time": 50.403
}
]
I know that I can extract these with JSON_EACH() and JSON_EXTRACT():
CREATE TEMPORARY TABLE Items(
id INTEGER PRIMARY KEY,
timings JSON
);
INSERT INTO Items(timings) VALUES
('[{"type": "start", "time": 12.345}, {"type": "end", "time": 67.891}]'),
('[{"type": "start", "time": 24.56}, {"type": "end", "time": 78.901}]');
SELECT
JSON_EXTRACT(Timings.value, '$.type'),
JSON_EXTRACT(Timings.value, '$.time')
FROM
Items,
JSON_EACH(timings) AS Timings;
This returns a table like:
start 12.345
end 67.891
start 24.56
end 78.901
What I really need though is to:
Find the timings of specific types. (Find the first object in the array that matches a condition.)
Take this data and select it as a column with the rest of the table.
In other words, I'm looking for a table that looks like this:
id start end
-----------------------------
0 12.345 67.891
1 24.56 78.901
I'm hoping for some sort of query like this:
SELECT
id,
JSON_EXTRACT(timings, '$.[type="start"].time'),
JSON_EXTRACT(timings, '$.[type="end"].time')
FROM Items;
Is there some way to use path in the JSON functions to select what I need? Or, some other way to pivot what I have in the first example to apply to the table?
One possibility:
WITH cte(id, json) AS
(SELECT Items.id
, json_group_object(json_extract(j.value, '$.type'), json_extract(j.value, '$.time'))
FROM Items
JOIN json_each(timings) AS j ON json_extract(j.value, '$.type') IN ('start', 'end')
GROUP BY Items.id)
SELECT id
, json_extract(json, '$.start') AS start
, json_extract(json, '$.end') AS "end"
FROM cte
ORDER BY id;
which gives
id start end
---------- ---------- ----------
1 12.345 67.891
2 24.56 78.901
Another one, that uses the window functions added in sqlite 3.25 and avoids creating intermediate JSON objects:
SELECT DISTINCT Items.id
, max(json_extract(j.value, '$.time'))
FILTER (WHERE json_extract(j.value, '$.type') = 'start') OVER ids AS start
, max(json_extract(j.value, '$.time'))
FILTER (WHERE json_extract(j.value, '$.type') = 'end') OVER ids AS "end"
FROM Items
JOIN json_each(timings) AS j ON json_extract(j.value, '$.type') IN ('start', 'end')
WINDOW ids AS (PARTITION BY Items.id)
ORDER BY Items.id;
The key is using the ON clause of the JOIN to limit results to just the two objects in each array that you care about, and then merging those up to two rows for each Items.id into one with a couple of different approaches.

How to query all entries with a value in a nested bigquery table

I generated a BigQuery table using an existing BigTable table, and the result is a multi-nested dataset that I'm struggling to query from. Here's the format of an entry from that BigQuery table just doing a simple select * from my_table limit 1:
[
{
"rowkey": "XA_1234_0",
"info": {
"column": [],
"somename": {
"cell": [
{
"timestamp": "1514357827.321",
"value": "1234"
}
]
},
...
}
},
...
]
What I need is to be able to get all entries from my_table where the value of somename is X, for instance. There will be multiple rowkeys where the value of somename will be X and I need all the data from each of those rowkey entries.
OR
If I could have a query where rowkey contains X, so to get "XA_1234_0", "XA_1234_1"... The "XA" and the "0" can change but the middle numbers to be the same. I've tried doing a where rowkey like "$_1234_$" but the query goes on for over a minute and is way too long for some reason.
I am using standard SQL.
EDIT: Here's an example of a query I tried that didn't work (with error: Cannot access field value on a value with type ARRAY<STRUCT<timestamp TIMESTAMP, value STRING>>), but best describes what I'm trying to achieve:
SELECT * FROM `my_dataset.mytable` where info.field_name.cell.value=12345
I want to get all records whose value in field_name equals some value.
From the sample Firebase Analytics dataset:
#standardSQL
SELECT *
FROM `firebase-analytics-sample-data.android_dataset.app_events_20160607`
WHERE EXISTS(
SELECT * FROM UNNEST(user_dim.user_properties)
WHERE key='powers' AND value.value.string_value='20'
)
LIMIT 1000
Below is for BigQuery Standard SQL
#standardSQL
SELECT t.*
FROM `my_dataset.mytable` t,
UNNEST(info.somename.cell) c
WHERE c.value = '1234'
above is assuming specific value can appear in each record just once - hope this is a true for you case
If this is not a case - below should make it
#standardSQL
SELECT *
FROM `yourproject.yourdadtaset.yourtable`
WHERE EXISTS(
SELECT *
FROM UNNEST(info.somename.cell)
WHERE value = '1234'
)
which I just realised pretty much same as Felipe's version - but just using your table / schema