Postgres query nested JSONB - sql

I have a JSONB column containing list of objects.>
Here's the table schema:
column Name | Datatype
---------------------
timestamp | timestamp
data | JSONB
Sample Data
1.
timestamp : 2020-02-02 19:01:21.571429+00
data : [
{
"tracker_id": "5",
"position": 1
},
{
"tracker_id": "11",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
2.
timestamp : 2020-02-02 19:01:23.571429+00
data : [
{
"tracker_id": "7",
"position": 3
},
{
"tracker_id": "4",
"position": 2
}
]
3.
timestamp : 2020-02-02 19:02:23.571429+00
data : [
{
"tracker_id": "5",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
I need to find the count of the transitions of tracker_id from position: 1 to position: 2
Here, the output will be 2, since tracker_id 4 and 5 changed their position from 1 to 2.
Note
The transition should be in ascending order depending on the timestamp
The position change need not to be in the consecutive records.
I'm using timescaledb extension
So far I've tried querying the objects in the list od individual record, but I'm not sure how to merge the list objects of each record and query them.
What would be the query for this? Should I write down a stored procedure instead?

I don't use timescaledb extension so I would choose pure SQL solution based on unnesting json:
with t (timestamp,data) as (values
(timestamp '2020-02-02 19:01:21.571429+00', '[
{
"tracker_id": "5",
"position": 1
},
{
"tracker_id": "11",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]'::jsonb),
(timestamp '2020-02-02 19:01:23.571429+00', '[
{
"tracker_id": "7",
"position": 3
},
{
"tracker_id": "4",
"position": 2
}
]
'::jsonb),
(timestamp '2020-02-02 19:02:23.571429+00', '[
{
"tracker_id": "5",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
'::jsonb)
), unnested as (
select t.timestamp, r.tracker_id, r.position
from t
cross join lateral jsonb_to_recordset(t.data) AS r(tracker_id text, position int)
)
select count(*)
from unnested u1
join unnested u2
on u1.tracker_id = u2.tracker_id
and u1.position = 1
and u2.position = 2
and u1.timestamp < u2.timestamp;

There are various functions that will help compose several database rows into a single JSON structure: row_to_json(), array_to_json(), and, array_agg().
You will then use the usual SELECT with an ORDER BY clause to get the timestamps/JSON data you want and the use above functions to create a single JSON struture.

Related

How to unpack Array to Rows in Snowflake?

I have a table that looks like the following in Snowflake:
ID | CODES
2 | [ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]
And I want to make it into:
ID | CODES
2 | 'CODE1'
2 | 'CODE2'
So far I've tried
SELECT ID,CODES[0]:list
FROM MY_TABLE
But that only gets me as far as:
ID | CODES
2 | [ { "item": "CODE1" }, { "item": "CODE2" } ]
How can I break out every 'item' element from every index of this list into its own row with each CODE as a string?
Update: Here is the answer I got working at the same time as the answer below, looks like we both used FLATTEN:
SELECT ID,f.value:item
FROM MY_TABLE,
lateral flatten(input => MY_TABLE.CODES[0]:list) f
So as you note you have hard coded your access into the codes, via codes[0] which gives you the first item from that array, if you use FLATTEN you can access all of the objects of the first array.
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.*
FROM my_table,
table(flatten(codes)) c;
gives:
2 1 [0] 0 { "list": [ { "item": "CODE1" }, { "item": "CODE2" }]} [ { "list": [{"item": "CODE1"}, { "item": "CODE2" }]}]
so now you want to loop across the items in list, so we use another FLATTEN on that:
WITH my_table(id,codes) AS (
SELECT 2, parse_json('[ { "list": [ { "item": "CODE1" }, { "item": "CODE2" } ] } ]')
)
SELECT ID, c.value, l.value
FROM my_table,
table(flatten(codes)) c,
table(flatten(c.value:list)) l;
gives:
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE1"}
2 {"list":[{"item": "CODE1"},{"item":"CODE2"}]} {"item":"CODE2"}
so you can pull apart that l.value how you need to access the parts you need.

How to filter Cosmos DB data based on value of an element in an array of values Using SQL API

I have a cosmosDB collection with below Data in it.
I have to find out the data only for EVENT named ABC and its value using SQL query.
[
{
"ID": "01XXXXX",
"EVENTS": [
{
"Name": "ABC",
"Value": 0
},
{
"Name": "XYZ",
"Value": 4
},
{
"Name": "PQR",
"Value": 5
}
]
},
{
"ID": "02XXXXX",
"EVENTS": [
{
"Name": "ABC",
"Value": 1
},
{
"Name": "XYZ",
"Value": 2
},
{
"Name": "PQR",
"Value": 3
}
]
}
]
I have tried the below code but it is not working since EVENT is an array.
SELECT * FROM c where c.EVENTS.Name = 'ABC'
Is there any way to find filter out the data only with Event Name as ABC using SQL?
Try using join
SELECT c FROM c
join l in c.EVENTS
where l.Name = 'ABC'

Delete element in a deeply nested array in jsonb column - Postgres

I have a table my_table with a jsonb column that contains some data, for instance, in a single row, the column can contain the following data:
[
{
"x_id": "1",
"type": "t1",
"parts": [
{ "part_id": "1", price: 400 },
{ "part_id": "2", price: 500 },
{ "part_id": "3", price: 0 }
]
},
{
"x_id": "2",
"type": "t1",
"parts": [
{ "part_id": "1", price: 1000 },
{ "part_id": "3", price: 60 }
]
},
{
"x_id": "3",
"type": "t2",
"parts": [
{ "part_id": "1", price: 100 },
{ "part_id": "3", price: 780 },
{ "part_id": "2", price: 990 }
]
}
]
I need help finding how to delete an element from the parts array given a x_id and a part_id.
Example
given x_id=2 and part_id=1, I need the data to be updated to become:
[
{
"x_id": "1",
"type": "t1",
"parts": [
{ "part_id": "1", price: 400 },
{ "part_id": "2", price: 500 },
{ "part_id": "3", price: 0 }
]
},
{
"x_id": "2",
"type": "t1",
"parts": [
{ "part_id": "3", price: 60 }
]
},
{
"x_id": "3",
"type": "t2",
"parts": [
{ "part_id": "1", price: 100 },
{ "part_id": "3", price: 780 },
{ "part_id": "2", price: 990 }
]
}
]
PS1: these data cannot be normalized, so that's not a possible solution.
PS2: I'm running PostgreSQL 9.6
PS3: I have checked this question and this question but my data structure seems too complex compared to the other questions thus I can't apply the given answers.
Edit1: the json data can be big, especially the parts array, which can have from as few as 0 element to thousands.
I think you can use #- operator (see functions-json), you just need to find the path to remove the array element from:
select
data #- p.path
from test as t
cross join lateral (
select array[(a.i-1)::text,'parts',(b.i-1)::text]
from jsonb_array_elements(t.data) with ordinality as a(data,i),
jsonb_array_elements(a.data->'parts') with ordinality as b(data,i)
where
a.data ->> 'x_id' = '2' and
b.data ->> 'part_id' = '1'
) as p(path)
or
update test as t set
data = data #- (
select
array[(a.i-1)::text,'parts',(b.i-1)::text]
from jsonb_array_elements(t.data) with ordinality as a(data,i),
jsonb_array_elements(a.data->'parts') with ordinality as b(data,i)
where
a.data ->> 'x_id' = '2' and
b.data ->> 'part_id' = '1'
)
db<>fiddle demo
update Ok, there's reasonable comment that update part works incorrectly if given path doesn't exist in the data. I guess in this case you're going to either duplicate expression in the where clause:
update test as t set
data = data #- (
select
array[(a.i-1)::text,'parts',(b.i-1)::text]
from jsonb_array_elements(t.data) with ordinality as a(data,i),
jsonb_array_elements(a.data->'parts') with ordinality as b(data,i)
where
a.data ->> 'x_id' = '2' and
b.data ->> 'part_id' = '23222'
)
where
exists (
select *
from jsonb_array_elements(t.data) as a(data),
jsonb_array_elements(a.data->'parts') as b(data)
where
a.data ->> 'x_id' = '2' and
b.data ->> 'part_id' = '23222'
)
db<>fiddle demo
or you can use self-join:
update test as t2 set
data = t.data #- p.path
from test as t
cross join lateral (
select array[(a.i-1)::text,'parts',(b.i-1)::text]
from jsonb_array_elements(t.data) with ordinality as a(data,i),
jsonb_array_elements(a.data->'parts') with ordinality as b(data,i)
where
a.data ->> 'x_id' = '2' and
b.data ->> 'part_id' = '23232'
) as p(path)
where
t.ctid = t2.ctid
db<>fiddle demo
This should work, just need another unique column (primary key usually)
create test table
create table test_tab(
id serial primary key,
j jsonb
);
insert into test_tab
(j)
values
('[
{
"x_id": "1",
"type": "t1",
"parts": [
{ "part_id": "1", "price": 400 },
{ "part_id": "2", "price": 500 },
{ "part_id": "3", "price": 0 }
]
},
{
"x_id": "2",
"type": "t1",
"parts": [
{ "part_id": "1", "price": 1000 },
{ "part_id": "3", "price": 60 }
]
},
{
"x_id": "3",
"type": "t2",
"parts": [
{ "part_id": "1", "price": 100 },
{ "part_id": "3", "price": 780 },
{ "part_id": "2", "price": 990 }
]
}
]');
Then split json, filter unnecessary data, and recreate json again:
select id, jsonb_agg( jsonb_build_object('x_id',xid, 'type',type, 'parts', case when inner_arr = '[null]'::jsonb then parts_arr::jsonb else inner_arr end) )
from (
select
id,
value->>'x_id' as xid,
jsonb_agg(inner_arr) as inner_arr,
max(value->>'parts') as parts_arr,
max(value->>'type') as type
from (
select * ,
case when value->>'x_id'='2' then jsonb_array_elements(value->'parts') else NULL end inner_arr
from test_tab
join lateral jsonb_array_elements(j)
on true
) t
where
inner_arr->>'part_id' is distinct from '1'
group by id, value->>'x_id'
) t
group by id

Removing null/empty values from an array

I'm struggling to understand arrays and structs in BigQuery. When I run this query in Standard SQL:
with t1 as (
select 1 as id, [1,2] as orders
union all
select 2 as id, null as orders
)
select
id,
orders
from t1
order by 1
I get this result in json:
[
{
"id": "1",
"orders": [
"1",
"2"
]
},
{
"id": "2",
"orders": []
}
]
I want to remove to remove the orders value for id = 2 so that I instead get:
[
{
"id": "1",
"orders": [
"1",
"2"
]
},
{
"id": "2"
}
]
How can I do this? Do I need to add another CTE to remove null values, how?

RavenDb facet search in string array field with wildcard

Is it possible to have a RavenDb faceted search, in a string[] field, where I would want to show facets (counts) for only values starting with a particular string, rather a range?
I'll try to explain myself better to with a simple example, imagine having an index with the below entries
ID | Values
-------------------------
1 | CatPersian, CatNormal, DogLabrador
2 | CatPersian, Camel, DogPoodle
3 | CatNormal, CatBengali, DogNormal
4 | DogNormal
I would perform a query on the above documents, and the Facet search would include a range of 'Cat*', on the 'Values' field. Is this possible? Then, I would get a result based on just the different values for cats, like:
CatPersian [2]
CatNormal [2]
CatBengali [1]
Yes, you can do that. Index the array, and then just use facets normally.
Let's see the full example. You have the following documents:
{
"Name": "John",
"FavoriteAnimals": [
"Cats",
"Dogs",
"Snails"
],
"#metadata": {
"#collection": "Kids"
}
}
{
"Name": "Jane",
"FavoriteAnimals": [
"Cats",
"Rabits"
],
"#metadata": {
"#collection": "Kids"
}
}
Now, you create the following index:
from k in docs.Kids
from animal in k.FavoriteAnimals
select new { Animal = animal }
And run this query:
from index 'YourIndex'
where startsWith(Animal , 'ca')
select facet('Animal')
And the result will be:
{
"Name": "Animal",
"Values": [
{
"Count": 2,
"Range": "cats"
}
]
}
Alternatively, you can use this index:
from k in docs.Kids
select new { k.FavoriteAnimals }
And run this query:
from index 'YourIndex'
where startsWith(FavoriteAnimals , 'ca')
select facet('FavoriteAnimals')
The difference here is that you'll get all matches for the documents that have a match.
So in this case
{
"Name": "Animal",
"Values": [
{
"Count": 2,
"Range": "cats"
},
{
"Count": 1,
"Range": "dogs"// also, snails, rabbits
}
]
}