Select with filters on nested JSON array - sql

Postgres 10: I have a table and a query below:
CREATE TABLE individuals (
uid character varying(10) PRIMARY KEY,
data jsonb
);
SELECT data->'files' FROM individuals WHERE uid = 'PDR7073706'
It returns this structure:
[
{"date":"2017-12-19T22-35-49","type":"indiv","name":"PDR7073706_indiv_2017-12-19T22-35-49.jpeg"},
{"date":"2017-12-19T22-35-49","type":"address","name":"PDR7073706_address_2017-12-19T22-35-49.pdf"}
]
I'm struggling with adding two filters by date and time. Like (illegal pseudo-code!):
WHERE 'type' = "indiv"
or like:
WHERE 'type' = "indiv" AND max('date')
It is probably easy, but I can't crack this nut, and need your help!

Assuming data type jsonb for lack of info.
Use the containment operator #> for the first clause (WHERE 'type' = "indiv"):
SELECT data->'files'
FROM individuals
WHERE uid = 'PDR7073706'
AND data -> 'files' #> '[{"type":"indiv"}]';
Can be supported with various kinds of indexes. See:
Query for array elements inside JSON type
Index for finding an element in a JSON array
The second clause (AND max('date')) is more tricky. Assuming you mean:
Get rows where the JSON array element with "type":"indiv" also has the latest "date".
SELECT i.*
FROM individuals i
JOIN LATERAL (
SELECT *
FROM jsonb_array_elements(data->'files')
ORDER BY to_timestamp(value ->> 'date', 'YYYY-MM-DD"T"HH24-MI-SS') DESC NULLS LAST
LIMIT 1
) sub ON sub.value -> 'type' = '"indiv"'::jsonb
WHERE uid = 'PDR7073706'
AND data -> 'files' #> '[{"type":"indiv"}]' -- optional; may help performance
to_timestamp(value ->> 'date', 'YYYY-MM-DD"T"HH24-MI-SS') is my educated guess on your undeclared timestamp format. Details in the manual here.
The last filter is redundant and optional. but it may help performance (a lot) if it is selective (only few rows qualify) and you have a matching index as advised:
AND data -> 'files' #> '[{"type":"indiv"}]'
Related:
Optimize GROUP BY query to retrieve latest record per user
Select first row in each GROUP BY group?
Update nth element of array using a WHERE clause

Related

How to perform ILIKE query against JSONB type column?

I have column "category_products" with datatype as JSONB. In that column data is inserted as array and this array contains objects. and that object contains array of object.
Here I need to perform ILIKE query against product_name.
example
category_products
-----------------
[{"products":[{product_name: product_one, price: 123}, {product_name: product_two, price: 999}]]
You may first flatten your data using a lateral join with jsonb_path_query and then apply an ILIKE in a WHERE clause as you need. Here is an illustration.
See the demo.
select id, l, l ->> 'product_name' as prod
from the_table,
lateral jsonb_path_query(category_products, '$[*].products[*]') as l;
Please note that your sample data are not valid JSON at all.
Unrelated but this would be so much easier and cleaner with a normalized data design.
Edit
As jsonb_path_query does not exist in pre-PG12 versions here is an alternative and a new demo.
select id, l, l ->> 'product_name' as prod
from the_table,
lateral jsonb_array_elements(category_products) as arr_ex,
lateral jsonb_array_elements(arr_ex -> 'products') as l;

function to sum all first value of Results SQL

I have a table with "Number", "Name" and "Result" Column. Result is a 2D text Array and I need to create a Column with the name "Average" that sum all first values of Result Array and divide by 2, can somebody help me Pls, I must use the create function for this. Its look like this:
Table1
Number
Name
Result
Average
01
Kevin
{{2.0,10},{3.0,50}}
2.5
02
Max
{{1.0,10},{4.0,30},{5.0,20}}
5.0
Average = ((2.0+3.0)/2) = 2.5
= ((1.0+4.0+5.0)/2) = 5.0
First of all: You should always avoid storing arrays in the table (or generate them in a subquery if not extremely necessary). Normalize it, it makes life much easier in nearly every single use case.
Second: You should avoid more-dimensional arrays. The are very hard to handle. See Unnest array by one level
However, in your special case you could do something like this:
demo:db<>fiddle
SELECT
number,
name,
SUM(value) FILTER (WHERE idx % 2 = 1) / 2 -- 2
FROM mytable,
unnest(avg_result) WITH ORDINALITY as elements(value, idx) -- 1
GROUP BY number, name
unnest() expands the array elements into one element per record. But this is not an one-level expand: It expand ALL elements in depth. To keep track of your elements, you could add an index using WITH ORDINALITY.
Because you have nested two-elemented arrays, the unnested data can be used as follows: You want to sum all first of two elements, which is every second (the odd ones) element. Using the FILTER clause in the aggregation helps you to aggregate only exact these elements.
However: If that's was a result of a subquery, you should think about doing the operation BEFORE array aggregation (if this is really necessary). This makes things easier.
Assumptions:
number column is Primary key.
result column is text or varchar type
Here are the steps for your requirements:
Add the column in your table using following query (you can skip this step if column is already added)
alter table table1 add column average decimal;
Update the calculated value by using below query:
update table1 t1
set average = t2.value_
from
(
select
number,
sum(t::decimal)/2 as value_
from table1
cross join lateral unnest((result::text[][])[1:999][1]) as t
group by 1
) t2
where t1.number=t2.number
Explanation: Here unnest((result::text[][])[1:999][1]) will return the first value of each child array (considering you can have up to 999 child arrays in your 2D array. You can increase or decrease it as per your requirement)
DEMO
Now you can create your function as per your requirement with above query.

Is there any way to search in json postgres column using matching clause?

I'm trying to search for a record from the Postgres JSON column. The stored data has a structure is like this
{
"contract_shipment_date": "2015-06-25T19:00:00.000Z",
"contract_product_grid": [
{
"product_name": "Axele",
"quantity": 22.58
},
{
"product_name": "Bell",
"quantity": 52.58
}
],
"lc_status": "Awaited"
}
My table name is Heap and column name is contract_product_grid. Also, contract_product_grid column can contain multiple product records.
I found this documentation but not able to get the desired output.
The required case is, I have a filter in which users can select product_name and on the basics of entered name by using matching clause, the record will be fetched and returned to users.
Suppose you entered Axele as a product_name input, and want to return the matching value for quantity key.
Then use :
SELECT js2->'quantity' AS quantity
FROM
(
SELECT JSON_ARRAY_ELEMENTS(value::json) AS js2
FROM heap,
JSON_EACH_TEXT(contract_product_grid) AS js
WHERE key = 'contract_product_grid'
) q
WHERE js2->> 'product_name' = 'Axele'
where expand the outermost JSON into key&value pairs through JSON_EACH_TEXT(json), and split all the elements with the newly formed array by JSON_ARRAY_ELEMENTS(value::json) function.
Then, filter out by spesific product_name within the main query.
Demo
P.S. Don't forget to wrap JSON column's value up with curly braces
SELECT *
FROM
(
SELECT JSON_ARRAY_ELEMENTS(contract_product_grid::json) AS js2
FROM heaps
WHERE 'contract_product_grid' = 'contract_product_grid'
) q
WHERE js2->> 'product_name' IN ('Axele', 'Bell')
As, I mentioned in question that my column name is 'contract_product_grid' and i have to only search from it.
Using this query, I'm able to get contract_product_grid information using IN clause with entered product name.
You need to unnest the array in order to be able to use a LIKE condition on each value:
select h.*
from heap h
where exists (select *
from jsonb_array_elements(h.contract_product_grid -> 'contract_product_grid') as p(prod)
where p.prod ->> 'product_name' like 'Axe%')
If you don't really need a wildcard search (so = instead of LIKE) you can use the contains operator #> which is a lot more efficient:
select h.*
from heap h
where h.contract_product_grid -> 'contract_product_grid' #> '[{"product_name": "Axele"}]';
This can also be used to search for multiple products:
select h.*
from heap h
where h.contract_product_grid -> 'contract_product_grid' #> '[{"product_name": "Axele"}, {"product_name": "Bell"}]';
If you are using Postgres 12 you can simplify that a bit using a JSON path expression:
select *
from heap
where jsonb_path_exists(contract_product_grid, '$.contract_product_grid[*].product_name ? (# starts with "Axe")')
Or using a regular expression:
select *
from heap
where jsonb_path_exists(contract_product_grid, '$.contract_product_grid[*].product_name ? (# like_regex "axe.*" flag "i")')

How can I aggregate Jsonb columns in postgres using another column type

I have the following data in a postgres table,
where data is a jsonb column. I would like to get result as
[
{field_type: "Design", briefings_count: 1, meetings_count: 13},
{field_type: "Engineering", briefings_count: 1, meetings_count: 13},
{field_type: "Data Science", briefings_count: 0, meetings_count: 3}
]
Explanation
Use jsonb_each_text function to extract data from jsonb column named data. Then aggregate rows by using GROUP BY to get one row for each distinct field_type. For each aggregation we also need to include meetings and briefings count which is done by selecting maximum value with case statement so that you can create two separate columns for different counts. On top of that apply coalesce function to return 0 instead of NULL if some information is missing - in your example it would be briefings for Data Science.
At a higher level of statement now that we have the results as a table with fields we need to build a jsonb object and aggregate them all to one row. For that we're using jsonb_build_object to which we are passing pairs that consist of: name of the field + value. That brings us with 3 rows of data with each row having a separate jsonb column with the data. Since we want only one row (an aggregated json) in the output we need to apply jsonb_agg on top of that. This brings us the result that you're looking for.
Code
Check LIVE DEMO to see how it works.
select
jsonb_agg(
jsonb_build_object('field_type', field_type,
'briefings_count', briefings_count,
'meetings_count', meetings_count
)
) as agg_data
from (
select
j.k as field_type
, coalesce(max(case when t.count_type = 'briefings_count' then j.v::int end),0) as briefings_count
, coalesce(max(case when t.count_type = 'meetings_count' then j.v::int end),0) as meetings_count
from tbl t,
jsonb_each_text(data) j(k,v)
group by j.k
) t
You can aggregate columns like this and then insert data to another table
select array_agg(data)
from the_table
Or use one of built-in json function to create new json array. For example jsonb_agg(expression)

How to sort a PostgreSQL JSON field [duplicate]

This question already has answers here:
How to reorder array in JSONB type column
(1 answer)
JSONB column: sort only content of arrays stored in column with mixed JSONB content
(1 answer)
Closed 4 years ago.
I have some table with json field. I want to group & count that field to understand its usage frequency. But the data in this field is stored in a random order, so I can get the wrong result. That's why I've decided to sort that field before grouping, but I failed. How can I do it?
JSON_FIELD
["one", "two"]
["two"]
["two", "one"]
["two"]
["three"]
I've tried this code:
SELECT JSON_FIELD, COUNT(1) FROM my_table GROUP BY JSON_FIELD;
But the result w/o sorting is wrong:
JSON_FIELD COUNT
["one", "two"] 1
["two"] 2
["two", "one"] 1
["three"] 1
But if I could sort it somehow, the expected (and correct) result would be:
JSON_FIELD COUNT
["one", "two"] 2
["two"] 2
["three"] 1
My question is very familiar to How to convert json array into postgres int array in postgres 9.3
A bit messy but works:
SELECT ja::TEXT::JSON, COUNT(*)
FROM (
SELECT JSON_AGG(je ORDER BY je::TEXT) AS ja
FROM (
SELECT JSON_ARRAY_ELEMENTS(j) je, ROW_NUMBER() OVER () AS r
FROM (
VALUES
('["one", "two"]'::JSON),
('["two"]'),
('["two", "one"]'),
('["two"]'),
('["three"]')
) v(j)
) el
GROUP BY r
) x
GROUP BY ja::TEXT
Result:
BTW the casting of JSON values to TEXT is because (at least in PG 9.3) there are no JSON equality operators, so I cast to TEXT in order to be able to do GROUP or ORDER BY, then back to JSON.