How to sort a PostgreSQL JSON field [duplicate] - sql

This question already has answers here:
How to reorder array in JSONB type column
(1 answer)
JSONB column: sort only content of arrays stored in column with mixed JSONB content
(1 answer)
Closed 4 years ago.
I have some table with json field. I want to group & count that field to understand its usage frequency. But the data in this field is stored in a random order, so I can get the wrong result. That's why I've decided to sort that field before grouping, but I failed. How can I do it?
JSON_FIELD
["one", "two"]
["two"]
["two", "one"]
["two"]
["three"]
I've tried this code:
SELECT JSON_FIELD, COUNT(1) FROM my_table GROUP BY JSON_FIELD;
But the result w/o sorting is wrong:
JSON_FIELD COUNT
["one", "two"] 1
["two"] 2
["two", "one"] 1
["three"] 1
But if I could sort it somehow, the expected (and correct) result would be:
JSON_FIELD COUNT
["one", "two"] 2
["two"] 2
["three"] 1
My question is very familiar to How to convert json array into postgres int array in postgres 9.3

A bit messy but works:
SELECT ja::TEXT::JSON, COUNT(*)
FROM (
SELECT JSON_AGG(je ORDER BY je::TEXT) AS ja
FROM (
SELECT JSON_ARRAY_ELEMENTS(j) je, ROW_NUMBER() OVER () AS r
FROM (
VALUES
('["one", "two"]'::JSON),
('["two"]'),
('["two", "one"]'),
('["two"]'),
('["three"]')
) v(j)
) el
GROUP BY r
) x
GROUP BY ja::TEXT
Result:
BTW the casting of JSON values to TEXT is because (at least in PG 9.3) there are no JSON equality operators, so I cast to TEXT in order to be able to do GROUP or ORDER BY, then back to JSON.

Related

Unnesting repeated records to a single row in Big Query

I have a dataset that includes repeated records. When I unnest them I get 2 rows. 1 per nested record.
Before unnest raw data:
After unnest using this query:
SELECT
eventTime
participant.id
FROM
`public.table`,
UNNEST(people) AS participant
WHERE
verb = 'event'
These are actually 2 rows that are expanded to 4. I've been trying to unnest into a single row so I have 3 columns,
eventTime, buyer.Id, seller.Id.
I've been trying to use REPLACE to build a struct of the unnested content but I cannot figure out how to do it. Any pointer , documentation or steps that could help me out?
Consider below approach
SELECT * EXCEPT(key) FROM (
SELECT
eventTime,
participant.id,
personEventRole,
TO_JSON_STRING(t) key
FROM `public.table` t,
UNNEST(people) AS participant
WHERE verb = 'event'
)
PIVOT (MIN(id) FOR personEventRole IN ('buyer', 'seller'))
if applied to sample data in your question - output is

Is there any way to search in json postgres column using matching clause?

I'm trying to search for a record from the Postgres JSON column. The stored data has a structure is like this
{
"contract_shipment_date": "2015-06-25T19:00:00.000Z",
"contract_product_grid": [
{
"product_name": "Axele",
"quantity": 22.58
},
{
"product_name": "Bell",
"quantity": 52.58
}
],
"lc_status": "Awaited"
}
My table name is Heap and column name is contract_product_grid. Also, contract_product_grid column can contain multiple product records.
I found this documentation but not able to get the desired output.
The required case is, I have a filter in which users can select product_name and on the basics of entered name by using matching clause, the record will be fetched and returned to users.
Suppose you entered Axele as a product_name input, and want to return the matching value for quantity key.
Then use :
SELECT js2->'quantity' AS quantity
FROM
(
SELECT JSON_ARRAY_ELEMENTS(value::json) AS js2
FROM heap,
JSON_EACH_TEXT(contract_product_grid) AS js
WHERE key = 'contract_product_grid'
) q
WHERE js2->> 'product_name' = 'Axele'
where expand the outermost JSON into key&value pairs through JSON_EACH_TEXT(json), and split all the elements with the newly formed array by JSON_ARRAY_ELEMENTS(value::json) function.
Then, filter out by spesific product_name within the main query.
Demo
P.S. Don't forget to wrap JSON column's value up with curly braces
SELECT *
FROM
(
SELECT JSON_ARRAY_ELEMENTS(contract_product_grid::json) AS js2
FROM heaps
WHERE 'contract_product_grid' = 'contract_product_grid'
) q
WHERE js2->> 'product_name' IN ('Axele', 'Bell')
As, I mentioned in question that my column name is 'contract_product_grid' and i have to only search from it.
Using this query, I'm able to get contract_product_grid information using IN clause with entered product name.
You need to unnest the array in order to be able to use a LIKE condition on each value:
select h.*
from heap h
where exists (select *
from jsonb_array_elements(h.contract_product_grid -> 'contract_product_grid') as p(prod)
where p.prod ->> 'product_name' like 'Axe%')
If you don't really need a wildcard search (so = instead of LIKE) you can use the contains operator #> which is a lot more efficient:
select h.*
from heap h
where h.contract_product_grid -> 'contract_product_grid' #> '[{"product_name": "Axele"}]';
This can also be used to search for multiple products:
select h.*
from heap h
where h.contract_product_grid -> 'contract_product_grid' #> '[{"product_name": "Axele"}, {"product_name": "Bell"}]';
If you are using Postgres 12 you can simplify that a bit using a JSON path expression:
select *
from heap
where jsonb_path_exists(contract_product_grid, '$.contract_product_grid[*].product_name ? (# starts with "Axe")')
Or using a regular expression:
select *
from heap
where jsonb_path_exists(contract_product_grid, '$.contract_product_grid[*].product_name ? (# like_regex "axe.*" flag "i")')

Postgresql, jsonb with multiple keys is returning a single row for each key

Here's what my situation is. I have rows that have a json column, and what I've been trying to do is get all the values for all the keys in that json in just one row.
let's say if I have row with the json value:
{"key1": "a", "key2": "b"}
Now, is it possible to extract the values as such: ["a", "b"]?
I attempted this so far:
select ---- some sum() fields ----,
b.match_data::json -> jsonb_object_keys(b.match_data) as "Course"
from --- tables ---
join -- tables ---
where -- condition ---
group by -- sum() fields ----, b.match_data
The problem with this is that for json with multiple keys, it is returning multiple rows.
demo: db<>fiddle
WITH jsondata AS (
SELECT '{"key1": "a", "key2": "b"}'::jsonb as data -- A
)
SELECT jsonb_agg(value) -- C
FROM jsondata, jsonb_each(data) -- B
Postgres JSON functions, Postgres (JSON) aggregate functions
A: CTE to work with your data
B: jsonb_each expands your data; result:
key value
key1 "a"
key2 "b"
C: jsonb_agg aggregates the value column into a json array with the expected result: ["a", "b"].
If you do not want the result as json array but as normal text array you have to change jsonb_each into jsonb_each_text and jsonb_agg into array_agg (see fiddle)
I used jsonb as type. Of course all functions exist for type json as well.
(Postgres JSON types)
S-Man's answer gave me a direction to use aggregators, and after a few trial and errors I got my answer
(select string_agg((select value from jsonb_array_elements_text(value)), ',')
from jsonb_each(b.match_data)) "Course"
It collects and displays values as a, b,... in one single row.

How can I aggregate Jsonb columns in postgres using another column type

I have the following data in a postgres table,
where data is a jsonb column. I would like to get result as
[
{field_type: "Design", briefings_count: 1, meetings_count: 13},
{field_type: "Engineering", briefings_count: 1, meetings_count: 13},
{field_type: "Data Science", briefings_count: 0, meetings_count: 3}
]
Explanation
Use jsonb_each_text function to extract data from jsonb column named data. Then aggregate rows by using GROUP BY to get one row for each distinct field_type. For each aggregation we also need to include meetings and briefings count which is done by selecting maximum value with case statement so that you can create two separate columns for different counts. On top of that apply coalesce function to return 0 instead of NULL if some information is missing - in your example it would be briefings for Data Science.
At a higher level of statement now that we have the results as a table with fields we need to build a jsonb object and aggregate them all to one row. For that we're using jsonb_build_object to which we are passing pairs that consist of: name of the field + value. That brings us with 3 rows of data with each row having a separate jsonb column with the data. Since we want only one row (an aggregated json) in the output we need to apply jsonb_agg on top of that. This brings us the result that you're looking for.
Code
Check LIVE DEMO to see how it works.
select
jsonb_agg(
jsonb_build_object('field_type', field_type,
'briefings_count', briefings_count,
'meetings_count', meetings_count
)
) as agg_data
from (
select
j.k as field_type
, coalesce(max(case when t.count_type = 'briefings_count' then j.v::int end),0) as briefings_count
, coalesce(max(case when t.count_type = 'meetings_count' then j.v::int end),0) as meetings_count
from tbl t,
jsonb_each_text(data) j(k,v)
group by j.k
) t
You can aggregate columns like this and then insert data to another table
select array_agg(data)
from the_table
Or use one of built-in json function to create new json array. For example jsonb_agg(expression)

Select with filters on nested JSON array

Postgres 10: I have a table and a query below:
CREATE TABLE individuals (
uid character varying(10) PRIMARY KEY,
data jsonb
);
SELECT data->'files' FROM individuals WHERE uid = 'PDR7073706'
It returns this structure:
[
{"date":"2017-12-19T22-35-49","type":"indiv","name":"PDR7073706_indiv_2017-12-19T22-35-49.jpeg"},
{"date":"2017-12-19T22-35-49","type":"address","name":"PDR7073706_address_2017-12-19T22-35-49.pdf"}
]
I'm struggling with adding two filters by date and time. Like (illegal pseudo-code!):
WHERE 'type' = "indiv"
or like:
WHERE 'type' = "indiv" AND max('date')
It is probably easy, but I can't crack this nut, and need your help!
Assuming data type jsonb for lack of info.
Use the containment operator #> for the first clause (WHERE 'type' = "indiv"):
SELECT data->'files'
FROM individuals
WHERE uid = 'PDR7073706'
AND data -> 'files' #> '[{"type":"indiv"}]';
Can be supported with various kinds of indexes. See:
Query for array elements inside JSON type
Index for finding an element in a JSON array
The second clause (AND max('date')) is more tricky. Assuming you mean:
Get rows where the JSON array element with "type":"indiv" also has the latest "date".
SELECT i.*
FROM individuals i
JOIN LATERAL (
SELECT *
FROM jsonb_array_elements(data->'files')
ORDER BY to_timestamp(value ->> 'date', 'YYYY-MM-DD"T"HH24-MI-SS') DESC NULLS LAST
LIMIT 1
) sub ON sub.value -> 'type' = '"indiv"'::jsonb
WHERE uid = 'PDR7073706'
AND data -> 'files' #> '[{"type":"indiv"}]' -- optional; may help performance
to_timestamp(value ->> 'date', 'YYYY-MM-DD"T"HH24-MI-SS') is my educated guess on your undeclared timestamp format. Details in the manual here.
The last filter is redundant and optional. but it may help performance (a lot) if it is selective (only few rows qualify) and you have a matching index as advised:
AND data -> 'files' #> '[{"type":"indiv"}]'
Related:
Optimize GROUP BY query to retrieve latest record per user
Select first row in each GROUP BY group?
Update nth element of array using a WHERE clause