Extract Earliest Date from Postgres jsonb array - sql

Suppose I have a jsonb array in a postgres column like
[{"startDate": "2019-09-01"}, {"startDate": "2019-07-22"}, {"startDate": "2019-08-08"}]
Is there a way to extract the earliest startDate from the jsonb array? I have tried using jsonb_array_elements but don't see how to loop through all the elements.

You can use a scalar sub-query
select (select (e.element ->> 'startDate')::date as start_date
from jsonb_array_elements(t.the_column) as e.element
order by start_date desc
limit 1) as start_date
from the_table t
You need to replace the_table and the_column with the actual table and column name you are using.

You can directly use MIN() aggregation after casting the derived value to date :
SELECT MIN((elm ->> 'startDate')::date)
FROM t
CROSS JOIN jsonb_array_elements(jsdata) AS j(elm)
Demo

Related

How to SUM numbers from a plain jsonb array?

I'm facing issues with a jsonb ARRAY column in PostgreSQL.
I need to sum this column for each row.
Expected Result:
index
sum(snx_wavelenghts)
1
223123
2
223123
You can solve this ...
... with a subquery, then aggregate:
SELECT index, sum(nr) AS wavelength_sum
FROM (
SELECT index, jsonb_array_elements(snx_wavelengths)::numeric AS nr
FROM tbl
) sub
GROUP BY 1
ORDER BY 1; -- optional?
... with an aggregate in a correlated subquery:
SELECT index
, (SELECT sum(nr::numeric) FROM jsonb_array_elements(snx_wavelengths) nr) AS wavelength_sum
FROM tbl
ORDER BY 1; -- optional?
... or with an aggregate in a LATERAL subquery:
SELECT t.index, js.wavelength_sum
FROM tbl t
LEFT JOIN LATERAL (
SELECT sum(nr::numeric) AS wavelength_sum
FROM jsonb_array_elements(t.snx_wavelengths) nr
) js ON true
ORDER BY 1; -- optional?
fiddle
See:
What is the difference between a LATERAL JOIN and a subquery in PostgreSQL?
Your screenshot shows fractional digits. Cast to the type numeric to get exact results. A floating point type like real or float can introduce rounding errors.
You’ll need to extract the jsonb array contents from the jsonb array using jsonb_array_elements function before summing them. Here’s an example
SELECT SUM(w::float) AS wavelength_sum
FROM (
SELECT jsonb_array_elements(snx_wavelengths) AS w
FROM my_table
);
This should work if I remember correctly (remember to update my_table to your table name). More info here https://www.postgresql.org/docs/9.5/functions-json.html

Not getting unique values inspite of using the distinct function

I am using the below code to return a set of distinct UUIDs and a corresponding date when the first action was taken on those UUIDs. The raw data will have non-distinct UUIDs and a corresponding date when an action was taken. I am trying to extract unique UUIDs and the first date when the action was taken as represented by date1. Can someone help where I am going wrong.
The output that I get is the same raw data and the UUIDs are unfortunately non-unique and has many duplicates
with raw_data as (
select UUID, cast(datestring as timestamp) as date1
from raw)
select
distinct UUID,
date_trunc('week', date1)
from raw_date
Use the min() aggregation function:
select UUID,
min(date_trunc('week', cast(datestring as timestamp)))
from raw
group by UUID;
This should do everything your query is doing. There is no need for a subquery or CTE.

GROUP BY in Postgres - no equality for JSON data type?

I have the following data in a matches table:
5;{"Id":1,"Teams":[{"Name":"TeamA","Players":[{"Name":"AAA"},{"Name":"BBB"}]},{"Name":"TeamB","Players":[{"Name":"CCC"},{"Name":"DDD"}]}],"TeamRank":[1,2]}
6;{"Id":2,"Teams":[{"Name":"TeamA","Players":[{"Name":"CCC"},{"Name":"BBB"}]},{"Name":"TeamB","Players":[{"Name":"AAA"},{"Name":"DDD"}]}],"TeamRank":[1,2]}
I want to select each last distinct Team in the table by their name. i.e. I want a query that will return:
6;{"Name":"TeamA","Players":[{"Name":"CCC"},{"Name":"BBB"}
6;{"Name":"TeamB","Players":[{"Name":"AAA"},{"Name":"DDD"}
So each team from last time that team appears in the table.
I have been using the following (from here):
WITH t AS (SELECT id, json_array_elements(match->'Teams') AS team FROM matches)
SELECT MAX(id) AS max_id, team FROM t GROUP BY team->'Name';
But this returns:
ERROR: could not identify an equality operator for type json
SQL state: 42883
Character: 1680
I understand that Postgres doesn't have equality for JSON. I only need equality for the team's name (a string), the players on that team don't need to be compared.
Can anyone suggest an alternative way to do this?
For reference:
SELECT id, json_array_elements(match->'Teams') AS team FROM matches
returns:
5;"{"Name":"TeamA","Players":[{"Name":"AAA"},{"Name":"BBB"}]}"
5;"{"Name":"TeamB","Players":[{"Name":"CCC"},{"Name":"DDD"}]}"
6;"{"Name":"TeamA","Players":[{"Name":"CCC"},{"Name":"BBB"}]}"
6;"{"Name":"TeamB","Players":[{"Name":"AAA"},{"Name":"DDD"}]}"
EDIT: I cast to text and following this question, I used DISTINCT ON instead of GROUP BY. Here's my full query:
WITH t AS (SELECT id, json_array_elements(match->'Teams') AS team
FROM matches ORDER BY id DESC)
SELECT DISTINCT ON (team->>'Name') id, team FROM t;
Returns what I wanted above. Does anyone have a better solution?
Shorter, faster and more elegant with a LATERAL join:
SELECT DISTINCT ON (t.team->>'Name') t.team
FROM matches m, json_array_elements(m.match->'Teams') t(team);
ORDER BY t.team->>'Name', m.id DESC; -- to get the "last"
If you just want distinct teams, the ORDER BY can go. Related:
Query for element of array in JSON column
Query for array elements inside JSON type
JSON and equality
There is no equality operator for the json data type in Postgres, but there is one for jsonb (Postgres 9.4+):
How to query a json column for empty objects?

Oracle to PostgreSQL query conversion with string_to_array()

I have below query in Oracle:
SELECT to_number(a.v_VALUE), b.v_VALUE
FROM TABLE(inv_fn_splitondelimiter('12;5;25;10',';')) a
JOIN TABLE(inv_fn_splitondelimiter('10;20;;', ';')) b
ON a.v_idx = b.v_idx
which give me result like:
I want to convert the query to Postgres. I have tried a query like:
SELECT UNNEST(String_To_Array('10;20;',';'))
I have also tried:
SELECT a,b
FROM (select UNNEST(String_To_Array('12;5;25;10;2',';'))) a
LEFT JOIN (select UNNEST(String_To_Array('12;5;25;10',';'))) b
ON a = b
But didn't get a correct result.
I don't know how to write query that's fully equivalent to the Oracle version. Anyone?
Starting with Postgres 9.4 you can use unnest() with multiple arrays to unnest them in parallel:
SELECT *
FROM unnest('{12,5,25,10,2}'::int[]
, '{10,20}' ::int[]) AS t(col1, col2);
That's all. NULL values are filled in automatically for missing elements to the right.
If parameters are provided as strings, convert with string_to_array() first. Like:
SELECT *
FROM unnest(string_to_array('12;5;25;10', ';')
, string_to_array('10;20' , ';')) AS t(col1, col2);
More details and an alternative solution for older versions:
Unnest multiple arrays in parallel
Split given string and prepare case statement
In the expression select a the a is not a column, but the name of the table alias. Consequently that expressions selects a complete row-tuple (albeit with just a single column), not a single column.
You need to define proper column aliases for the derived tables. It is also recommended to use set returning functions only in the from clause, not in the select list.
If you are not on 9.4 you need to generate the "index" using a window function. If you are on 9.4 then Erwin's answer is much better.
SELECT a.v_value, b.v_value
FROM (
select row_number() over () as idx, -- generate an index for each element
i as v_value
from UNNEST(String_To_Array('12;5;25;10;2',';')) i
) as a
JOIN (
select row_number() over() as idx,
i as v_value
from UNNEST(String_To_Array('10;20;;',';')) i
) as b
ON a.idx = b.idx;
An alternative way in 9.4 would be to use the with ordinality option to generate the row index in case you do need the index value:
select a.v_value, b.v_value
from regexp_split_to_table('12;5;25;10;2',';') with ordinality as a(v_value, idx)
left join regexp_split_to_table('10;20;;',';') with ordinality as b(v_value, idx)
on a.idx = b.idx

How to replace a nested select with "group by" with a "having" clause?

My SQL is getting somewhat rusty and the only way I have managed to retrieve from a table the ids of the newest records (based on a date field) of the same type is with a nested select, but I suspect that there must be a way to do the same with a having clause or something more efficient.
Supposing that the only columns are ID, TYPE and DATE, my current query is:
select ID from MY_TABLE,
(select TYPE as GROUP_TYPE,
max(DATE) as MAX_DATE
from MY_TABLE group by TYPE)
where TYPE = GROUP_TYPE
and DATE = MAX_DATE
(I'm writing it from my memory, maybe there are some syntax errors, but you get the idea)
I'd prefer to stick to pure standard SQL without proprietary extensions.
Then there is no "more efficient" way to write this query. Not in standard ANSI-SQL. The problem is that you are trying to compare an AGGREGATE column (Max-date) against a base column (date) to return another base column (ID). The HAVING clause cannot handle this type of comparison.
There are ways using ROW_NUMBER (windowing function) or MySQL (group by hack) to do it, but those are not portable across database systems.
SELECT a.id, a.type, a.dater
from my_table a inner join
(
select type, max(dater) as dater2
from my_table
group by type
) b
on a.type= b.type and a.dater= b.dater2
This should get you closer depending on your data
select ID from MY_TABLE
where (DATE = (select max(DATE) from MY_TABLE as X
where X.TYPE = MY_TABLE.TYPE)