Split array by portions in PostgreSQL - sql

I need split array by 2-pair portions, only nearby values.
For example I have following array:
select array[1,2,3,4,5]
And I want to get 4 rows with following values:
{1,2}
{2,3}
{3,4}
{4,5}
Can I do it by SQL query?

select a
from (
select array[e, lead(e) over()] as a
from unnest(array[1,2,3,4,5]) u(e)
) a
where not exists (
select 1
from unnest(a) u (e)
where e is null
);
a
-------
{1,2}
{2,3}
{3,4}
{4,5}

One option is to do this with a recursive cte. Starting from the first position in the array and going up to the last.
with recursive cte(a,val,strt,ed,l) as
(select a,a[1:2] as val,1 strt,2 ed,cardinality(a) as l
from t
union all
select a,a[strt+1:ed+1],strt+1,ed+1,l
from cte where ed<l
)
select val from cte
a in the cte is the array.
Another option if you know the max length of the array is to use generate_series to get all numbers from 1 to max length and cross joining the array table on cardinality. Then use lead to get slices of the array and omit the last one (as lead on last row for a given partition would be null).
with nums(n) as (select * from generate_series(1,10))
select a,res
from (select a,t.a[nums.n:lead(nums.n) over(partition by t.a order by nums.n)] as res
from nums
cross join t
where cardinality(t.a)>=nums.n
) tbl
where res is not null

Related

How to unnest BigQuery nested records into multiple columns

I am trying to unnest the below table .
Using the below unnest query to flatten the table
SELECT
id,
name ,keyword
FROM `project_id.dataset_id.table_id`
,unnest (`groups` ) as `groups`
where id = 204358
Problem is , this duplicates the rows (except name) as is the case with flattening the table.
How can I modify the query to put the names in two different columns rather than rows.
Expected output below -
That's because the comma is a cross join - in combination with an unnested array it is a lateral cross join. You repeat the parent row for every row in the array.
One problem with pivoting arrays is that arrays can have a variable amount of rows, but a table must have a fixed amount of columns.
So you need a way to decide for a certain row that becomes a certain column.
E.g. with
SELECT
id,
name,
groups[ordinal(1)] as firstArrayEntry,
groups[ordinal(2)] as secondArrayEntry,
keyword
FROM `project_id.dataset_id.table_id`
unnest(groups)
where id = 204358
If your array had a key-value pair you could decide using the key. E.g.
SELECT
id,
name,
(select value from unnest(groups) where key='key1') as key1,
keyword
FROM `project_id.dataset_id.table_id`
unnest(groups)
where id = 204358
But that doesn't seem to be the case with your table ...
A third option could be PIVOT in combination with your cross-join solution but this one has restrictions too: and I'm not sure how computation-heavy this is.
Consider below simple solution
select * from (
select id, name, keyword, offset
from `project_id.dataset_id.table_id`,
unnest(`groups`) with offset
) pivot (max(name) name for offset + 1 in (1, 2))
if applied to sample data in your question - output is
Note , when you apply to your real case - you just need to know how many such name_NNN columns to expect and extend respectively list - for example for offset + 1 in (1, 2, 3, 4, 5)) if you expect 5 such columns
In case if for whatever reason you want improve this - use below where everything is built dynamically for you so you don't need to know in advance how many columns it will be in the output
execute immediate (select '''
select * from (
select id, name, keyword, offset
from `project_id.dataset_id.table_id`,
unnest(`groups`) with offset
) pivot (max(name) name for offset + 1 in (''' || string_agg('' || pos, ', ') || '''))
'''
from (select pos from (
select max(array_length(`groups`)) cnt
from `project_id.dataset_id.table_id`
), unnest(generate_array(1, cnt)) pos
))
Your question is a little unclear, because it does not specify what to do with other keywords or other columns. If you specifically want the first two values in the array for keyword "OVG", you can unnest the array and pull out the appropriate names:
SELECT id,
(SELECT g.name
FROM UNNEST(t.groups) g WITH OFFSET n
WHERE key = 'OVG'
ORDER BY n
LIMIT 1
) as name_1,
(SELECT g.name
FROM UNNEST(t.groups) g WITH OFFSET n
WHERE key = 'OVG'
ORDER BY n
LIMIT 1 OFFSET 1
) as name_2,
'OVG' as keyword
FROM `project_id.dataset_id.table_id` t
WHERE id = 204358;

Filter records using JSON function for JSON array

There is one table where data stored in JSON format. I need to find how many records are there where Quote Required.
JSON
[{"id":14,"desc":"Job is incomplete.","quote_required":"Yes"},
{"id":14,"desc":"appointment need to rebook","quote_required":"Yes","start-date":"2021-11-20"}]
I am trying to achieve about using below JSON_CONTAINS() and JSON_EXTRACT()
SELECT COUNT(*)
FROM `products`
WHERE JSON_CONTAINS( JSON_EXTRACT(submit_report, "$.quote_required"), '"Yes"' )
But I am getting 0 results here
You can search for each element of the array whether having quote_required equals to Yes through use of index values starting from 0 upto length of the array minus 1 by generating index values with recursive common table expression such as
WITH recursive cte AS
(
SELECT 0 AS n
UNION ALL
SELECT n + 1 AS value
FROM cte
WHERE cte.n < ( SELECT JSON_LENGTH(submit_report) - 1 FROM `products` )
)
SELECT SUM(JSON_CONTAINS(JSON_EXTRACT(submit_report, CONCAT("$[",n,"].quote_required")),
'"Yes"')) AS count
FROM cte
JOIN `products`
Demo

PostgreSQL: How to return a subarray dynamically using array slices in postgresql

I need to sum a subarray from an array using postgresql.
I need to create a postgresql query that will dynamically do this as the upper and lower indexes will be different for each array.
These indexes will come from two other columns within the same table.
I had the below query that will get the subarray:
SELECT
SUM(t) AS summed_index_values
FROM
(SELECT UNNEST(int_array_column[34:100]) AS t
FROM array_table
WHERE id = 1) AS t;
...but I then realised I couldn't use variables or SELECT statements when using array slices to make the query dynamic:
int_array_column[SELECT array_index_lower FROM array_table WHERE id = 1; : SELECT array_index_upper FROM array_table WHERE id = 1;]
...does anyone know how I can achieve this query dynamically?
No need for sub-selects, just use the column names:
SELECT SUM(t) AS summed_index_values
FROM (
SELECT UNNEST(int_array_column[tb.array_index_lower:tb.array_index_upper]) AS t
FROM array_table tb
WHERE id = 1
) AS t;
Note that it's not recommended to use set-returning functions (unnest) in the SELECT list. It's better to put that into the FROM clause:
SELECT sum(t.val)
FROM (
SELECT t.val
FROM array_table tb
cross join UNNEST(int_array_column[tb.array_idx_lower:array_idx_upper]) AS t(val)
WHERE id = 1
) AS t;

Finding most common elements of column of arrays in Presto

I would like to find the most common elements within a column of arrays in presto.
For example...
col1
[A,B,C]
[A,B]
[A,D]
with output of...
col1 - col2
A - 3
B - 2
C - 1
D - 1
I have tried using flatten and unnest. I am able to get it into a single array using
select flatten(array_agg(col1))
from tablename;
but I am then not sure how to group and count by the distinct elements. I also am struggling to get this to run on all of my data because of the large amount of memory required.
Thanks for any help!
You can use to unnest() to flatten Array and then group by to group the unique values.
The Query to generate the data set for your case. You can replace this part with your select command in the final query:
with dataset AS (
SELECT ARRAY[
ARRAY['A','B','C'],
ARRAY['A','B'],
ARRAY['A','D']
] AS data
)
select dt from dataset
CROSS JOIN UNNEST(data) AS t(dt)
O/P:
------
dt
------
[A,B,C]
------
[A,B]
------
[A,D]
Now in the final query we will first flatten this data to remove all the values from all the rows and then group those value to get unique values and their count.
FINAL QUERY:
with da AS(
with dataset AS (
SELECT ARRAY[
ARRAY['A','B','C'],
ARRAY['A','B'],
ARRAY['A','D']
] AS data
)
select dt from dataset
CROSS JOIN UNNEST(data) AS t(dt)
)
select daVal,count(*) from da
CROSS JOIN UNNEST(dt) AS t(daVal)
GROUP BY daVal
You can unnest() and aggregate:
select u.col, count(*)
from t cross join
unnest(col1) u(col)
group by u.col;

Returning the lowest integer not in a list in SQL

Supposed you have a table T(A) with only positive integers allowed, like:
1,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18
In the above example, the result is 10. We always can use ORDER BY and DISTINCT to sort and remove duplicates. However, to find the lowest integer not in the list, I came up with the following SQL query:
select list.x + 1
from (select x from (select distinct a as x from T order by a)) as list, T
where list.x + 1 not in T limit 1;
My idea is start a counter and 1, check if that counter is in list: if it is, return it, otherwise increment and look again. However, I have to start that counter as 1, and then increment. That query works most of the cases, by there are some corner cases like in 1. How can I accomplish that in SQL or should I go about a completely different direction to solve this problem?
Because SQL works on sets, the intermediate SELECT DISTINCT a AS x FROM t ORDER BY a is redundant.
The basic technique of looking for a gap in a column of integers is to find where the current entry plus 1 does not exist. This requires a self-join of some sort.
Your query is not far off, but I think it can be simplified to:
SELECT MIN(a) + 1
FROM t
WHERE a + 1 NOT IN (SELECT a FROM t)
The NOT IN acts as a sort of self-join. This won't produce anything from an empty table, but should be OK otherwise.
SQL Fiddle
select min(y.a) as a
from
t x
right join
(
select a + 1 as a from t
union
select 1
) y on y.a = x.a
where x.a is null
It will work even in an empty table
SELECT min(t.a) - 1
FROM t
LEFT JOIN t t1 ON t1.a = t.a - 1
WHERE t1.a IS NULL
AND t.a > 1; -- exclude 0
This finds the smallest number greater than 1, where the next-smaller number is not in the same table. That missing number is returned.
This works even for a missing 1. There are multiple answers checking in the opposite direction. All of them would fail with a missing 1.
SQL Fiddle.
You can do the following, although you may also want to define a range - in which case you might need a couple of UNIONs
SELECT x.id+1
FROM my_table x
LEFT
JOIN my_table y
ON x.id+1 = y.id
WHERE y.id IS NULL
ORDER
BY x.id LIMIT 1;
You can always create a table with all of the numbers from 1 to X and then join that table with the table you are comparing. Then just find the TOP value in your SELECT statement that isn't present in the table you are comparing
SELECT TOP 1 table_with_all_numbers.number, table_with_missing_numbers.number
FROM table_with_all_numbers
LEFT JOIN table_with_missing_numbers
ON table_with_missing_numbers.number = table_with_all_numbers.number
WHERE table_with_missing_numbers.number IS NULL
ORDER BY table_with_all_numbers.number ASC;
In SQLite 3.8.3 or later, you can use a recursive common table expression to create a counter.
Here, we stop counting when we find a value not in the table:
WITH RECURSIVE counter(c) AS (
SELECT 1
UNION ALL
SELECT c + 1 FROM counter WHERE c IN t)
SELECT max(c) FROM counter;
(This works for an empty table or a missing 1.)
This query ranks (starting from rank 1) each distinct number in ascending order and selects the lowest rank that's less than its number. If no rank is lower than its number (i.e. there are no gaps in the table) the query returns the max number + 1.
select coalesce(min(number),1) from (
select min(cnt) number
from (
select
number,
(select count(*) from (select distinct number from numbers) b where b.number <= a.number) as cnt
from (select distinct number from numbers) a
) t1 where number > cnt
union
select max(number) + 1 number from numbers
) t1
http://sqlfiddle.com/#!7/720cc/3
Just another method, using EXCEPT this time:
SELECT a + 1 AS missing FROM T
EXCEPT
SELECT a FROM T
ORDER BY missing
LIMIT 1;