Join Postgresql array to table - sql

I have following tables
create table top100
(
id integer not null,
top100ids integer[] not null
);
create table top100_data
(
id integer not null,
data_string text not null
);
Rows in table top100 look like:
1, {1,2,3,4,5,6...100}
Rows in table top100_data look like:
1, 'string of text, up to 500 chars'
I need to get the text values from table top100_data and join them with table top100.
So the result will be:
1, {'text1','text2','text3',...'text100'}
I am currenly doing this on application side by selecting from top100, then iterating over all array items and then selecting from top100_data and iterating again + transforming ids to their _data text values.
This can be very slow on large data sets.
Is is possible to get this same result with single SQL query?

You can unnest() and re-aggregate:
select t100.id, array_agg(t100d.data order by top100id)
from top100 t100 cross join
unnest(top100ids) as top100id join
top100_data t100d
on t100d.id = top100id
group by t100.id;
Or if you want to keep the original ordering:
select t100.id, array_agg(t100d.data order by top100id.n)
from top100 t100 cross join
unnest(top100ids) with ordinality as top100id(id, n) join
top100_data t100d
on t100d.id = top100id.id
group by t100.id;

Just use unnest and array_agg function in PostgreSQL, your final sql could be like below:
with core as (
select
id,
unnest(top100ids) as top_id
from
top100
)
select
t1.id,
array_agg(t1.data_string) as text_datas
from
top100 t1
join
core c on t1.id = c.top_id
The example of unnest as below:
postgres=# select * from my_test;
id | top_ids
----+--------------
1 | {1,2,3,4,5}
2 | {6,7,8,9,10}
(2 rows)
postgres=# select id, unnest(top_ids) from my_test;
id | unnest
----+--------
1 | 1
1 | 2
1 | 3
1 | 4
1 | 5
2 | 6
2 | 7
2 | 8
2 | 9
2 | 10
(10 rows)
The example of array_agg as below:
postgres=# select * from my_test_1 ;
id | content
----+---------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
(6 rows)
postgres=# select id,array_agg(content) from my_test_1 group by id;
id | array_agg
----+-----------
1 | {a,b,c,d}
2 | {x,y}
(2 rows)

Related

How to convert JSONB array of pair values to rows and columns?

Given that I have a jsonb column with an array of pair values:
[1001, 1, 1002, 2, 1003, 3]
I want to turn each pair into a row, with each pair values as columns:
| a | b |
|------|---|
| 1001 | 1 |
| 1002 | 2 |
| 1003 | 3 |
Is something like that even possible in an efficient way?
I found a few inefficient (slow) ways, like using LEAD(), or joining the same table with the value from next row, but queries take ~ 10 minutes.
DDL:
CREATE TABLE products (
id int not null,
data jsonb not null
);
INSERT INTO products VALUES (1, '[1001, 1, 10002, 2, 1003, 3]')
DB Fiddle: https://www.db-fiddle.com/f/2QnNKmBqxF2FB9XJdJ55SZ/0
Thanks!
This is not an elegant approach from a declarative standpoint, but can you please see whether this performs better for you?
with indexes as (
select id, generate_series(1, jsonb_array_length(data) / 2) - 1 as idx
from products
)
select p.id, p.data->>(2 * i.idx) as a, p.data->>(2 * i.idx + 1) as b
from indexes i
join products p on p.id = i.id;
This query
SELECT j.data
FROM products
CROSS JOIN jsonb_array_elements(data) j(data)
should run faster if you just need to unpivot all elements within the query as in the demo.
Demo
or even remove the columns coming from products table :
SELECT jsonb_array_elements(data)
FROM products
OR
If you need to return like this
| a | b |
|------|---|
| 1001 | 1 |
| 1002 | 2 |
| 1003 | 3 |
as unpivoting two columns, then use :
SELECT MAX(CASE WHEN mod(rn,2) = 1 THEN data->>(rn-1)::int END) AS a,
MAX(CASE WHEN mod(rn,2) = 0 THEN data->>(rn-1)::int END) AS b
FROM
(
SELECT p.data, row_number() over () as rn
FROM products p
CROSS JOIN jsonb_array_elements(data) j(data)) q
GROUP BY ceil(rn/2::float)
ORDER BY ceil(rn/2::float)
Demo

SQL How to filter table with values having more than one unique value of another column

I have data table Customers that looks like this:
ID | Sequence No |
1 | 1 |
1 | 2 |
1 | 3 |
2 | 1 |
2 | 1 |
2 | 1 |
3 | 1 |
3 | 2 |
I would like to filter the table so that only IDs with more than 1 distinct count of Sequence No remain.
Expected output:
ID | Sequence No |
1 | 1 |
1 | 2 |
1 | 3 |
3 | 1 |
3 | 2 |
I tried
select ID, Sequence No
from Customers
where count(distinct Sequence No) > 1
order by ID
but I'm getting error. How to solve this?
You can get the desired result by using the below query. This is similar to what you were trying -
Sample Table & Data
Declare #Data table
(Id int, [Sequence No] int)
Insert into #Data
values
(1 , 1 ),
(1 , 2 ),
(1 , 3 ),
(2 , 1 ),
(2 , 1 ),
(2 , 1 ),
(3 , 1 ),
(3 , 2 )
Query
Select * from #Data
where ID in(
select ID
from #Data
Group by ID
Having count(distinct [Sequence No]) > 1
)
Using analytic functions, we can try:
WITH cte AS (
SELECT *, MIN([Sequence No]) OVER (PARTITION BY ID) min_seq,
MAX([Sequence No]) OVER (PARTITION BY ID) max_seq
FROM Customers
)
SELECT ID, [Sequence No]
FROM cte
WHERE min_seq <> max_seq
ORDER BY ID, [Sequence No];
Demo
We are checking for a distinct count of sequence number by asserting that the minimum and maximum sequence numbers are not the same for a given ID. The above query could benefit from the following index:
CREATE INDEX idx ON Customers (ID, [Sequence No]);
This would let the min and max values be looked up faster.

SQL select all rows in a single row's "history"

I have a table that looks like this:
ID | PARENT_ID
--------------
0 | NULL
1 | 0
2 | NULL
3 | 1
4 | 2
5 | 4
6 | 3
Being an SQL noob, I'm not sure if I can accomplish what I would like in a single command.
What I would like is to start at row 6, and recursively follow the "history", using the PARENT_ID column to reference the ID column.
The result (in my mind) should look something like:
6|3
3|1
1|0
0|NULL
I already tried something like this:
SELECT T1.ID
FROM Table T1, Table T2
WHERE T1.ID = 6
OR T1.PARENT_ID = T2.PARENT_ID;
but that just gave me a strange result.
With a recursive cte.
If you want to start from the maximum id:
with recursive cte (id, parent_id) as (
select t.*
from (
select *
from tablename
order by id desc
limit 1
) t
union all
select t.*
from tablename t inner join cte c
on t.id = c.parent_id
)
select * from cte
See the demo.
If you want to start specifically from id = 6:
with recursive cte (id, parent_id) as (
select *
from tablename
where id = 6
union all
select t.*
from tablename t inner join cte c
on t.id = c.parent_id
)
select * from cte;
See the demo.
Results:
| id | parent_id |
| --- | --------- |
| 6 | 3 |
| 3 | 1 |
| 1 | 0 |
| 0 | |

Select records not in another table with additional criteria

I am working on an ACCESS DB.
I have 1 table (tblData) with 1 column ( DataId) and 3 entries:
tblData (A)
+--------+
| DataId |
+--------+
| 1 |
| 2 |
| 3 |
+--------+
Another table (tblSelections) contains 3 columns (id, dataid, userid) and has 3 entries:
tblSelections (B)
+----+--------+---------+
| id | dataid | userid |
+----+--------+---------+
| 1 | 1 | 5 |
| 2 | 2 | 5 |
| 3 | 3 | 2 |
+----+--------+---------+
How can I select the records from table A (tblData) which are not in tbl B (tblSelections) for a certain 'userid'?
For 'userid' 5 the query must return 'DataId' 3 from table A as dataid 1 & 2 are already present in table B for userid 5.
For 'userid' 2 the query must return 'DataId' 1 & 2 from table A as dataid 3 is already present in table B for userid 2.
For 'userid' 1 the query must return 'DataId' 1, 2 & 3 from table A as no records are present in table B for userid 1
Use EXISTS or IN for queries like yours:
SELECT *
FROM tblData
WHERE DataId NOT IN
(
SELECT dataid
FROM tblSelections
WHERE userid = 5
);
SELECT *
FROM tblData
WHERE NOT EXISTS
(
SELECT *
FROM tblSelections
WHERE tblSelections.dataid = tblData.DataId AND tblSelections.userid = 5
);
You can use an outer join to select all records, then put a condition in the where clause that a non-nullable column in b is null. This will give you all records in a that do not have a matching row in b according to the join conditions.
This query assumes that you have a parameter or variable named #userid that represents the user ID to search against.
select
a.*
from tblData a
left join tblSelections b on b.dataid = a.dataid and b.userid = #userid
where b.id is null

SELECT query with cross rows WHERE statement

I'll try to explain the type of the query that I want:
Assume I have a table like this:
| ID | someID | Number |
|----|--------|--------|
| 1 | 1 | 10 |
| 2 | 1 | 11 |
| 3 | 1 | 14 |
| 4 | 2 | 10 |
| 5 | 2 | 13 |
Now, I want to find the someID that have a specific numbers (For example query for numbers 10, 11, 14 will return someID 1 and query for numbers 10, 13 will return 2). But, if someID contains all the query numbers but also more numbers, it will not return by the query. (For example query for 10, 11 will return nothing).
Is it possible?
SELECT t1.someId
FROM yourTable t1
WHERE t1.number IN (10,14,11)
GROUP BY t1.someID
HAVING COUNT(DISTINCT t1.ID) = (SELECT COUNT(DISTINCT t2.ID) FROM yourTable t2 WHERE t1.someID=t2.someID)
Example Fiddle
select someID
from yourtable
where number in (10,11,14)
and not exists (select * from yourtable t2 where number not in(10,11,14)
and t2.someid=yourtable.someid)
group by someID
having count(distinct ID) = 3
Where 3 is the number of items you are querying for
Yes, once you get the query numbers into a table variable (say it's called #QNums, with one column named QNum)) try
Select distinct someId
From table t
Where exists (Select * from #QNums
where QNum = t.Number)
And not Exists (Select * From table t2
Where someId = t.someId
And not exists(Select * From #QNums
where QNum = t3.Number))