Iterate through singly-linked list in PostgreSQL - sql

I have a table with a primary (unique) key id and a foreign key next_id which points to another (next) entry in this same table (or to none if it's the last entry).
So the example data may look like this:
| id | next_id |
| -------- | ---------- |
| 1 | 3 |
| 2 | null |
| 3 | 2 |
What's the most effective SQL-query to iterate through such linked list in Posgres and return list of linked id's (would be [1, 3, 2] for the provided example table)?

I figured it out using the recursive query as suggested in comments by #Laurenz Albe
WITH recursive tmp_table as (
SELECT id
FROM main_table
WHERE id = 1
UNION
SELECT m.id
FROM main_table m
INNER JOIN tmp_table t ON t.next_id = m.id
)
SELECT * FROM tmp_table;

Related

How to traverse postgresql in the form of linked list?

I have a table in the form of linked list.
| unique_id | next |
| -------- | ------- |
| 1 | 3 |
| 2 | null |
| 3 | 2 |
Here the unique_id is the id of the row. Next is the id of the row it is pointing to. There is another table which keeps track of the head. Let's say row with uId=1 is the head. So, I want to query my table such that it extracts head from the headTable and gives the data in the same order as this linked-list. 1->3->2 in the form of an array of rows.
Expected Result : [{unique_id:1, next:3},{unique_id:3, next:2}{unique_id:2, next:null}]
Sorry, I'm unable to render the above table properly that's why It's in the form of code.
Try a recursive query - and add a "path" column to the query:
CREATE TABLE
-- your input ....
indata(unique_id,next) AS (
SELECT 1,3
UNION ALL SELECT 2,null
UNION ALL SELECT 3,2
)
;
\pset null NULL
WITH RECURSIVE recursion AS (
SELECT
unique_id
, next
, (unique_id::VARCHAR(8)||'>'||next::VARCHAR(8))::VARCHAR(16) AS path
FROM indata
WHERE unique_id=1 -- need a starting filter
UNION ALL
SELECT
c.unique_id
, c.next
, p.path||'>'||NVL(c.next::VARCHAR(8),'NULL')
FROM recursion p
JOIN indata c ON p.next = c.unique_id
)
SELECT * FROM recursion;
-- out unique_id | next | path
-- out -----------+------+------------
-- out 1 | 3 | 1>3
-- out 3 | 2 | 1>3>2
-- out 2 | NULL | 1>3>2>NULL
A bit late with the answer, but here is my version, without adding columns.
with recursive headtablelist as (
select unique_id, next
from headtable
where unique_id = 1
union
select e.unique_id, e.next
from headtable e
inner join headtablelist s on s.next = e.unique_id
)
select * from headtablelist;
Demo in sqldaddy.io
More information about recursive queries can be found here.

ARRAY_AGG without duplicates

In PostgreSQL database I have table which has columns like ITEM_ID and PARENT_ITEM_ID.
| ITEM_ID | ITEM_NAME | PARENT_ITEM_ID |
|---------|-----------|----------------|
| 1 | A | 0 |
| 2 | B | 0 |
| 3 | C | 1 |
My task to take all values from these columns and put them to one array. In the same time I need delete all duplicates. I started with such SQL query but what the best way to delete duplicates?
SELECT
ARRAY_AGG(ITEM_ID || ',' || PARENT_ITEM_ID)
FROM
ITEMS_RELATIONSHIP
GROUP BY
ITEM_ID
I want such result:
[1,0,2,3]
Right now I have such result:
|{1,0}|
|{2,0}|
|{3,1}|
If you want one array of all item IDs, don't group by item_id. Something like this might be what you want:
select
array_agg(item_id, ',') as itemlist
from
(
select item_id from items_relationship
union
select parent_item_id from items_relationship
) as allitems;
Here is one method to get the parent item ids in with the other item ids:
select array_agg(distinct item_id)
from items_relationship ir cross join lateral
(values (ir.item_id), (ir.parent_item_id)) v(item_id);
This unpivots the data using a lateral join and then aggregates.

replace select table by an update

I have an intermediate table:
text_mining_molecule
|text_mining_id| molecule_id |
| -------------| ---------- |
| ID | ID |
and two other tables:
Table Molécules:
id | main_name | others …
--- | --------- | ------
1 | caféine | others …
Table jsonTextMining:
id | title | molecule_name | others …
---|------- |-------------------------------------|------
1 | title1 | colchicine, cellulose, acid, caféine| others …
text_mining_molecule need to be inserted when select a choice in a list with ID's from 2 others tables json_text_mining and molecule.
Actually there is a dropdown that already insert all rows from json_text_mining to text_mining when choose a score under 4.
INSERT INTO text_mining (id, solrid, originalpaper, annotatedfile, title, keyword, importantsentence, version, klimischscore, moleculename, synonymname, validation)
SELECT id, solrid, originalpaper, annotatedfile, title, keyword, importantsentence, version, klimischscore, molecule_name, synonym_name, validation
FROM json_text_mining WHERE klimischscore < 4
This works but i need text_mining_molecule to be filled also with related ID's so i have also this part of code :
SELECT s.id, m.id
FROM (SELECT id, regexp_split_to_table(molecule_name,', ') AS m_name
FROM json_text_mining) AS s, molecule m
WHERE m.main_name = s.m_name;
How can i update text_mining_molecule table directly with an insert instead a select ?
use CTE. eg if text_mining_molecule.molecule references molecule.id, would be smth like:
with c as (
SELECT s.id sid, m.id mid
FROM (SELECT id, regexp_split_to_table(molecule_name,', ') AS m_name
FROM json_text_mining) AS s, molecule m
WHERE m.main_name = s.m_name
)
update text_mining_molecule t
set sid = c.sid
from c
where t.molecule = c.mid

Delete rows except for one for every id

I have a dataset with multiple ids. For every id there are multiple entries. Like this:
--------------
| ID | Value |
--------------
| 1 | 3 |
| 1 | 4 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 3 | 3 |
| 3 | 5 |
--------------
Is there a SQL DELETE query to delete (random) rows for every id, except for one (random rows would be nice but is not essential)? The resulting table should look like this:
--------------
| ID | Value |
--------------
| 1 | 2 |
| 2 | 1 |
| 3 | 5 |
--------------
Thanks!
It doesn't look like hsqldb fully supports olap functions (in this case row_number() over (partition by ...), so you'll need to use a derived table to identify the one value you want to keep for each ID. It certainly won't be random, but I don't think anything else will be either. Something like so
This query will give you the first part:
select
id,
min(value) as minval
from
group by id
Then you can delete from your table where you don't match:
delete from
<your table> t1
inner join
(
select
id,
min(value) as minval
from
<your table>
group by id
) t2
on t1.id = t2.id
and t1.value <> t2.value
Try this:
alter ignore table a add unique(id);
Here a is the table name
This should do what you want:
SELECT ID, Value
FROM (SELECT ID, Value, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NEWID()) AS RN
FROM #Table) AS A
WHERE A.RN = 1
I tried the given answers with HSQLDB but it refused to execute those queries for different reasons (join is not allowed in delete query, ignore statement is not allowed in alter query). Thanks to Andrew I came up with this solution (which is a little bit more circumstantial, but allows it to delete random rows):
Add a new column for random values:
ALTER TABLE <table> ADD COLUMN rand INT
Fill this column with random data:
UPDATE <table> SET rand = RAND() * 1000000
Delete all rows which don't have the minimum random value for their id:
DELETE FROM <table> WHERE rand NOT IN (SELECT MIN(rand) FROM <table> GROUP BY id)
Drop the random column:
ALTER TABLE <table> DROP rand
For larger tables you probably should ensure that the random values are unique, but this worked perfectly for me.

Choose rows based on two connected column values in one statement - ORACLE

First, I'm not sure if the title represent the best of the issue. Any better suggestion is welcomed. My problem is I have the following table:
+----+----------+-------+-----------------+
| ID | SUPPLIER | BUYER | VALIDATION_CODE |
+----+----------+-------+-----------------+
| 1 | A | Z | 937886521 |
| 2 | A | X | 937886521 |
| 3 | B | Z | 145410916 |
| 4 | C | V | 775709785 |
+----+----------+-------+-----------------+
I need to show SUPPLIERS A and B which have BUYER Z, X. However, I want this condition to be one-to-one relationship rather than one-to-many. That is, for the supplier A, I want to show the column with ID: 1, 2. For the supplier B, I want to show the column 3 only. The following script will show the supplier A with all possible buyers (which I do not want):
SELECT *
FROM validation
WHERE supplier IN ( 'A', 'B' )
AND buyer IN ( 'X', 'Z');
This will show the following pairs: (A,Z), (A,X), (B, Z). I need to show only the following: (A,X)(B,Z) in one statement.
The desired result should be like this:
+----+----------+-------+-----------------+
| ID | SUPPLIER | BUYER | VALIDATION_CODE |
+----+----------+-------+-----------------+
| 2 | A | X | 937886521 |
| 3 | B | Z | 145410916 |
+----+----------+-------+-----------------+
You can update the WHERE clause to filter on the desired pairs:
select *
from sample
where (upper(supplier),upper(buyer))
in (('A','X'),('A','Y'),('A','Z'),('B','X'),('B','Y'),('B','Z'));
I used the UPPER function based on your mixed case examples.
See if this what you need:
SELECT MAX(id),
supplier,
MAX(buyer),
MAX(validation_code)
FROM
(SELECT *
FROM Validation
WHERE supplier IN ( 'A', 'B' ) AND buyer IN ( 'X', 'Z')
) filtered
GROUP BY supplier;
SQL Fiddle
I used GROUP BY supplier to flatten the table and included maximum values of ID, Buyer, and Validation_Code.
Alternatively, you could try this:
SELECT id
, supplier
, buyer
, validation_code
FROM (SELECT id
,max(id) OVER(PARTITION BY supplier) AS maxid
,supplier
,buyer
,validation_code
FROM sample) AS x
WHERE x.id=x.maxid
You may have a look to the results of the inner SQL statement to see what it does.
try this query:
select ID,SUPPLIER,BUYER,VALIDATION_CODE from
(select
t2.*,t1.counter
from
validation t2,
(select supplier,count(supplier) as counter from hatest group by supplier)t1
where
t1.supplier = t2.supplier)t3
where t3.supplier in('A','B') and
id = case when t3.counter > 1 then
(select max(id) from validation t4 where t4.supplier = t3.supplier) else t3.id end;