quickly take all correlated rows from the table

quickly take all correlated rows from the table - sql

I have a large table with multiple columns representing linked events. This includes columns id and nextId, where id means id of some event1, and nextId suggest in which another event this event1 was used. However, there is no column 'prev_id' which would say which event0 contributed to event1. Is it possible to build a query which will generate for me such a table without taking a very long running time?
Here is an example of what I mean:
id | nextId
10 | 34
5 | 67
22 | 23
2 | 10
16 | 22
4 | 5
What I want to have is the following:
prev_id | id | next_id
2 | 10 | 34
4 | 5 | 67
16 | 22 | 23

You can use a join:
select t.id as prev_id, t.nextid as id, tnext.nextid as next_id
from t join
t tnext
on tnext.id = t.nextid;

You only need a self join.
But for the sake of readability, I would recommend using a CTE:
with prevs as (
select nextid as id, id as previd from ids
)
select previd, id, nextid
from ids
join prevs using(id)
;
previd | id | nextid
--------+----+--------
4 | 5 | 67
2 | 10 | 34
16 | 22 | 23
(3 rows)

In addition to what others have said, you can also accomplish this using hierarchical queries.
WITH test_data AS (
SELECT 10 AS ID,34 AS nextID FROM DUAL
UNION SELECT 5,67 FROM DUAL
UNION SELECT 22,23 FROM DUAL
UNION SELECT 2,10 FROM DUAL
UNION SELECT 16,22 FROM DUAL
UNION SELECT 4,5 FROM DUAL
)
SELECT h.*
FROM (
SELECT PRIOR t.ID AS prevID,
t.ID,
t.nextID
FROM test_data t
CONNECT BY t.ID = PRIOR t.nextID
) h
WHERE h.prevID IS NOT NULL
ORDER BY h.prevID

Related

Find top parent of child, multiple levels

ENTRY TABLE
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 2 |
| 4 | null |
| 5 | 4 |
| 6 | 5 |
...
I make copies of the entries in some cases and they are conneted by parent ID.
Each entry can have one copy:
THIS WONT HAPPEN
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 1 |
...
Sometimes I need to take a copy and query for it's top level parent. I need to find the top parent entries for all the entries I search for.
For example, if I query for the parents of ID 6 and 3, I would get ID 4 and 1.
If I query for the parents of ID 5 and 2, I would get ID 4 and 1.
But also If I query for ID 5 and 1, it should return ID 4 and 1 because the entry ID 1 is already the top parent itself.
I don't know where to begin since I don't know how to recursively query in such case.
Can anyone point me in the right direction?
I know that the query below will just return the child elemements (ID 6 and 3), but I don't know where to go from here honestly.
I am using OracleSQL by the way.
SELECT * FROM entry WHERE id IN (6, 3);

You can use a hierarchical query and CONNECT_BY_ROOT.
Either starting at the root of the hierarchy and working down:
SELECT id,
CONNECT_BY_ROOT(id) AS root_id
FROM entry
WHERE id IN (6, 3)
START WITH parent_id IS NULL
CONNECT BY PRIOR id = parent_id;
Or, from the entry back up to the root:
SELECT CONNECT_BY_ROOT(id) AS id,
id AS root_id
FROM entry
WHERE parent_id IS NULL
START WITH id IN (6, 3)
CONNECT BY PRIOR parent_id = id;
Which, for the sample data:
CREATE TABLE entry( id, parent_id ) AS
SELECT 1, NULL FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 2 FROM DUAL UNION ALL
SELECT 4, NULL FROM DUAL UNION ALL
SELECT 5, 4 FROM DUAL UNION ALL
SELECT 6, 5 FROM DUAL UNION ALL
SELECT 7, 6 FROM DUAL
Both output:
ID
ROOT_ID
3
1
6
4
db<>fiddle here

You can use recursive CTE to walk the graph and find the initial parent. For example:
with
n (starting_id, current_id, parent_id, v) as (
select id, id, parent_id, 0 from entry where id in (6, 3)
union all
select n.starting_id, e.id, e.parent_id, n.v - 1
from n
join entry e on e.id = n.parent_id
)
select starting_id, current_id as initial_id
from (
select n.*, row_number() over(partition by starting_id order by v) as rn
from n
) x
where rn = 1
Result:
STARTING_ID INITIAL_ID
------------ ----------
3 1
6 4
See running example at db<>fiddle.

SQL Server 2012 - Find a steadily rising value of a column

I have a table like below:
ID | Name | Ratio | Miles
____________________________________
1 | ABC | 45 | 21
1 | ABC | 46 | 24
1 | ABC | 46 | 25
2 | PQR | 41 | 19
2 | PQR | 39 | 17
3 | XYZ | 27 | 13
3 | XYZ | 26 | 11
4 | DEF | 40 | 18
4 | DEF | 40 | 18
4 | DEF | 42 | 20
I want to write a query that will find an ID whose Miles value has been steadily rising.
For instance,
Miles values of Name 'ABC' and 'DEF' are steadily rising.
It's fine if the Miles value drops by up to 5% and rises again.
It should also include this Name.
I tried self join on this table but it gives me Cartesian product.
Can anyone help me with this?
I am using SQL server 2012.
Thanks in advance!

SQL tables represent unordered sets. Let me assume that you have a column that specifies the ordering. Then, you can use lag() and some logic:
select id, name
from (select t.*,
lag(miles) over (partition by id order by orderingcol) as prev_miles
from t
) t
group by id, name
having min(case when prev_miles is null or miles >= prev_miles * 0.95 then 1 else 0 end) = 1;
The having clause is simply determining if all the rows meet your specific condition.

try this:
Note: 5% case is not handled here
create table #tmp(ID INT,Name VARCHAR(50),Ratio INT,Miles INT)
INSERT INTO #tmp
SELECT 1,'ABC',45,21
union all
SELECT 1,'ABC',46,24
union all
SELECT 1,'ABC',46,25
union all
SELECT 2,'PQR',41,19
union all
SELECT 2,'PQR',39,17
union all
SELECT 3,'XYZ',27,13
union all
SELECT 3,'XYZ',26,11
union all
SELECT 4,'DEF',40,18
union all
SELECT 4,'DEF',40,18
union all
SELECT 4,'DEF',42,21
Select *,CASE WHEN Miles<=LEAD(Miles,1,Miles) OVER(partition by ID Order by ID) THEN 1
--NEED ADD 5%condition Here
ELSE 0 END AS nextMiles
into #tmp2
from #tmp
;with cte
AS(
select * , ROW_NUMBER() OVER (partition by ID,nextMiles order by ID) rn from #tmp2
)
SELECT DISTINCT ID,Name FROM cte WHERE rn>1
Drop table #tmp
Drop table #tmp2

How to count all the connected nodes (rows) in a graph on Postgres?

My table has account_id and device_id. One account_id could have multiple device_ids and vice versa. I am trying to count the depth of each connected many-to-many relationship.
Ex:
account_id | device_id
1 | 10
1 | 11
1 | 12
2 | 10
3 | 11
3 | 13
3 | 14
4 | 15
5 | 15
6 | 16
How do I construct a query that knows to combine accounts 1-3 together, 4-5 together, and leave 6 by itself? All 7 entries of accounts 1-3 should be grouped together because they all touched the same account_id or device_id at some point. I am trying to group them together and output the count.
Account 1 was used on device's 10, 11, 12. Those devices used other accounts too so we want to include them in the group. They used additional accounts 2 and 3. But account 3 was further used by 2 more devices so we will include them as well. The expansion of the group brings in any other account or device that also "touched" an account or device already in the group.
A diagram is shown below:

You can use a recursive cte:
with recursive t(account_id, device_id) as (
select 1, 10 union all
select 1, 11 union all
select 1, 12 union all
select 2, 10 union all
select 3, 11 union all
select 3, 13 union all
select 3, 14 union all
select 4, 15 union all
select 5, 15 union all
select 6, 16
),
a as (
select distinct t.account_id as a, t2.account_id as a2
from t join
t t2
on t2.device_id = t.device_id and t.account_id >= t2.account_id
),
cte as (
select a.a, a.a2 as mina
from a
union all
select a.a, cte.a
from cte join
a
on a.a2 = cte.a and a.a > cte.a
)
select grp, array_agg(a)
from (select a, min(mina) as grp
from cte
group by a
) a
group by grp;
Here is a SQL Fiddle.

You can GROUP BY the device_id and then aggregate together the account_id into a Postgres array. Here is an example query, although I'm not sure what your actual table name is.
SELECT
device_id,
array_agg(account_id) as account_ids
FROM account_device --or whatever your table name is
GROUP BY device_id;
Results - hope it's what you're looking for:
16 | {6}
15 | {4,5}
13 | {3}
11 | {1,3}
14 | {3}
12 | {1}
10 | {1,2}

-- \i tmp.sql
CREATE TABLE graph(
account_id integer NOT NULL --references accounts(id)
, device_id integer not NULL --references(devices(id)
,PRIMARY KEY(account_id, device_id)
);
INSERT INTO graph (account_id, device_id)VALUES
(1,10) ,(1,11) ,(1,12)
,(2,10)
,(3,11) ,(3,13) ,(3,14)
,(4,15)
,(5,15)
,(6,16)
;
-- SELECT* FROM graph ;
-- Find the (3) group leaders
WITH seed AS (
SELECT row_number() OVER () AS cluster_id -- give them a number
, g.account_id
, g.device_id
FROM graph g
WHERE NOT EXISTS (SELECT*
FROM graph nx
WHERE (nx.account_id = g.account_id OR nx.device_id = g.device_id)
AND nx.ctid < g.ctid
)
)
-- SELECT * FROM seed;
;
WITH recursive omg AS (
--use the above CTE in a sub-CTE
WITH seed AS (
SELECT row_number()OVER () AS cluster_id
, g.account_id
, g.device_id
, g.ctid AS wtf --we need an (ordered!) canonical id for the tuples
-- ,just to identify and exclude them
FROM graph g
WHERE NOT EXISTS (SELECT*
FROM graph nx
WHERE (nx.account_id = g.account_id OR nx.device_id = g.device_id) AND nx.ctid < g.ctid
)
)
SELECT s.cluster_id
, s.account_id
, s.device_id
, s.wtf
FROM seed s
UNION ALL
SELECT o.cluster_id
, g.account_id
, g.device_id
, g.ctid AS wtf
FROM omg o
JOIN graph g ON (g.account_id = o.account_id OR g.device_id = o.device_id)
-- AND (g.account_id > o.account_id OR g.device_id > o.device_id)
AND g.ctid > o.wtf
-- we still need to exclude duplicates here
-- (which could occur if there are cycles in the graph)
-- , this could be done using an array
)
SELECT *
FROM omg
ORDER BY cluster_id, account_id,device_id
;
Results:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 10
cluster_id | account_id | device_id
------------+------------+-----------
1 | 1 | 10
2 | 4 | 15
3 | 6 | 16
(3 rows)
cluster_id | account_id | device_id | wtf
------------+------------+-----------+--------
1 | 1 | 10 | (0,1)
1 | 1 | 11 | (0,2)
1 | 1 | 12 | (0,3)
1 | 1 | 12 | (0,3)
1 | 2 | 10 | (0,4)
1 | 3 | 11 | (0,5)
1 | 3 | 13 | (0,6)
1 | 3 | 14 | (0,7)
1 | 3 | 14 | (0,7)
2 | 4 | 15 | (0,8)
2 | 5 | 15 | (0,9)
3 | 6 | 16 | (0,10)
(12 rows)
Newer version (I added an Id column to the table)
-- for convenience :set of all adjacent nodes.
CREATE TEMP VIEW pair AS
SELECT one.id AS one
, two.id AS two
FROM graph one
JOIN graph two ON (one.account_id = two.account_id OR one.device_id = two.device_id)
AND one.id <> two.id
;
WITH RECURSIVE flood AS (
SELECT g.id, g.id AS parent_id
, 0 AS lev
, ARRAY[g.id]AS arr
FROM graph g
UNION ALL
SELECT c.id, p.parent_id AS parent_id
, 1+p.lev AS lev
, p.arr || ARRAY[c.id] AS arr
FROM graph c
JOIN flood p ON EXISTS (
SELECT * FROM pair WHERE p.id = pair.one AND c.id = pair.two)
AND p.parent_id < c.id
AND NOT p.arr #> ARRAY[c.id] -- avoid cycles/loops
)
SELECT g.*, a.parent_id
, dense_rank() over (ORDER by a.parent_id)AS group_id
FROM graph g
JOIN (SELECT id, MIN(parent_id) AS parent_id
FROM flood
GROUP BY id
) a
ON g.id = a.id
ORDER BY a.parent_id, g.id
;
New results:
CREATE VIEW
id | account_id | device_id | parent_id | group_id
----+------------+-----------+-----------+----------
1 | 1 | 10 | 1 | 1
2 | 1 | 11 | 1 | 1
3 | 1 | 12 | 1 | 1
4 | 2 | 10 | 1 | 1
5 | 3 | 11 | 1 | 1
6 | 3 | 13 | 1 | 1
7 | 3 | 14 | 1 | 1
8 | 4 | 15 | 8 | 2
9 | 5 | 15 | 8 | 2
10 | 6 | 16 | 10 | 3
(10 rows)

Select statement that duplicates rows based on N value of column

I have a Power table that stores building circuit details. A circuit can be 1 phase or 3 phase but is always represented as 1 row in the circuit table.
I want to insert the details of the circuits into a join table which joins panels to circuits
My current circuit table has the following details
CircuitID | Voltage | Phase | PanelID | Cct |
1 | 120 | 1 | 1 | 1 |
2 | 208 | 3 | 1 | 3 |
3 | 208 | 2 | 1 | 8 |
Is it possible to create a select where by when it sees a 3 phase row it selects 3 rows (or 2 select 2 rows) and increments the Cct column by 1 each time or do I have to create a loop?
CircuitID | PanelID | Cct |
1 | 1 | 1 |
2 | 1 | 3 |
2 | 1 | 4 |
2 | 1 | 5 |
3 | 1 | 8 |
3 | 1 | 9 |

Here is one way to do it
First generate numbers using tally table(best possible way). Here is one excellent article about generating number without loops. Generate a set or sequence without loops
Then join the numbers table with yourtable where phase value of each record should be greater than sequence number in number's table
;WITH e1(n) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e1 CROSS JOIN e2), -- 10*100
numbers as ( SELECT n = ROW_NUMBER() OVER (ORDER BY n) FROM e3 )
SELECT CircuitID,
PanelID,
Cct = Cct + ( n - 1 )
FROM Yourtable a
JOIN numbers b
ON a.Phase >= b.n

You can do this with a one recursive cte.
WITH cte AS
(
SELECT [CircuitID], [Voltage], [Phase], [PanelID], [Cct], [Cct] AS [Ref]
FROM [Power]
UNION ALL
SELECT [CircuitID], [Voltage], [Phase], [PanelID], [Cct] + 1, [Ref]
FROM cte
WHERE [Cct] + 1 < [Phase] + [Ref]
)
SELECT [CircuitID], [PanelID], [Cct]
FROM cte
ORDER BY [CircuitID]

Simplest way,
Select y.* from (
Select 1 CircuitID,120 Voltage,1 Phase,1 PanelID, 1 Cct
union
Select 2,208,3,1,3
union
Select 3,208,2,1,8)y,
(Select 1 x
union
Select 2 x
union
Select 3 x)x
Where x.x <= y.Phase
Directly copy paste this and try, it will run 100%. After that, just replace my 'y' table with your real table.

SQL query update by grouping

I'm dealing with some legacy data in an Oracle table and have the following
--------------------------------------------
| RefNo | ID |
--------------------------------------------
| FOO/BAR/BAZ/AAAAAAAAAA | 1 |
| FOO/BAR/BAZ/BBBBBBBBBB | 1 |
| FOO/BAR/BAZ/CCCCCCCCCC | 1 |
| FOO/BAR/BAZ/DDDDDDDDDD | 1 |
--------------------------------------------
For each of the /FOO/BAR/BAZ/% records I want to make the ID a Unique incrementing number.
Is there a method to do this in SQL?
Thanks in advance
EDIT
Sorry for not being specific. I have several groups of records /FOO/BAR/BAZ/, /FOO/ZZZ/YYY/. The same transformation needs to occur for each of these other (example) groups. The recnum can't be used I want ID to start from 1, incrementing, for each group of records I have to change.
Sorry for making a mess of my first post. Output should be
--------------------------------------------
| RefNo | ID |
--------------------------------------------
| FOO/BAR/BAZ/AAAAAAAAAA | 1 |
| FOO/BAR/BAZ/BBBBBBBBBB | 2 |
| FOO/BAR/BAZ/CCCCCCCCCC | 3 |
| FOO/BAR/BAZ/DDDDDDDDDD | 4 |
| FOO/ZZZ/YYY/AAAAAAAAAA | 1 |
| FOO/ZZZ/YYY/BBBBBBBBBB | 2 |
--------------------------------------------

Let's try something like this(Oracle version 10g and higher):
SQL> with t1 as(
2 select 'FOO/BAR/BAZ/AAAAAAAAAA' as RefNo, 1 as ID from dual union all
3 select 'FOO/BAR/BAZ/BBBBBBBBBB', 1 from dual union all
4 select 'FOO/BAR/BAZ/CCCCCCCCCC', 1 from dual union all
5 select 'FOO/BAR/BAZ/DDDDDDDDDD', 1 from dual union all
6 select 'FOO/ZZZ/YYY/AAAAAAAAAA', 1 from dual union all
7 select 'FOO/ZZZ/YYY/BBBBBBBBBB', 1 from dual union all
8 select 'FOO/ZZZ/YYY/CCCCCCCCCC', 1 from dual union all
9 select 'FOO/ZZZ/YYY/DDDDDDDDDD', 1 from dual
10 )
11 select row_number() over(partition by ComPart order by DifPart) as id
12 , RefNo
13 From (select regexp_substr(RefNo, '[[:alpha:]]+$') as DifPart
14 , regexp_substr(RefNo, '([[:alpha:]]+/)+') as ComPart
15 , RefNo
16 , Id
17 from t1
18 ) q
19 ;
ID REFNO
---------- -----------------------
1 FOO/BAR/BAZ/AAAAAAAAAA
2 FOO/BAR/BAZ/BBBBBBBBBB
3 FOO/BAR/BAZ/CCCCCCCCCC
4 FOO/BAR/BAZ/DDDDDDDDDD
1 FOO/ZZZ/YYY/AAAAAAAAAA
2 FOO/ZZZ/YYY/BBBBBBBBBB
3 FOO/ZZZ/YYY/CCCCCCCCCC
4 FOO/ZZZ/YYY/DDDDDDDDDD
I think that actual updating the ID column wouldn't be a good idea. Every time you add new groups of data you would have to run the update statement again. The better way would be creating a view and you will see desired output every time you query it.

rownum can be used as an incrementing ID?
UPDATE legacy_table
SET id = ROWNUM;
This will assign unique values to all records in the table. This link contains documentation about Oracle Pseudocolumn.

You can run the following:
update <table_name> set id = rownum where descr like 'FOO/BAR/BAZ/%'

This is pretty rough and I'm not sure if your RefNo is a single value column or you just made it like that for simplicity.
select
sub.RefNo
row_number() over (order by sub.RefNo) + (select max(id) from TABLE),
from (
select FOO+'/'+BAR+'/'+BAZ+'/'+OTHER as RefNo
from TABLE
group by FOO+'/'+BAR+'/'+BAZ+'/'+OTHER
) sub

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

quickly take all correlated rows from the table - sql

You can use a join: select t.id as prev_id, t.nextid as id, tnext.nextid as next_id from t join t tnext on tnext.id = t.nextid;

You only need a self join. But for the sake of readability, I would recommend using a CTE: with prevs as ( select nextid as id, id as previd from ids ) select previd, id, nextid from ids join prevs using(id) ; previd | id | nextid --------+----+-------- 4 | 5 | 67 2 | 10 | 34 16 | 22 | 23 (3 rows)

Related

Find top parent of child, multiple levels

SQL Server 2012 - Find a steadily rising value of a column

How to count all the connected nodes (rows) in a graph on Postgres?

Select statement that duplicates rows based on N value of column

SQL query update by grouping

Categories

Resources