How to get first 2 children of a parent in SQL? - sql

I have a table that looks like the below
ID Name ParentID
333 UK NULL
124 Wales 333
126 Swansea 124
127 Llanrhidian 126
As you can see all of the parent and children are in the same table. I need to create a view from this which shows the ID and name for each bottom level Child, the ChildID and Name of the one above it and then the highest level parent of them. An output of the above for Wales would look like the following
ChildID1 Child1Name ChildID2 Child2Name ParentID ParentName
127 Llanrhidian 126 Swansea 333 England
Sometimes the number of parents to a child can be different. In the example above, ChildID 127 has 3 parents. This can sometimes be more but we will always need to see the lowest 2.
Does this make sense? Can someone help me with this?

You can try to use CTE recursive with condition aggravated function.
the first query to use CTE recursive get all recursive relation data.
the second query use MAX window function to get max number which means parentId.
;WITH CTE AS(
SELECT t1.ID,t1.Name,t1.ParentID,1 num
FROM T t1 LEFT JOIN T t2 on t1.ID = t2.ParentID
WHERE t2.ID IS NULL
UNION ALL
SELECT t2.ID,t2.NAME,t2.ParentID,num + 1
FROM CTE t1 JOIN T t2 ON t1.ParentID = t2.ID
),CTE2 AS (
SELECT *,MAX(num) OVER(ORDER BY num desc) maxNum
FROM CTE
)
SELECT MAX(CASE WHEN num = 1 then ID END) ChildID1,
MAX(CASE WHEN num = 1 then name END) Child1Name,
MAX(CASE WHEN num = 2 then ID END) ChildID2,
MAX(CASE WHEN num = 2 then Name END) Child2Name,
MAX(CASE WHEN num = maxNum then ID END) ParentID,
MAX(CASE WHEN num = maxNum then Name END) ParentName
FROM CTE2
sqlfiddle

In your SQL, use: SELECT TOP 2 * FROM ...
This wil only fetch the first 2 lines he finds. (This is syntax for SQL Server)

Related

Filter rows and select in to another columns in SQL?

I have a table like below.
If(OBJECT_ID('tempdb..#temp') Is Not Null)
Begin
Drop Table #Temp
End
create table #Temp
(
Type int,
Code Varchar(50),
)
Insert Into #Temp
SELECT 1,'1'
UNION
SELECT 1,'2'
UNION
SELECT 1,'3'
UNION
SELECT 2,'4'
UNION
SELECT 2,'5'
UNION
SELECT 2,'6'
select * from #Temp
And would like to get the below result.
Type_1
Code_1
Type_2
Code_2
1
1
2
4
1
2
2
5
1
3
2
6
I have tried with union and inner join, but not getting desired result. Please help.
You can use full outer join and cte as follows:
With cte as
(Select type, code,
Row_number() over (partition by type order by code) as rn
From your_table t)
Select t1.type, t1.code, t2.type, t2.code
From cte t1 full join cte t2
On t1.rn = t2.rn and t1.type =1 and t2.type = 2
Here is a query which will produce the output you expect:
WITH cte AS (
SELECT t.[Type], t.Code
, rn = ROW_NUMBER() OVER (PARTITION BY t.[Type] ORDER BY t.Code)
FROM #Temp t
)
SELECT Type_1 = t1.[Type], Code_1 = t1.Code
, Type_2 = t2.[Type], Code_2 = t2.Code
FROM cte t1
JOIN cte t2 ON t1.rn = t2.rn AND t2.[Type] = 2
AND t1.[Type] = 1
This query is will filter out any Type_1 records which do not have a Type_2 record. This means if there are an uneven number of Type_1 vs Type_2 records, the extra records will get eliminated.
Explanation:
Since there is no obvious way to join the two sets of data, because there is no shared key between them, we need to create one.
So we use this query:
SELECT t.[Type], t.Code
, rn = ROW_NUMBER() OVER (PARTITION BY t.[Type] ORDER BY t.Code)
FROM #Temp t
Which assigns a ROW_NUMBER to every row...It restarts the numbering for every Type value, and it orders the numbering by the Code.
So it will produce:
| Type | Code | rn |
|------|------|----|
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | 3 |
| 2 | 4 | 1 |
| 2 | 5 | 2 |
| 2 | 6 | 3 |
Now you can see that we have assigned a key to each row of Type 1's and Type 2's which we can use for the joining process.
In order for us to re-use this output, we can stick it in a CTE and perform a self join (not an actual type of join, it just means we want to join a table to itself).
That's what this query is doing:
SELECT *
FROM cte t1
JOIN cte t2 ON t1.rn = t2.rn AND t2.[Type] = 2
AND t1.[Type] = 1
It's saying, "give me a list of all Type 1 records, and then join all Type 2 records to that using the new ROW_NUMBER we've generated".
Note: All of this works based on the assumption that you always want to join the Type 1's and Type 2's based on the order of their Code.
You can also do this using aggregation:
select max(case when type = 1 then type end) as type_1,
max(case when type = 1 then code end) as code_1,
max(case when type = 2 then type end) as type_2,
max(case when type = 2 then code end) as code_2
from (select type, code,
row_number() over (partition by type order by code) as seqnum
from your_table t
) t
group by seqnum;
It would be interesting to know which is faster -- a join approach or aggregation.
Here is a db<>fiddle.

Finding Missing Numbers series when Data Is Grouped in sql server

I need to write a query that will calculate the missing numbers with their count in a sequence when the data is "grouped". The data are in multiple groups & each group is in sequence.
For Ex. I have number series like 1001-1050, 1245-1270, 4571-4590 and all numbers like 1001,1002,1003,....1050 is stored in Table1 and from that Table1 some numbers are stored in another table Table2. E.g. 1001,1002,1003,1004,1005.
I want to get output like this:
Utilized Numbers | Balance Numbers |
----------- -------------------------
1001 - 1005 = 5 | 1006 - 1050 = 45 |
1245 - 1251 = 7 | 1252 - 1270 = 19 |
4571 - 4573 = 3 | 4574 - 4590 = 17 |
The number of each series is single field which is stored in both tables.
You haven't really explained your data, but guessing that "Utilized" are the numbers found in both Table1 and Table2, and "Balance" are the numbers only in Table1.
You can get the result at least this way, it's a little bit messy, mostly because of formatting the results:
Edit: This is a new version that does not use lag.
select
min (case when C2 = 1 then MINID end), max (case when C2 = 1 then MAXID end), max(case when C2=1 then ROWS end),
min (case when C2 = 0 then MINID end), max (case when C2 = 0 then MAXID end), max(case when C2=0 then ROWS end)
from (
select min(ID) as MINID, max(ID) as MAXID, count(*) as ROWS, C2, row_number() over (partition by C2 order by min(ID)) as GRP3 from (
select *, ID - RN as GRP1, ID - RN2 as GRP2 from (
select
T1.ID, row_number() over (order by T1.ID) as RN,
case when T2.ID is NULL then 0 else 1 end as C2,
row_number() over (partition by case when T2.ID is NULL then 0 else 1 end order by T1.ID) as RN2,
T2.ID as ID2
from #Table1 T1
left outer join #Table2 T2 on T1.ID = T2.ID
) X
) Y
group by GRP1, GRP2, C2
) Z
group by GRP3
order by 1
The idea here is to have a row number ordered by Table1.ID, and it's compared to the Table1.ID, and if the difference changes, then it's a new group. The same logic is used second time, but now partitioned differently for rows that exist in Table2 to handle changes between "Utilized" and "Balance".
From those groupings you can get the min and max value + number of rows. There's one additional grouping with min/max and case to format the result into 2 columns.
See the demo.

Consolidate, Combine, Merge Rows

Every search I do leads me to results for people seeking array_agg to combine multiple columns in a row into column. That's not what I am trying to figure out here, and maybe I am not using the right search terms (e.g., consolidate, combine, merge).
I am trying to combine rows by populating values in fields ... I am not sure the best way to describe this other than with an example:
Current:
--------------------------------
id num_1 num_2 num_3 num_4
--------------------------------
1 111 222 0 0
2 111 333 0 0
3 111 0 0 444
4 0 222 555 0
5 777 999 0 0
6 0 999 888 0
After Processing:
--------------------------------
id num_1 num_2 num_3 num_4
--------------------------------
1 111 222 555 444
2 111 333 555 444
3 111 333 555 444
4 111 222 555 444
5 777 999 888 0
6 777 999 888 0
After Deleting Duplicate Rows:
--------------------------------
id num_1 num_2 num_3 num_4
--------------------------------
1 111 222 555 444
2 111 333 555 444
3 777 999 888 0
This will likely be a 2 step process ... first fill in the blanks, and then find/delete the duplicate. I can do the second step, but having trouble figuring how to first populate the 0 values with values from another row where you might have two different values (id 1/2 for num_2 column) but only one value for num_1 (e.g., 111)
I can do it in PHP, but would like to figure out how to do it using only Postgres.
EDIT: My example table is a relations table. I have multiple datasets with similar information (e.g., username) but different registration ID numbers. So, I do an inner join on table 1 and table 2 (for example) where the username is the same. Then I take the registration IDs (which are different) from each table and insert that as a row into my relations table. In my example tables above, Row 1 has two different registration IDs from the two tables I joined … the values 111 (num_1) and 222 (num_2) are inserted into the table and zeros inserted for num_3 and num_4. Then I compare table 1 and table 4 and the values 111 (num_1) and 444 (num_4) get inserted into the relations table and zeros for num_2 and num_3. Since registration ID 111 is related to registration ID 222 and registration ID 111 is related to registration ID 444, then registration IDs 111, 222, and 444 are all related (meaning the username is the same for each of those registration IDs). Does that help to clarify?
EDIT 2: I corrected Tables 2 and 3. Hopefully now it makes sense. The username column is not unique. So, I have 4 tables like this:
Table 1:
bob - 111
mary - 777
Table 2:
bob - 222
bob - 333
mary - 999
Table 3:
bob - 555
mary - 888
Table 4:
bob - 444 -- mary does not exist in this table
So, in my relations table I should end up with 3 rows as given in example Table 3 above.
It seems like you started in the middle of a presumed solution, forgetting to present the initial problem. Based on your added information I suggest a completely different, much simpler solution. You have:
CREATE TABLE table1 (username text, registration_id int);
CREATE TABLE table2 (LIKE table1);
CREATE TABLE table3 (LIKE table1);
CREATE TABLE table4 (LIKE table1);
INSERT INTO table1 VALUES ('bob', 111), ('mary', 777);
INSERT INTO table2 VALUES ('bob', 222), ('bob', 333), ('mary', 999);
INSERT INTO table3 VALUES ('bob', 555), ('mary', 888);
INSERT INTO table4 VALUES ('bob', 444); -- no mary
Solution
What you really seem to need is FULL [OUTER] JOIN. Details in the manual on FROM and JOIN.
-- CREATE TABLE relations AS
SELECT username
, t1.registration_id AS reg1
, t2.registration_id AS reg2
, t3.registration_id AS reg3
, t4.registration_id AS reg4
FROM table1 t1
FULL JOIN table2 t2 USING (username)
FULL JOIN table3 t3 USING (username)
FULL JOIN table4 t4 USING (username)
ORDER BY username;
That's all. Produces your desired result directly.
username reg1 reg2 reg3 reg4
---------------------------------
bob 111 222 555 444
bob 111 333 555 444
mary 777 999 888 (null)
Your given example would work with LEFT JOIN as well, since all missing entries are to the right. But that would fail in other constellations. I added some more revealing test cases in the fiddle:
SQL Fiddle.
I assume you are aware that multiple entries in multiple tables will produce a huge number of output rows:
Two SQL LEFT JOINS produce incorrect result
If your values are always increasing (as in the example), then just use cumulative maximum and then select distinct:
select row_number() over (order by min(id)) as id,
t.num1, t.num2, t.num3, t.num4
from (select id,
max(num1) over (order by id) as num1,
max(num2) over (order by id) as num2,
max(num3) over (order by id) as num3,
max(num4) over (order by id) as num4
from t
) t
group by t.num1, t.num2, t.num3, t.num4;
If max() doesn't work, then what you really want is lag( . . . ignore nulls). That is not yet available. Perhaps the simplest method is then correlated subqueries for each column:
select row_number() over (order by min(id)) as id,
t.num1, t.num2, t.num3, t.num4
from (select id,
(select t2.num1 from t t2 where t2.id <= t.id and t2.num1 <> 0 order by t2.id desc limit 1
) as num1,
(select t2.num2 from t t2 where t2.id <= t.id and t2.num2 <> 0 order by t2.id desc limit 1
) as num2,
(select t2.num3 from t t2 where t2.id <= t.id and t2.num3 <> 0 order by t2.id desc limit 1
) as num3,
(select t2.num4 from t t2 where t2.id <= t.id and t2.num4 <> 0 order by t2.id desc limit 1
) as num4
from t
) t
group by t.num1, t.num2, t.num3, t.num4;
This version would not be very efficient on even medium sized tables.
A more efficient version is more complicated:
select row_number() over (order by id) as id,
t1.num1, t2.num2, t3.num3, t4.num4
from (select min(id) as id,
from (select id,
max(case when num1 > 0 then id end) over (order by id) as num1_id,
max(case when num2 > 0 then id end) over (order by id) as num2_id,
max(case when num3 > 0 then id end) over (order by id) as num3_id,
max(case when num4 > 0 then id end) over (order by id) as num4_id
from t
) t
group by num1_id, num2_id, num3_id, num4_id
) t left join
t t1
on t1.id = t.num1_id left join
t t2
on t2.id = t.num2_id left join
t t3
on t3.id = t.num3_id left join
t t4
on t4.id = t.num4_id left join
group by t.num1, t.num2, t.num3, t.num4;
EDIT:
That was a little silly. There is an easier way using first_value() (which Postgres unfortunately does not support as an aggregation function):
select row_number() over (order by min(id)) as id,
num1, num2, num3, num4
from (select id,
first_value(num1) over (order by (case when num1 is not null then id en) nulls last
) as num1,
first_value(num2) over (order by (case when num2 is not null then id end) nulls last
) as num2,
first_value(num3) over (order by (case when num3 is not null then id end) nulls last
) as num3,
first_value(num4) over (order by (case when num4 is not null then id end) nulls last
) as num4
from t
) t
group by num1, num2, num3, num4;

Pivot a Hierarchy table with no aggregate

I have a table [Departments] with 2 columns:
[IdDepartment]
[IdSubDepartment]
The table is a kind of hierarchy:
IdDepartment | IdSubDepartment
1 | 2
1 | 3
2 | 4
3 | 5
If I search for department 5 I want to get the following 5 -> 3 -> 1
(I only need the X level every time - not always the root).
I have written a query that gets a department ID and returns its 3rd level (say I enter ID 5 and get back 1). It works fast and good. the problem is when i do that for 7K departments, it gets stuck.
I want to convert the table to a pivot like this:
IdDepartment0 | IdDepartment1 | IdDepartment2 ...
1 2 4
1 3 5
important: I know the level of each department.
so, when I get department 5, I know it is on level 2 (IdDepartment2)
so I can query my new table in no time and get each department level I want.
How do I do convert to the new table?
thanks in advance
Eran
This snipped can be expanded to include deeper nesting.
It can propbably be optimized some.
;WITH cteLvl AS
(
SELECT IdDepartment, IdSubDepartment, 0 AS Lvl
FROM Department
WHERE IdDepartment NOT IN (SELECT IdSubDepartment FROM Department WHERE IdSubDepartment IS NOT NULL)
UNION ALL
SELECT B.IdDepartment, B.IdSubDepartment, A.Lvl + 1
FROM cteLvl A
INNER JOIN Department B ON B.IdDepartment = A.IdSubDepartment
)
, cteLeaf AS
(
SELECT *, ROW_NUMBER() OVER(ORDER BY IdDepartment) AS GroupId
FROM Department
WHERE IdSubDepartment IS NULL
UNION ALL
SELECT B.IdDepartment, B.IdSubDepartment, A.GroupId
FROM cteLeaf A
INNER JOIN Department B ON A.IdDepartment = B.IdSubDepartment
)
, cteCombined AS
(
SELECT A.IdDepartment, A.GroupId, B.Lvl FROM cteLeaf A
INNER JOIN (SELECT DISTINCT IdDepartment, Lvl FROM cteLvl) B ON A.IdDepartment = B.IdDepartment
)
--SELECT * FROM cteCombined
SELECT GroupId, [0] AS Dep0, [1] AS Dep1, [2] AS Dep2, [3] AS Dep3, [4] AS Dep4
FROM
(SELECT GroupId, Lvl, IdDepartment
FROM cteCombined) P
PIVOT
(
SUM(IdDepartment)
FOR Lvl IN
( [0], [1], [2], [3], [4] )
) AS V
Same effect without using the PIVOT construct:
SELECT
GroupId,
MAX(CASE Lvl WHEN 0 THEN IdDepartment END) AS Dep0,
MAX(CASE Lvl WHEN 1 THEN IdDepartment END) AS Dep1,
MAX(CASE Lvl WHEN 2 THEN IdDepartment END) AS Dep2,
MAX(CASE Lvl WHEN 3 THEN IdDepartment END) AS Dep3
FROM cteCombined
GROUP BY GroupId

How can I get the first result for each account in this SQL query?

I'm trying to write a query that follows this logic:
Find the first following status code of an account that had a previous status code of X.
So if I have a table of:
id account_num status_code
64 1 X
82 1 Y
72 2 Y
87 1 Z
91 2 X
103 2 Z
The results would be:
id account_num status_code
82 1 Y
103 2 Z
I've come up with a couple of solutions but I'm not all that great with SQL and so they've been pretty inelegeant thus far. I was hoping that someone here might be able to point me in the right direction.
View:
SELECT account_number, id
FROM table
WHERE status_code = 'X'
Query:
SELECT account_number, min(id)
FROM table
INNER JOIN view
ON table.account_number = view.account_number
WHERE table.id > view.id
At this point I have the id that I need but I'd have to write ANOTHER query that uses the id tolook up the status_code.
Edit: To add some context, I'm trying to find calls that have a status_code of X. If a call has a status_code of X we want to dial it a different way the next time we make an attempt. The aim of this query is to provide a report that will show the results of the second dial if the first dial resulted an X status code.
Here's a SQL Server solution.
UPDATE
The idea is to avoid a number of NESTED LOOP joins as proposed by Olaf because they roughly have O(N * M) complexity and thus extremely bad for your performance. MERGED JOINS complexity is O(NLog(N) + MLog(M)) which is much better for real world scenarios.
The query below works as follows:
RankedCTE is a subquery that assigns a row number to each id partioned by account and sorted by id which represents the time. So for the data below the output of this
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
would be:
id account_num status_code item_rank
----------- ----------- ----------- ----------
87 1 Z 1
82 1 Y 2
64 1 X 3
103 2 Z 1
91 2 X 2
72 2 Y 3
Once we have them numbered we join the result on itself like this:
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
*
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
which will give us an event and a preceding event in the same table
id account_num status_code item_rank id account_num status_code item_rank
----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
87 1 Z 1 82 1 Y 2
82 1 Y 2 64 1 X 3
103 2 Z 1 91 2 X 2
91 2 X 2 72 2 Y 3
Finally, we just have to take the preceding event with code "X" and the event with code not "X":
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
A.id,
A.account_num,
A.status_code
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
AND A.status_code <> 'X'
AND B.status_code = 'X'
Query plans for this query and #Olaf Dietsche solution (one of the versions) are below.
Data setup script
CREATE TABLE dbo.Test
(
id int not null PRIMARY KEY,
account_num int not null,
status_code nchar(1)
)
GO
INSERT dbo.Test (id, account_num, status_code)
SELECT 64 , 1, 'X' UNION ALL
SELECT 82 , 1, 'Y' UNION ALL
SELECT 72 , 2, 'Y' UNION ALL
SELECT 87 , 1, 'Z' UNION ALL
SELECT 91 , 2, 'X' UNION ALL
SELECT 103, 2, 'Z'
SQL Fiddle with subselect
select id, account_num, status_code
from mytable
where id in (select min(t1.id)
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num)
and SQL Fiddle with join, both for MS SQL Server 2012, both returning the same result.
select id, account_num, status_code
from mytable
join (select min(t1.id) as min_id
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num) t on id = min_id
SELECT MIN(ID), ACCOUNT_NUM, STATUS_CODE FROM (
SELECT ID, ACCOUNT_NUM, STATUS_CODE
FROM ACCOUNT A1
WHERE EXISTS
(SELECT 1
FROM ACCOUNT A2
WHERE A1.ACCOUNT_NUM = A2.ACCOUNT_NUM
AND A2.STATUS_CODE = 'X'
AND A2.ID < A1.ID)
) SUB
GROUP BY ACCOUNT_NUM
Here's an SQLFIDDLE
Here's query, with your data, checked under PostgreSQL:
SELECT t0.*
FROM so13594339 t0 JOIN
(SELECT min(t1.id), t1.account_num
FROM so13594339 t1, so13594339 t2
WHERE t1.account_num = t2.account_num AND t1.id > t2.id AND t2.status_code = 'X'
GROUP BY t1.account_num
) z
ON t0.id = z.min AND t0.account_num = z.account_num;