Postgresql copy data within the tree table - sql

I have table with tree structure, columns are id, category, parent_id
Now I need a copy a node and its child's to a another node, while copying, the category must be same, but with new id and parent_id..
My input will be node to copy & destination node to copy
I have explained the tree structure in the image file..
i need a function to do so..,
PostgreSQL version 9.1.2
Column | Type | Modifiers
-----------+---------+-------------------------------------------------
id | integer | not null default nextval('t1_id_seq'::regclass)
category | text |
parent_id | integer |
Indexes:
"t1_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"fk_t1_1" FOREIGN KEY (parent_id) REFERENCES t1(id)
Referenced by:
TABLE "t1" CONSTRAINT "fk_t1_1" FOREIGN KEY (parent_id) REFERENCES t1(id)

(tested under PostgreSQL 8.4.3)
The following query assigns new IDs to the sub-tree under node 4 (see the nextval) and then finds the corresponding new IDs of parents (see the LEFT JOIN).
WITH RECURSIVE CTE AS (
SELECT *, nextval('t1_id_seq') new_id FROM t1 WHERE id = 4
UNION ALL
SELECT t1.*, nextval('t1_id_seq') new_id FROM CTE JOIN t1 ON CTE.id = t1.parent_id
)
SELECT C1.new_id, C1.category, C2.new_id new_parent_id
FROM CTE C1 LEFT JOIN CTE C2 ON C1.parent_id = C2.id
Result (on your test data):
new_id category new_parent_id
------ -------- -------------
9 C4
10 C5 9
11 C6 9
12 C7 10
Once you have that, it's easy to insert it back to the table, you just have to be careful to reconnect the sub-tree root with the appropriate parent (8 in this case, see the COALESCE(new_parent_id, 8)):
INSERT INTO t1
SELECT new_id, category, COALESCE(new_parent_id, 8) FROM (
WITH RECURSIVE CTE AS (
SELECT *, nextval('t1_id_seq') new_id FROM t1 WHERE id = 4
UNION ALL
SELECT t1.*, nextval('t1_id_seq') new_id FROM CTE JOIN t1 ON CTE.id = t1.parent_id
)
SELECT C1.new_id, C1.category, C2.new_id new_parent_id
FROM CTE C1 LEFT JOIN CTE C2 ON C1.parent_id = C2.id
) Q1
After that, the table contains the following data:
new_id category new_parent_id
------ -------- -------------
1 C1
2 C2 1
3 C3 1
4 C4 2
5 C5 4
6 C6 4
7 C7 5
8 C8 3
9 C4 8
10 C5 9
11 C6 9
12 C7 10

Related

Self join queries in hive

I need a hive query to fetch the hierarchy in which my product was sold .
Considering the below records , the end customer was 1 and 6 , since their SoldTo column value is NULL.
CustomerID SoldTo
--------------------
1 NULL
2 1
3 2
4 3
5 4
6 NULL
7 1
8 6
My output should look like :
c1 c2 c3 c4 c5
-------------------
5 4 3 2 1 (c1 (5) - first customer who bought product and c5(1) -last customer)
8 6 (c1 (8) - first customer , c2 (6)- Last customer)
7 1
Hive has no real support for recursive CTEs or hierarchical data structures. You can do this using multiple joins -- but the depth of the hierarchy is fixed.
select t1.CustomerId as c1, t2.CustomerId as c2, t3.CustomerId as c3,
t4.CustomerId as c4, t5.CustomerId
from t t1 left join
t t2
on t2.SoldTo = t1.CustomerId left join
t t3
on t3.SoldTo = t2.CustomerId left join
t t4
on t4.SoldTo = t3.CustomerId left join
t t5
on t5.SoldTo = t4.CustomerId
where t1.CustomerId is null;

Find next row with specific value in a given row

The table I have now looks something like this. Each row has a time value (on which the table is sorted in ascending order), and two values which can be replicated across rows:
Key TimeCall R_ID S_ID
-------------------------------------------
1 100 40 A
2 101 50 B
3 102 40 C
4 103 50 D
5 104 60 A
6 105 40 B
I would like to return something like this, wherein for each row, a JOIN is applied such that the S_ID and Time_Call of the next row that shares that row's R_ID is displayed (or is NULL if that row is the last instance of a given R_ID). Example:
Key TimeCall R_ID S_ID NextTimeCall NextS_ID
----------------------------------------------------------------------
1 100 40 A 102 C
2 101 50 B 103 D
3 102 40 C 105 B
4 103 50 D NULL NULL
5 104 60 A NULL NULL
6 105 40 B NULL NULL
Any advice on how to do this would be much appreciated. Right now I'm joining the table on itself and staggering the key on which I'm joining, but I know this won't work for the instance that I've outlined above:
SELECT TOP 10 Table.*, Table2.TimeCall AS NextTimeCall, Table2.S_ID AS NextS_ID
FROM tempdb..#Table AS Table
INNER JOIN tempdb..#Table AS Table2
ON Table.TimeCall + 1 = Table2.TimeCall
So if anyone could show me how to do this such that it can call rows that aren't just consecutive, much obliged!
Use LEAD() function:
SELECT *
, LEAD(TimeCall) OVER (PARTITiON BY R_ID ORDER BY [Key]) AS NextTimeCall
, LEAD(S_ID) OVER (PARTITiON BY R_ID ORDER BY [Key]) AS NextS_ID
FROM Table2
ORDER BY [Key]
SQLFiddle DEMO
This is only test example I had close by ... but i think it could help you out, just adapt it to your case, it uses Lag and Lead ... and it's for SQL Server
if object_id('tempdb..#Test') IS NOT NULL drop table #Test
create table #Test (id int, value int)
insert into #Test (id, value)
values
(1, 1),
(1, 2),
(1, 3)
select id,
value,
lag(value, 1, 0) over (order by id) as [PreviusValue],
lead(Value, 1, 0) over (order by id) as [NextValue]
from #Test
Results are
id value PreviusValue NextValue
1 1 0 2
1 2 1 3
1 3 2 0
Use an OUTER APPLY to select the top 1 value that has the same R_ID as the first Query and has a higher Key field
Just change the TableName to the actual name of your table in both parts of the query
SELECT a.*, b.TimeCall as NextTimeCall, b.S_ID as NextS_ID FROM
(
SELECT * FROM TableName as a
) as a
OUTER APPLY
(
SELECT TOP 1 FROM TableName as b
WHERE a.R_ID = b.R_ID
AND a.Key > B.Key
ORDER BY Key ASC
) as b
Hope this helps! :)
For older versions, here is one trick using Outer Apply
SELECT a.*,
nexttimecall,
nexts_id
FROM table1 a
OUTER apply (SELECT TOP 1 timecall,s_id
FROM table1 b
WHERE a.r_id = b.r_id
AND a.[key] < b.[key]
ORDER BY [key] ASC) oa (nexttimecall, nexts_id)
LIVE DEMO
Note : It is better to avoid reserved keywords(Key) as column/table names.

Sql delete parent from child/delete whole tree

Delete parent and child in loop
Table 1 (Parent table)
Id int
Table 2 (Relationship table)
Id1 int FOREIGN KEY (Id1) REFERENCES Table1 (Id)
Id2 int FOREIGN KEY (Id2) REFERENCES Table1 (Id)
Id - Id1 one to one or one to zero relationship
Id - Id2 one to many
Data in table 1
Id
1
2
3
4
5
6
7
8
9
10
Data in table 2
Id1 Id2
2 1
3 1
4 2
5 2
6 4
7 4
8 5
9 5
So it is like a tree with root as 1
1 has two childs 2 and 3
2 has two childs 4 and 5
4 has two childs 6 and 7
5 has two childs 8 and 9
3,6,7,8,9,10 has no child
Best possible way to achieve the below mentioned case:
Deleting 1 => deletes the complete table2 and table1(except 10 in table 1)
Try
update table2 set id2 = null;
delete from table1 where id <> 10;
delete from table2;
You can do this using Recursive CTE
;WITH cte
AS (SELECT Id1,
Id2,
id2 AS parent
FROM Yourtable
UNION ALL
SELECT a.Id1,
a.Id2,
b.Id2
FROM cte a
JOIN Yourtable b
ON a.parent = b.id1)
SELECT *
FROM cte
WHERE parent = 1
OPTION (maxrecursion 0)
--DELETE FROM Yourtable
--WHERE id1 IN (SELECT id1
-- FROM cte
-- WHERE parent = 1)
--OPTION (maxrecursion 0)
If the select is returning expected results then comment the select and un-comment the Delete

SQL - Find the Top Level Parent and Multiply Quantities

I have two tables which track part numbers as well as the hierarchy of assemblies.
Table: Config
ConfigNum AssemblyNum Qty
1 A 1
1 B 2
1 C 2
2 A 1
2 C 1
Table: SubAssembly
SubAssembly PartNum Qty
A AA 2
A BB 4
A CC 2
A DD 4
B EE 4
B FF 8
AA AAA 2
I would like to create a flat version of these tables which shows the ConfigNum (Top level parent) with all associated assembly and part numbers, for each ConfigNum in the Config table. The column Config.AssemblyNum is equivalent to SubAssembly.SubAssembly.
The Subassembly table shows the partent to child relation ship between parts. For example Assembly 'A' has a child assembly 'AA'. Since 'AA' exists in the SubAssembly Column is it self an assembly and as you can see it has a child part 'AAA'. 'AAA' does not exist in the SubAssemly columns therefore it is the last child in the series.
I would also like to have an accurate quantity count of each part based on multiplying of parent to child down the chain.
For example in the output:
(Total Qty of AAA) = (Qty A) x (Qty AA) x (Qty AAA)
4 = 1 x 2 x 2
Example Output table: (for Config 1)
ConfigNum SubAssembly PartNum TotalQty
1 A AA 2
1 A BB 4
1 A CC 2
1 A DD 4
1 B EE 8
1 B FF 16
1 A AAA 4
Any suggestion on how to complete this task would be greatly appreciated.
EDIT: I have been able to create this code based on suggestions made in the answers.
I am still having trouble getting the quantities to multiply down.
I have received the error "Types don't match between the anchor and the recursive part in column "PartQty" of recursive query "RCTE"."
;WITH RCTE (AssemblyNum, PartNum, PartQty, Lvl) AS
(
SELECT AssemblyNum, PartNum, PartQty, 1 AS Lvl
FROM SP_SubAssembly r1
WHERE EXISTS (SELECT * FROM SP_SubAssembly r2 WHERE r1.AssemblyNum = r2.PartNum)
UNION ALL
SELECT rh.AssemblyNum, rc.PartNum, (rc.PartQty * rh.PartQty), Lvl+1 AS Lvl
FROM dbo.SP_SubAssembly rh
INNER JOIN RCTE rc ON rh.PartNum = rc.AssemblyNum
)
SELECT CB.ID, CB.ConfigNum, CB.PartNum, CB.PartQty, r.AssemblyNum, r.PartNum, SUM(r.PartQty * COALESCE(CB.PartQty,1)) AS TotalQty
FROM SP_ConfigBOM CB
FULL OUTER JOIN RCTE r ON CB.PartNum = r.AssemblyNum
WHERE CB.ConfigNum IS NOT NULL
ORDER BY CB.ConfigNum
Thanks,
For this problem I think you must use a recursive query. In fact I think SubAssembly table should have some ProductID field other than SubAssembly to easily identify the main product that contains assemblies.
You can find a similar example in SLQ Server documentation.
Can check it here: http://rextester.com/FQYI80157
Change the Qty in Config table to change the final result.
create temp table t1 (cfg int, part varchar(10), qty int);
create temp table t2 (part varchar(10), sasm varchar(10), qty int);
insert into t1 values (1,'A',2);
insert into t2 values ('A','AA',2);
insert into t2 values ('A','BB',4);
insert into t2 values ('A','CC',2);
insert into t2 values ('A','DD',4);
insert into t2 values ('AA','AAA',2);
WITH cte(sasm, part, qty)
AS (
SELECT sasm, part, qty
FROM #t2 WHERE part = 'A'
UNION ALL
SELECT p.sasm, p.part, p.qty * pr.qty
FROM cte pr, #t2 p
WHERE p.part = pr.sasm
)
SELECT #t1.cfg, cte.part, cte.sasm, SUM(cte.qty * COALESCE(#t1.qty,1)) as total_quantity
FROM cte
left join #t1 on cte.part = #t1.part
GROUP BY #t1.cfg, cte.part, cte.sasm;
This is the result:
+------+------+----------------+
| part | sasm | total_quantity |
+------+------+----------------+
| A | AA | 4 |
+------+------+----------------+
| A | DD | 8 |
+------+------+----------------+
| AA | AAA | 4 |
+------+------+----------------+
| A | BB | 8 |
+------+------+----------------+
| A | CC | 4 |
+------+------+----------------+
insert into #Temp
SELECT A.[ConfigNum] ,
A.[AssemblyNum],
B.[PartNum],
A.[Qty],
A.QTY * B.Qty TotalQty
INTO #Temp
FROM [Config] A,
[SubAssembly] B
WHERE A.[AssemblyNum] = B.[SubAssembly]
SELECT A.[ConfigNum] ,
A.[AssemblyNum],
A.[PartNum],
A.[Qty],
A.TotalQty
FROM #Temp A
union
SELECT A.[ConfigNum] ,
A.[AssemblyNum],
B.[PartNum],
A.[Qty],
A.TotalQty * B.Qty
FROM #Temp A,
[SubAssembly] B
WHERE
A.[PartNum] = B.[SubAssembly]

select rows from main table based on highest date in child table between a date range

Sorry for the confusing title.
I've this table:
ApplicantID Applicant Name
-------------------------------
1 Sandeep
2 Thomas
3 Philip
4 Jerin
ALong with this child table which is connected with the above table:
DetailsID ApplicantID CourseName Dt
---------------------------------------------------------------------
1 1 C1 10/5/2014
2 1 C2 10/18/2014
3 1 c3 7/3/2014
4 2 C1 3/2/2014
5 2 C2 10/18/2014
6 2 c3 1/1/2014
7 3 C1 1/5/2014
8 3 C2 4/18/2014
9 3 c3 2/23/2014
10 4 C1 3/15/2014
11 4 C2 2/20/2014
12 4 C2 2/20/2014
I want to get applicantsID, for example, when I specify a date range from
4/20/2014 to 3/5/2014 I should have:
ApplicantID Applicant Name
-------------------------------
3 Philip
4 Jerin
That means the applicants from the main table that must be in the second table and also the highest date of the second table must fall in the specified date range. Hope the scenario is clear.
you can use window analytic function row_number to get applicant with maximum date in the given time range.
select T1.[ApplicantID], [Applicant Name]
from Table1 T1
join ( select [ApplicantID],
ROW_NUMBER() over ( partition by [ApplicantID] order by Dt desc) as rn
from Table2
where Dt BETWEEN '3/5/2014' AND '4/20/2014'
) T
on T1.[ApplicantID] = T.[ApplicantID]
and T.rn =1
You will need to pull the MAX per ApplicantId with a GROUP BY in a sub-query, then JOIN to that result. This should work for you:
Select A.ApplicantId, A.[Applicant Name]
From ApplicantTableName A
Join
(
Select D.ApplicantId, Max(D.Dt) DT
From DetailsTableName D
Group By D.ApplicantId
) B On A.ApplicantId = B.ApplicantId
Where B.DT Between '03/05/2014' And '04/20/2014'