Hierarchy SQL server joining two different tables - sql

I'm having a problem making a hierarchy query.
I have the following in MS SQL DB:
Table A - Orders with order code, article, qty:
OP | ART | QTY
A | X |100
B | Y |200
Table B with assembly references of articles, but articles CAN be made from other articles if there exist a child's reference (may need to go 3 levels deep):
ART | ART2 |QTY
X | U | 20
X | O | 10
X | Z | 30
Y | Q | 20
Y | W | 15
Y | E | 30
U | Z | 10
And I want to get something like this:
A.OP |LEVEL| ART | B.ART2 |QTY
A | 2 | X | Z |(100*20*10)=2000
A | 1 | X | O |(100*10) =1000
A | 1 | X | Z |(100*30) = 3000
B | 1 | Y | Q |(200*20) = 4000
B | 1 | Y | W |(200*15) = 3000
B | 1 | Y | E |(200*30) = 6000
B | 1 | Y | Z |(200*10) = 2000
I've already made one thing:
WITH X AS (
SELECT
firstlvl.ART,
1 AS LEVEL,
firstlvl.ART2,
firstlvl.QTY,
QTY AS PARENTQTY
FROM B AS firstlvl
WHERE firstlvl.ART='X'
UNION ALL
SELECT secondlevel.ART,
EL.LEVEL +1,
secondlevel.BDT_MLC,
secondlevel.ART2,
secondlevel.QTY,
EL.PARENTQTY AS PARENTQTY
FROM B AS secondlevel
INNER JOIN X AS EL
ON secondlevel.ART = EL.ART2)
SELECT * FROM X
But now I don't know how to join quantities with table A nor how to run this query for all items on the first table.
Can anyone help me please?
Many Thanks!

with AllData as
( select op as art, art as art2, qty from a
union all
select * from b),
Tree(RootLvl, Father, Child, qty, lvl) as (
select art, art, art2, qty, 0 from AllData
where art in ( 'A' , 'B')
union all
select C.RootLvl, C.Child, AllData.art2, AllData.qty * c.qty, C.lvl + 1 from Tree C
join AllData
on C.Child = AllData.art
)
select C1.RootLvl as OP, C1.Lvl, C1.Child, C1.Qty from Tree c1
left join Tree c2
on c1.child = c2.father and
c1.rootLvl = c2.RootLvl
where C2.RootLvl is null
order by 1, 2 desc
SQL Fiddle demo
The A and B tables are almost the same. I made one from them to just simplify it. After that we use recursion to create tree and at the end I use left join to get only leafs from tree.

Related

SQL - Intersections

I have table like below:
|Group|User|
| 1 | X |
| 1 | Y |
| 1 | Z |
| 2 | X |
| 2 | Y |
| 2 | Z |
| 3 | X |
| 3 | Z |
| 4 | X |
I want to calculate intersections of groups: 1&2, 1&2&3, 1&2&3&4.
Then I want to see users in each intersection and how many of them are there.
In the given example should be:
1&2 -> 3
1&2&3 -> 2
1&2&3&4 -> 1
Is it possible using SQL?
If yes, how can I start?
On a daily basis I'm working with Python, but now I have to translate this code into SQL code and I have huge problem with it.
In PostgreSQL you can use JOIN or INTERSECT. Example with JOIN:
select count(*)
from t a
join t b on b.usr = a.usr
where a.grp = 1 and b.grp = 2
Then:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3
And:
select count(*)
from t a
join t b on b.usr = a.usr
join t c on c.usr = a.usr
join t d on d.usr = a.usr
where a.grp = 1 and b.grp = 2 and c.grp = 3 and d.grp = 4
Result in 3, 2, and 1 respectively.
EDIT - You can also do:
select count(*)
from (
select usr, count(*)
from t
where grp in (1, 2, 3, 4)
group by usr
having count(*) = 4 -- number of groups
) x
See running example at db<>fiddle.

SQL cross match IDs to create new cross-platform ID -> how to optimize

I have a Redshift table with two columns which shows which ID's are connected, that is, belonging to the same person. I would like to make a mapping (extra column) with a unique person ID using SQL.
The problem is similar to this one: SQL: creating unique id for item with several ids
However in my case the ID's in both columns are of a different kind, and therefor the suggested joining solution (t1.epid = t2.pid, etc..) will not work.
In below example there are 4 individual persons using 9 IDs of type 1 and 10 IDs of type 2.
ID_type1 | ID_type2
---------+--------
1 | A
1 | B
2 | C
3 | C
4 | D
4 | E
5 | E
6 | F
7 | G
7 | H
7 | I
8 | I
8 | J
9 | J
9 | B
What I am looking for is an extra column with a mapping to a unique ID for the person. The difficulty is in correctly identifying the IDs related to persons like x & z which have multiple IDs of both types. The result could look something this:
ID_type1 | ID_type2 | ID_real
---------+---------------------
1 | A | z
1 | B | z
2 | C | y
3 | C | y
4 | D | x
4 | E | x
5 | E | x
6 | F | w
7 | G | z
7 | H | z
7 | I | z
8 | I | z
8 | J | z
9 | J | z
9 | B | z
I wrote below query which goes up to 4 loops and does the job for a small dataset, however is struggling with larger sets as the number of rows after joining increase very fast each loop. I am stuck in finding ways to do this more effective / efficient.
WITH
T1 AS(
SELECT DISTINCT
l1.ID_type1 AS ID_type1,
r1.ID_type1 AS ID_type1_overlap
FROM crossmatch_example l1
LEFT JOIN crossmatch_example r1 USING(ID_type2)
ORDER BY 1,2
),
T2 AS(
SELECT DISTINCT
l1.ID_type1,
r1.ID_type1_overlap
FROM T1 l1
LEFT JOIN T1 r1 on l1.ID_type1_overlap = r1.ID_type1
ORDER BY 1,2
),
T3 AS(
SELECT DISTINCT
l1.ID_type1,
r1.ID_type1_overlap
FROM T2 l1
LEFT JOIN T2 r1 on l1.ID_type1_overlap = r1.ID_type1
ORDER BY 1,2
),
T4 AS(
SELECT DISTINCT
l1.ID_type1,
r1.ID_type1_overlap
FROM T3 l1
LEFT JOIN T3 r1 on l1.ID_type1_overlap = r1.ID_type1
ORDER BY 1,2
),
mapping AS(
SELECT ID_type1,
min(ID_type1_overlap) AS mapped
FROM T4
GROUP BY 1
ORDER BY 1
),
output AS(
SELECT DISTINCT
l1.ID_type1::INT AS ID_type1,
l1.ID_type2,
FUNC_SHA1(r1.mapped) AS ID_real
FROM crossmatch_example l1
LEFT JOIN mapping r1 on l1.ID_type1 = r1.ID_type1
ORDER BY 1,2)
SELECT * FROM output
What you're trying to do is called Transitive Closure. There are articles about how to implement it in SQL.
This is an example in Spark linq-like dsl https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkTC.scala.
The solution to the problem is iterative, and to fully resolve the graph, you may need to apply more iterations. What can be optimised is the input for each iteration. I remember working on it once, but cannot recall the details.

SQL Match group of records to another group of records

Is there SQL statement to match up multiple records to an exact match of multiple records in another table?
Lets say I have table A
ID | List# | Item
1 | 5 | A
2 | 5 | C
3 | 5 | B
4 | 6 | A
5 | 6 | D
*I purposely made Items 'ABC' out of order as the order of the records I receive may be out of order.
Table B
ID | Group | Item
1 | AAA | A
2 | AAA | B
3 | AAA | C
4 | AAA | D
5 | BBB | A
6 | BBB | B
7 | BBB | C
8 | DDD | A
If looking at the first table, I would want List# 5 to return a match only for group 'BBB', as all (and only) three records match.
The simplest way is to aggregate into a string or array and join. Standard SQL supports listagg(), so you can do:
select a.list, b.list, a.items
from (select a.list, listagg(item, ',') within group (order by item) as items
from a
group by a.list
) a join
(select b.list, listagg(item, ',') within group (order by item) as items
from b
group by b.list
) b
on a.items = b.items;
Not all databases support listagg(). Many -- but not all -- have similar functionality. This is simpler than the "standard" SQL approach.
You can simulate database division. It's a little bit cumbersome but here it is:
with
x as (
select
from a
where a.list = 5
),
y as (
select grp, count(*) as cnt
from b
join x on x.item = b.item
group by grp
)
select grp
from y
where cnt = (select count(*) from x)

Find all possible pairs based on transitivity via T-SQL

I have a concern that doesn't let me go for already several days.
The collective mind is the last resort I may rely on.
Assume we have a table with two columns. Actually, the values are GUIDs, but for the sake of simplicity let's take them as letters.
| a | b |
|---|---|
| x | y |
| y | x |
| y | z |
| z | y |
| m | n |
| m | z |
I need to create a T-SQL query that will present all the possible pairs out of trasitivity, i.e. if x=y, y=z, then x=z. Also, simmetry has to be there, i.e. if there is x=y, then there should be y=x as well.
In this particular case, I believe there is "full house", meaning that every letter is connected to all others through the intermediates. But I need a query that will show that.
All I did is here (SQLFiddle fails to run it):
WITH
t AS
(SELECT 'x' AS a, 'y' AS b
UNION ALL
SELECT 'y' AS a, 'x' AS b
UNION ALL
SELECT 'y' AS a, 'z' AS b
UNION ALL
SELECT 'z' AS a, 'y' AS b
UNION ALL
SELECT 'm' AS a, 'n' AS b
UNION ALL
SELECT 'm' AS a, 'z' AS b),
coupled_reflective AS --for reflective couples we take either of them
(SELECT t2.a, t2.b
FROM t t1
JOIN t t2 ON t1.a=t2.b
AND t1.b!=t2.a),
reversive_coupled_reflective AS --that's another half of the above couples (reversed)
(SELECT t2.b, t2.a
FROM t t1
JOIN t t2 ON t1.a=t2.b
AND t1.b!=t2.a),
rs AS -- reduce the initial set (t)
(SELECT *
FROM coupled_reflective
UNION
SELECT *
FROM t
EXCEPT
SELECT *
FROM reversive_coupled_reflective),
cte AS -- recursively iterate through the set to find transitive values (get linked by the left field)
(SELECT a, b
FROM rs
UNION ALL
SELECT rs.b, cte.b
FROM rs
JOIN cte ON rs.a=cte.a
AND rs.b!=cte.b),
cte2 AS -- recursively iterate through the set to find transitive values (get linked by the right field)
(SELECT a, b
FROM rs
UNION ALL
SELECT rs.a, cte.a
FROM rs
JOIN cte ON rs.b=cte.b
AND rs.a!=cte.a)
SELECT a, b FROM cte2
UNION
SELECT a, b FROM cte
UNION
SELECT a, b FROM t
UNION
SELECT b, a FROM t
But that doesn't do the trick, unfortunately.
The desired result should be
| a | b |
|---|---|
| x | y |
| y | x |
| y | z |
| z | y |
| m | n |
| m | z |
| n | m |
| z | m |
| x | z |
| z | x |
| x | m |
| m | x |
| x | n |
| n | x |
| y | m |
| m | y |
| y | n |
| n | y |
Is there a SQL-gifted buddy out there who can help me here, please?
Thanks.
You can use recursive CTEs, but you need a list of already visited nodes. You can implement that using a string:
with cte as (
select a, b, cast('{' + a + '}{' + b + '}' as varchar(max)) as visited
from t
union all
select cte.a, t.b,
(visited + '{' + t.b + '}')
from cte join
t
on cte.b = t.a
where cte.visited not like '%{' + t.b + '}%'
)
select distinct a, b
from cte;
Note:
The above follows the directed links in the graph. If you want undirected links, then include both:
with t as (
select a, b from yourtable
union
select b, a from yourtable
),
The rest of the logic follows using t.

SQL return max value from child for each parent row

I have 2 tables - 1 with parent records, 1 with child records. For each parent record, I'm trying to return a single child record with the MAX(SalesPriceEach).
Additionally I'd like to only return a value when there is more than 1 child record.
parent - SalesTransactions table:
+-------------------+---------+
|SalesTransaction_ID| text |
+-------------------+---------+
| 1 | Blah |
| 2 | Blah2 |
| 3 | Blah3 |
+-------------------+---------+
child - SalesTransactionLines table
+--+-------------------+---------+--------------+
|id|SalesTransaction_ID|StockCode|SalesPriceEach|
+--+-------------------+---------+--------------+
| 1| 1 | 123 | 99 |
| 2| 1 | 35 | 50 |
| 3| 2 | 15 | 75 |
+--+-------------------+---------+--------------+
desired results
+-------------------+---------+--------------+
|SalesTransaction_ID|StockCode|SalesPriceEach|
+-------------------+---------+--------------+
| 1 | 123 | 99 |
| 2 | 15 | 75 |
+-------------------+---------+--------------+
I found a very similar question here, and based my query on the answer but am not seeing the results I expect.
WITH max_feature AS (
SELECT c.StockCode,
c.SalesTransaction_ID,
MAX(c.SalesPriceEach) as feature
FROM SalesTransactionLines c
GROUP BY c.StockCode, c.SalesTransaction_ID)
SELECT p.SalesTransaction_ID,
mf.StockCode,
mf.feature
FROM SalesTransactions p
LEFT JOIN max_feature mf ON mf.SalesTransaction_ID = p.SalesTransaction_ID
The results from this query are returning multiple rows for each parent, and not even the highest value first!
select stl.SalesTransaction_ID, stl.StockCode, ss.MaxSalesPriceEach
from SalesTransactionLines stl
inner join
(
select stl2.SalesTransaction_ID, max(stl2.SalesPriceEach) MaxSalesPriceEach
from SalesTransactionLines stl2
group by stl2.SalesTransaction_ID
having count(*) > 1
) ss on (ss.SalesTransaction_ID = stl.SalesTransaction_ID and
ss.MaxSalesPriceEach = stl.SalesPriceEach)
OR, alternatively:
SELECT stl1.*
FROM SalesTransactionLines AS stl1
LEFT OUTER JOIN SalesTransactionLines AS stl2
ON (stl1.SalesTransaction_ID = stl2.SalesTransaction_ID
AND stl1.SalesPriceEach < stl2.SalesPriceEach)
WHERE stl2.SalesPriceEach IS NULL;
I know I'm a year late to this party but I always prefer using Row_Number in these situations. It solves the problem when there are two rows that meet your Max criteria and makes sure that only one row is returned:
with z as (
select
st.SalesTransaction_ID
,row=ROW_NUMBER() OVER(PARTITION BY st.SalesTransaction_ID ORDER BY stl.SalesPriceEach DESC)
,stl.StockCode
,stl.SalesPriceEach
from
SalesTransactions st
inner join SalesTransactionLines stl on stl.SalesTransaction_ID = st.SalesTransaction_ID
)
select * from z where row = 1
SELECT SalesTransactions.SalesTransaction_ID,
SalesTransactionLines.StockCode,
MAX(SalesTransactionLines.SalesPriceEach)
FROM SalesTransactions RIGHT JOIN SalesTransactionLines
ON SalesTransactions.SalesTransaction_ID = SalesTransactionLines.SalesTransaction_ID
GROUP BY SalesTransactions.SalesTransaction_ID, alesTransactionLines.StockCode;
select a.SalesTransaction_ID, a.StockCode, a.SalesPriceEach
from SalesTransacions as a
inner join (select SalesTransaction_ID, MAX(SalesPriceEach) as SalesPriceEach
from SalesTransactionLines group by SalesTransaction_ID) as b
on a.SalesTransaction_ID = b.SalesTransaction_ID
and a.SalesPriceEach = b.SalesPriceEach
subquery returns table with trans ids and their maximums so just join it with transactions table itself by those 2 values