SQL Group related strings when overlap (transitive relationship?) - sql

I have the following result sets:
Those values come from a relational table of
ProductId, GroupId
1 | 4
2 | 4
2 | 5
3 | 4
3 | 5
CategoryId | ProductId
1 | 1
1 | 2
1 | 3
All the following "Id" are from the category of those produtcts
Example 1: Example 2: Example 3:
|Id |Group| |Id |Group | |Id |Group |
----------- --------------- ---------------
| 1 | 4 | | 1 | 4,5 | | 1 | 3,5 |
| 1 | 4,5 | | 1 | 3,4,5,6 | | 1 | 3,4,5,6 |
| 1 | 5,7 | | 1 | 5,7 | | 1 | 4,5 |
----------- --------------- ---------------
I need to process those tables to get the following results
Result 1: Result 2: Result 3:
|Id |Group| |Id |Group | |Id |Group |
----------- --------------- ---------------
| 1 | 4,5 | | 1 | 3,4,5,6 | | 1 | 3,4,5,6 |
| 1 | 4,5 | | 1 | 3,4,5,6 | | 1 | 3,4,5,6 |
| 1 | 5,7 | | 1 | 5,7 | | 1 | 3,4,5,6 |
----------- --------------- ---------------
Explanation for that, those columns indicate where the price of some item should be placed, and all related prices should be in the same table if possible, so when a group can be joined with other it should result in empty spaces for the columns that weren't originally for that product so:
Using example 1 this is the final result:
| G4 | G5 |
--------------------
Product1 | 10 | |
Product2 | | 15 |
Product3 | 14 | 18 |
--------------------
| G5 | G7 |
--------------------
Product1 | 10 | 25 |
Product2 | | 15 |
--------------------
Using the example 3 this is the final result:
| G3 | G4 | G5 | G6 |
------------------------------
Product1 | 10 | | 15 | 20 |
Product2 | | | 17 | |
Product3 | 14 | 18 | | |
------------------------------
But I'm completly clueless on how to do those group joins (the empty spaces in the result set is not a problem.

I don't fully understand the problem yet but this might help you determine which sets of groups are subsets of larger ones. As I best I can tell that's the transitive relationship you indicated in the title.
select t1.id as id1, t2.id as id2
from T t1 full outer join T t2 on t2.grp = t1.grp
group by t1.id, t2.id
having count(distinct t1.grp) < count(distinct t2.grp) and count(t1.id) = count(*)
Now that you have an adjacency list or a hierarchy you could try some approaches as in this question to find all the "maximal" or top-level sets: Finding a Top Level Parent in SQL
If you have a limit on the number of possible groups then Gordon's answer there may be sufficient without all the recursive complications.

Related

Inserting set of rows for every ID in another table

this is an initial table (this is just a part of a larger table where Article ID's can vary), database is MS Sql.
-----------------------------------
|ArticleID | GroupID |
-----------------------------------
| 1 | NULL |
-----------------------------------
| 2 | NULL |
-----------------------------------
| 3 | NULL |
-----------------------------------
| 4 | NULL |
-----------------------------------
Set of rows that should be entered for each ArticleID looks something like this:
------------------------
| GroupID |
------------------------
| A |
------------------------
| B |
------------------------
| C |
------------------------
| D |
------------------------
Result table should look something like this:
-----------------------------------
|ArticleID | GroupID |
-----------------------------------
| 1 | NULL |
-----------------------------------
| 1 | A |
-----------------------------------
| 1 | B |
-----------------------------------
| 1 | C |
-----------------------------------
| 1 | D |
-----------------------------------
| 2 | NULL |
-----------------------------------
| 2 | A |
-----------------------------------
| 2 | B |
-----------------------------------
| 2 | C |
-----------------------------------
| 2 | D |
-----------------------------------
| 3 | NULL |
-----------------------------------
| 3 | A |
-----------------------------------
| 3 | B |
-----------------------------------
| 3 | C |
-----------------------------------
| 3 | D |
-----------------------------------
| 4 | NULL |
-----------------------------------
| 4 | A |
-----------------------------------
| 4 | B |
-----------------------------------
| 4 | C |
-----------------------------------
| 4 | D |
-----------------------------------
Any suggestion how to insert it efficiently?
Thanks a lot for you suggestion.
Regards
This is a cross join between two sets.
with a as (
select * from(values (1),(2),(3),(4))v(ArticleId)
), g as (
select * from(values (null),('A'),('B'),('C'),('D'))v(GroupId)
)
select *
from a cross join g;
To insert into the original table you could do:
with g as (select * from(values('A'),('B'),('C'),('D'))v(GroupId))
insert into t
select t.ArticleId, g.GroupId
from t cross join g;
See Example Fiddle

How to update table 2 from the inserted data in table 1?

Can you help me on what query I to to update one table with data from another.
I have 2 tables for example:
tbl_med_take
| id | name | med | qty |
---------------------------------
| 1 | jayson | med2 | 3 |
| 2 | may | med2 | 4 |
| 3 | jenny. | med3 | 6 |
| 4 | joel. | med3 | 4 |
tbl_med
| id | med | stocks |
-----------------------------
| 1 | med1 | 20 |
| 2 | med2 |. 17 |
| 3 | med3 | 24 |
The output that I want in tbl_med:
tbl_med
| id | med | stocks |
-----------------------------
| 1 | med1 | 20 |
| 2 | med2 |. 10 |
| 3 | med3 | 14 |
First get the total consumed from med_tbl_take using
select med,sum(quantity) as total from tbl_med_take group by med
Then you can left join with your med_tbl and subtract.
select m.id,m.med,(m.stocks-ISNULL(n.total,0)) from tbl_med m
left join
(select med,sum(quantity) as total from tbl_med_take group by med) n
on m.med=n.med
CHECK DEMO HERE

Adjusting composite key when deleting database elements in SQL

Consider the following table T
------------------------------
| CountryID | Obs | Event |
------------------------------
| 1 | 1 | 10 |
| 1 | 2 | 20 |
| 1 | 3 | 30 |
| 2 | 1 | 20 |
| 2 | 2 | 30 |
| 2 | 3 | 10 |
| 3 | 1 | 30 |
| 3 | 2 | 10 |
| 3 | 3 | 20 |
------------------------------
I would like to delete all rows such that Event = 20 however I would then like to update the Obs so that they were still in incremental order from 1 with a difference of 1.
For example if I run SELECT * FROM T WHERE Event != 20, I would get
------------------------------
| CountryID | Obs | Event |
------------------------------
| 1 | 1 | 10 |
| 1 | 3 | 30 |
| 2 | 2 | 30 |
| 2 | 3 | 10 |
| 3 | 1 | 30 |
| 3 | 2 | 10 |
------------------------------
but instead I want
------------------------------
| CountryID | Obs | Event |
------------------------------
| 1 | 1 | 10 |
| 1 | 2 | 30 |
| 2 | 1 | 30 |
| 2 | 2 | 10 |
| 3 | 1 | 30 |
| 3 | 2 | 10 |
------------------------------
what query do I need to achieve this?
First, in SQLite, there is a pseudo-column called rowid that uniquely identifies each row. You can do what you want by using a correlated subquery:
update t
set obs = (select count(*)
from t t2
where t2.countryid = t.countryid and t2.rowid <= t.rowid
);
That said, this is quite inefficient and shouldn't be run on anything other than baby tables. If this is an operation that you regularly want to do, you might consider a more powerful database than SQLite.

Aggregate values by parents recursively in Oracle

Consider the the following example structure:
DEPARTMENT
ID
PARENT_ID
NAME
DEPTH
PROJECT
ID
NAME
COST
DEPARTMENT_ID
Some data, just for the sake of the examples bellow:
| ID | PARENT_ID | NAME | DEPTH |
|----|-----------|-------|-------|
| 1 | NULL | DEPT1 | 1 |
| 2 | 1 | DEPT2 | 2 |
| 3 | 1 | DEPT3 | 2 |
| 4 | 2 | DEPT4 | 3 |
| 5 | 3 | DEPT5 | 3 |
| 6 | NULL | DEPT6 | 1 |
| 7 | 6 | DEPT7 | 2 |
| ID | NAME | COST | DEPARTMENT_ID |
|------|--------|-------|---------------|
| 1 | PRJ1 | 100 | 1 |
| 2 | PRJ2 | 200 | 2 |
| 3 | PRJ3 | 300 | 3 |
| 4 | PRJ4 | 400 | 4 |
| 5 | PRJ5 | 500 | 5 |
| 6 | PRJ6 | 600 | 6 |
| 7 | PRJ7 | 700 | 7 |
Now, I need to somehow aggregate the costs of the, projects by one department and then by its direct children.
If the choosen filter is DEPT1, the intented result is:
| LINE | DEPARTMENT_ID | PARENT_ID | NAME | AGGREGATE_COST |
|------|----------------|-----------|--------|----------------|
| 1 | 1 | NULL | DEPT1 | 1500 |
| 2 | 2 | 1 | DEPT2 | 600 |
| 3 | 3 | 1 | DEPT3 | 800 |
Where:
Line 3 aggregate is PRJ5 (of DEPT5, which is child of DEPT3) + PRJ3 (of DEPT3) cost
Line 2 aggregate is PRJ4 (of DEPT4, which is child of DEPT2) + PRJ2 (of DEPT2) cost
Line 1 aggregate is the sum of his childrens aggregates.
PRJ6 and PRJ7 costs are ignored because the are from DEPT6 and DEPT7, and those are not in the hierachy of DEPT1 (DEPT6 would be his sibling, not child)
EDIT:
| ID | NAME | COST | DEPARTMENT_ID |
|------|--------|-------|---------------|
| 1 | PRJ1 | 1 | 1 |
| 2 | PRJ2 | 1 | 1 |
| 3 | PRJ3 | 1 | 2 |
| 4 | PRJ4 | 1 | 2 |
| 5 | PRJ5 | 1 | 4 |
In this scenario, the solution ivanzg presented, doesn't seem to work.
I get doubled results for the projects in the highers ranks
If I get the aggregate for DEPT1, it returns something similar to this:
| LINE | DEPARTMENT_ID | PARENT_ID | NAME | AGGREGATE_COST |
|------|----------------|-----------|--------|----------------|
| 1 | 1 | NULL | DEPT1 | 8 |
| 2 | 2 | NULL | DEPT1 | 4 |
You can tag rows in a hierarchy query (to later create groups) by using CONNECT_BY_ROOT hierarchy operator. In the hierarchy query, by making all rows root rows you create every hierarchy combination, later only specified combinations are taken and aggregated. For your test data this returns what you specified.
SELECT ROOT_DEPT AS DEPARTMENT_ID
,ROOT_PARENT AS PARENT_ID
,ROOT_NAME AS NAME
,SUM(COST) AS AGGREGATE_COST
FROM (SELECT COST
,CONNECT_BY_ROOT DEPARTMENT_ID ROOT_DEPT
,CONNECT_BY_ROOT PARENT_ID ROOT_PARENT
,CONNECT_BY_ROOT NAME ROOT_NAME
FROM (SELECT B.DEPARTMENT_ID
,NVL(A.PARENT_ID,'0') PARENT_ID
,A.NAME
,SUM(B.COST) COST
FROM DEPARTMENT A
JOIN PROJECT B
ON A.ID = B.DEPARTMENT_ID
--> GROUP COST OF PROJECTS IN THE SAME DEPARTMENT IF THERE ARE ANY
GROUP BY B.DEPARTMENT_ID
,NVL(A.PARENT_ID,'0')
,A.NAME
)
--> MAKE ALL ROWS ROOT ROWS
CONNECT BY PRIOR DEPARTMENT_ID = PARENT_ID
)
WHERE ROOT_DEPT = 1 OR ROOT_PARENT = 1
GROUP BY ROOT_DEPT
,ROOT_PARENT
,ROOT_NAME

SQL compare multiple rows or partitions to find matches

The database I'm working on is DB2 and I have a problem similar to the following scenario:
Table Structure
-------------------------------
| Teacher Seating Arrangement |
-------------------------------
| PK | seat_argmt_id |
| | teacher_id |
-------------------------------
-----------------------------
| Seating Arrangement |
-----------------------------
|PK FK | seat_argmt_id |
|PK | Row_num |
|PK | seat_num |
|PK | child_name |
-----------------------------
Table Data
------------------------------
| Teacher Seating Arrangement|
------------------------------
| seat_argmt_id | teacher_id |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
------------------------------
---------------------------------------------------
| Seating Arrangement |
---------------------------------------------------
| seat_argmt_id | row_num | seat_num | child_name |
| 1 | 1 | 1 | Abe |
| 1 | 1 | 2 | Bob |
| 1 | 1 | 3 | Cat |
| | | | |
| 2 | 1 | 1 | Abe |
| 2 | 1 | 2 | Bob |
| 2 | 1 | 3 | Cat |
| | | | |
| 3 | 1 | 1 | Abe |
| 3 | 1 | 2 | Cat |
| 3 | 1 | 3 | Bob |
| | | | |
| 4 | 1 | 1 | Abe |
| 4 | 1 | 2 | Bob |
| 4 | 1 | 3 | Cat |
| 4 | 2 | 2 | Dan |
---------------------------------------------------
I want to see where there are duplicate seating arrangements for a teacher. And by duplicates I mean where the row_num, seat_num, and child_name are the same among different seat_argmt_id for one teacher_id. So with the data provided above, only seat id 1 and 2 are what I would want to pull back, as they are duplicates on everything but the seat id. If all the children on the 2nd table are exact (sans the primary & foreign key, which is seat_argmt_id in this case), I want to see that.
My initial thought was to do a count(*) group by row#, seat#, and child. Everything with a count of > 1 would mean it's a dupe and = 1 would mean it's unique. That logic only works if you are comparing single rows though. I need to compare multiple rows. I cannot figure out a way to do it via SQL. The solution I have involves going outside of SQL and works (probably). I'm just wondering if there is a way to do it in DB2.
Does this do what you want?
select d.teacher_id, sa.row_num, sa.seat_num, sa.child_name
from seatingarrangement sa join
data d
on sa.seat_argmt_id = d.seat_argmt_id
group by d.teacher_id, sa.row_num, sa.seat_num, sa.child_name
having count(*) > 1;
EDIT:
If you want to find two arrangements that are the same:
select sa1.seat_argmt_id, sa2.seat_argmt_id
from seatingarrangement sa1 join
seatingarrangement sa2
on sa1.seat_argmt_id < sa2.seat_argmt_id and
sa1.row_num = sa2.row_num and
sa1.seat_num = sa2.seat_num and
sa1.child_name = sa2.child_name
group by sa1.seat_argmt_id, sa2.seat_argmt_id
having count(*) = (select count(*) from seatingarrangement sa where sa.seat_argmt_id = sa1.seat_argmt_id) and
count(*) = (select count(*) from seatingarrangement sa where sa.seat_argmt_id = sa2.seat_argmt_id);
This finds the matches between two arrangements and then verifies that the counts are correct.