Display titles which are missing association values partially or completely - sql

I am having problem writing SQL query for the following scenario. I need someone's help to write the query.
I have the following tables 7 tables:
1) Titles
ID Title Author
-------------------------------------------------------------------------
1 The Hidden Language of Computer Hardware and Software Charles Petzold
2 Paths, Dangers, Strategies Nick Bostrom
3 The Smart Girl's Guide to Privacy Violet Blue
4 Introduction to Algorithms Thomas H. Cormen
5 Machine Learning in Action Peter Harrington
...
2) Themes
ID Name
------------------------------------------
1 Science Fiction
2 Biography
3 Painting
...
3) Subjects
ID Name
-----------------------------------
1 Science
2 Technology
3 Music
4 Geography
...
4) Grades
ID Name
------------------------------------
1 Grade 1
2 Grade 2
3 Grade 3
4 Grade 4
5 Grade 5
...
5) TitleThemeAssociation
TitleID ThemeID
------------------------------------------
1 1
1 3
4 2
4 3
...
6) TitleSubjectAssociaton
TitleID SubjectID
---------------------------------
1 1
1 3
2 1
2 3
4 1
4 2
...
7) TitleGradeAssociaton
TitleID GradeID
1 1
1 2
1 3
2 1
2 2
...
I need to write a query to display only titles which are missing any of three values (Themes, Subjects and Grades) or not assigned values completely. I should not display the title if all three values (Themes, Subjects, Grades) are assigned. In the above data set since TitleID 1 has all three values it should not be present in the list. TitleID 2 has only Subjects and Grades assigned but not Themes so it should be displayed in the output. While listing the titles if a title has multiple values then they should be contacted with comma (,) separator.
So the final output of the above data set should be as below:
Output:
Title ID Title Theme Subject Grade
-------------------------------------------------------------------------------------------
2 Paths, Dangers, Strategies - Science, Music Grade 1, Grade 2
3 The Smart Girl's Guide to Privacy - - -
4 Introduction to Algorithms Biography, Painting Science, Technology -
5 Machine Learning in Action - - -

There are essentially two questions you're asking. The first being how to filter when either a Theme, Subject or Grade is missing. And the other is asking how to concat these items into a comma separated list.
The following query should be what you're looking for:
Select Distinct
T.Id As [Title ID],
T.Title,
H.Theme,
S.Subject,
G.Grade
From Titles T
Outer Apply
(
Select Stuff(( Select ', ' + Name
From Themes H
Join TitleThemeAssociation TH On H.Id = TH.ThemeId
Where TH.TitleId = T.Id
For Xml Path('')), 1, 2, '') As Theme
From Themes
) H
Outer Apply
(
Select Stuff(( Select ', ' + Name
From Subjects S
Join TitleSubjectAssociaton TS On S.Id = TS.SubjectId
Where TS.TitleId = T.Id
For Xml Path('')), 1, 2, '') As Subject
From Subjects
) S
Outer Apply
(
Select Stuff(( Select ', ' + Name
From Grades G
Join TitleGradeAssociaton TG On G.Id = TG.GradeId
Where TG.TitleId = T.Id
For Xml Path('')), 1, 2, '') As Grade
From Grades
) G
Where H.Theme Is Null
Or S.Subject Is Null
Or G.Grade Is Null

Hope this Helps.
;WITH cte_Titles (ID,Title,Author) AS
(
SELECT 1,'The Hidden Language of Computer Hardware and Software','Charles Petzold' UNION ALL
SELECT 2,'Paths, Dangers, Strategies','Nick Bostrom' UNION ALL
SELECT 3,'The Smart Girls Guide to Privacy','Violet Blue' UNION ALL
SELECT 4,'Introduction to Algorithms','Thomas H. Cormen' UNION ALL
SELECT 5,'Machine Learning in Action','Peter Harrington'
),cte_Themes(ID,Name) AS
(
SELECT 1,'Science Fiction' UNION ALL
SELECT 2,'Biography' UNION ALL
SELECT 3,'Painting'
),cte_Subjects(ID,Name) AS
(
SELECT 1,'Science' UNION ALL
SELECT 2,'Technology' UNION ALL
SELECT 3,'Music' UNION ALL
SELECT 4,'Geography'
),cte_Grades(ID,Name) AS
(
SELECT 1,'Grade 1' UNION ALL
SELECT 2,'Grade 2' UNION ALL
SELECT 3,'Grade 3' UNION ALL
SELECT 4,'Grade 4' UNION ALL
SELECT 5,'Grade 5'
),cte_TitleThemeAssociation(TitleID,ThemeID) AS
(
SELECT 1,1 UNION ALL
SELECT 1,3 UNION ALL
SELECT 4,2 UNION ALL
SELECT 4,3
),cte_TitleSubjectAssociaton(TitleID,SubjectID) AS
(
SELECT 1, 1 UNION ALL
SELECT 1, 3 UNION ALL
SELECT 2, 1 UNION ALL
SELECT 2, 3 UNION ALL
SELECT 4, 1 UNION ALL
SELECT 4, 2
),cte_TitleGradeAssociaton(TitleID,GradeID) AS
(
SELECT 1, 1 UNION ALL
SELECT 1, 2 UNION ALL
SELECT 1, 3 UNION ALL
SELECT 2, 1 UNION ALL
SELECT 2, 2
)
,cte_ResultSet AS
(
SELECT DISTINCT t.ID AS TitleID,
t.Title,
th.NAME AS Theme,
s.NAME AS Subject,
g.NAME AS Grade
FROM cte_Titles t
LEFT JOIN cte_TitleThemeAssociation tta
ON t.ID = tta.TitleID
LEFT JOIN cte_Themes th
ON tta.ThemeID = th.ID
LEFT JOIN cte_TitleSubjectAssociaton tsa
ON tsa.TitleID = t.ID
LEFT JOIN cte_Subjects s
ON tsa.SubjectID = s.ID
LEFT JOIN cte_TitleGradeAssociaton tga
ON tga.TitleID = t.ID
LEFT JOIN cte_Grades g
ON g.ID = tga.GradeID
)
SELECT DISTINCT Title
, STUFF((SELECT DISTINCT ',' + SUB.Theme AS [text()]
FROM cte_ResultSet SUB
WHERE SUB.TitleID = CAT.TitleID
FOR XML PATH('')
), 1, 1, '' ) AS Theme
, STUFF((SELECT DISTINCT ',' + SUB.Subject AS [text()]
FROM cte_ResultSet SUB
WHERE SUB.TitleID = CAT.TitleID
FOR XML PATH('')
), 1, 1, '' ) AS Subject
, STUFF((SELECT DISTINCT ',' + SUB.Grade AS [text()]
FROM cte_ResultSet SUB
WHERE SUB.TitleID = CAT.TitleID
FOR XML PATH('')
), 1, 1, '' ) AS Grade
FROM cte_ResultSet CAT

Related

Oracle Finding a string match from multiple database tables

This is somewhat a complex problem to describe, but I'll try to explain it with an example. I thought I would have been able to use the Oracle Instr function to accomplish this, but it does not accept queries as parameters.
Here is a simplification of my data:
Table1
Person Qualities
Joe 5,6,7,8,9
Mary 7,8,10,15,20
Bob 7,8,9,10,11,12
Table2
Id Desc
5 Nice
6 Tall
7 Short
Table3
Id Desc
8 Angry
9 Sad
10 Fun
Table4
Id Desc
11 Boring
12 Happy
15 Cool
20 Mad
Here is somewhat of a query to give an idea of what I'm trying to accomplish:
select * from table1
where instr (Qualities, select Id from table2, 1,1) <> 0
and instr (Qualities, select Id from table3, 1,1) <> 0
and instr (Qualities, select Id from table3, 1,1) <> 0
I'm trying to figure out which people have at least 1 quality from each of the 3 groups of qualities (tables 2,3, and 4)
So Joe would not be returned in the results because he does not have the quality from each of the 3 groups, but Mary and Joe would since they have at least 1 quality from each group.
We are running Oracle 12, thanks!
Here's one option:
SQL> with
2 table1 (person, qualities) as
3 (select 'Joe', '5,6,7,8,9' from dual union all
4 select 'Mary', '7,8,10,15,20' from dual union all
5 select 'Bob', '7,8,9,10,11,12' from dual
6 ),
7 table2 (id, descr) as
8 (select 5, 'Nice' from dual union all
9 select 6, 'Tall' from dual union all
10 select 7, 'Short' from dual
11 ),
12 table3 (id, descr) as
13 (select 8, 'Angry' from dual union all
14 select 9, 'Sad' from dual union all
15 select 10, 'Fun' from dual
16 ),
17 table4 (id, descr) as
18 (select 11, 'Boring' from dual union all
19 select 12, 'Happy' from dual union all
20 select 15, 'Cool' from dual union all
21 select 20, 'Mad' from dual
22 ),
23 t1new (person, id) as
24 (select person, regexp_substr(qualities, '[^,]+', 1, column_value) id
25 from table1 cross join table(cast(multiset(select level from dual
26 connect by level <= regexp_count(qualities, ',') + 1
27 ) as sys.odcinumberlist))
28 )
29 select a.person,
30 count(b.id) bid,
31 count(c.id) cid,
32 count(d.id) did
33 from t1new a left join table2 b on a.id = b.id
34 left join table3 c on a.id = c.id
35 left join table4 d on a.id = d.id
36 group by a.person
37 having ( count(b.id) > 0
38 and count(c.id) > 0
39 and count(d.id) > 0
40 );
PERS BID CID DID
---- ---------- ---------- ----------
Bob 1 3 2
Mary 1 2 2
SQL>
What does it do?
lines #1 - 22 represent your sample data
T1NEW CTE (lines #23 - 28) splits comma-separated qualities into rows, per every person
final select (lines #29 - 40) are outer joining t1new with each of "description" tables (table2/3/4) and counting how many qualities are contained in there for each of person's qualities (represented by rows from t1new)
having clause is here to return only desired persons; each of those counts have to be a positive number
Maybe this will help:
{1} Create a view that categorises all qualities and allows you to SELECT quality IDs and categories . {2} JOIN the view to TABLE1 and use a join condition that "splits" the CSV value stored in TABLE1.
{1} View
create or replace view allqualities
as
select 1 as category, id as qid, descr from table2
union
select 2, id, descr from table3
union
select 3, id, descr from table4
;
select * from allqualities order by category, qid ;
CATEGORY QID DESCR
---------- ---------- ------
1 5 Nice
1 6 Tall
1 7 Short
2 8 Angry
2 9 Sad
2 10 Fun
3 11 Boring
3 12 Happy
3 15 Cool
3 20 Mad
{2} Query
-- JOIN CONDITION:
-- {1} add a comma at the start and at the end of T1.qualities
-- {2} remove all blanks (spaces) from T1.qualities
-- {3} use LIKE and the qid (of allqualities), wrapped in commas
--
-- inline view: use UNIQUE, otherwise we may get counts > 3
--
select person
from (
select unique person, category
from table1 T1
join allqualities A
on ',' || replace( T1.qualities, ' ', '' ) || ',' like '%,' || A.qid || ',%'
)
group by person
having count(*) = ( select count( distinct category ) from allqualities )
;
-- result
PERSON
Bob
Mary
Tested w/ Oracle 18c and 11g. DBfiddle here.

Using the results of a STRING_AGG function with the IN operator in a WHERE clause

I have column children_ids which contain PKs from a STRING_AGG function. I am trying to use this column within a WHERE clause with the IN operator to return the total_pets but it doesn't work. If I copy and paste the values directly into the IN operator the query returns the correct info, otherwise no reuslts are found.
Here are my data sets:
Parents
=======
id parent_name
----------------
1 Bob and Mary
2 Mick and Jo
Children
========
id child_name parent_id
-------------------------
1 Eddie 1
2 Frankie 1
3 Robbie 1
4 Duncan 2
5 Rick 2
6 Jen 2
Childrens Pets
===============
id pet_name child_id
-------------------------
1 Puppy 1
2 Piggy 2
3 Monkey 3
4 Lamb 4
5 Tiger 5
6 Bear 6
7 Zebra 6
Expected Output
===============
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 3
2 4,5,6 4
Current [undesired] Output
==========================
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 0
2 4,5,6 0
here is the standard sql to test for yourself
# setup data with standardSQL
WITH `parents` AS (
SELECT 1 id, 'Bob and Mary' parent_names UNION ALL
SELECT 2, 'Mick and Jo'
),
`children` AS (
SELECT 1 id, 'Eddie' child_name, 1 parent_id UNION ALL
SELECT 2, 'Frankie', 1 UNION ALL
SELECT 3, 'Robbie', 1 UNION ALL
SELECT 4, 'Duncan', 2 UNION ALL
SELECT 5, 'Rick', 2 UNION ALL
SELECT 6, 'Jen', 2
),
`childrens_pets` AS (
SELECT 1 id, 'Puppy' pet_name, 1 child_id UNION ALL
SELECT 2, 'Piggy', 2 UNION ALL
SELECT 3, 'Monkey', 3 UNION ALL
SELECT 4, 'Lamb', 4 UNION ALL
SELECT 5, 'Tiger', 5 UNION ALL
SELECT 6, 'Bear', 6 UNION ALL
SELECT 7, 'Zebra', 6
)
And the query:
#standardSQL
select
parent_id
, children_ids
-- !!! This keeps returning 0 instead of the total pets for each parent based on their children
, (
select count(p1.id)
from childrens_pets p1
where cast(p1.child_id as string) in (children_ids)
) as total_pets
from
(
SELECT
p.id as parent_id
, (
select string_agg(cast(c1.id as string))
from children as c1
where c1.parent_id = p.id
) as children_ids
FROM parents as p
join children as c
on p.id = c.parent_id
join childrens_pets as cp
on cp.child_id = c.id
)
GROUP BY
parent_id
, children_ids
... but is there a way to do it using the IN operator as my query ...
Just fix one line and it will work for you!
Replace
WHERE CAST(p1.child_id AS STRING) IN (children_ids)
with
WHERE CAST(p1.child_id AS STRING) IN (SELECT * FROM UNNEST(SPLIT(children_ids)))
Huh? This would seem to do what you want:
SELECT p.id as parent_id,
string_agg(distinct cast(c.id as string)) as children_ids
count(distinct cp.id) as num_pets
FROM parents p JOIN
children c
ON p.id = c.parent_id JOIN
children_pets cp
ON cp.child_id = c.id
GROUP BY parent_id;

SQL Server Create Grouping For Related Records

I'm running into an interesting scenario trying to assign an arbitrary FamilyId to fields that are related to each other.
Here is the structure that we're currently working with:
DataId OriginalDataId
3 1
4 1
5 1
6 1
3 2
4 2
5 2
6 2
7 10
8 10
9 10
11 15
What we're attempting to do is add a FamilyId column to all DataIds that have a relationship between each other.
In this case, Id's 3, 4, 5, and 6 have a relationship to 1. But 3, 4, 5, and 6 also have a relationship with 2. So 1, 2, 3, 4, 5, and 6 should all be considered to be in the same FamilyId.
7, 8, and 9 only have a relationship to 10, which puts this into a separate FamilyId. Same for 11 and 15.
What I am expecting as a result from this are the following results:
DataId FamilyId
1 1
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 2
10 2
11 3
15 3
Sample data, structure, and queries:
Declare #Results_Stage Table
(
DataId BigInt Not Null,
OriginalDataId BigInt Null
)
Insert #Results_Stage
Values (3,1), (4,1), (5,1), (6,1), (3,2), (4,2), (5,2), (6,2), (7,10), (8, 10), (9, 10), (11, 15)
Select DataId, Row_Number() Over(Partition By DataId Order By OriginalDataId Asc) FamilyId
From #Results_Stage R
Union
Select OriginalDataId, Row_Number() Over(Partition By DataId Order By OriginalDataId Asc) FamilyId
From #Results_Stage
I'm positive my attempt is nowhere near correct, but I'm honestly not sure where to even start on this -- or if it's even possible in SQL Server.
Does anyone have an idea on how to tackle this issue, or at least, something to point me in the right direction?
Edit Below is a query I've come up with so far to identify the other DataId records that should belong to the same FamilyId
Declare #DataId BigInt = 1
;With Children As
(
Select Distinct X.DataId
From #Results_Stage S
Outer Apply
(
Select Distinct DataId
From #Results_Stage R
Where R.OriginalDataId = S.DataId
Or R.OriginalDataId = S.OriginalDataId
) X
Where S.DataId = #DataId
Or S.OriginalDataId = #DataId
)
Select Distinct O.OriginalDataId
From Children C
Outer Apply
(
Select S.OriginalDataId
From #Results_Stage S
Where S.DataId = C.DataId
) O
Union
Select DataId
From Children
The following query, which employs FOR XML PATH:
SELECT R.OriginalDataId,
STUFF((
SELECT ', ' + + CAST([DataId] AS VARCHAR(MAX))
FROM #Results_Stage
WHERE (OriginalDataId = R.OriginalDataId)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS GroupValues
FROM #Results_Stage R
GROUP BY R.OriginalDataId
can be used to produce this output:
OriginalDataId GroupValues
===========================
1 3, 4, 5, 6
2 3, 4, 5, 6
10 7, 8, 9
15 11
Using the above result set, we can easily identify each group and thus have something upon which DENSE_RANK() can be applied:
;WITH GroupedData AS (
SELECT R.OriginalDataId,
STUFF((
SELECT ', ' + + CAST([DataId] AS VARCHAR(MAX))
FROM #Results_Stage
WHERE (OriginalDataId = R.OriginalDataId)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS GroupValues
FROM #Results_Stage R
GROUP BY R.OriginalDataId
), Families AS (
SELECT OriginalDataId, DENSE_RANK() OVER (ORDER BY GroupValues) AS FamilyId
FROM GroupedData
)
SELECT OriginalDataId AS DataId, FamilyId
FROM Families
UNION
SELECT DataId, F.FamilyId
FROM #Results_Stage R
INNER JOIN Families F ON R.OriginalDataId = F.OriginalDataId
ORDER BY FamilyId
Output from above is:
DataId FamilyId
===================
11 1
15 1
1 2
2 2
3 2
4 2
5 2
6 2
7 3
8 3
9 3
10 3
Check this ... it doesn't look too nice but is doing the job :)
DECLARE #T TABLE (DataId INT, OriginalDataId INT)
INSERT INTO #T(DataId , OriginalDataId)
select 3,1
union all select 4,1
union all select 5,1
union all select 6,1
union all select 3,2
union all select 4,2
union all select 5,2
union all select 6,2
union all select 7,10
union all select 8,10
union all select 9,10
union all select 11,15
SELECT * FROM #T
;WITH f AS (
SELECT DISTINCT OriginalDataId FROM #T
)
, m AS (
SELECT DISTINCT
DataId , OriginalDataId = MIN(OriginalDataId)
FROM #T
GROUP BY DataId
)
, m2 AS (
SELECT DISTINCT
x.DataId , x.OriginalDataId
FROM #T AS x
LEFT OUTER JOIN m ON x.DataId = m.DataId AND x.OriginalDataId = m.OriginalDataId
WHERE m.DataId IS NULL
)
, m3 AS (
SELECT DISTINCT DataId = x.OriginalDataId , m.OriginalDataId
FROM m2 AS x
INNER JOIN m ON x.DataId = m.DataId
)
, m4 AS (
SELECT DISTINCT
DataId = OriginalDataId , OriginalDataId
FROM #T
WHERE OriginalDataId NOT IN(SELECT DataId FROM m3)
UNION
SELECT DISTINCT
x.DataId , f.OriginalDataId
FROM f
INNER JOIN m AS x on x.OriginalDataId = f.OriginalDataId
WHERE x.DataId NOT IN(SELECT DataId FROM m3)
UNION
SELECT DataId , OriginalDataId FROM m3
)
, list AS (
SELECT
x.DataId, FamilyId = DENSE_RANK() OVER(ORDER BY x.OriginalDataId )
FROM m4 AS x
)
SELECT * FROM list
-- OUTPUT
DataId FamilyId
1 1
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 2
10 2
11 3
15 3

Use a CTE to traverse to 2nd level in tree

I'm trying to use a CTE to traverse a tree in SQL Server. Ideally what I would like as output is a table which shows for each node in the tree the corresponding node that is second from the top in the tree.
I have some basic code to traverse the tree from a given node, but how can I modify it so it produces the desired output ?
DECLARE #temp TABLE
(
Id INT
, Name VARCHAR(50)
, Parent INT
)
INSERT #temp
SELECT 1,' Great GrandFather Thomas Bishop', null UNION ALL
SELECT 2,'Grand Mom Elian Thomas Wilson' , 1 UNION ALL
SELECT 3, 'Dad James Wilson',2 UNION ALL
SELECT 4, 'Uncle Michael Wilson', 2 UNION ALL
SELECT 5, 'Aunt Nancy Manor', 2 UNION ALL
SELECT 6, 'Grand Uncle Michael Bishop', 1 UNION ALL
SELECT 7, 'Brother David James Wilson',3 UNION ALL
SELECT 8, 'Sister Michelle Clark', 3 UNION ALL
SELECT 9, 'Brother Robert James Wilson', 3 UNION ALL
SELECT 10, 'Me Steve James Wilson', 3
;WITH cte AS
(
SELECT Id, Name, Parent, 1 as Depth
FROM #temp
WHERE Id = 8
UNION ALL
SELECT t2.*, Depth + 1 as 'Depth'
FROM cte t
JOIN #temp t2 ON t.Parent = t2.Id
)
SELECT *
, MAX(Depth) OVER() - Depth + 1 AS InverseDepth
FROM cte
As output I would like something like
Id Name depth2_id depth2_name
8 Sister Michelle .. 2 Grand Mom Elian ....
7 Brother David .. 2 Grand Mom Elian ....
4 Uncle Michael .. 2 Grand Mom Elian ...
Thanks for any tips or pointers.
a bit hard to get what your goal, but you can use smth like this:
;with cte AS
(
select
t.Id, t.Name, t.Parent, 1 as Depth,
null as Depth2Parent
from #temp as t
where t.Parent is null
union all
select
t.Id, t.Name, t.Parent, c.Depth + 1 as 'Depth',
isnull(c.Depth2Parent, case when c.Depth = 1 then t.Id end) as Depth2Parent
from cte as c
inner join #temp as t on t.Parent = c.Id
)
select *
from cte
sql fiddle demo

Find Missing Pairs in SQL

Assume there's a relational database with 3 tables:
Courses {name, id},
Students {name, id},
Student_Course {student_id, course_id}
I want to write an SQL that gives me the student-course pairs that do NOT exist. If that is not feasible, at least it'd be good to know if there are missing pairs or not.
Also, since this is a small part of a larger problem I'd like to automate, seeing many different ways of doing it would be useful.
1st find all pairs and then remove pairs present (either by left join/not null or not exists)
select s.id as student_id, c.id as course_id
from Courses as c
cross join Students as s
left join Student_Course as sc on sc.student_id = s.id and sc.course_id = c.id
where sc.course_id is null -- any sc field defined as "not null"
with Courses as(
select 1 as id,'Math' as name union all
select 2 as id,'English' as name union all
select 3 as id,'Physics' as name union all
select 4 as id,'Chemistry' as name),
Students as(
select 1 as id,'John' as name union all
select 2 as id,'Joseph' as name union all
select 3 as id,'George' as name union all
select 4 as id,'Michael' as name
),
studcrse as(
select 1 as studid, 1 as crseid union all
select 1 as studid, 2 as crseid union all
select 1 as studid, 3 as crseid union all
select 2 as studid, 3 as crseid union all
select 2 as studid, 4 as crseid union all
select 3 as studid, 1 as crseid union all
select 3 as studid, 2 as crseid union all
select 3 as studid, 4 as crseid union all
select 3 as studid, 3 as crseid union all
select 4 as studid, 4 as crseid )
SELECT A.ID AS studentId,a.name as studentname,b.id as crseid,b.name as crsename
from Students as a
cross join
Courses as b
where not exists
(
select 1 from studcrse as c
where c.studid=a.id
and c.crseid=b.id)