How to combine three SELECT statements with very tricky requirements - sql

I have a SQL query with three SELECT statements. A picture of the data tables generated by these three select statements is located at www.britestudent.com/pub/1.png. Each of the three data tables have identical columns. I want to combine these three tables into one table such that:
(1) All rows in top table (Table1) are always included.
(2) Rows in the middle table (Table2) are included only when the values in column1 (UserName) and column4 (CourseName) do not match with any row from Table1. Both columns need to match for the row in Table2 to not be included.
(3) Rows in the bottom table (Table3) are included only when the value in column4 (CourseName) is not already in any row of the results from combining Table1 and Table2.
I have had success in implementing (1) and (2) with an SQL query like this:
SELECT DISTINCT
UserName AS UserName,
MAX(AmountUsed) AS AmountUsed,
MAX(AnsweredCorrectly) AS AnsweredCorrectly,
CourseName,
MAX(course_code) AS course_code,
MAX(NoOfQuestionsInCourse) AS NoOfQuestionsInCourse,
MAX(NoOfQuestionSetsInCourse) AS NoOfQuestionSetsInCourse
FROM
( "SELECT statement 1" UNION "SELECT statement 2" ) dt_derivedTable_1
GROUP BY CourseName, UserName
Where "SELECT statement 1" is the query that generates Table1 and "SELECT statement 2" is the query that generates Table2. A picture of the data table generated by this query is located at www.britestudent.com/pub/2.png. I can get away with using the MAX() function because values in the AmountUsed and AnsweredCorrectly columns in Table1 will always be larger than those in Table2 (and they are identical in the last three columns of both tables).
What I fail at is implementing (3). Any suggestions on how to do this will be appreciated. It is tricky because the UserName values in Table3 are null, and because the CourseName values in the combined Table1 and Table2 results are not unique (but they are unique in Table3).
After implementing (3), the final table should look like the table in picture 2.png with the addition of the last row from Table3 (the row with the CourseName value starting with "4. Klasse..."
I have tried to implement (3) using another derived table using SELECT, MAX() and UNION, but I could not get it to work. Below is my full SQL query with the lines from this failed attempt to implement (3) commented out.
Cheers,
Frederick
PS--I am new to this forum (and new to SQL as well), but I have had more of my previous problems answered by reading other people's posts on this forum than from reading any other forum or Web site. This forum is a great resources.
-- SELECT DISTINCT MAX(UserName), MAX(AmountUsed) AS AmountUsed, MAX(AnsweredCorrectly) AS AnsweredCorrectly, CourseName, MAX(course_code) AS course_code, MAX(NoOfQuestionsInCourse) AS NoOfQuestionsInCourse, MAX(NoOfQuestionSetsInCourse) AS NoOfQuestionSetsInCourse
-- FROM (
SELECT DISTINCT UserName AS UserName, MAX(AmountUsed) AS AmountUsed, MAX(AnsweredCorrectly) AS AnsweredCorrectly, CourseName, MAX(course_code) AS course_code, MAX(NoOfQuestionsInCourse) AS NoOfQuestionsInCourse, MAX(NoOfQuestionSetsInCourse) AS NoOfQuestionSetsInCourse
FROM (
-- Table 1 - All UserAccount/Course combinations that have had quizzez.
SELECT DISTINCT dbo.win_user.user_name AS UserName,
cast(dbo.GetAmountUsed(dbo.session_header.win_user_id, dbo.course.course_id, dbo.course.no_of_questionsets_in_course) as nvarchar(10)) AS AmountUsed,
Isnull(cast(dbo.GetAnswerCorrectly(dbo.session_header.win_user_id, dbo.course.course_id, dbo.question_set.no_of_questions) as nvarchar(10)),0) AS AnsweredCorrectly,
dbo.course.course_name AS CourseName,
dbo.course.course_code,
dbo.course.no_of_questions_in_course AS NoOfQuestionsInCourse,
dbo.course.no_of_questionsets_in_course AS NoOfQuestionSetsInCourse
FROM dbo.session_detail
INNER JOIN dbo.session_header ON dbo.session_detail.session_header_id = dbo.session_header.session_header_id
INNER JOIN dbo.win_user ON dbo.session_header.win_user_id = dbo.win_user.win_user_id
INNER JOIN dbo.win_user_course ON dbo.win_user_course.win_user_id = dbo.win_user.win_user_id
INNER JOIN dbo.question_set ON dbo.session_header.question_set_id = dbo.question_set.question_set_id
RIGHT OUTER JOIN dbo.course ON dbo.win_user_course.course_id = dbo.course.course_id
WHERE (dbo.session_detail.no_of_attempts = 1 OR dbo.session_detail.no_of_attempts IS NULL)
AND (dbo.session_detail.is_correct = 1 OR dbo.session_detail.is_correct IS NULL)
AND (dbo.win_user_course.is_active = 'True')
GROUP BY dbo.win_user.user_name, dbo.course.course_name, dbo.question_set.no_of_questions, dbo.course.no_of_questions_in_course,
dbo.course.no_of_questionsets_in_course, dbo.session_header.win_user_id, dbo.course.course_id, dbo.course.course_code
UNION ALL
-- Table 2 - All UserAccount/Course combinations that do or do not have quizzes but where the Course is selected for quizzes for that User Account.
SELECT dbo.win_user.user_name AS UserName,
-1 AS AmountUsed,
-1 AS AnsweredCorrectly,
dbo.course.course_name AS CourseName,
dbo.course.course_code,
dbo.course.no_of_questions_in_course AS NoOfQuestionsInCourse,
dbo.course.no_of_questionsets_in_course AS NoOfQuestionSetsInCourse
FROM dbo.win_user_course
INNER JOIN dbo.win_user ON dbo.win_user_course.win_user_id = dbo.win_user.win_user_id
RIGHT OUTER JOIN dbo.course ON dbo.win_user_course.course_id = dbo.course.course_id
WHERE (dbo.win_user_course.is_active = 'True')
GROUP BY dbo.win_user.user_name, dbo.course.course_name, dbo.course.no_of_questions_in_course,
dbo.course.no_of_questionsets_in_course, dbo.course.course_id, dbo.course.course_code
) dt_derivedTable_1
GROUP BY CourseName, UserName
-- UNION ALL
-- Table 3 - All Courses.
-- SELECT DISTINCT null AS UserName,
-- -2 AS AmountUsed,
-- -2 AS AnsweredCorrectly,
-- dbo.course.course_name AS CourseName,
-- dbo.course.course_code,
-- dbo.course.no_of_questions_in_course AS NoOfQuestionsInCourse,
-- dbo.course.no_of_questionsets_in_course AS NoOfQuestionSetsInCourse
-- FROM dbo.course
-- WHERE is_active = 'True'
-- ) dt_derivedTable_2
-- GROUP BY CourseName
-- ORDER BY CourseName

With such filtering requirements (depending on the rows of prior queries), I recommend a table variable.
DECLARE #MyTable TABLE
(
ID int PRIMARY KEY,
Name varchar(50),
QueryNumber int
)
INSERT INTO #MyTable (ID, Name, QueryNumber)
SELECT CustomerID, CustomerName, 1
FROM Customer
WHERE Name = "Bob"
INSERT INTO #MyTable (ID, Name, QueryNumber)
SELECT CustomerID, CustomerName, 2
FROM Customer
WHERE Name = "Joe" and CustomerID not in (SELECT ID FROM #MyTable)
INSERT INTO #MyTable (ID, Name, QueryNumber)
SELECT CustomerID, CustomerName, 3
FROM Customer
WHERE CustomerID not in (SELECT ID FROM #MyTable)
SELECT *
FROM #MyTable

Here is an Oracle flavored solution:
Select
*
from table1
UNION
select
*
from table2
where not exists(
select 'x'
from table1
where
table2.username = table1.username
and table2.coursename = table1.coursename
)
UNION
select
*
from table3
where
coursename not in (
Select
coursename
from table1
UNION /* the union operator implies distinct, so
there will be no duplicates */
select
coursename
from table2
where not exists(
select 'x'
from table1
where
table2.username = table1.username
and table2.coursename = table1.coursename
)
)

Related

Display the names of each employees who works in both ‘IT’ and ‘SE’

Emp(sid(pk) : integer, sname: varchar(255))
Dep(sid(fk) : integer, dep : varchar(255))
SQL:How I find the names of each employees who works in both ‘IT’ and ‘SE’?
To observe a query that Joins two tables together and get common values depend on a common column Ex: id, using INNTER JOIN will help you on that
The INNER JOIN keyword selects records that have matching values in both tables.
Solution
SELECT Emp.sid, Emp.sname FROM Emp
INNER JOIN
(SELECT sid FROM Dep WHERE dep='IT'
INTERSECT
SELECT sid FROM Dep WHERE dep='SE') as A
ON Emp.sid = A.sid
References
SQL INNER JOIN
The way I understand it, the data situation is as below. emp with sid and sname, and dep, with sid - the foreign key to emp and dep, this time not as a table, but a column containing the department's abbreviation. And the combination, in the dep table, of sid and dep, is unique.
If that is the constellation, then join the two tables using sid, filter by: dep in the set:('IT' , 'SE'); Then, put the two columns from emp into the the column list, and GROUP BY them, and finally, apply the grouping filter HAVING COUNT(*) = 2 to just get the group that has two entries when filtered by the two departments.
WITH
emp(sid, sname) AS (
SELECT 42,'Arthur'
UNION ALL SELECT 43,'Ford'
UNION ALL SELECT 44,'Zaphod'
)
,
dep(sid, dep) AS (
SELECT 42,'IT'
UNION ALL SELECT 42,'SE'
UNION ALL SELECT 42,'AC'
UNION ALL SELECT 43,'IT'
UNION ALL SELECT 43,'AC'
UNION ALL SELECT 44,'SE'
UNION ALL SELECT 44,'SA'
)
SELECT
emp.sid
, emp.sname
FROM emp JOIN dep USING(sid)
WHERE dep.dep IN ('IT','SE')
GROUP BY
emp.sid
, emp.sname
HAVING COUNT(*) = 2;
-- out sid|sname
-- out 42|Arthur

SQL Query – records within the SQL Select statement, but NOT in the table being queried

I have a large list of CustIDs that I need to query on to find if they are within the CUSTOMER table; I want the result to tell me which CustIDs ARE on the table and which CustIDs are NOT on the table.
I provided a short list below to give an idea of what I need to do.
Oracle database
Table: Customer
Primary Key: CustID
Scenario:
Customer table only has the following (2) CustID: ‘12345’, ‘56789’
Sql:
Select * from CUSTOMERS where CUSTID in (‘12345’, ‘56789’, ‘01234’);
I want the result to tell me that both ‘12345’ and ‘56789’ are in the table, AND that ‘01234’ is NOT.
select
v.CustID,
exists (select * from Customer where Customer.CustID = v.CustID)
from (values (12345), (56789), (01234)) v (CustID);
Results:
custid exists
12345 true
56789 true
1234 false
You need a left join or subquery for this. The precise syntax varies by database. Typical syntax is:
select i.custid,
(case when c.custid is not null then 1 else 0 end) as exists_flag
from (select '12345' as custid union all
select '56789' union all
select '01234'
) ci left join
customers c
on c.cust = i.custid;

SQL Loop/Crawler

I am trying to figure out some ways to accomplish this script. I import an excel sheet and then I need to populate 5 different tables based on this excel sheet. However for this example I just need help with the initial loop then I think I can work through the rest.
select distinct Department from IPACS_New_MasterList
where Department is not null
This provides me a list of 7 different departments.
Dep1, Dep2, Dep3, Dep4, Dep5, Dep6, Dep7
For each of these departments I need to perform some code.
Step #1:
Insert the department into table_one
I then need to keep the SCOPE_IDENTITY() for the rest of the code.
Step #2
perform the second loop (inserting all functions in that department into table2.
I'm not sure how to really do a foreach row in this select statement loop, or if I need to do something completely different. I've looked at several answers but can't seem to find exactly what I'm looking for.
Sample Data:
Source Table
Dep1, func1, process1, procedure1
dep1, func1, process1, procedure2
dep1, func1, process2, procedure3
dep1, func1, process2, procedure4
dep1, func1, process2, procedure5
dep1, func2, process3, procedure6
dep2, func3, process4, procedure7
My Tables:
My first table is a list of every department from the above query. With a key on the departmentID. Each department can have many functions.
My second table is a list of all functions with a key on functionID and a foreign key on departmentID. Each function must have 1 department and can have many processes
My third table is a list of all processes with a key on processID and a foreign key on functionID. Each process must have 1 function and can have many procedures.
There are two approaches you can use without a loop.
1) If you have candidate keys in your source (department name) just join your source table back to the table you inserted
e.g.
INSERT INTO Department
(Name)
SELECT DISTINCT Dep1
FROM SOURCE;
INSERT INTO Functions
(
Name,
DepartmentID)
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name;
SQL Fiddle
2) If you don't have candidate keys in your source then you can use the output clause
For example here if a department weren't guaranteed to be unique this would correctly find only the newly add
DECLARE #Department TABLE
(
DepartmentID INT
)
DECLARE #Functions TABLE
(
FunctionID INT
)
INSERT INTO Department
(Name)
OUTPUT INSERTED.DepartmentID INTO #Department
SELECT DISTINCT Dep1
FROM SOURCE
INSERT INTO Functions
(
Name,
DepartmentID)
OUTPUT INSERTED.FunctionID INTO #FunctionID
SELECT DISTINCT
s.Func1,
d.DepartmentID
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN #Department d2
ON d.departmentID = d2.departmentID;
INSERT INTO
processes
(
name,
FunctionID,
[Procedure]
)
SELECT
s.process1,
f.FunctionID,
s.procedure1
FROM
source s
INNER JOIN Department d
on s.dep1 = d.name
INNER JOIN Functions f
on d.DepartmentID = f.departmentID
and s.func1 = f.name
INNER JOIN #Functions f2
ON f.Functions = f2.Functions
SELECT * FROM Department;
SELECT * FROm Functions;
SELECT * FROM processes;
SQL Fiddle
If I am understanding what you are trying to do... yes you can use a loop. Its not really talked about and I bet I am going to get some feedback from other SQL developers that its not a best practice. But if you really need to do a loop
DECLARE #rowcount as int
DECLARE #numberOfRows as int
SET #rowcount = 0
SET #numberOfRows = SELECT COUNT(*) from tablename --put in anything to get the number of times to loop.
WHILE #numberOfRows <= #rowcount
BEGIN
--Put whatever process you need to repeat here
SET #rowcount = #rowcount + 1
END
Assuming you have tables set up with an IDENTITY field set for the Primary Key, you can populate each successive table's foreign key by joining to the previous table and the source table, something like:
INSERT INTO Table1
SELECT DISTINCT Department
FROM SourceTable
GO
INSERT INTO Table2
SELECT DISTINCT b.Deptartment_ID, a.Function
FROM SourceTable a
JOIN Table1 b
ON a.Department = b.Department
GO
INSERT INTO Table3
SELECT DISTINCT b.Function_ID, a.Process
FROM SourceTable a
JOIN Table2 b
ON a.Function = b.Function
GO
INSERT INTO Table4
SELECT DISTINCT b.Process_ID, a.Procedure
FROM SourceTable a
JOIN Table3 b
ON a.Process = b.Process
GO

SQL - select selective row multiple times

I need to produce mailing labels for my company and I thought I would do a query for that:
I have 2 tables - tblAddress , tblContact.
In tblContact I have "addressNum" which is a foreign key of address and "labelsNum" column that represents the number of times the address should appear in the labels sheet.
I need to create an inner join of tblcontact and tbladdress by addressNum,
but if labelsNum exists more than once it should be displayed as many times as labelsNum is.
I suggest using a recursive query to do the correct number of iterations for each row.
Here is the code (+ link to SQL fiddle):
;WITH recurs AS (
SELECT *, 1 AS LEVEL
FROM tblContact
UNION ALL
SELECT t1.*, LEVEL + 1
FROM tblContact t1
INNER JOIN
recurs t2
ON t1.addressnum = t2.addressnum
AND t2.labelsnum > t2.LEVEL
)
SELECT *
FROM recurs
ORDER BY addressnum
Wouldn't the script return multiple lines for different contacts anyway?
CREATE TABLE tblAddress (
AddressID int IDENTITY
, [Address] nvarchar(35)
);
CREATE TABLE tblContact (
ContactID int IDENTITY
, Contact nvarchar(35)
, AddressNum int
, labelsNum int
);
INSERT INTO tblAddress VALUES ('foo1');
INSERT INTO tblAddress VALUES ('foo2');
INSERT INTO tblContact VALUES ('bar1', 1, 1);
INSERT INTO tblContact VALUES ('bar2', 2, 2);
INSERT INTO tblContact VALUES ('bar3', 2, 2);
SELECT * FROM tblAddress a JOIN tblContact c ON a.AddressID = c.AddressNum
This yields 3 rows on my end. The labelsNum column seems redundant to me. If you add a third contact for address foo2, you would have to update all labelsNum columns for all records referencing foo2 in order to keep things consistent.
The amount of labels is already determined by the amount of different contacts.
Or am I missing something?

SQL Multiple Duplicate Row Detection

I'm trying to determine a correct way to isolate rows within a table that have the same values in 2 columns.
There are two tables, one (Name) with the person's names and IDs, and the other one (Nation) with people's IDs and their nations. I join the two tables with inner join, and now the new table columns consist of an ID, first name, last name, and nation. If I want to find pairs of people who have the same last name and are from the same nation, why isn't
select ID, FName, LName, Nation
from (Name inner join Nation on Name.ID = Nation.ID)
group by Name, Nation
having count(Name) > 1 and count(Nation) > 1
working?
I'm aiming for the result to be a table with columns:
ID -------First--------------- Last ---------Nation
where the last names and nations will be identical pairs while first names will be different.
I feel like the group by part isnt appropriate, but is there even an alternate way? Thanks for any help.
If you are using MS SQL Server:
select
*
from
(
select
Name.*,
Nation.Nation,
cnt = count(*) over(partition by LName, Nation)
from Name
join Nation on Nation.ID = Name.ID
) t
where cnt > 1
Try this:
SELECT * FROM (
SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
FROM Name
INNER JOIN Nation ON (Name.ID = Nation.ID)
) a
INNER JOIN (
SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
FROM Name
INNER JOIN Nation ON (Name.ID = Nation.ID)
) b ON (a.LName = b.LName AND a.Nation = b.Nation)
WHERE a.ID < b.ID
As Simon Righarts hinted, something's not right with the design.
Scenario 1)
If a name can have multiple nations, you would have 3 tables implementing an n:m relationship.
CREATE TABLE name (name_id int, name text, ...);
CREATE TABLE nation (nation_id int, nation text, ...);
CREATE TABLE nationality (name_id int references name(name_id)
,nation_id int references nation(nation_id)
... );
Query for the scenario:
SELECT a.name_id, a.fname, a.lname, n.nation
FROM name a
JOIN nationality na USING (name_id)
JOIN nation n USING (nation_id)
JOIN (
SELECT a.lname, na.nation_id
FROM name a
JOIN nationality na USING (name_id)
GROUP BY 1,2
HAVING count(*) > 1) x USING (lname, nation_id)
Scenario 2)
If a name can only have one nation, there would be a column nation_id in the table name:
CREATE TABLE name (name_id int
,name text
,nation_id int references nation(nation_id), ...);
CREATE TABLE nation (nation_id int, nation text, ...);
Query for this scenario:
SELECT a.name_id, a.fname, a.lname, n.nation
FROM name a
JOIN nation n USING (nation_id)
JOIN (
SELECT a.lname, a.nation_id
FROM name a
GROUP BY 1,2
HAVING count(*) > 1) x USING (lname, nation_id);
All multiple occurrences are included here, not just "pairs" - assuming you meant that.
Your actual description doesn't fit either scenario.