SQLite: Splitting multiple selects into different columns - sql

I need to achieve the following looking output in SQL:
chair | Secretary | Year
------------------------
Matt | Susan | 2006
Susan | Joe | 2005
From a database with tables Members and Leaders. Table Members looks as follows:
Name | num
-------------
Matt | 123
Susan | 456
Joe | 789
Table Leaders looks as follows:
Year | Chair | Secretary
-------------------
2006 | 123 | 456
2005 | 456 | 789
So far I've come up with something like this:
SELECT * FROM
(SELECT m.name FROM Members M, Leaders L
WHERE M.num = L.secretary
UNION ALL
SELECT m.name FROM MEMBER M, Leaders L WHERE M.num = L.chair
UNION ALL
SELECT L.year from Leaders L);
However, this selects all of the wanted parameters as one column. My question is: How do I make it so that the names in particular are split into Chair and Secretary columns when they are derived from the same table?

In my opinion you could build your query in this way:
SELECT l.year, m1.Name, m2.Name
FROM leaders l
INNER JOIN members m1 ON l.chair = m1.num
INNER JOIN members m2 ON l.secretary = m2.num
You can find a test here

You have to use ALIAS, which can be defined with AS keyword (that can be omitted). ALIAS can be used for tables and for columns, like in the example below:
SELECT
ChairMembers.Name AS Chair,
SecretaryMembers.Name AS Secretary,
Year
FROM
Leaders
INNER JOIN
Members AS ChairMembers
ON
Leaders.Chair = ChairMembers.num
INNER JOIN
Members AS SecretaryMembers
ON
Leaders.Secretary = SecretaryMembers.num
For more information you can read this documentation.

Related

Compare Two Relations in SQL

I just started studying SQL and this is a demo given by the teacher in an online course and it works fine. The statement is looking for "students such that number of other students with same GPA is equal to number of other students with same sizeHS":
select *
from Student S1
where (
select count(*)
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) = (
select count(*)
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
);
It seems that in this where clause, we're comparing two relations (because the result of a subquery is a relation), but most of the time we are comparing attributes(as far as I've seen).
So I'm thinking about whether there are requirements for how many attributes, and how many tuples, the RELATION should contain when comparing two RELATIONS. If not, how do we compare two RELATIONS when there're multiple attributes or multiple tuples and what do we get for result?
Note:
Student relation has 4 attributes: sID, sName, GPA, sizeHS. And here's the data:
+-----+--------+-----+--------+
| sID | sName | GPA | sizeHS |
+-----+--------+-----+--------+
| 123 | Amy | 3.9 | 1000 |
| 234 | Bob | 3.6 | 1500 |
| 345 | Craig | 3.5 | 500 |
| 456 | Doris | 3.9 | 1000 |
| 567 | Edward | 2.9 | 2000 |
| 678 | Fay | 3.8 | 200 |
| 789 | Gary | 3.4 | 800 |
| 987 | Helen | 3.7 | 800 |
| 876 | Irene | 3.9 | 400 |
| 765 | Jay | 2.9 | 1500 |
| 654 | Amy | 3.9 | 1000 |
| 543 | Craig | 3.4 | 2000 |
+-----+--------+-----+--------+
and the result of this query is:
+-----+--------+-----+---------+
| sID | sName | GPA | sizeHS |
+-----+--------+-----+---------+
| 345 | Craig | 3.5 | 500 |
| 567 | Edward | 2.9 | 2000 |
| 678 | Fay | 3.8 | 200 |
| 789 | Gary | 3.4 | 800 |
| 765 | Jay | 2.9 | 1500 |
| 543 | Craig | 3.4 | 2000 |
+-----+--------+-----+---------+
because the result of a subquery is a relation
Relation is the scientific name for what we call a table in a database and I like the name "table" much better than "relation". A table is easy to imagine. We know them from our school time schedule for instance. Yes, we relate things here inside a table (day and time and the subject taught in school), but we can also relate tables to tables (pupils' timetables with the table of class rooms, the overall subject schedule, and the teacher's timetables). As such, tables in an RDBMS are also related to each other (hence the name relational database management system). I find the name relation for a table quite confusing (and many people use the word "relation" to describe the relations between tables instead).
So, yes, a query result itself is again a table ("relation"). And from tables we can of course select:
select * from (select * from b) as subq;
And then there are scalar queries that return exactly one row and one column. select count(*) from b is such a query. While this is still a table we can select from
select * from (select count(*) as cnt from b) as subq;
we can even use them where we usually have single values, e.g. in the select clause:
select a.*, (select count(*) from b) as cnt from a;
In your query you have two scalar subqueries in your where clause.
With subqueries there is another distinction to make: we have correlated and non-correlated subqueries. The last query I have just shown contains a non-correlated subquery. It selects the count of b rows for every single result row, no matter what that row contains elsewise. A correlated subquery on the other hand may look like this:
select a.*, (select count(*) from b where b.x = a.y) as cnt from a;
Here, the subquery is related to the main table. For every result row we look up the count of b rows matching the a row we are displaying via where b.x = a.y, so the count is different from row to row (but we'd get the same count for a rows sharing the same y value).
Your subqueries are also correlated. As with the select clause, the where clause deals with one row at a time (in order to keep or dismiss it). So we look at one student S1 at a time. For this student we count other students (S2, where S2.sID <> S1.sID) who have the same GPA (and S2.GPA = S1.GPA) and count other students who have the same sizeHS. We only keep students (S1) where there are exactly as many other students with the same GPA as there are with the same sizeHS.
UPDATE
As do dealing with multiple tuples as in
select *
from Student S1
where (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) = (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
);
this is possible in some DBMS, but not in SQL Server. SQL Server doesn't know tuples.
But there are other means to achieve the same. You could just add two subqueries:
select * from student s1
where (...) = (...) -- compare counts here
and (...) = (...) -- compare averages here
Or get the data in the FROM clause and then deal with it. E.g.:
select *
from Student S1
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) sx
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
) sy
where sx.cnt = sy.cnt and sx.avg_grade = sy.avg_grade;
There are relational operations:
The intersection operator produces the set of tuples that two
relations share in common. Intersection is implemented in SQL in the
form of the INTERSECT operator.
The difference operator acts on two relations and produces the set of tuples from the first relation that do not exist in the second relation. Difference is implemented in SQL in the form of the EXCEPT or MINUS operator.
So, in the context of SQL Server, for example, you can do:
SELECT *
FROM R1
EXCEPT
SELECT *
FROM R2
to get rows in R1 not included in R2 and the reverse - to get all differences.
Of course, the attributes must be the same - if not, you need to explicit set the attributes in the SELECT.

How can I find all columns A whose subcategories B are all related to the same column C?

I'm trying to better understand relational algebra and am having trouble solving the following type of question:
Suppose there is a column A (Department), a column B (Employees) and a column C (Managers). How can I find all of the departments who only have one manager for all of their employees? An example is provided below:
Department | Employees | Managers
-------------+-------------+----------
A | John | Bob
A | Sue | Sam
B | Jim | Don
B | Alex | Don
C | Jason | Xie
C | Greg | Xie
In this table, the result I should get are all tuples containing departments B and C because all of their employees are managed by the same person (Don and Xie respectively). Department A however, would not be returned because it's employees have multiple managers.
Any help or pointers would be appreciated.
Such problems usually call for a self-join.
Joining the relation onto itself on Department, then filtering out the tuples where the Managers are equal would yield us all the unwanted tuples, which we can just subtract from the original relations.
Here's how I'd do it:
First we make a copy of table T, and call it T2, then take a cross product of T and T2. From the result we select all the rows where T1.Manager /= T2.Manager but T1.Department=T2.Department, yielding us these tuples:
T1.Department | T1.Employees| T1.Managers | T2.Managers | T2.Employees | T2.Department
--------------+-------------+-------------+-------------+--------------+--------------
A | John | Bob | Sam | Sue | A
A | Sue | Sam | Bob | John | A
Departments A and B aren't present because their T1.Manager always equals T2.Manager.
Then we just subtract this result the original set to get the answer.
If your RDBMS supports common table expressions:
with C as (
select department, manager, count(*) as cnt
from A
group by department, manager
),
B as (
select department, count(*) as cnt
from A group by department
)
select A.*
from A
join C on A.department = C.department
join B on A.department = B.department
where B.cnt = C.cnt;

Concatenating multiple rows that differ in only one column

I'm writing an app and I have the following tables in SQLite:
course:
_id | name | a | b
university:
_id | name | c | d
course_university:
course_id | university_id
Course_university links the courses with the universities that offer them. It's a many-to-many relationship. I need a request that would give me the following
course._id | course.name | course.a | university.name | university.c
The query I thought would work was
SELECT c._id, c.name, c.a, u.name, u.c
FROM course AS c, university AS u, course_university AS cu
WHERE c._id=cu.course_id AND u._id=cu.university_id
The problem is that if there is a course offered by more than one university, the above query will show it twice, the only difference being in the university column. Is there a way to concatenate the university names for once course, so instead of getting
20 | Calculus | 23 | Stanford | 5 |
20 | Calculus | 23 | Harvard | 5 |
I'd get
20 | Calculus | 23 | Stanford & Harvard | 5 |
In my case there might be more than 2 universities working together on one course, so if it accommodates for concatenating three rows then great. This is my first time dealing with SQL databases, so I'm not that aware of any more advanced methodology to solve this.
Here is an example of how you use group_concat():
SELECT c._id, c.name, c.a, group_concat(u.name, ' & ') as universities, u.c
FROM course_university cu join
course c
on c._id = cu.course_id join
university AS u,
on u._id = cu.university_id
group by c._id, c.name, c.a, u.c;
I also changed the query syntax to use explicit, ANSI standard join syntax.

Generating a hierarchy

I got the following question at a job interview and it completely stumped me, so I'm wondering if anybody out there can help explain it to me. Say I have the following table:
employees
--------------------------
id | name | reportsTo
--------------------------
1 | Alex | 2
2 | Bob | NULL
3 | Charlie | 5
4 | David | 2
5 | Edward | 8
6 | Frank | 2
7 | Gary | 8
8 | Harry | 2
9 | Ian | 8
The question was to write a SQL query that returned a table with a column for each employee's name and a column showing how many people are above that employee in the organization: i.e.,
hierarchy
--------------------------
name | hierarchyLevel
--------------------------
Alex | 1
Bob | 0
Charlie | 3
David | 1
Edward | 2
Frank | 1
Gary | 2
Harry | 1
Ian | 2
I can't even figure out where to begin writing this as a SQL query (a cursor, maybe?). Can anyone help me out in case I get asked a similar question to this again? Thanks.
The simplest example would be to use a (real or temporary) table, and add one level at a time (fiddle):
INSERT INTO hierarchy
SELECT id, name, 0
FROM employees
WHERE reportsTo IS NULL;
WHILE ((SELECT COUNT(1) FROM employees) <> (SELECT COUNT(1) FROM hierarchy))
BEGIN
INSERT INTO hierarchy
SELECT e.id, e.name, h.hierarchylevel + 1
FROM employees e
INNER JOIN hierarchy h ON e.reportsTo = h.id
AND NOT EXISTS(SELECT 1 FROM hierarchy hh WHERE hh.id = e.id)
END
Other solutions will be slightly different for each RDBMS. As one example, in SQL Server, you can use a recursive CTE to expand it (fiddle):
;WITH expanded AS
(
SELECT id, name, 0 AS level
FROM employees
WHERE reportsTo IS NULL
UNION ALL
SELECT e.id, e.name, level + 1 AS level
FROM expanded x
INNER JOIN employees e ON e.reportsTo = x.id
)
SELECT *
FROM expanded
ORDER BY id
Other solutions include recursive stored procedures, or even using dynamic SQL to iteratively increase the number of joins until everybody is accounted for.
Of course all these examples assume there are no cycles and everyone can be traced up the chain to a head honcho (reportsTo = NULL).

Building a SQL query that doesn't include data based on hierarchy

I've spent quite a bit of time on Google and SO the last few days, but I can't seem to find an answer to my problem. Not knowing exactly how to phrase the issue into a reasonable question makes it a little more difficult. You don't know what you don't know, right?
Due to business limitations I can't post my exact code and database structure, so I'll do my best to give a solid example.
Customer Table - Holds customer data
[CustId] | [CustName]
---------------------
1 | John Smith
2 | Jane Doe
3 | John Doe
Code Table - Holds code data
[CodeId] | [CodeDesc]
---------------------
A | A Desc
B | B Desc
C | C Desc
D | D Desc
E | E Desc
CustomerCode Table - Combines customers with codes
[CustId] | [CodeId]
-------------------
1 | A
1 | B
2 | B
2 | C
2 | D
3 | C
3 | E
CodeHierarchy Table - Hierarchy of codes that shouldn't be included (DropCode) if a customer has a ConditionCode
[ConditionCode] | [DropCode]
----------------------------
A | B
B | C
B | D
Now I'll try to explain my actual question.
What I'm trying to accomplish is writing a query (view) that will list codes based on the CodeHierarchy table.
Results would be something like this:
[CustName] | [CodeId]
-------------------
John Smith | A
Jane Doe | B
John Doe | C
John Doe | E
Code B isn't listed for John Smith since he has code A. Codes C and D aren't listed for Jane Doe since she also has code B. John Doe has all codes listed (notice that E isn't even in the CodeHierarchy table).
I've tried a few different things (inner joins, left/right joins, subqueries, etc) but I just can't get the results I'm looking.
As a base query, this returns all codes:
SELECT
Customer.CustomerName,
Code.CodeDesc
FROM
Customer
INNER JOIN CustomerCode
ON Customer.CodeId = CustomerCode.CodeId
INNER JOIN Code
ON CustomerCode.CodeId = Code.CodeId
This only returns codes that are ConditionCodes (I understand why, but I though it might be worth a shot at the time):
SELECT
Customer.CustomerName,
Code.CodeDesc
FROM
Customer
INNER JOIN CustomerCode
ON Customer.CodeId = CustomerCode.CodeId
INNER JOIN Code
ON CustomerCode.CodeId = Code.CodeId
INNER JOIN CodeHierarchy
ON Customer.CodeId = CodeHierarchy.ConditionCode
AND Customer.CodeId != CodeHierarchy.DropCode
I tried a subquery (don't have that code available) that ended up dropping all DropCodes, regardless if a member did or did not have a qualifying hierarchy (i.e. customer rows with B weren't returned even if they didn't have A)
I had an idea of making the base query above a subquery and the joining it with the CodeHierarchy table, but I'm stuck on how to write the query:
SELECT
*
FROM
(
base query (with all codes)
) CustomerCodesAll
INNER/LEFT JOIN CodeHierarchy
ON ?
I've also been doing some reading on CTEs, but I'm not sure how I could make use of that technique.
This will end up being a view to be queried against for reporting purposes. The customer table contains much more data including dob, gender, company status, etc. The view will be straightforward and pull everything. Queries against the view will include where clauses for dob, gender, etc.
Can anyone point me in the right direction?
Thanks for any help.
SELECT
Customer.CustName,
Code.CodeDesc
FROM
Customer
INNER JOIN CustomerCode AS posCustomerCode
ON Customer.CustId = posCustomerCode.CustId
INNER JOIN Code
ON posCustomerCode.CodeId = Code.CodeId
LEFT JOIN CodeHierarchy
ON posCustomerCode.CodeId = CodeHierarchy.DropCode
WHERE
CodeHierarchy.ConditionCode NOT IN (
SELECT CodeId
FROM CustomerCode AS negCustomerCode
WHERE negCustomerCode.CustId=posCustomerCode.CustId
)
OR CodeHierarchy.ConditionCode IS NULL
SQLfiddle