SQL count on multiple columns including 0 value - sql

I've joined up my tables such that every entry is unique and I want to get a COUNT() value for how many unique courses the teachers teach. I figured I would make a table of distinct courses then do a count based on the teacher's id, however this doesn't account for teachers who taught no courses and I wish to return a zero value in this case. How do I go about getting these zero values?
Table for reference:
id | name | course_id | sec_id | semester | year
-------+------------+-----------+--------+----------+------
33456 | A | | | |
10101 | B | CS-101 | 1 | Fall | 2009
76766 | C | BIO-301 | 1 | Summer | 2010
12121 | D | FIN-201 | 1 | Spring | 2010
10101 | B | CS-347 | 1 | Fall | 2009
76543 | E | | | |
83821 | F | CS-319 | 2 | Spring | 2010
83821 | F | CS-190 | 2 | Spring | 2009
98345 | G | EE-181 | 1 | Spring | 2009
10101 | B | CS-315 | 1 | Spring | 2010
22222 | H | PHY-101 | 1 | Fall | 2009
45565 | I | CS-101 | 1 | Spring | 2010
15151 | J | MU-199 | 1 | Spring | 2010
32343 | K | HIS-351 | 1 | Spring | 2010
83821 | F | CS-190 | 1 | Spring | 2009
45565 | I | CS-319 | 1 | Spring | 2010
76766 | C | BIO-101 | 1 | Summer | 2009
58583 | L | | | |
ps. I believe I am using PostgreSQL.
[EDIT] the expected result is a table of id's, names, and a number showing the amount of courses the teacher has taught, including 0 if they have not taught any course.
[EDIT 2] I only need a query on this table, all the other work is done. If there is no value for course_id, sec_id, semester, year then that teacher has not taught a course (in the case of teachers A, E, and L; who would have a count of 0). I only need a way to count these courses, nothing else.

let's assume the table name is t:
select distinct count(course_id) filter (where course_id is not null) over (partition by id,name),id, name
from t
order by name;
count | id | name
-------+-------+--------------
0 | 33456 | A
3 | 10101 | B
2 | 76766 | C
1 | 12121 | D
0 | 76543 | E
3 | 83821 | F
1 | 98345 | G
1 | 22222 | H
2 | 45565 | I
1 | 15151 | J
1 | 32343 | K
0 | 58583 | L
(12 rows)
https://www.postgresql.org/docs/current/static/sql-expressions.html

I believe, this query should do the job:
SELECT COUNT(DISTINCT course_id), t.id, name FROM courses r Right Outer JOIN teachers t ON (r.id=t.id) group by t.id, name;
I assumed, that not all teachers where in the table you have provided, thus the need for the Outer Join. Tested on similar database on Oracle.

Do the joins with subquery to find the teachers course count
select t.name, coalesce(c.course_count, 0) course_count
from table t left join (
select name, count(distinct course _id) course_count
from table
group by name
) c on c. name = t.name
group by t.name

Related

Is there an easier way to find the row with a max value?

I have a schema where these two tables exist (among others)
participation
+------+--------+------------------+
| movie| person | role |
+------+--------+------------------+
| 1 | 1 | "Regisseur" |
| 1 | 1 | "Schauspieler" |
| 1 | 2 | "Schauspielerin" |
| 2 | 3 | "Regisseur" |
| 3 | 4 | "Regisseur" |
| 3 | 5 | "Schauspieler" |
| 3 | 6 | "Schauspieler" |
| 4 | 7 | "Schauspielerin" |
| 4 | 8 | "Schauspieler" |
| 5 | 1 | "Schauspieler" |
| 5 | 8 | "Schauspieler" |
| 5 | 14 | "Schauspieler" |
+------+--------+------------------+
movie
+----+------------------------------+------+-----+
| id | title | year | fsk |
+----+------------------------------+------+-----+
| 1 | "Die Bruecke am Fluss" | 1995 | 12 |
| 2 | "101 Dalmatiner" | 1961 | 0 |
| 3 | "Vernetzt - Johnny Mnemonic" | 1995 | 16 |
| 4 | "Waehrend Du schliefst..." | 1995 | 6 |
| 5 | "Casper" | 1995 | 6 |
| 6 | "French Kiss" | 1995 | 6 |
| 7 | "Stadtgespraech" | 1995 | 12 |
| 8 | "Apollo 13" | 1995 | 6 |
| 9 | "Schlafes Bruder" | 1995 | 12 |
| 10 | "Assassins - Die Killer" | 1995 | 16 |
| 11 | "Braveheart" | 1995 | 16 |
| 12 | "Das Netz" | 1995 | 12 |
| 13 | "Free Willy 2" | 1995 | 6 |
+----+------------------------------+------+-----+
I want to get the movie with the highest number of people that participated. I figured out an SQL statement that actually does this, but looks super complicated. It looks like this:
SELECT titel
FROM movie.movie
JOIN (SELECT *
FROM (SELECT Max(count_person) AS max_count_person
FROM (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons) AS
maxCountPersons
JOIN (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons
ON maxCountPersons.max_count_person =
countPersons.count_person)
AS maxPersonsmovie
ON maxPersonsmovie.movie = movie.id
The main problem is, that I can't find an easier way to select the row with the highest value. If I simply could make a selection on the inner table and pick the row with the highest value on count_person without losing the information about the movie itself, this would look so much simpler. Is there a way to simplify this, or is this really the easiest way to do this?
Here is a way without subqueries:
SELECT m.title
FROM movie.movie m JOIN
movie.participation p
ON m.id = p.movie
GROUP BY m.title
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
You can use LIMIT 1 instead of FETCH, if you prefer.
Note: In the event of ties, this only returns one value. That seems consistent with your question.
You can use rank window function to do this.
SELECT title
FROM (SELECT m.title,rank() over(order by count(p.person) desc) as rnk
FROM movie.movie m
LEFT JOIN movie.participation p ON m.id=p.movie
GROUP BY m.title
) t
WHERE rnk=1
SELECT title
FROM movie.movie
WHERE id = (SELECT movie
FROM movie.participation
GROUP BY movie
ORDER BY count(*) DESC
LIMIT 1);

How to SUM rows with an outer join?

Question:
I have the following tables that I'd like to sum on two fields: HOURS and RATE. I also want to retrieve the NAME from the third table, joining all 3 tables on the field LINE_NUM.
If the LINE_NUM and CODE are the same, sum the fields of A with B.
Table EARNINGS A:
| EMPLOYEE_ID | LINE_NUM | REG_CODE | REG_HOURS | REG_RATE |
------------------------------------------------------------
| 0001 | 1 | C | 20 | 200 |
| 0002 | 1 | H | 0 | 0 |
Table OTH_EARNINGS B:
| LINE_NUM | OTH_CODE | OTH_HOURS | OTH_RATE |
----------------------------------------------
| 1 | A | 0 | 0 |
| 1 | B | 0 | 0 |
| 1 | C | 10 | 100 |
| 2 | A | 50 | 50 |
Table PAYCHECK C:
| EMPLOYEE_ID | LINE_NUM | NAME |
---------------------------------
| 0001 | 1 | Tom |
| 0001 | 2 | Tom |
| 0002 | 1 | John |
The result I'm looking for should be:
| EMPLOYEE_ID | LINE_NUM | CODE | HOURS | RATE | NAME |
-------------------------------------------------------
| 0001 | 1 | A | 0 | 0 | Tom |
| 0001 | 1 | B | 0 | 0 | Tom |
| 0001 | 1 | C | 30 | 300 | Tom |
| 0001 | 2 | A | 50 | 50 | Tom |
| 0002 | 1 | H | 0 | 0 | John |
Any idea how I can achieve this?
What I tried:
I've tried (table A with C) UNION (table B with C), but I can't get the sums to work.
SELECT C.EMPLOYEE_ID, A.REG_CODE, A.REG_HRS, SUM(A.REG_RATE)
FROM EARNINGS A, PAYCHECK C
WHERE A.LINE_NUM = C.LINE_NUM
GROUP BY C.EMPLOYEE_ID, A.REG_CODE, A.REG_HRS
UNION
SELECT D.EMPLOYEE_ID, B.OTH_CODE, B.OTH_HRS, SUM(B.OTH_RATE)
FROM OTH_EARNINGS B, PAYCHECK D
WHERE B.LINE_NUM = D.LINE_NUM
GROUP BY D.EMPLOYEE_ID, B.OTH_CODE, B.OTH_HRS
But I couldn't get the sum to work and it returned:
| EMPLOYEE_ID | LINE_NUM | CODE | HOURS | RATE | NAME |
-------------------------------------------------------
| 0001 | 1 | A | 0 | 0 | Tom |
| 0001 | 1 | B | 0 | 0 | Tom |
| 0001 | 1 | C | 10 | 100 | Tom |
| 0001 | 1 | C | 20 | 200 | Tom |
| 0001 | 2 | A | 50 | 50 | Tom |
| 0002 | 1 | H | 0 | 0 | John |
Your approach wasn't bad and you were almost there.
You should make the GROUP BY on the results of the 2 UNIONed queries being nested:
SELECT EMPLOYEE_ID, NAME, CODE, SUM(HRS), SUM(RATE)
FROM
(
SELECT C.EMPLOYEE_ID, C.NAME, A.REG_CODE AS CODE, A.REG_HRS AS HRS, A.REG_RATE AS RATE
FROM EARNINGS A
INNER JOIN PAYCHECK C ON A.LINE_NUM = C.LINE_NUM
UNION ALL
SELECT D.EMPLOYEE_ID, C.NAME, B.OTH_CODE AS CODE, B.OTH_HRS AS HRS, B.OTH_RATE AS RATE
FROM OTH_EARNINGS B
INNER JOIN PAYCHECK D ON B.LINE_NUM = D.LINE_NUM
)
GROUP BY EMPLOYEE_ID, NAME, CODE
However this will return wrong results because the JOINs on the PAYCHECK table will returns duplicates.
There's obviously something missing somewhere.
To identify the employee, you should combine 2 columns : EMPLOYEE_ID and LINE_NUM. For the first query on EARNING, there's no issue as the EMPLOYEE_ID is present in the table. However for the second query on OTH_EARNINGS, the EMPLOYEE_ID is missing...
In theory you should have something like this (check the INNER JOIN...ON)
SELECT EMPLOYEE_ID, NAME, CODE, SUM(HRS), SUM(RATE)
FROM
(
SELECT C.EMPLOYEE_ID, C.NAME, A.REG_CODE AS CODE, A.REG_HRS AS HRS, A.REG_RATE AS RATE
FROM EARNINGS A
INNER JOIN PAYCHECK C ON A.LINE_NUM = C.LINE_NUM AND A.EMPLOYEE_ID = C.EMPLOYEE_ID
UNION ALL
SELECT D.EMPLOYEE_ID, C.NAME, B.OTH_CODE AS CODE, B.OTH_HRS AS HRS, B.OTH_RATE AS RATE
FROM OTH_EARNINGS B
INNER JOIN PAYCHECK D ON B.LINE_NUM = D.LINE_NUM AND B.EMPLOYEE_ID = D.EMPLOYEE_ID
)
GROUP BY EMPLOYEE_ID, NAME, CODE
I also changed from your initial query:
the JOINs from implicit to explicit syntax.
the UNION into an UNION ALL as there's no reason here to remove the duplicates (maybe I am wrong)

Joining to tables while linking with a crosswalk table

I am trying to write a query to de identify one of my tables. To make distinct ids for people, I used name, age and sex. However in my main table, the data has been collected for years and the sex code changed from 1 meaning male and 2 meaning female to M meaning male and F meaning female. To make this uniform in my distinct individuals table I used a crosswalk table to convert the sexcode into to the correct format before placing it into the distinct patients table.
I am now trying to write the query to match the distinct patient ids to their correct the rows from the main. table. The issue is that now the sexcode for some has been changed. I know I could use an update statement on my main table and changes all of the 1 and 2 to the m and f. However, I was wondering if there was a way to match the old to the new sexcodes so I would not have to make the update. I did not know if there was a way to join the main and distinct ids tables in the query while using the sexcode table to convert the sexcodes again. Below are the example tables I am currently using.
This is my main table that I want to de identify
----------------------------
| Name | age | sex | Toy |
----------------------------
| Stacy| 30 | 1 | Bat |
| Sue | 21 | 2 | Ball |
| Jim | 25 | 1 | Ball |
| Stacy| 30 | M | Ball |
| Sue | 21 | F | glove |
| Stacy| 18 | F | glove |
----------------------------
Sex code crosswalk table
-------------------
| SexOld | SexNew |
-------------------
| M | M |
| F | F |
| 1 | M |
| 2 | F |
-------------------
This is the table I used to to populate IDs for people I found to be distinct in my main table
--------------------------
| ID | Name | age | sex |
--------------------------
| 1 | Stacy| 30 | M |
| 2 | Jim | 25 | M |
| 3 | Stacy| 18 | F |
| 4 | Sue | 21 | F |
--------------------------
This what I want my de identified table to look like
---------------
| ID | Toy |
---------------
| 1 | Bat |
| 4 | Ball |
| 2 | Ball |
| 1 | Ball |
| 4 | glove |
| 3 | glove |
---------------
select c.ID, a.Toy
from maintable a
left join sexcodecrosswalk b on b.sexold = a.sex
left join peopleids c on c.Name = a.Name and c.age = a.age and c.Sex = b.sexNew
Here's a demonstration that this works:
http://sqlfiddle.com/#!3/a2d26/1

SQL join where only the max value should be returned

Looking for some help with SQL. I have the following 4 tables
Users Table
+-----------------------------+
| ID | First_Name | Last_Name |
+-----------------------------+
| 1 | Billy | O'Neal |
+----+------------+-----------+
| 2 | John | Skeet |
+----+------------+-----------+
| 3 | Ken | Stamp |
+----+------------+-----------+
| 4 | Doug | Feng |
+----+------------+-----------+
Book_CheckOut
+----+--------------+---------------+
| ID | User_ID | Book_ID |
+-----------------------------------+
| 1 | 1 | 1 |
+----+--------------+---------------+
| 2 | 2 | 3 |
+----+--------------+---------------+
| 3 | 2 | 1 |
+----+--------------+---------------+
| 4 | 2 | 2 |
+----+--------------+---------------+
| 5 | 3 | 1 |
+----+--------------+---------------+
| 6 | 1 | 4 |
+----+--------------+---------------+
| 7 | 1 | 0 |
+----+--------------+---------------+
Books
+---------+-------------+-------------+
| ID | Book_Name | Location_ID |
+-----------------------+-------------+
| 1 | Programming | 1 |
+---------+-------------+-------------+
| 2 | Cooking | 3 |
+---------+-------------+-------------+
| 3 | Dancing | 2 |
+---------+-------------+-------------+
| 4 | Sports | 1 |
+---------+-------------+-------------+
Location
+---------+-------------+
| ID | Loc_Name |
+-----------------------+
| 1 | Palo Alto |
+---------+-------------+
| 2 | San Jose |
+---------+-------------+
| 3 | Oakland |
+---------+-------------+
| 4 | Cupertino |
+---------+-------------+
What I am trying to get to is to figure out all the person with the latest book checked out. If the person doesn't have any record, he should show up. If there are no book matched such as 0 which means that the person returned all book. He should show up as well.
End results
Record
+-----------------+----------------+----------------+
| First_Name | Book_Name | Loc_Name |
+-----------------+----------------+----------------+
| Billy | | |
+-----------------+----------------+----------------+
| John | Cooking | Oakland |
+-----------------+----------------+----------------+
| Ken | Programming | Palo Alto |
+-----------------+----------------+----------------+
| Doug | | |
+-----------------+----------------+----------------+
Billy doesn't have anything since his last record in Book_CheckOut is 0 and Doug doesn't have anything since there are no record of him in Book_CheckOut.
I have tried various join with MAX() and group by but there doesn't seem to be a way to satisfy all of what I am looking for.
Any help is greatly appreciated.
try this:
select
u.first_name,
b.book_name,
l.loc_name
from user u
left join (select *
from book_checkout t0
where id = (select
max(id)
from book_checkout
where user_id = t0.user_id
)
) bc on bc.user_id = u.id
left join books b on b.id = bc.book_id
left join location l on l.id = b.location_id
subquery inside first join statement is used to select only last records for every user. But this query is considered that every user checkout only 1 book at a time.
Let me know if it works )
SELECT LC.First_Name
, ISNULL(B.Book_Name, N'') AS BookName
, ISNULL(L.Loc_Name, N'') AS Loc_Name
FROM Books AS B
INNER JOIN Book_CheckOut AS BC ON B.ID = BC.Book_ID
INNER JOIN Location AS L ON B.ID = L.ID
RIGHT OUTER JOIN (SELECT U.First_Name
, ISNULL(MAX(BC.ID), 0) AS BCID
FROM Users AS U
LEFT OUTER JOIN Book_CheckOut AS BC ON U.ID = BC.User_ID
GROUP BY U.First_Name) AS LC ON BC.ID = LC.BCID
The subquery shows Last CheckOut of all users.
select First_Name, Book_Name, Location_Name
from Users U, (select * from Books_Checkout where ID in (select max(ID) from Books_Checkout group by User_ID) and Book_ID is not null order by ID) BC, Books B, Location L
where U.ID = BC.User_ID and B.ID = BC.Book_ID and L.ID = B.Location_ID;
The above query results:
John Cooking Oakland
Ken Programming Palo Alto

Finding number of types of accounts from each customer

I am having a lot of trouble with trying to construct a query that will give me the name of each customer and the number of different types of accounts each has. The three types are Checkings, Savings, and CD.
customers:
+--------+--------+
| cid | name |
+--------+--------+
| 1 | a |
| 2 | b |
| 3 | c |
+--------+--------+
accounts:
+-----------+-----------+
| aid | type |
+-----------+-----------+
| 1 | Checkings |
| 2 | Savings |
| 3 | Checkings |
| 4 | CD |
| 5 | CD |
| 6 | Checkings |
+-----------+-----------+
transactions:
+--------+--------+--------+
| tid | cid | aid |
+--------+--------+--------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 3 |
| 4 | 3 | 4 |
| 5 | 1 | 5 |
| 6 | 3 | 4 |
| 7 | 1 | 6 |
+--------+--------+--------+
The expected answer would be:
a, 3
b, 1
c, 1
Getting the names is simple enough, but how can I keep count of each individual's account as well as compare the accounts to make sure that it is not the same type?
just add DISTINCT inside the COUNT
SELECT a.cid, a.name, COUNT(DISTINCT c.type) totalCount
FROM customers a
INNER JOIN transactions b
ON a.cis = b.cid
INNER JOIN accounts c
ON b,aid = c.aid
GROUP BY a.cid, a.name
Query:
SQLFiddleExample
SELECT
a."name",
COUNT(DISTINCT c."type") totalCount
FROM customers a
INNER JOIN transactions b
ON a."cid" = b."cid"
INNER JOIN accounts c
ON b."aid" = c."aid"
GROUP BY a."cid", a."name"
ORDER BY totalCount DESC
Result:
| NAME | TOTALCOUNT |
---------------------
| a | 3 |
| b | 1 |
| c | 1 |