How to count all subquery results that returns only the most recent item? - sql

In this post (Adding a Query to a Subquery then produces no results) #D-Shih provided a great solution, which I would like to extend.
How do I add to the results returned, the count of reports by that teacher, even if the subquery is only finding the last one?
I'm trying to solve the <???> AS CountOfReports, line below, but my SQL skills are not that great.
SELECT
t.NAME,
t1.REPORTINGTYPE,
<???> AS CountOfReports, <<<< ****
t1.REPORTINGPERIOD
FROM
teachers AS t
INNER JOIN
(SELECT
*,
(SELECT COUNT(*) FROM REPORTS tt
WHERE tt.TEACHER_ID = t1.TEACHER_ID
AND tt.REPORTINGPERIOD >= t1.REPORTINGPERIOD) rn
FROM
REPORTS t1) AS t1 ON t1.TEACHER_ID = t.id AND rn = 1
ORDER BY
t.NAME

You can compute the count with a correlated subquery:
SELECT t.Name,
r.ReportingType,
max(r.ReportingPeriod),
(SELECT count(*)
FROM Reports r2
WHERE r2.Teacher_ID = r.Teacher_ID
) AS Reports
FROM Teachers t
JOIN Reports r ON t.ID = r.Teacher_ID
GROUP BY r.Teacher_ID;
NAME REPORTINGTYPE max(r.ReportingPeriod) Reports
-------------- ------------- ---------------------- ----------
Mr John Smith Final 2017-03 3
Ms Janet Smith Draft 2018-07 2

Related

SQL Query: Count "id" occurrences in two tables

I have these 3 tables and I am trying to count, how many "hints" and "quizzes" are there for specific town id.
db_town
id
town
1
New York
db_hint
id
town_id
hint
1
1
test
db_quiz
id
town_id
quiz
1
1
quiz 1
2
1
quiz 2
I am using this statement, but it does not work :(
SELECT count(q.id),count(h.id) FROM `db_town` t LEFT JOIN `db_quiz` q ON t.id = q.town_id LEFT JOIN `db_hint` h ON t.id = h.town_id WHERE t.id = 1 GROUP BY t.id
and it produces this result:
count(q.id)
count(h.id)
2
2
Do I need to use two statements? Or is it possible to query it in a single SQL statement? I am using MariaDB.
You can use union all and aggregation:
select town_id, sum(is_hint), sum(is_quiz)
from ((select town_id, 1 as is_hint, 0 as is_quiz
from hints
) union all
(select town_id, 0, 1
from quizzes
)
) t
group by town_id;
Alternatively, you can use correlated subqueries:
select t.*,
(select count(*) from hints h where h.town_id = t.id),
(select count(*) from quizzes q where q.town_id = t.id)
from towns t;
Two things to look out for:
JOINs are likely to multiply rows and throw off the counts.
Getting 0 values if a town has no hints or quizzes.
You can use COUNT (DISTINCT) if both the hint id and the quiz id are unique.
SELECT
count(distinct q.id),count(distinct h.id)
FROM `db_town` t
LEFT JOIN `db_quiz` q ON t.id = q.town_id
LEFT JOIN `db_hint` h ON t.id = h.town_id
WHERE t.id = 1 GROUP BY t.id

SQL subquery with latest record

I've read just about every question on here that I can find that is referencing getting the latest record from a subquery, but I just can't work out how to make it work in my situation.
I'm creating an SSRS report for use on SQL Server 2008.
In the database is a table of contacts and DBSdata. I want to pull up a list of contacts and the latest record (many of the fields from that row) from the DBSdata table (expiry date furthest in the future)
Contacts
========
PKContactID ContactName
----------- -----------
1 JONES Chris
2 SMITH Mary
3 GREY Jean
DBSdata
=======
Ordinal FKContactID ExpiryDate IssueDate DBSType
------- ----------- ---------- --------- -------
3 1 2021-09-01 2019-09-01 Internal
2 1 2019-08-31 2017-08-31 External
1 1 2017-07-01 2015-07-01 Internal
2 2 2021-04-15 2019-04-15 Internal
1 2 2019-05-05 2017-05-06 External
1 3 2018-01-03 2016-03-02 External
And the result I'd like is:
Latest DBS
==========
PKContactID ContactName ExpiryDate IssueDate DBSType
-------------------------------------------------------------------
3 GREY Jean 2018-01-03 2016-03-02 External
1 JONES Chris 2021-09-01 2019-09-01 Internal
2 SMITH Mary 2021-04-15 2019-04-15 Internal
[The DBSData table doesn't have it's own Primary Key field - that's not something I have control over, unfortunately... And the ordinal increases per contact, so FKContactID+Ordinal is unique....]
This is the code I've kind of got to, but it isn't working. The system I'm uploading the SSRS to doesn't give me any useful error message at all, so I can't be more specific about what isn't working I'm afraid. I get none of the SSRS report displayed, just an error saying the dataset source isn't working.
SELECT
c.PKContactID, c.ContactName, d.ExpiryDate, d.IssueDate, d.DBSType
FROM
Contacts c
LEFT JOIN (
SELECT TOP 1 FKContactID, ExpiryDate, IssueDate, DBSType
FROM DBSData
WHERE FKContactID = c.PKContactID
ORDER BY ExpiryDate DESC
) d ON c.PKContactID = d.FKContactID
ORDER BY
c.ContactName
I suspect it's something to do with that WHERE in the subquery, but if I don't have that, that whole table is using the WHOLE table and returning 1 row, not the top 1 for that contact.
Your method would work using APPLY, instead of JOIN:
SELECT c.PKContactID, c.ContactName,
d.ExpiryDate, d.IssueDate, d.DBSType
FROM Contacts c OUTER APPLY
(SELECT TOP 1 d.*
FROM DBSData d
WHERE d.FKContactID = c.PKContactID
ORDER BY d.ExpiryDate DESC
) d
ORDER BY c.ContactName;
Technically APPLY implements something called a lateral join. This is like a correlated subquery, but it can return multiple rows and multiple columns. Lateral joins are very powerful, and this is a good example for using them.
For performance, you want indexes on DBSData(FKContactID, ExpiryDate DESC) (perhaps including the other columns you want as well) and Contacts(ContactName).
With the right indexes, I would expect this to have performance at least as good as other methods.
An alternative that also typically has good performance is using a correlated subquery for filtering:
SELECT c.PKContactID, c.ContactName,
d.ExpiryDate, d.IssueDate, d.DBSType
FROM Contacts c LEFT JOIN
DBSData d
ON d.FKContactID = c.PKContactID AND
d.ExpiryDate = (SELECT MAX(d2.ExpiryDate)
FROM DBSData d
WHERE d2.FKContactID = d.FKContactID
);
Note that to match the LEFT JOIN, the correlation condition needs to be in the ON clause, not the WHERE clause.
Finally, if you do use window functions, I would recommend a subquery for getting the first row:
SELECT c.PKContactID, c.ContactName,
d.ExpiryDate, d.IssueDate, d.DBSType
FROM Contacts c LEFT JOIN
(SELECT d.*,
ROW_NUMBER() OVER (PARTITION BY d.FKContactID ORDER BY d.PKContactID DESC) as seqnum
FROM DBSData d
) d
ON d.FKContactID = c.PKContactID AND
d.seqnum = 1;
Doing the subquery before the JOIN gives more opportunities for the optimizer to produce a better execution plan.
Here's one option using row_number():
SELECT *
FROM (
SELECT
c.PKContactID, c.ContactName, d.ExpiryDate, d.IssueDate, d.DBSType,
row_number() over (partition by c.PKContactID order by d.ExpiryDate desc) rn
FROM
Contacts c
LEFT JOIN DBSData d ON d.FKContactID = c.PKContactID
) t
WHERE rn = 1
ORDER BY ContactName
Online Demo
This Solution gives result as you expected and performance is so much higher.
select c.PKContactID,c.ContactName,d.ExpiryDate, d.IssueDate, d.DBSType from Contacts c
inner join DBSdata d
on c.PKContactID=d.FKContactID
where d.Ordinal in (select max(d.Ordinal) from DBSdata d where d.FKContactID=c.PKContactID)
order by c.ContactName

SQL - Select highest value when data across 3 tables

I have 3 tables:
Person (with a column PersonKey)
Telephone (with columns Tel_NumberKey, Tel_Number, Tel_NumberType e.g. 1=home, 2=mobile)
xref_Person+Telephone (columns PersonKey, Tel_NumberKey, CreatedDate, ModifiedDate)
I'm looking to get the most recent (e.g. the highest Tel_NumberKey) from the xref_Person+Telephone for each Person and use that Tel_NumberKey to get the actual Tel_Number from the Telephone table.
The problem I am having is that I keep getting duplicates for the same Tel_NumberKey. I also need to be sure I get both the home and mobile from the Telephone table, which I've been looking to do via 2 individual joins for each Tel_NumberType - again getting duplicates.
Been trying the following but to no avail:
-- For HOME
SELECT
p.PersonKey, pn.Phone_Number, pn.Tel_NumberKey
FROM
Persons AS p
INNER JOIN
xref_Person+Telephone AS x ON p.PersonKey = x.PersonKey
INNER JOIN
Telephone AS pn ON x.Tel_NumberKey = pn.Tel_NumberKey
WHERE
pn.Tel_NumberType = 1 -- e.g. Home phone number
AND pn.Tel_NumberKey = (SELECT MAX(pn1.Tel_NumberKey) AS Tel_NumberKey
FROM Person AS p1
INNER JOIN xref_Person+Telephone AS x1 ON p1.PersonKey = x1.PersonKey
INNER JOIN Telephone AS pn1 ON x1.Tel_NumberKey = pn1.Tel_NumberKey
WHERE pn1.Tel_NumberType = 1
AND p1.PersonKey = p.PersonKey
AND pn1.Tel_Number = pn.Tel_Number)
ORDER BY
p.PersonKey
And have been looking over the following links but again keep getting duplicates.
SQL select max(date) and corresponding value
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
SQL Server: SELECT only the rows with MAX(DATE)
Am sure this must be possible but been at this a couple of days and can't believe its that difficult to get the most recent / highest value when referencing 3 tables. Any help greatly appreciated.
select *
from
( SELECT p.PersonKey, pn.Phone_Number, pn.Tel_NumberKey
, row_number() over (partition by p.PersonKey, pn.Phone_Number order by pn.Tel_NumberKey desc) rn
FROM
Persons AS p
INNER JOIN
xref_Person+Telephone AS x ON p.PersonKey = x.PersonKey
INNER JOIN
Telephone AS pn ON x.Tel_NumberKey = pn.Tel_NumberKey
WHERE
pn.Tel_NumberType = 1
) tt
where tt.rn = 1
ORDER BY
tt.PersonKey
you have to use max() function and then you have to order by rownum in descending order like.
select f.empno
from(select max(empno) empno from emp e
group by rownum)f
order by rownum desc
It will give you all employees having highest employee number to lowest employee number. Now implement it with your case then let me know.

sql select records with matching subsets

There are two sets of employees: managers and grunts.
For each manager, there's a table manager_meetings that holds a list of which meetings each manager attended. A similar table grunt_meetings holds a list of which meetings each grunt attended.
So:
manager_meetings grunt_meetings
managerID meetingID gruntID meetingID
1 a 4 a
1 b 4 b
1 c 4 c
2 a 4 d
2 b 5 a
3 c 5 b
3 d 5 c
3 e 6 a
6 c
7 b
7 a
The owner doesn't like it when a manager and a grunt know exactly the same information. It makes his head hurt. He wants to identify this situation, so he can demote the manager to a grunt, or promote the grunt to a manager, or take them both golfing. The owner likes to golf.
The task is to list every combination of manager and grunt where both attended exactly the same meetings. If the manager attended more meeting than the grunt, no match. If the grunt attended more meetings than the manager, no match.
The expected results here are:
ManagerID GruntID
2 7
1 5
...because manager 2 and grunt 7 both attended (a,b), while manager 1 and grunt 5 both attended (a,b,c).
I can solve it in a clunky way, by pivoting up the subset of meetings in a subquery into XML, and comparing each grunt's XML list to each manager's XML. But that's horrible, and also I have to explain to the owner what XML is. And I don't like golfing.
Is there some better way to do "WHERE {subset1} = {subset2}"? It feels like I'm missing some clever kind of join.
SQL Fiddle
Here is a version that works:
select m.mId, g.gId, count(*) --select m.mid, g.gid, mm.meetingid, gm.meetingid as gmm
from manager m cross join
grunt g left outer join
(select mm.*, count(*) over (partition by mm.mid) as cnt
from manager_meeting mm
) mm
on mm.mid = m.mId full outer join
(select gm.*, count(*) over (partition by gm.gid) as cnt
from grunt_meeting gm
) gm
on gm.gid = g.gid and gm.meetingid = mm.meetingid
group by m.mId, g.gId, mm.cnt, gm.cnt
having count(*) = mm.cnt and mm.cnt = gm.cnt;
The string comparison method is shorter, perhaps easier to understand, and probably faster.
EDIT:
For your particular case of getting exact matches, the query can be simplified:
select mm.mId, gm.gId
from (select mm.*, count(*) over (partition by mm.mid) as cnt
from manager_meeting mm
) mm join
(select gm.*, count(*) over (partition by gm.gid) as cnt
from grunt_meeting gm
) gm
on gm.meetingid = mm.meetingid and
mm.cnt = gm.cnt
group by mm.mId, gm.gId
having count(*) = max(mm.cnt);
This might be more competitive with the string version, both in terms of performance and clarity.
It counts the number of matches between a grunt and a manager. It then checks that this is all the meetings for each.
An attempt at avenging Aaron's defeat – a solution using EXCEPT:
SELECT
m.mID,
g.gID
FROM
manager AS m
INNER JOIN
grunt AS g
ON NOT EXISTS (
SELECT meetingID
FROM manager_meeting
WHERE mID = m.mID
EXCEPT
SELECT meetingID
FROM grunt_meeting
WHERE gID = g.gID
)
AND NOT EXISTS (
SELECT meetingID
FROM grunt_meeting
WHERE gID = g.gID
EXCEPT
SELECT meetingID
FROM manager_meeting
WHERE mID = m.mID
);
Basically, subtract a grunt's set of meetings from a manager's set of meetings, then the other way round. If neither result contains rows, the grunt and the manager attended the same set of meetings.
Please note that this query will match managers and grunts that never attended a single meeting.
An alternative version - but requires another table. Basically, we give each meeting a distinct power of two as it's 'value', then sum every manager's meeting value and each grunt's meeting value. Where they're the same, we have a match.
It should be possible to make the meeting_values table a TVF, but this is a little bit simpler.
SQL Fiddle
Additional table:
CREATE TABLE meeting_values (value INT, meetingID CHAR(1));
INSERT INTO meeting_values VALUES
(1,'a'),(2,'b'),(4,'c'),(8,'d'),(16,'e');
And the query:
SELECT managemeets.mID, gruntmeets.gID
FROM
( SELECT gm.gID, sum(value) AS meeting_totals
FROM grunt_meeting gm
INNER JOIN
meeting_values mv ON gm.meetingID = mv.meetingID
GROUP BY gm.gID
) gruntmeets
INNER JOIN
( SELECT mm.mID, sum(value) AS meeting_totals
FROM manager_meeting mm
INNER JOIN
meeting_values mv ON mm.meetingID = mv.meetingID
GROUP BY mm.mID
) managemeets ON gruntmeets.meeting_totals = managemeets.meeting_totals

SQL display two results side-by-side

I have two tables, and am doing an ordered select on each of them. I wold like to see the results of both orders in one result.
Example (simplified):
"SELECT * FROM table1 ORDER BY visits;"
name|# of visits
----+-----------
AA | 5
BB | 9
CC | 12
.
.
.
"SELECT * FROM table2 ORDER BY spent;"
name|$ spent
----+-------
AA | 20
CC | 30
BB | 50
.
.
.
I want to display the results as two columns so I can visually get a feeling if the most frequent visitors are also the best buyers. (I know this example is bad DB design and not a real scenario. It is an example)
I want to get this:
name by visits|name by spent
--------------+-------------
AA | AA
BB | CC
CC | BB
I am using SQLite.
Select A.Name as NameByVisits, B.Name as NameBySpent
From (Select C.*, RowId as RowNumber From (Select Name From Table1 Order by visits) C) A
Inner Join
(Select D.*, RowId as RowNumber From (Select Name From Table2 Order by spent) D) B
On A.RowNumber = B.RowNumber
Try this
select
ISNULL(ts.rn,tv.rn),
spent.name,
visits.name
from
(select *, (select count(*) rn from spent s where s.value>=spent.value ) rn from spent) ts
full outer join
(select *, (select count(*) rn from visits v where v.visits>=visits.visits ) rn from visits) tv
on ts.rn = tv.rn
order by ISNULL(ts.rn,tv.rn)
It creates a rank for each entry in the source table, and joins the two on their rank. If there are duplicate ranks they will return duplicates in the results.
I know it is not a direct answer, but I was searching for it so in case someone needs it: this is a simpler solution for when the results are only one per column:
select
(select roleid from role where rolename='app.roles/anon') roleid, -- the name of the subselect will be the name of the column
(select userid from users where username='pepe') userid; -- same here
Result:
roleid | userid
--------------------------------------+--------------------------------------
31aa33c4-4e66-4da3-8525-42689e46e635 | 12ad8c95-fbef-4287-9834-7458a4b250ee
For RDBMS that support common table expressions and window functions (e.g., SQL Server, Oracle, PostreSQL), I would use:
WITH most_visited AS
(
SELECT ROW_NUMBER() OVER (ORDER BY num_visits) AS num, name, num_visits
FROM visits
),
most_spent AS
(
SELECT ROW_NUMBER() OVER (ORDER BY amt_spent) AS num, name, amt_spent
FROM spent
)
SELECT mv.name, ms.name
FROM most_visited mv INNER JOIN most_spent ms
ON mv.num = ms.num
ORDER BY mv.num
Just join table1 and table2 with name as key like bellow:
select a.name,
b.name,
a.NumOfVisitField,
b.TotalSpentField
from table1 a
left join table2 b on a.name = b.name