How do you properly query the result of a complex join statement in SQL? - sql

New to advanced SQL!
I'm trying to write a query that returns the COUNT(*) and SUM of the resulting columns from this query:
DECLARE #Id INT = 1000;
SELECT
*,
CASE
WHEN Id1 >= 6 THEN 1
ELSE 0
END AS Tier1,
CASE
WHEN Id1 >= 4 THEN 1
ELSE 0
END AS Tier2,
CASE
WHEN Id1 >= 2 THEN 1
ELSE 0
END AS Tier3
FROM (
SELECT
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName,
MAX(AppSubmitU_Level.Id1) AS Id1
FROM Org
INNER JOIN AppEmployment
ON AppEmployment.OrgID = Org.OrgID
INNER JOIN App
ON App.AppID = AppEmployment.AppID
INNER JOIN AppSubmit
ON App.AppID = AppSubmit.AppID
INNER JOIN AppSubmitU_Level
ON AppSubmit.LevelID = AppSubmitU_Level.Id1
INNER JOIN AppEmpU_VerifyStatus
ON AppEmpU_VerifyStatus.VerifyStatusID = AppEmployment.VerifyStatusID
WHERE AppSubmitU_Level.SubmitTypeID = 1 -- Career
AND AppEmpU_VerifyStatus.StatusIsVerified = 1
AND AppSubmit.[ExpireDate] IS NOT NULL
AND AppSubmit.[ExpireDate] > GETDATE()
AND Org.OrgID = #Id
GROUP BY
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName
) employees
I've tried to do so by moving the #Id outside the original query, and adding a SELECT(*), SUM, and SUM to the top, like so:
DECLARE #OrgID INT = 1000;
SELECT COUNT(*), SUM(employees.Tier1), SUM(employees.Tier2), SUM(employees.Tier3)
FROM
(SELECT *,
...
) AS employees
);
When I run the query, however, I'm getting the errors:
The multi-part identifier employees.Tier1 could not be bound
The same errors appear for the other identifiers in my SUM statements.
I'm assuming this has to do with the fact that the Tier1, Tier2, and Tier3 columns are being returned by the inner join query in my FROM(), and aren't values set by the existing tables that I'm querying. But I can't figure out how to rewrite it to initialize properly.
Thanks in advance for the help!

This is a scope problem: employees is defined in the subquery only, it is not available in the outer scope. You basically want to alias the outer query:
DECLARE #OrgID INT = 1000;
SELECT COUNT(*), SUM(employees.Tier1) TotalTier1, SUM(employees.Tier2) TotalTier2, SUM(employees.Tier3) TotalTier3
FROM (
SELECT *,
...
) AS employees
) AS employees;
--^ here
Note that I added column aliases to the outer query, which is a good practice in SQL.
It might be easier to understand what is going on if you use another alias for the outer query:
SELECT COUNT(*), SUM(e.Tier1), SUM(e.Tier2), SUM(e.Tier3)
FROM (
SELECT *,
...
) AS employees
) AS e;
Note that you don't actually need to qualify the column names in the outer query, since column names are unambigous anyway.
And finally: you don't actually need a subquery. You could write the query as:
SELECT
SUM(CASE WHEN Id1 >= 6 THEN 1 ELSE 0 END) AS TotalTier1,
SUM(CASE WHEN Id1 >= 4 THEN 1 ELSE 0 END) AS TotalTier2,
SUM(CASE WHEN Id1 >= 2 THEN 1 ELSE 0 END) AS TotalTier3
FROM (
SELECT
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName,
MAX(AppSubmitU_Level.Id1) AS Id1
FROM Org
INNER JOIN AppEmployment
ON AppEmployment.OrgID = Org.OrgID
INNER JOIN App
ON App.AppID = AppEmployment.AppID
INNER JOIN AppSubmit
ON App.AppID = AppSubmit.AppID
INNER JOIN AppSubmitU_Level
ON AppSubmit.LevelID = AppSubmitU_Level.Id1
INNER JOIN AppEmpU_VerifyStatus
ON AppEmpU_VerifyStatus.VerifyStatusID = AppEmployment.VerifyStatusID
WHERE AppSubmitU_Level.SubmitTypeID = 1 -- Career
AND AppEmpU_VerifyStatus.StatusIsVerified = 1
AND AppSubmit.[ExpireDate] IS NOT NULL
AND AppSubmit.[ExpireDate] > GETDATE()
AND Org.OrgID = #Id
GROUP BY
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName
) employees

Related

Rewrite a query with GROUP BY ALL

Microsoft has deprecated GROUP BY ALL and while the query might work now, I'd like to future-proof this query for future SQL upgrades.
Currently, my query is:
SELECT qt.QueueName AS [Queue] ,
COUNT ( qt.QueueName ) AS [#ofUnprocessedEnvelopes] ,
COUNT ( CASE WHEN dq.AssignedToUserID = 0 THEN 1
ELSE NULL
END
) AS [#ofUnassignedEnvelopes] ,
MIN ( dq.DocumentDate ) AS [OldestEnvelope]
FROM dbo.VehicleReg_Documents_QueueTypes AS [qt]
LEFT OUTER JOIN dbo.VehicleReg_Documents_Queue AS [dq] ON dq.QueueID = qt.QueueTypeID
WHERE dq.IsProcessed = 0
AND dq.PageNumber = 1
GROUP BY ALL qt.QueueName
ORDER BY qt.QueueName ASC;
And the resulting data set:
<table><tbody><tr><td>Queue</td><td>#ofUnprocessedEnvelopes</td><td>#ofUnassignedEnvelopes</td><td>OldestEnvelope</td></tr><tr><td>Cancellations</td><td>0</td><td>0</td><td>NULL</td></tr><tr><td>Dealer</td><td>26</td><td>17</td><td>2018-04-06</td></tr><tr><td>Matched to Registration</td><td>93</td><td>82</td><td>2018-04-04</td></tr><tr><td>New Registration</td><td>166</td><td>140</td><td>2018-03-21</td></tr><tr><td>Remaining Documents</td><td>2</td><td>2</td><td>2018-04-04</td></tr><tr><td>Renewals</td><td>217</td><td>0</td><td>2018-04-03</td></tr><tr><td>Transfers</td><td>296</td><td>245</td><td>2018-03-30</td></tr><tr><td>Writebacks</td><td>53</td><td>46</td><td>2018-04-09</td></tr></tbody></table>
I've tried various versions using CTE's and UNION's but I cannot get result set to generate correctly - the records that have no counts will not display or I will have duplicate records displayed.
Any suggestions on how to make this work without the GROUP BY ALL?
Below is one attempt where I tried a CTE with a UNION:
;WITH QueueTypes ( QueueTypeID, QueueName )
AS ( SELECT QueueTypeID ,
QueueName
FROM dbo.VehicleReg_Documents_QueueTypes )
SELECT qt.QueueName AS [Queue] ,
COUNT ( qt.QueueName ) AS [#ofUnprocessedEnvelopes] ,
COUNT ( CASE WHEN dq.AssignedToUserID = 0 THEN 1
ELSE NULL
END
) AS [#ofUnassignedEnvelopes] ,
CONVERT ( VARCHAR (8), MIN ( dq.DocumentDate ), 1 ) AS [OldestEnvelope]
FROM QueueTypes AS qt
LEFT OUTER JOIN dbo.VehicleReg_Documents_Queue AS dq ON dq.QueueID = qt.QueueTypeID
WHERE dq.IsProcessed = 0
AND dq.PageNumber = 1
GROUP BY qt.QueueName
UNION ALL
SELECT qt.QueueName AS [Queue] ,
COUNT ( qt.QueueName ) AS [#ofUnprocessedEnvelopes] ,
COUNT ( CASE WHEN dq.AssignedToUserID = 0 THEN 1
ELSE NULL
END
) AS [#ofUnassignedEnvelopes] ,
CONVERT ( VARCHAR (8), MIN ( dq.DocumentDate ), 1 ) AS [OldestEnvelope]
FROM QueueTypes AS qt
LEFT OUTER JOIN dbo.VehicleReg_Documents_Queue AS dq ON dq.QueueID = qt.QueueTypeID
GROUP BY qt.QueueName
But the results are not close to being correct:
Your current query doesn't work as it seems to work, because while you outer join table VehicleReg_Documents_Queue you dismiss all outer joined rows in the WHERE clause, so you are where you would have been with a mere inner join. You may want to consider either moving your criteria to the ON clause or make this an inner join right away.
It is also weird that you join queue type and queue not on the queue ID or the queue type ID, but on dq.QueueID = qt.QueueTypeID. That's like joining employees and addresses on employee number matching the house number. At least that's what it looks like.
(Then why does your queue type table have a queue name? Shouldn't the queue table contain the queue name instead? But this is not about your query, but about your data model.)
GROUP BY ALL means: "Please give us all QueueNames, even when the WHERE clause dismisses them. I see two possibilities for your query:
You do want an outer join actually. Then there is no WHERE clause and you can simply make this GROUP BY qt.QueueName.
You don't want an outer join. Then we want a row per QueueName in the table, which we might not get with simply changing GROUP BY ALL qt.QueueName to GROUP BY qt.QueueName.
In that second case we want all QueueNames first and outer join your query:
select
qn.QueueName AS [Queue],
q.[#ofUnassignedEnvelopes],
q.[OldestEnvelope]
FROM (select distinct QueueName from VehicleReg_Documents_QueueTypes) qn
LEFT JOIN
(
SELECT qt.QueueName,
COUNT ( qt.QueueName ) AS [#ofUnprocessedEnvelopes] ,
COUNT ( CASE WHEN dq.AssignedToUserID = 0 THEN 1
ELSE NULL
END
) AS [#ofUnassignedEnvelopes] ,
MIN ( dq.DocumentDate ) AS [OldestEnvelope]
FROM dbo.VehicleReg_Documents_QueueTypes AS [qt]
JOIN dbo.VehicleReg_Documents_Queue AS [dq] ON dq.QueueID = qt.QueueTypeID
WHERE dq.IsProcessed = 0
AND dq.PageNumber = 1
) q ON q.QueueName = qn.QueueName
GROUP BY ALL qn.QueueName
ORDER BY qn.QueueName ASC;
I think the best corollary here for a 'GROUP BY ALL' into something more ANSI compliant would be a CASE statement. Without knowing your data, it's hard to say for sure if this is 1:1, but I'm betting it's in the ballpark.
SELECT qt.QueueName AS [Queue]
,COUNT(CASE
WHEN dq.IsProcessed = 0
AND dq.PageNumber = 1
THEN qt.QueueName
END) AS [#ofUnprocessedEnvelopes]
,COUNT(CASE
WHEN dq.AssignedToUserID = 0
AND dq.IsProcessed = 0
AND dq.PageNumber = 1
THEN 1
ELSE NULL
END) AS [#ofUnassignedEnvelopes]
,MIN(CASE
WHEN dq.IsProcessed = 0
AND dq.PageNumber = 1
THEN dq.DocumentDate
END) AS [OldestEnvelope]
FROM dbo.VehicleReg_Documents_QueueTypes AS [qt]
LEFT OUTER JOIN dbo.VehicleReg_Documents_Queue AS [dq] ON dq.QueueID = qt.QueueTypeID
GROUP BY qt.QueueName
ORDER BY qt.QueueName ASC;
That's a bit uglier because every aggregate has to have the WHERE conditions inside a case statement, but at least you are future proof.

Counting records in a SQL subquery

I'm having difficult with a subquery. In plain English I'm trying to pick a random userID from the QCUsers table that has less than 20 records from the QCTier1_Assignments table. The problem is that my query below is only picking users where it meets the criteria of the inner query when I need it to pick any user from QCUsers table even if the user does not have any records at all in the QCTier1_Assignments table. I need something like this
AND (Sub.QCCount < 20 OR Sub.QCCount = 0 )
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
E1.UserID
,Sub.QCCount --Drawn from the subquery
FROM QCUsers E1
JOIN (SELECT
QCA.UserID,
COUNT(*) AS QCCount
FROM QCTier1_Assignments QCA
WHERE QCA.ReviewPeriodMonth = #ReviewPeriodMonth
AND QCA.ReviewPeriodYear = #ReviewPeriodYear
GROUP BY QCA.UserID
) Sub
ON E1.UserID = Sub.UserID
WHERE Active = 1
AND Grade = 12
AND Sub.QCCount < 20
ORDER BY NEWID()
I also tried it this way with no luck
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
E1.UserID
,Sub.QCCount --Drawn from the subquery
FROM QCUsers E1
RIGHT JOIN (SELECT
QCA.UserID,
ReviewPeriodMonth,
ReviewPeriodYear,
COUNT(*) AS QCCount
FROM QCTier1_Assignments QCA
GROUP BY
QCA.UserID,
ReviewPeriodMonth,
ReviewPeriodYear
) Sub
ON E1.UserID = Sub.UserID
WHERE Active = 1
AND Grade = 12
AND Sub.QCCount < 20
AND Sub.ReviewPeriodMonth = #ReviewPeriodMonth
AND Sub.ReviewPeriodYear = #ReviewPeriodYear
ORDER BY NEWID()
Try using your second query but change the WHERE clause to use COALESCE(Sub.QCCount, 0) instead of justSub.QCCount`
If the subquery returns no rows then with your RIGHT JOIN you'll at least still get the row, but the QCCount will be NULL which when compared to anything will result in a "false" effectively.
Also, you should look into the HAVING clause. It might allow you to do this without a subquery at all.
Here's an example with the HAVING clause. If it doesn't give the correct results please let me know as I'm not able to test this.
DECLARE
#ReviewPeriodMonth VARCHAR(10) = '10'
#ReviewPeriodYear VARCHAR(10) = '2015'
SELECT TOP 1
E1.UserID,
COUNT(QCA.UserID) AS QCCount
FROM
QCUsers E1
LEFT OUTER JOIN QCTier1_Assignments QCA ON
QCA.UserID = E1.UserID AND
QCA.ReviewPeriodMonth = #ReviewPeriodMonth AND
QCA.ReviewPeriodYear = #ReviewPeriodYear
WHERE
E1.Active = 1 AND
Grade = 12 AND
HAVING
COUNT(*) < 20
ORDER BY
NEWID()
You should use LEFT JOIN instead of JOIN(INNER JOIN), And you'd better to put the predicate to the outer query based on your practice, but I recommend the following way:
SELECT TOP1 ABC.UserID,ABC.QCCount
FROM
(
SELECT E1.UserID, COUNT(*) as QCCount
FROM QCUsers as E1
LEFT JOIN QCTier1_Assignments as QCA
ON QCA.UserID = E1.UserID
WHERE QCA.ReviewPeriodMonth = #ReviewPeriodMonth
AND QCA.ReviewPeriodYear = #ReviewPeriodYear
AND Active = 1
AND Grade = 12
GROUP BY E1.UserID
) as ABC
WHERE ABC.QCCount <20
ORDER BY NEWID()
I was able to work it out through a combination of responses here
DECLARE #ReviewPeriodMonth varchar(10) = '10'
DECLARE #ReviewPeriodYear varchar(10) = '2015'
SELECT TOP 1
QCUsers.UserID,
COUNT(QCTier1_Assignments.ReviewID) AS ReviewCount
FROM
QCTier1_Assignments RIGHT OUTER JOIN
QCUsers ON QCTier1_Assignments.UserID = QCUsers.UserID
WHERE
QCUsers.Active = 1
AND QCUsers.Grade = '12'
AND (ReviewPeriodMonth = #ReviewPeriodMonth OR ReviewPeriodMonth IS NULL)
AND (ReviewPeriodYear = #ReviewPeriodYear OR ReviewPeriodYear IS NULL)
GROUP BY
QCUsers.UserID
HAVING
(COALESCE(COUNT(QCTier1_Assignments.ReviewID),0) < 4)
ORDER BY NEWID()

Incorrect count value in subquery on table join column

The query I'm executing seems to be ignoring the where clause in the subquery
(select count(amazon) from orders where b.amazon = 2 and manifest = a.dbid)
column amazon is type INT
SQL SERVER 2014
If I run the query on its own and enter the value for manifest I get the correct result which I am expecting and is 1
select count(amazon) from orders where amazon = 2 and manifest = '211104'
Result Returns 1
When I run the query below I get a result of 5 which is the count of all orders where manifest = 211104 but the value of amazon is 1 in 4 results and 2 in 1 result.
Select distinct
top 30 DBID, today ,sum([amazon-orders])
From
(
SELECT [dbid], [today],
(select count(amazon) from orders
where b.amazon = 2 and manifest = a.dbid) as [amazon-orders]
FROM [manifest] a
join orders b on a.[dbid] = b.[manifest]
) t1
Group By
DBID, today
order by dbid desc
Can someone please help me.
Thanks
You have an extra join so you are counting multiple times... do this:
Select distinct
top 30 DBID, today ,sum([amazon-orders])
From
(
SELECT [dbid], [today],
(select count(amazon) from orders b
where b.amazon = 2 and manifest = a.dbid) as [amazon-orders]
FROM [manifest] a
) t1
Group By
DBID, today
order by dbid desc
or like this
SELECT [dbid], [today], count(o.amazon)
FROM [manifest] a
join orders o on a.dbid = o.manifest and o.amazon = 2
group by dbid, today
or this if you have columns you don't want to join (there is more going on than just this one join in your query and you need to use a left join):
SELECT [dbid], [today], sum(case when o.amazon is not null then 1 else 0 end)
FROM [manifest] a
left join orders o on a.dbid = o.manifest and o.amazon = 2
group by dbid, today
How's this?
SELECT top 30 [dbid], [today], sum(case when b.amazon = 2 then 1 else 0 end) as [amazon-orders]
FROM [manifest] a
join orders b on a.[dbid] = b.[manifest]
group by DBID, today
order by dbid desc
Pretty sure that because your query is in effect joining orders twice it's increasing the count.
Use this :
select a.DBID, a.today, count(b.amazon) from [manifest] a
join orders b on a.[dbid] = b.[manifest] and b.amazon = 2
Group By a.DBID, a.today

Joining 2 sql selects

I have two very similar sql statements
select instrumentuniqueid, count(levelid) as errors
from dbo.testevent
join dbo.test
on dbo.test.id = dbo.testevent.testid where dbo.test.runid = 20962 and dbo.testevent.levelid = 1
group by instrumentuniqueid
select instrumentuniqueid, count(levelid) as warnings
from dbo.testevent
join dbo.test
on dbo.test.id = dbo.testevent.testid where runid = 20962 and levelid = 2
group by instrumentuniqueid
The first one produces columns of instrumentuniqueid (aggregated) and the count
The second one produces columns of aggregated instrumentuniqueid with a different count.
How can I join them together so that the final table looks like:
Instrumentuniqueid | Errors | Warnings
Use conditional aggregation:
select instrumentuniqueid,
sum(case when te.levelid = 1 then 1 else 0 end) as errors,
sum(case when te.levelid = 2 then 1 else 0 end) as warnings
from dbo.testevent te join
dbo.test t
on t.id = t2.testid
where t.runid = 20962
group by instrumentuniqueid;
Table aliases also make a query easier to write and to read.

CTE with Group By returning the same results as basic query

I'm trying to select a count of product reviews by rating, where the rating can be 0 - 5.
The following basic select works but won't give a count of ratings that don't exist in the underlying table.
SELECT Rating, COUNT(*) AS 'Reviews' FROM ProductReviews
WHERE ProductID = 'product1'
GROUP BY Rating
I've tried using a CTE to generate the missing results, joined to Reviews table using an outer join, as soon as I try to include the "group by" expression the results fall back to match the results from the basic query.
(I've checked that the CTE does indeed generate the full range of required values).
BEGIN
DECLARE #START AS INT = 0;
DECLARE #END AS INT = 5;
WITH CTE_Ratings AS
(
SELECT #START as cte_rating
UNION ALL
SELECT 1 + cte_rating
FROM CTE_Ratings
WHERE cte_rating < #END
)
SELECT
cte_rating AS 'ReviewRating'
, ISNULL(COUNT(*), 0) AS 'ReviewCount'
FROM CTE_Ratings
LEFT OUTER JOIN Reviews ON Reviews.Rating = cte_rating
WHERE ProductReviews.ProductID = 'product1'
AND cte_rating BETWEEN #START AND #END
GROUP BY cte_rating
END
(I also tried building a temporary table containing the required values, joined to the Reviews table, with identical results).
In the case of both of the above queries the results are:
Rating Reviews
0 1
3 3
4 9
5 47
Whereas what I'm trying to get to for the same data is:
Rating Reviews
0 1
1 0
2 0
3 3
4 9
5 47
Can anyone suggest when the addition of the Group By aggregate function is causing the query to fail, or how it might be improved ?
The WHERE changes the OUTER JOIN to an INNER JOIN because you are filtering on optional rows from the outer table.So, move the outer table filter into the JOIN
Also, you are already filtering between start and end in the CTE
WITH ...
-- CTE here
SELECT
C.cte_rating AS 'ReviewRating'
, ISNULL(COUNT(PR.Rating), 0) AS 'ReviewCount'
FROM
CTE_Ratings C
LEFT OUTER JOIN
ProductReviews PR ON C.cte_rating= PR.Rating AND PR.ProductID = 'product1'
GROUP BY
C.cte_rating
More clearly, you are actually doing this
WITH ...
-- CTE here
SELECT
C.cte_rating AS 'ReviewRating'
, ISNULL(COUNT(PR.Rating), 0) AS 'ReviewCount'
FROM
CTE_Ratings C
LEFT OUTER JOIN
(
SELECT PR.Rating
FROM ProductReviews
WHERE ProductID = 'product1'
) PR ON C.cte_rating= PR.Rating
GROUP BY
C.cte_rating