SQL ANY as a function instead of an operator - sql

I need to count users that match certain conditions. To do that I need to join some tables and check if any of the grouping combination match the condition.
The way I implemented that now is by having a nested select that counts original matches and then counting the rows that have at least one result.
SELECT
COUNT(case when NestedCount1 > 0 then 1 else null end) as Count1,
COUNT(case when NestedCount2 > 0 then 1 else null end) as Count2,
COUNT(case when NestedCount3 > 0 then 1 else null end) as Count3
FROM
(SELECT
COUNT(case when Type = 1 then 1 else null end) as NestedCount1,
COUNT(case when Type = 2 then 1 else null end) as NestedCount2,
COUNT(case when Type = 2 AND Condition = 1 then 1 else null end) as NestedCount3
FROM [User]
LEFT JOIN [UserGroup] ON [User].Id = [UserGroup].UserId
LEFT JOIN [Group] ON [UserGroup].GroupId = [Group].Id
GROUP BY [User].Id) nested
What irks me is that the counts from the nested select are only used to check existence. However since ANY in SQL is only an operator I cannot think of a cleaner way on how to rewrite this.
The query returns correct results as is.
I'm wondering if there is any way to rewrite this that would avoid having intermediate results that are only used to check existence condition?
Sample imput User.csv Group.csv UserGroup.csv
Expected results: 483, 272, 121

It might be possible to simplify that query.
I think that the group on the UserId can be avoided.
By using distinct conditional counts on the user id.
Then there's no need for a sub-query.
SELECT
COUNT(DISTINCT case when [User].[Type] = 1 then [User].Id end) as Count1,
COUNT(DISTINCT case when [User].[Type] = 2 then [User].Id end) as Count2,
COUNT(DISTINCT case when [User].[Type] = 2 AND Condition = 1 then [User].Id end) as Count3
FROM [User]
LEFT JOIN [UserGroup] ON [UserGroup].UserId = [User].Id
LEFT JOIN [Group] ON [Group].Id = [UserGroup].GroupId;

SELECT
SUM(case when NestedCount1 > 0 then 1 else 0 end) as Count1,
SUM(case when NestedCount2 > 0 then 1 else 0 end) as Count2,
SUM(case when NestedCount3 > 0 then 1 else 0 end) as Count3
FROM
(
SELECT
[User].Id,
COUNT(case when Type = 1 then 1 else 0 end) as NestedCount1,
COUNT(case when Type = 2 then 1 else 0 end) as NestedCount2,
COUNT(case when Type = 2 AND Condition = 1 then 1 else 0 end) as NestedCount3
FROM [User]
LEFT JOIN [UserGroup] ON [UserGroup].UserId = [User].Id
LEFT JOIN [Group] ON [Group].Id = [UserGroup].GroupId
GROUP BY [User].Id
) nested

Related

SQL sum of two conditional aggregation

So yesterday I learned about conditional aggregation. I'm fairly new to SQL.
Here is my query:
select
Year_CW,
sum(case when col = 0 then 1 else 0 end) as "Total_sampled(Checked)",
sum(case when col = 1 then 1 else 0 end) as "Total_unsampled(Not_Checked)",
sum(case when col = 0 AND col2 = 'accepted' then 1 else 0 end) as "Accepted",
sum(case when col = 0 AND col2 = 'accepted with comments' then 1 else 0 end) as "Accepted with comments",
sum(case when col = 0 AND col2 = 'request for rework' then 1 else 0 end) as "Request for rework",
sum(case when col = 0 AND col2 = 'rejected' then 1 else 0 end) as "Rejected",
sum(case when col = 0 Or col = 1 then 1 else 0 end) as "Total_DS"
from
(select
Year_CW, SAMPLED as col, APPROVAL as col2
from
View_TEST tv) tv
group by
Year_CW
order by
Year_CW desc
I'm basically just calculating some KPIs grouped by week.
Look at the row for "Total_DS". It is essentially the sum of the first two sums, "Total_sampled(Checked)" and "Total_unsampled(Not_Checked)".
Is there a way that I can add the two columns from the first two sums to get the third one instead of trying to get the data all over again? I feel performance wise this would be terrible practice. It doesn't matter for this database but I don't want to learn bad code practice from the start.
Thanks for helping.
You probably won't see a significant performance hit from what you're doing now as you already have all the data available, you're just repeating the case evaluation.
But you can't refer to the column aliases for the first two columns within the same level of query.
If you can't do a simple count as #Zeki suggested because you aren't sure if there might be values other than zero and one (though this looks rather like a binary true/false equivalent, so there may well be a check constraint limiting you to those values), or if you're just more interested in a more general case, you can use an inline view as #jarhl suggested:
select Year_CW,
"Total_sampled(Checked)",
"Total_unsampled(Not_Checked)",
"Accepted",
"Accepted with comments",
"Request for rework",
"Rejected",
"Total_sampled(Checked)" + "Total_unsampled(Not_Checked)" as "Total_DS"
from (
select Year_CW,
sum(case when col = 0 then 1 else 0 end) as "Total_sampled(Checked)",
sum(case when col = 1 then 1 else 0 end) as "Total_unsampled(Not_Checked)",
sum(case when col = 0 AND col2 = 'accepted' then 1 else 0 end) as "Accepted",
sum(case when col = 0 AND col2 = 'accepted with comments' then 1 else 0 end)
as "Accepted with comments",
sum(case when col = 0 AND col2 = 'request for rework' then 1 else 0 end)
as "Request for rework",
sum(case when col = 0 AND col2 = 'rejected' then 1 else 0 end) as "Rejected"
from (
select Year_CW, SAMPLED as col, APPROVAL as col2
from View_TEST tv
) tv
group by Year_CW
)
order by Year_CW desc;
The inner query gets the data and calculates the conditional aggregate values. The outer query just gets those values from the inner query, and also adds the Total_DS column to the result set by adding together the rwo values from the inner query.
You should generally avoid quoted identifiers, and if you really need them in your result set you should apply them at the last possible moment - so use unquoted identifiers in the inner query, and give them qupted aliases in the outer query. And personally if the point of a query is to count things, I prefer to use a conditional count over a conditional sum. I'm also not sure why you already have a subquery against your view, which just changes the column names and makes the main query slightly more obscure. So I might do this as:
select year_cw,
total_sampled_checked as "Total_sampled(Checked)",
total_unsampled_not_checked as "Total_unsampled(Not_Checked)",
accepted as "Accepted",
accepted_with_comments as "Accepted with comments",
request_for_rework as "Request for rework",
rejected as "Rejected",
total_sampled_checked + total_unsampled_not_checked as "Total_DS"
from (
select year_cw,
count(case when sampled = 0 then 1 end) as total_sampled_checked,
count(case when sampled = 1 then 1 end) as total_unsampled_not_checked,
count(case when sampled = 0 and approval = 'accepted' then 1 end) as accepted,
count(case when sampled = 0 and approval = 'accepted with comments' then 1 end)
as accepted_with_comments,
count(case when sampled = 0 and approval = 'request for rework' then 1 end)
as request_for_rework,
count(case when sampled = 0 and approval = 'rejected' then 1 end) as rejected
from view_test
group by year_cw
)
order by year_cw desc;
Note that in the case expression, then 1 can be then <anything that isn't null>, so you could do then sampled or whatever. I've left out the implicit else null. As count() ignores nulls, all the case expression has to do is evaluate to any not-null value for the rows you want to include in the count.
You can try below
select Year_CW,
sum(case when col = 0 then 1 else 0 end) as "Total_sampled(Checked)",
sum(case when col = 1 then 1 else 0 end) as "Total_unsampled(Not_Checked)",
sum(case when col = 0 AND col2 = 'accepted' then 1 else 0 end) as "Accepted",
sum(case when col = 0 AND col2 = 'accepted with comments' then 1 else 0 end) as "Accepted with comments",
sum(case when col = 0 AND col2 = 'request for rework' then 1 else 0 end) as "Request for rework",
sum(case when col = 0 AND col2 = 'rejected' then 1 else 0 end) as "Rejected",
sum(sum(case when col = 0 then 1 else 0 end) = 0 Or sum(case when col = 1 then 1 else 0 end) = 1 then 1 else 0 end) as "Total_DS"
from (select Year_CW, SAMPLED as col, APPROVAL as col2
from View_TEST tv
) tv
group by Year_CW
order by Year_CW desc

Why does this query have two selects?

I have this query :
SELECT WorkId, RegisterDate, sum(RoomType1) As RoomType1, sum(RoomType2) As RoomType2, sum(RoomType3) As RoomType3, sum(RoomType4) As RoomType4, sum(RoomType5) As RoomType5, sum(RoomType6) As RoomType6, sum(RoomType7) As RoomType7, sum(RoomType8) As RoomType8
FROM (
SELECT dbo.[Work].WorkId, dbo.[Work].RegisterDate,
case dbo.Floor.RoomType when 1 then 1 else 0 end as RoomType1,
case dbo.Kat.RoomType when 2 then 1 else 0 end as RoomType2,
FROM dbo.Belediye INNER JOIN
dbo.[Is] ON dbo.Municipality.MunicipalityId= dbo.[Is].MunicipalityWorkId INNER JOIN
dbo.Look ON dbo.[Work].LookWorkId = dbo.Look.LookId ,
WHERE (dbo.Look.LocationIS NOT NULL)
) E
GROUP BY WorkId,
This query works as expected, but I can't understand why it has two selects, why does it need them? Please explain it to me. Thanks.
As you suspected this query dont need two selects and could be rewritten without sub-query:
SELECT i.IsId,
i.KayitTarihi,
SUM(case k.OdaTipi when 1 then 1 else 0 end) as RoomType1,
SUM(case k.OdaTipi when 2 then 1 else 0 end) as RoomType2,
SUM(case k.OdaTipi when 3 then 1 else 0 end) as RoomType3,
SUM(case k.OdaTipi when 4 then 1 else 0 end) as RoomType4,
SUM(case k.OdaTipi when 5 then 1 else 0 end) as RoomType5,
SUM(case k.OdaTipi when 6 then 1 else 0 end) as RoomType6,
SUM(case k.OdaTipi when 7 then 1 else 0 end) as RoomType7,
SUM(case k.OdaTipi when 8 then 1 else 0 end) as RoomType8
FROM dbo.Belediye b
INNER JOIN dbo.[Is] i
ON b.BelediyeId = i.BelediyeIsId
INNER JOIN dbo.YerGorme yg
ON i.YerGormeIsId = yg.YerGormeId
INNER JOIN dbo.Kat k
ON yg.YerGormeId = k.YerGorme_YerGormeId
WHERE yg.Lokasyon IS NOT NULL
GROUP BY i.IsId, i.KayitTarihi
Note: use table aliases

How to get count of a column value returned using subquery?

I want to get all hrc_acct_num which are not present in acct_key column of STAGING_CUST_ACCT table. The last outer select column is throwing an error. How can I get count of a column returned using a subquery?
SELECT source_sys_cd,
Count(CASE
WHEN is_delete = 0 THEN 1
END) [DEL IS 0],
Sum(CASE
WHEN trans_amt = 0 THEN 1
ELSE 0
END) [STG $0 TXN CNT],
Count(CASE
WHEN hrc_acct_num NOT IN(SELECT DISTINCT acct_key
FROM staging_cust_acct) THEN
hrc_acct_num
END)
FROM staging_transactions (nolock)
GROUP BY source_sys_cd
ORDER BY source_sys_cd
You can do a LEFT JOIN to the sub query and then do a SUM when the value is null. acct_key
SELECT source_sys_cd,
Count(CASE
WHEN is_delete = 0 THEN 1
END) [DEL IS 0],
Sum(CASE
WHEN trans_amt = 0 THEN 1
ELSE 0
END) [STG $0 TXN CNT],
SUM(CASE WHEN T.acct_key is NULL THEN 1 else 0 END ) CountNotIN
FROM staging_transactions (nolock) s
LEFT JOIN (SELECT DISTINCT acct_key
FROM staging_cust_acct) t
s.hrc_acct_num = t.acct_key
GROUP BY source_sys_cd
ORDER BY source_sys_cd
Here's a simplified demo
You can short circuit the subquery with NOT EXISTS. It's more efficient than LEFT JOIN (SELECT DISTINCT, since you don't care about enumerating all the times it does exist.
SELECT source_sys_cd,
Count(CASE is_delete WHEN
WHEN is_delete = 0 THEN 1
END) [DEL IS 0],
Count(CASE
WHEN trans_amt = 0 THEN 1
END) [STG $0 TXN CNT],
Count(CASE
WHEN NOT EXISTS (SELECT 1
FROM staging_cust_acct
WHERE acct_key = hrc_acct_num) THEN 1
END)
FROM staging_transactions (nolock)
GROUP BY source_sys_cd
ORDER BY source_sys_cd

SQL query rewrite for prettification and or performance improvement

I have a query that essentially amounts to:
Select query 1
Union
Select query 2
where rowid not in query 1 rowids
Is there a prettier / more performant way to do this? I'm assuming the results of query 1 would be cached and thus utilized in the union... but it's also kinda oogly.
Update with the original query:
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT1
LEfT JOIN Fruits F on F.FTID = FT1.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
Union ALL
select FruitType, 0 as Fresh, 0 as Ripe, 0 as Moldy
FROM FruitTypes ft3
where
ft3.StoreID = #PassedInStoreID
and FruitType NOT IN
(
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT2
LEfT JOIN Fruits F on F.FTID = FT2.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
)
Thanks!
You don't need the second case statement in the NOT in clause. And not Exists is often faster in SQL Server.
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT1
LEfT JOIN Fruits F on F.FTID = FT1.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
Union ALL
select FruitType, 0 as Fresh, 0 as Ripe, 0 as Moldy
FROM FruitTypes ft3
where
ft3.StoreID = #PassedInStoreID
and NOT EXISTS
(
SELECT *
FROM FruitTypes FT2
LEfT JOIN Fruits F on F.FTID = FT2.ID
where
Fruit.IsHighPriced = 0
and ft3.FruitType = FT2.FruitType
)
The prettiest way of writing would probably be by turning query #1 into a view or a function, then using that view or function to call the repetitious code.
Performance could possibly be improved by using query #1 to fill a temp table or table variable, then using that temp table in place of the repititious code.

How to do addition and division of aliased columns in a query?

I am using SQL Server 2008.
I am trying to do some basic math in some basic queries. I need to add up wins, losses, total, and percentages. I usually ask for the raw numbers and then do the calculations once I return my query to page. I would like to give SQL Server the opportunity to work a little harder.
What I want to do is something like this:
SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
Here's what I am doing now:
SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses,
SUM(case when vote = 1 then 1 else 0 end) + SUM(case when vote = 0 then 1 else 0 end) as Votes
What is the easiest, cleanest way to do simple math calculations like this in a query?
*EDIT: *
While I got some great answers, I didn't get what I was looking for.
The scores that I will be calculating are for a specific team, so, my results need to be like this:
TeamID Team Wins Losses Totals
1 A's 5 3 8
2 Bee's 7 9 16
3 Seas 1 3 4
SELECT T.TeamID,
T.Team,
V.TotalWins,
V.TotalLosses,
V.PctWins
FROM Teams T
JOIN
SELECT V.TeamID,
SUM(case when vote = 1 then 1 else 0 end) as V.TotWin,
SUM(case when vote = 0 then 1 else 0 end) as V.TotLoss
FROM Votes V
GROUP BY V.TeamID
I tried a bunch of things, but don't quite know what wrong. I am sure the JOIN part is where the problem is though. How do I bring these two resultsets together?
One way is to wrap your query in an external one:
SELECT TotalWins,
TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
FROM
( SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM ...
)
Another way (suggested by #Mike Christensen) is to use Common Table Expressions (CTE):
; WITH Calculation AS
( SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM ...
)
SELECT TotalWins,
TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
FROM
Calculation
Sidenote: No idea if this would mean any preformance difference in SQL-Server but you can also write these sums:
SUM(case when vote = 1 then 1 else 0 end)
as counts:
COUNT(case when vote = 1 then 1 end) --- the ELSE NULL is implied
try
select a, b, a+b as total
from (
select
case ... end as a,
case ... end as b
from realtable
) t
To answer your second question, this is the code you put forward with corrections to the syntax:
SELECT
T.TeamID,
T.Team,
V.TotalWins,
V.TotalLosses,
PctWins = V.TotalWins * 100 / CAST(V.TotalWins + V.TotalLosses AS float)
FROM Teams T
JOIN (
SELECT
TeamID,
SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM Votes
GROUP BY TeamID
) as V on T.TeamID = V.TeamID
Note the brackets around the inner select.
It might help you if you're doing this sort of thing more than once to create a view...
CREATE VIEW [Totals]
SELECT
SUM(case when T.vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when T.vote = 0 then 1 else 0 end) as TotalLosses,
T.SomeGroupColumn
FROM SomeTable T
GROUP BY T.SomeGroupColumn