SQL query rewrite for prettification and or performance improvement

SQL query rewrite for prettification and or performance improvement - sql

I have a query that essentially amounts to:
Select query 1
Union
Select query 2
where rowid not in query 1 rowids
Is there a prettier / more performant way to do this? I'm assuming the results of query 1 would be cached and thus utilized in the union... but it's also kinda oogly.
Update with the original query:
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT1
LEfT JOIN Fruits F on F.FTID = FT1.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
Union ALL
select FruitType, 0 as Fresh, 0 as Ripe, 0 as Moldy
FROM FruitTypes ft3
where
ft3.StoreID = #PassedInStoreID
and FruitType NOT IN
(
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT2
LEfT JOIN Fruits F on F.FTID = FT2.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
)
Thanks!

You don't need the second case statement in the NOT in clause. And not Exists is often faster in SQL Server.
SELECT FruitType
, count(CASE WHEN Status = 0 THEN 1 ELSE 0 END) AS Fresh
, count(CASE WHEN Status = 1 THEN 1 ELSE 0 END) AS Ripe
, count(CASE WHEN Status = 2 THEN 1 ELSE 0 END) AS Moldy
FROM FruitTypes FT1
LEfT JOIN Fruits F on F.FTID = FT1.ID
where
Fruit.IsHighPriced = 0
GROUP BY FruitType
Union ALL
select FruitType, 0 as Fresh, 0 as Ripe, 0 as Moldy
FROM FruitTypes ft3
where
ft3.StoreID = #PassedInStoreID
and NOT EXISTS
(
SELECT *
FROM FruitTypes FT2
LEfT JOIN Fruits F on F.FTID = FT2.ID
where
Fruit.IsHighPriced = 0
and ft3.FruitType = FT2.FruitType
)

The prettiest way of writing would probably be by turning query #1 into a view or a function, then using that view or function to call the repetitious code.
Performance could possibly be improved by using query #1 to fill a temp table or table variable, then using that temp table in place of the repititious code.

Related

Adding a dummy identifier to data that varies by position and value

I am working on a project in SQL Server with diagnosis codes and a patient can have up to 4 codes but not necessarily more than 1 and a patient cannot repeat a code more than once. However, codes can occur in any order. My goal is to be able to count how many times a Diagnosis code appears in total, as well as how often it appears in a set position.
My data currently resembles the following:
PtKey
Order #
Order Date
Diagnosis1
Diagnosis2
Diagnosis3
Diagnosis 4
345
1527
7/12/20
J44.9
R26.2
NULL
NULL
367
1679
7/12/20
R26.2
H27.2
G47.34
NULL
325
1700
7/12/20
G47.34
NULL
NULL
NULL
327
1710
7/12/20
I26.2
J44.9
G47.34
NULL
I would think the best approach would be to create a dummy column here that would match up the diagnosis by position. For example, Diagnosis 1 with A, and Diagnosis 2 with B, etc.
My current plan is to rollup the diagnosis using an unpivot:
UNPIVOT ( Diag for ColumnALL IN (Diagnosis1, Diagnosis2, Diagnosis3, Diagnosis4)) as unpvt
However, this still doesn’t provide a way to count the diagnoses by position on a sales order.
I want it to look like this:
Diagnosis
Total Count
Diag1 Count
Diag2 Count
Diag3 Count
Diag4 Count
J44.9
2
1
1
0
0
R26.2
1
1
0
0
0
H27.2
1
0
1
0
0
I26.2
1
1
0
0
0
G47.34
3
1
0
2
0

You can unpivot using apply and aggregate:
select v.diagnosis, count(*) as cnt,
sum(case when pos = 1 then 1 else 0 end) as pos_1,
sum(case when pos = 2 then 1 else 0 end) as pos_2,
sum(case when pos = 3 then 1 else 0 end) as pos_3,
sum(case when pos = 4 then 1 else 0 end) as pos_4
from data d cross apply
(values (diagnosis1, 1),
(diagnosis2, 2),
(diagnosis3, 3),
(diagnosis4, 4)
) v(diagnosis, pos)
where diagnosis is not null;

Another way is to use UNPIVOT to transform the columns into groupable entities:
SELECT Diagnosis, [Total Count] = COUNT(*),
[Diag1 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis1' THEN 1 ELSE 0 END),
[Diag2 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis2' THEN 1 ELSE 0 END),
[Diag3 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis3' THEN 1 ELSE 0 END),
[Diag4 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis4' THEN 1 ELSE 0 END)
FROM
(
SELECT * FROM #x UNPIVOT (Diagnosis FOR DiagGroup IN
([Diagnosis1],[Diagnosis2],[Diagnosis3],[Diagnosis4])) up
) AS x GROUP BY Diagnosis;
Example db<>fiddle

You can also manually unpivot via UNION before doing the conditional aggregation:
SELECT Diagnosis, COUNT(*) As Total Count
, SUM(CASE WHEN Position = 1 THEN 1 ELSE 0 END) As [Diag1 Count]
, SUM(CASE WHEN Position = 2 THEN 1 ELSE 0 END) As [Diag2 Count]
, SUM(CASE WHEN Position = 3 THEN 1 ELSE 0 END) As [Diag3 Count]
, SUM(CASE WHEN Position = 4 THEN 1 ELSE 0 END) As [Diag4 Count]
FROM
(
SELECT PtKey, Diagnosis1 As Diagnosis, 1 As Position
FROM [MyTable]
UNION ALL
SELECT PtKey, Diagnosis2 As Diagnosis, 2 As Position
FROM [MyTable]
WHERE Diagnosis2 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis3 As Diagnosis, 3 As Position
FROM [MyTable]
WHERE Diagnosis3 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis4 As Diagnosis, 4 As Position
FROM [MyTable]
WHERE Diagnosis4 IS NOT NULL
) d
GROUP BY Diagnosis
Borrowing Aaron's fiddle, to avoid needing to rebuild the schema from scratch, and we get this:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d1f7f525e175f0f066dd1749c49cc46d

SQL Group By with multiple counts

I'm trying to group a list of services together along with the number of applicants in each service, but I also need a count on the status each applicant is in.
Applicants table
serviceID clientID applicantID status
----------------------------------------------------
1 41 1 1 (Processing)
1 41 16 1 (Processing)
1 41 15 2 (Ready)
2 41 12 1 (Processing)
2 41 18 3 (Complete)
Service table:
serviceID serviceName
--------------------------
1 Full Service
2 Part Service
Results need to look like:
serviceName totalApplicants processingCount readyCount completeCount
---------------------------------------------------------------------------
Full Service 3 2 1 0
Part Service 2 1 0 1
I've got the following, but it's returning the same count in each of the columns:-
SELECT
Services.serviceName,
(COUNT(Applicants.applicantID)) AS totalApplicants,
ISNULL(SUM(CASE WHEN Applicants.status = 1 THEN 1 ELSE 0 END), 0) AS processingCount,
ISNULL(SUM(CASE WHEN Applicants.status = 2 THEN 1 ELSE 0 END), 0) AS readyCount,
ISNULL(SUM(CASE WHEN Applicants.status = 3 THEN 1 ELSE 0 END), 0) AS completeCount
FROM
Applicants
LEFT JOIN
Services ON Applicants.serviceID = Services.serviceID
WHERE
Applicants.clientID = #CompanyID
GROUP BY
Services.serviceName

You can do conditional aggregation:
select
s.serviceName,
count(s.serviceID) totalApplicants,
sum(case when status = 1 then 1 else 0 end) processingCount,
sum(case when status = 2 then 1 else 0 end) readyCount,
sum(case when status = 3 then 1 else 0 end) completeCount
from service s
left join applicants a on a.serviceID = s.serviceID AND a.clientID = #CompanyID
group by s.serviceID, s.serviceName
The conditional expression use standard case expression; depending on the database that you are actually using, neater alternatives may exists.

Your query should be fine, but it can be simplified to:
SELECT s.serviceName,
COUNT(a.AppicantId) AS totalApplicants,
SUM(CASE WHEN a.status = 1 THEN 1 ELSE 0 END) AS processingCount,
SUM(CASE WHEN a.status = 2 THEN 1 ELSE 0 END) AS readyCount,
SUM(CASE WHEN a.status = 3 THEN 1 ELSE 0 END) AS completeCount
FROM Services s LEFT JOIN
Applicants a
ON a.serviceID = s.serviceID AND
a.clientID = #CompanyID
GROUP BY s.serviceName ;
Notes:
It looks like you want a row for every service, so that should be the first table in the LEFT JOIN.
Hence the filtering on the company goes into the ON clause.
Table aliases make the query easier to write and to read.
No ISNULL() is needed (and I prefer COALESCE() over ISNULL()).

SQL ANY as a function instead of an operator

I need to count users that match certain conditions. To do that I need to join some tables and check if any of the grouping combination match the condition.
The way I implemented that now is by having a nested select that counts original matches and then counting the rows that have at least one result.
SELECT
COUNT(case when NestedCount1 > 0 then 1 else null end) as Count1,
COUNT(case when NestedCount2 > 0 then 1 else null end) as Count2,
COUNT(case when NestedCount3 > 0 then 1 else null end) as Count3
FROM
(SELECT
COUNT(case when Type = 1 then 1 else null end) as NestedCount1,
COUNT(case when Type = 2 then 1 else null end) as NestedCount2,
COUNT(case when Type = 2 AND Condition = 1 then 1 else null end) as NestedCount3
FROM [User]
LEFT JOIN [UserGroup] ON [User].Id = [UserGroup].UserId
LEFT JOIN [Group] ON [UserGroup].GroupId = [Group].Id
GROUP BY [User].Id) nested
What irks me is that the counts from the nested select are only used to check existence. However since ANY in SQL is only an operator I cannot think of a cleaner way on how to rewrite this.
The query returns correct results as is.
I'm wondering if there is any way to rewrite this that would avoid having intermediate results that are only used to check existence condition?
Sample imput User.csv Group.csv UserGroup.csv
Expected results: 483, 272, 121

It might be possible to simplify that query.
I think that the group on the UserId can be avoided.
By using distinct conditional counts on the user id.
Then there's no need for a sub-query.
SELECT
COUNT(DISTINCT case when [User].[Type] = 1 then [User].Id end) as Count1,
COUNT(DISTINCT case when [User].[Type] = 2 then [User].Id end) as Count2,
COUNT(DISTINCT case when [User].[Type] = 2 AND Condition = 1 then [User].Id end) as Count3
FROM [User]
LEFT JOIN [UserGroup] ON [UserGroup].UserId = [User].Id
LEFT JOIN [Group] ON [Group].Id = [UserGroup].GroupId;

SELECT
SUM(case when NestedCount1 > 0 then 1 else 0 end) as Count1,
SUM(case when NestedCount2 > 0 then 1 else 0 end) as Count2,
SUM(case when NestedCount3 > 0 then 1 else 0 end) as Count3
FROM
(
SELECT
[User].Id,
COUNT(case when Type = 1 then 1 else 0 end) as NestedCount1,
COUNT(case when Type = 2 then 1 else 0 end) as NestedCount2,
COUNT(case when Type = 2 AND Condition = 1 then 1 else 0 end) as NestedCount3
FROM [User]
LEFT JOIN [UserGroup] ON [UserGroup].UserId = [User].Id
LEFT JOIN [Group] ON [Group].Id = [UserGroup].GroupId
GROUP BY [User].Id
) nested

Why does this query have two selects?

I have this query :
SELECT WorkId, RegisterDate, sum(RoomType1) As RoomType1, sum(RoomType2) As RoomType2, sum(RoomType3) As RoomType3, sum(RoomType4) As RoomType4, sum(RoomType5) As RoomType5, sum(RoomType6) As RoomType6, sum(RoomType7) As RoomType7, sum(RoomType8) As RoomType8
FROM (
SELECT dbo.[Work].WorkId, dbo.[Work].RegisterDate,
case dbo.Floor.RoomType when 1 then 1 else 0 end as RoomType1,
case dbo.Kat.RoomType when 2 then 1 else 0 end as RoomType2,
FROM dbo.Belediye INNER JOIN
dbo.[Is] ON dbo.Municipality.MunicipalityId= dbo.[Is].MunicipalityWorkId INNER JOIN
dbo.Look ON dbo.[Work].LookWorkId = dbo.Look.LookId ,
WHERE (dbo.Look.LocationIS NOT NULL)
) E
GROUP BY WorkId,
This query works as expected, but I can't understand why it has two selects, why does it need them? Please explain it to me. Thanks.

As you suspected this query dont need two selects and could be rewritten without sub-query:
SELECT i.IsId,
i.KayitTarihi,
SUM(case k.OdaTipi when 1 then 1 else 0 end) as RoomType1,
SUM(case k.OdaTipi when 2 then 1 else 0 end) as RoomType2,
SUM(case k.OdaTipi when 3 then 1 else 0 end) as RoomType3,
SUM(case k.OdaTipi when 4 then 1 else 0 end) as RoomType4,
SUM(case k.OdaTipi when 5 then 1 else 0 end) as RoomType5,
SUM(case k.OdaTipi when 6 then 1 else 0 end) as RoomType6,
SUM(case k.OdaTipi when 7 then 1 else 0 end) as RoomType7,
SUM(case k.OdaTipi when 8 then 1 else 0 end) as RoomType8
FROM dbo.Belediye b
INNER JOIN dbo.[Is] i
ON b.BelediyeId = i.BelediyeIsId
INNER JOIN dbo.YerGorme yg
ON i.YerGormeIsId = yg.YerGormeId
INNER JOIN dbo.Kat k
ON yg.YerGormeId = k.YerGorme_YerGormeId
WHERE yg.Lokasyon IS NOT NULL
GROUP BY i.IsId, i.KayitTarihi
Note: use table aliases

How to do addition and division of aliased columns in a query?

I am using SQL Server 2008.
I am trying to do some basic math in some basic queries. I need to add up wins, losses, total, and percentages. I usually ask for the raw numbers and then do the calculations once I return my query to page. I would like to give SQL Server the opportunity to work a little harder.
What I want to do is something like this:
SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
Here's what I am doing now:
SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses,
SUM(case when vote = 1 then 1 else 0 end) + SUM(case when vote = 0 then 1 else 0 end) as Votes
What is the easiest, cleanest way to do simple math calculations like this in a query?
*EDIT: *
While I got some great answers, I didn't get what I was looking for.
The scores that I will be calculating are for a specific team, so, my results need to be like this:
TeamID Team Wins Losses Totals
1 A's 5 3 8
2 Bee's 7 9 16
3 Seas 1 3 4
SELECT T.TeamID,
T.Team,
V.TotalWins,
V.TotalLosses,
V.PctWins
FROM Teams T
JOIN
SELECT V.TeamID,
SUM(case when vote = 1 then 1 else 0 end) as V.TotWin,
SUM(case when vote = 0 then 1 else 0 end) as V.TotLoss
FROM Votes V
GROUP BY V.TeamID
I tried a bunch of things, but don't quite know what wrong. I am sure the JOIN part is where the problem is though. How do I bring these two resultsets together?

One way is to wrap your query in an external one:
SELECT TotalWins,
TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
FROM
( SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM ...
)
Another way (suggested by #Mike Christensen) is to use Common Table Expressions (CTE):
; WITH Calculation AS
( SELECT SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM ...
)
SELECT TotalWins,
TotalLosses,
TotalWins + TotalLosses as TotalPlays,
TotalPlays / TotalWins as PctWins
FROM
Calculation
Sidenote: No idea if this would mean any preformance difference in SQL-Server but you can also write these sums:
SUM(case when vote = 1 then 1 else 0 end)
as counts:
COUNT(case when vote = 1 then 1 end) --- the ELSE NULL is implied

try
select a, b, a+b as total
from (
select
case ... end as a,
case ... end as b
from realtable
) t

To answer your second question, this is the code you put forward with corrections to the syntax:
SELECT
T.TeamID,
T.Team,
V.TotalWins,
V.TotalLosses,
PctWins = V.TotalWins * 100 / CAST(V.TotalWins + V.TotalLosses AS float)
FROM Teams T
JOIN (
SELECT
TeamID,
SUM(case when vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when vote = 0 then 1 else 0 end) as TotalLosses
FROM Votes
GROUP BY TeamID
) as V on T.TeamID = V.TeamID
Note the brackets around the inner select.

It might help you if you're doing this sort of thing more than once to create a view...
CREATE VIEW [Totals]
SELECT
SUM(case when T.vote = 1 then 1 else 0 end) as TotalWins,
SUM(case when T.vote = 0 then 1 else 0 end) as TotalLosses,
T.SomeGroupColumn
FROM SomeTable T
GROUP BY T.SomeGroupColumn

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query rewrite for prettification and or performance improvement - sql

Related

Adding a dummy identifier to data that varies by position and value

SQL Group By with multiple counts

SQL ANY as a function instead of an operator

Why does this query have two selects?

How to do addition and division of aliased columns in a query?

Categories

Resources