GROUP BY with nested case expression - is there better way? - sql

SQL server 2012
I am taking fees received and multiplying them by different factors based on how long the Client has been a Client. The group by clause is fairly straight forward. However, my select gets awkward when I want to use this criteria in different ways:
select mp.professionals
,case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate) then 'New' else 'Old' end age -- straight forward
,case (case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate) then 'New' else 'Old' end) -- nested case
when 'New' then sum(fees) * 0.5
when 'Old' then sum(fees) * 0.25
else 0
end Credit
,case (case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate) then 'New' else 'Old' end) -- nested case
when 'New' then 'Welcome!'
when 'Old' then 'Thank you for being a long-time Client!'
end Greeting
from mattersprofessionals mp
inner join matters m on m.matters = mp.matters
inner join stmnledger sl on sl.matters = mp.matters
group by mp.professionals, case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate) then 'New' else 'Old' end
I suppose I should mention this is a simplified version of my actual case statements.
I was hoping there was a way to group by sl.stmndate < dateadd(year, 3, m.qClientOpenDate) as a boolean or something so I don't have to do the nested case expressions.
I know I could do a sub-query on the basic case and then do more case expressions in the outer query. But that's just rearranging the same nested case concept.

When the values are calculated directly from the row I tend to use cross apply for this as it is more concise than adding a derived table/CTE whose only purpose is to define a column alias but still needs to project out the remaining columns and contain a FROM.
This is not an option for expressions that reference window functions or aggregate functions but will work fine here.
select mp.professionals
,ca.age -- straight forward
,case ca.age -- nested case
when 'New' then sum(fees) * 0.5
when 'Old' then sum(fees) * 0.25
else 0
end Credit
,case ca.age -- nested case
when 'New' then 'Welcome!'
when 'Old' then 'Thank you for being a long-time Client!'
end Greeting
from mattersprofessionals mp
inner join matters m on m.matters = mp.matters
inner join stmnledger sl on sl.matters = mp.matters
cross apply (select case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate) then 'New' else 'Old' end) ca(age)
group by mp.professionals, ca.age

You can simplify the query a little by pre-computing the case expression in a subquery:
select
professionals,
age,
sum(fees) * case age
when 'New' then 0.5
when 'Old' then 0.25
else 0
end credit,
case age
when 'New' then 'Welcome'
when 'Old' then 'Thank you for being a long-time Client!'
end greeting
from (
select
mp.professionals,
case when sl.stmndate < dateadd(year, 3, m.qClientOpenDate)
then 'New'
else 'Old'
end age
from mattersprofessionals mp
inner join matters m on m.matters = mp.matters
inner join stmnledger sl on sl.matters = mp.matters
) t
group by professionals, age
Note: the else branch in the fees calculation is probably never reached (since age always takes either value 'New' or 'Old').

Technically I think the answer to my question is "No". There isn't a way to group by the boolean portion of a case statement so it can be used in various ways within the select.
I upvoted GMB's and Martin Smith's answers as they are helpful and informative. I am playing with the cross apply, a learning experience for sure.

Related

SQL Server CASE Statement Evaluate Expression Once

I may be missing something obvious! Thanks in advance for any help.
I am trying to use a CASE statement in an inline SQL Statement. I only want to evaluate the expression once, so I am looking to put the expression in the CASE section, and then evaluate the result in each WHEN. Here is the example:
SELECT
MyTable.ColumnA,
CASE DateDiff(d, MyTable.MyDate, getDate())
WHEN <= 0 THEN 'bad'
WHEN BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END as MyCalculatedColumn,
MyTable.SomeOtherColumn
I know I can do this:
CASE
WHEN DateDiff(d, MyTable.MyDate, getDate()) <= 0 THEN 'bad'
WHEN DateDiff(d, MyTable.MyDate, getDate()) BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END
But in my first example, SQL does not seem to like this statement:
WHEN <= 0 THEN 'bad'
Note that the statement is inline with other SQL, so I can't do something like:
DECLARE #DaysDiff bigint
SET #DaysDiff = DateDiff(d, MyTable.MyDate, getDate())
CASE #DaysDiff
WHEN <= 0 THEN 'bad'
WHEN BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END
My actual DateDiff expression is much more complex and I only want to maintain its logic, and have it evaluated, only once.
Thanks again...
You can use apply for this purpose:
SELECT MyTable.ColumnA,
(CASE WHEN day_diff <= 0 THEN 'bad'
WHEN BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END) as MyCalculatedColumn,
MyTable.SomeOtherColumn
FROM MyTable CROSS APPLY
(VALUES (DateDiff(day, MyTable.MyDate, getDate()))) v(day_diff)
APPLY is a very handy way to add calculated values into a statement. Because they are defined in the FROM clause, they can be used in SELECT, WHERE, and GROUP BY clauses where column aliases would not be recognized.
I see your problem. CASE expression WHEN value can only do an equality check
You could try using a CTE (Common Table Expression), do everything except the case statement in the CTE and then put the CASE in the final SELECT at the end. I'm not sure whether it will prevent the expression being evaluated twice - thats kindof the optimisers problem, not yours (thats how I like to think about it)
WITH cteMyComplexThing AS(
SELECT MyTable.ColumnA,
DateDiff(d, MyTable.MyDate, getDate()) as ComplexThing,
MyTable.SomeOtherColumn
FROM MyTable
)
SELECT
ColumnA,
CASE
WHEN ComplexThing <= 0 THEN 'bad'
WHEN ComplexThing BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END as MyCalculatedColumn,
SomeOtherColumn
FROM cteMyComplexThing
The WHEN clause in a CASE statement needs both sides of the condition. <=0 cannot stand by itself.
CASE #DaysDiff
WHEN ? <= 0 THEN 'bad'
WHEN BETWEEN 1 AND 15 THEN 'reasonable'
ELSE 'good'
END

Combine Three Grouped Select Queries Into One

--ON TIME PMWO's
SELECT LOCATION, COUNT(WONUM) AS OnTimePMWOs
FROM
WORKORDER
WHERE worktype = 'pm' and actfinish<=targcompdate
GROUP BY LOCATION
--PAST DUE PMWO'S
SELECT LOCATION, COUNT(WONUM) AS PastDuePMWOs
FROM
WORKORDER
WHERE worktype = 'pm' and actfinish>=targcompdate
GROUP BY LOCATION
--30 DayForecast-
SELECT W.location, COUNT(W.wonum) AS Forecast30days
from
workorder AS W
INNER JOIN PMFORECAST AS P
ON W.CHANGEDATE=P.CHANGEDATE
WHERE WORKTYPE='PM' AND P.forecastdate>= GETDATE()+30
GROUP BY LOCATION
This is an answer (all be it just a guess based on the limited question). I'm not a fan of placing CASE statements inside aggregates, but depending on environment, indexing, and data in the included tables, it might perform okay. Please be more involved when posting things. Show us what you've tried, explain the problem, give samples of data, include the desired output, all that fun stuff. The better the question, the better chance of a good answer. Okay, done on the high horse...
DECLARE #Forecast30days DATE
SET #Forecast30days = CURRENT_TIMESTAMP + 30
SELECT
wo.LOCATION
,COUNT(CASE WHEN wo.actfinish <= wo.targcompdate THEN wo.WONUM ELSE NULL END) AS OnTimePMWOs
,COUNT(CASE WHEN wo.actfinish >= wo.targcompdate THEN wo.WONUM ELSE NULL END) AS PastDuePMWOs
,COUNT(CASE WHEN pm.changedate IS NOT NULL THEN wo.WONUM ELSE NULL END) AS Forecast30days
FROM WORKORDER AS wo
LEFT JOIN PMFORECAST AS pm
ON wo.changedate = pm.changedate
AND pm.forecastdate >= #Forecast30days
WHERE wo.worktype = 'pm'
AND ((wo.actfinish IS NOT NULL AND wo.targcompdate IS NOT NULL)
OR (pm.changedate IS NOT NULL))
GROUP BY
wo.LOCATION

Alternative approach to multiple left joins

I have the following queries to count sales by route
SELECT DISTINCT q.sales_route,
y.yesterday,
t.today
FROM tblquotesnew q
left join (SELECT tblquotesnew.sales_route,
Count(tblquotesnew.sales_route) AS Yesterday
FROM tblquotesnew
WHERE tblquotesnew.date_sent_to_registrations =
Trunc(SYSDATE - 1)
AND sales_route IS NOT NULL
GROUP BY tblquotesnew.sales_route) y
ON q.sales_route = y.sales_route
left join (SELECT tblquotesnew.sales_route,
Count(tblquotesnew.sales_route) AS Today
FROM tblquotesnew
WHERE tblquotesnew.date_sent_to_registrations =
Trunc(SYSDATE)
AND sales_route IS NOT NULL
GROUP BY tblquotesnew.sales_route) t
ON q.sales_route = t.sales_route
I then have 6 other left joins to count current and previous week, month, and year.
This approach does work, but I was wondering if this is a more efficient (in terms of lines of code) way of pulling together this data?
I think you just need conditional aggregation:
select q.sales_route,
sum(case when q.date_sent_to_registrations = trunc(SYSDATE - 1)
then 1 else 0
end) as yesterday,
sum(case when q.date_sent_to_registrations = trunc(SYSDATE)
then 1 else 0
end) as today
from tblquotesnew q
group by sales_route
You may use conditional aggregation
SELECT sales_route,
sum(CASE WHEN date_sent_to_registrations = Trunc(SYSDATE)
AND sales_route IS NOT NULL
THEN 1 ELSE 0 END) today,
sum(CASE WHEN date_sent_to_registrations = Trunc(SYSDATE - 1)
AND sales_route IS NOT NULL
THEN 1 ELSE 0 END) yesterday
FROM tblquotesnew
GROUP BY sales_route
conditional aggregation leads to one sequential scan of your table which may be ok in many cases. An alternative solution is to use subqueries behind SELECT which may be sometimes more efficient. For example, if you access small subselect of data and you can create indexes to support it.

SQL Return Value of Zero so filter remains

The query below works great except for one thing.
I am using the column as "WorkStream" as a filter in my end user report. Since it is a filter I need it to be there regardless of whether or not the query returns anything.
Of course I have to use the where clause as a filter within the query so that only certain tasks are returned.
The trouble is that when I have a workstream without any task_Significance items the workstream is removed and the report filter (Workstream) does not work.
What I need is a table that looks like this but if I have a missing workstream then it would put "0's" all the way across. But only if the workstream is missing would it return ONLY the filter name (so the report would have the filter and just show and empty table below the filter in the final report.
This is my code:
SELECT
p.[Workstream]
,T.[Task_Significance]
,COUNT(1) AS Total
,SUM(case when T.[TaskPercentCompleted] >= 100 then 1 else 0 end) AS Actual
,SUM(case when T.[TaskFinishDate] <= DATEADD(DAY, 8-DATEPART(DW, GETDATE()), Convert(date,getdate())) then 1 else 0 end) AS Planned
FROM [psmado].[dbo].[MSP_EpmProject_UserView] as P
join [PSMADO].[dbo].[MSP_EpmTask_UserView] as T
On T.[projectUID] = P.[projectUID]
WHERE
[Task_Significance] IN('Application', 'Data', 'Interface', 'End User Compute', 'Network', 'Compute Package', 'Data Center', 'CREWS Sites', 'App Design Review', 'Infra Design Review')
GROUP BY p.[EnterpriseProjectTypeUID], p.[Workstream],T.[Task_Significance]
Is there anyway to do this?
Use a left join and move your where clause to the on clause. This will return all p.[Workstream] and those t.Task_Significance that are in your in() list.
You can use coalesce() or isnull() to substitute a value for null when there are no matching t.Task_Significance.
select
p.[Workstream]
, coalesce(T.[Task_Significance], 'none') as Task_Significance
, count(1) as Total
, sum(case when T.[TaskPercentCompleted] >= 100 then 1 else 0 end) as Actual
, sum(case when T.[TaskFinishDate]
<= dateadd(day, 8 - datepart(dw, getdate()), Convert(date, getdate()))
then 1 else 0 end) as Planned
from [psmado].[dbo].[msp_EpmProject_UserView] as P
left join [psmado].[dbo].[msp_EpmTask_UserView] as T
on T.[projectuid] = P.[projectuid]
and [Task_Significance] in ('Application', 'Data', 'Interface'
, 'End User Compute', 'Network', 'Compute Package', 'Data Center'
, 'crews Sites', 'App Design Review', 'Infra Design Review')
group by
p.[EnterpriseProjectTypeuid]
, p.[Workstream]
, T.[Task_Significance]

Transact-SQL Sub Query

I'm struggling to find the logic of how to accomplish a sub query, or at least that's what I think is required! I'll show what I have:
SELECT CH.SFA,
convert(datetime, RE.START_DATE, 103) AS 'START DATE',
Count(Distinct CH.CHNO) AS 'TOTAL CH',
Count(CH.STATUS) AS 'COMPLETED CH',
count(distinct CH.CHNO + CH.STATUS) As 'COMPLETED CH2'
FROM CUSTOMER.dbo. CH CH, CUSTOMER.dbo.RE RE
WHERE
RE.SFA = CH.SFA
GROUP BY
CH.SFA, RE.START_DATE
What I am trying to do is where I have COMPLETED CH2 I need to specify that it ends with a C, the Status Column is either blanks or C's and by doing a distinct count of CHNO and C together give me the result I need but I cannot for the life of me find out how to write it!
I am using Microsoft Query to take the data from its source straight into the Excel spreadsheet.
Many thanks for taking a look.
Been ages since I've used MS Query so I'm fuzzy on syntax, but this is the general idea of how to write a subquery containing a WHERE clause and an aggregation to get you started:
SELECT
CH.SFA,
convert(datetime, RE.START_DATE, 103) AS 'START DATE',
Count(DISTINCT CH.CHNO) AS 'TOTAL CH',
Count(CH.STATUS) AS 'COMPLETED CH',
CCH.COMPLETED_CH2 AS 'COMPLETED CH2'
FROM CUSTOMER.dbo.CH CH
INNER JOIN CUSTOMER.dbo.RE RE
ON RE.SFA = CH.SFA
LEFT JOIN (
SELECT SFA, COUNT(DISTINCT CH.CHNO) AS COMPLETED_CH2
FROM CUSTOMER.dbo.CH
WHERE STATUS = 'C'
GROUP BY SFA
) AS CCH
ON RE.SFA = CCH.SFA
GROUP BY CH.SFA, RE.START_DATE
if you just want to know count of records where CH.STATUS = 'C' than add another COUNT statement with CASE logic.
COUNT(CASE WHEN CH.STATUS = 'C' then 1 else null end) as 'COMPLETED CH2'
when combining COUNT and CASE statement remember to have NULL for ELSE statement, otherwise all rows will be counted.
as an alternative you can do it with a SUM
SUM(CASE WHEN CH.STATUS = 'C' then 1 else 0 end) as 'COMPLETED CH2'