I am calculating Age of a user based on his date of birth.
select UserId, (Convert(int,Convert(char(8),GETDATE(),112))-Convert(char(8),[DateOfBirth],112))/10000 AS [Age] FROM dbo.[User]
This gives me the UserId and his age.
Now I want to group this result.
How many users are in 30's, How many users in 40's and how many users in their 50's.. need the count of users with their age groups
If the user is > 0 and less than 30 he should be grouped to 20's
If the user is >= 30 and < 40 then he should be added to 30's list, same with 40's and 50's
Can this be achieved without creating any temp table?
I believe this will get you what you want.
Anything < 30 will be placed in the '20' group.
Anything >= 50 will be placed in the '50' group.
If they are 30-39 or 40-49, they will be placed in their appropriate decade group.
SELECT y.AgeDecade, COUNT(*)
FROM dbo.[User] u
CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
CROSS APPLY (SELECT AgeDecade = CASE
WHEN x.Age <= 29 THEN 20
WHEN x.Age BETWEEN 30 AND 39 THEN 30
WHEN x.Age BETWEEN 40 AND 49 THEN 40
WHEN x.Age >= 50 THEN 50
ELSE NULL
END
) y
GROUP BY y.AgeDecade
Placing the logic into CROSS APPLY makes it easier to reuse the logic within the same query, this way you can use it in SELECT, GROUP BY, ORDER BY, WHERE, etc, without having to duplicate it. This could also be done using a cte, but in this scenario, this is my preference.
Update:
You asked in the comments how it would be possible to show a count of 0 when no people exist for an age group. In most cases the answer is simple, LEFT JOIN. As with everything, there's always more than one way to bake a cake.
Here are a couple ways you can do it:
The simple left join, take the query from my original answer, and just do a left join to a table. You could do this in a couple ways, CTE, temp table, table variable, sub-query, etc. The takeaway is, you need to isolate your User table somehow.
Simple Sub-query method, nothing fancy. Just stuck the whole query into a sub-query, then left joined it to our new lookup table.
DECLARE #AgeGroup TABLE (AgeGroupID tinyint NOT NULL);
INSERT INTO #AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);
SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)
FROM #AgeGroup ag
LEFT JOIN (
SELECT y.AgeDecade
FROM #User u
CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
CROSS APPLY (SELECT AgeDecade = CASE
WHEN x.Age <= 29 THEN 20
WHEN x.Age BETWEEN 30 AND 39 THEN 30
WHEN x.Age BETWEEN 40 AND 49 THEN 40
WHEN x.Age >= 50 THEN 50
ELSE NULL
END
) y
) a ON a.AgeDecade = ag.AgeGroupID
GROUP BY ag.AgeGroupID;
This would be the exact same thing as using a cte:
DECLARE #AgeGroup TABLE (AgeGroupID tinyint NOT NULL);
INSERT INTO #AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);
WITH cte_Users AS (
SELECT y.AgeDecade
FROM #User u
CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
CROSS APPLY (SELECT AgeDecade = CASE
WHEN x.Age <= 29 THEN 20
WHEN x.Age BETWEEN 30 AND 39 THEN 30
WHEN x.Age BETWEEN 40 AND 49 THEN 40
WHEN x.Age >= 50 THEN 50
ELSE NULL
END
) y
)
SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)
FROM #AgeGroup ag
LEFT JOIN cte_Users a ON a.AgeDecade = ag.AgeGroupID
GROUP BY ag.AgeGroupID;
Choosing between the two is purely preference. There's no performance gain to using a CTE here.
Bonus:
If you wanted to table drive your groups and also have 0 counts, you could do something like this...though I will warn you to be careful using APPLY operators because they can be tricky with performance sometimes.
IF OBJECT_ID('tempdb..#User','U') IS NOT NULL DROP TABLE #User; --SELECT * FROM #User
WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(x)) -- 10
, c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y) -- 10 * 10
SELECT UserID = IDENTITY(int,1,1)
, DateOfBirth = CONVERT(date, GETDATE()-(RAND(CHECKSUM(NEWID()))*18500))
INTO #User
FROM c2 u;
IF OBJECT_ID('tempdb..#AgeRange','U') IS NOT NULL DROP TABLE #AgeRange; --SELECT * FROM #AgeRange
CREATE TABLE #AgeRange (
AgeRangeID tinyint NOT NULL IDENTITY(1,1),
RangeStart tinyint NOT NULL,
RangeEnd tinyint NOT NULL,
RangeLabel varchar(100) NOT NULL,
);
INSERT INTO #AgeRange (RangeStart, RangeEnd, RangeLabel)
VALUES ( 0, 29, '< 29')
, (30, 39, '30 - 39')
, (40, 49, '40 - 49')
, (50, 255, '50+');
-- Using an OUTER APPLY
SELECT ar.RangeLabel, COUNT(y.UserID)
FROM #AgeRange ar
OUTER APPLY (
SELECT u.UserID
FROM #User u
CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
WHERE x.Age BETWEEN ar.RangeStart AND ar.RangeEnd
) y
GROUP BY ar.RangeLabel, ar.RangeStart
ORDER BY ar.RangeStart;
-- Using a CTE
WITH cte_users AS (
SELECT u.UserID
, Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000
FROM #User u
)
SELECT ar.RangeLabel, COUNT(u.UserID)
FROM #AgeRange ar
LEFT JOIN cte_users u ON u.Age BETWEEN ar.RangeStart AND ar.RangeEnd
GROUP BY ar.RangeStart, ar.RangeLabel
ORDER BY ar.RangeStart;
I would start by putting the age computation in a lateral join, so it can easily be referred to. Then, if you want the age groups as rows, you can join a derived table that describes the intervals:
select v.age_group, count(*) as cnt_users
from dbo.[User] u
cross apply (values
((convert(int, convert(char(8), getdate(),112)) - convert(char(8), u.[DateOfBirth], 112))/10000)
) a(age)
inner join (values
( 0, 30, '0-30'),
(30, 40, '30-40'),
(40, 50, '40-50'),
(50, null, '50+')
) v(min_age, max_age, age_group)
on a.age >= v.min_age
and (a.age < v.max_age or v.max_age is null)
group by v.age_group
On the other hands, if you want the counts in columns, use conditional aggregation:
select
sum(case when a.age < 30 then 1 else 0 end) as age_0_30,
sum(case when a.age >= 30 and a.age < 40 then 1 else 0 end) as age_30_40,
sum(case when a.age >= 40 and a.age < 50 then 1 else 0 end) as age_40_50,
sum(case when a.age >= 50 then 1 else 0 end) as age_50
from dbo.[User] u
cross apply (values
((convert(int, convert(char(8), getdate(),112)) - convert(char(8), [DateOfBirth], 112))/10000)
) a(age)
yes you can.
this query should work with you
SELECT STR(ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) + 's' AS [Age Group], COUNT(UserId) AS Count
FROM dbo.User
GROUP BY STR(ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) + 's'
for your updated question
SELECT CASE
WHEN (ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) < 30 THEN '20s'
WHEN (ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) >= 50 THEN '50s'
ELSE str(ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) + 's'
END AS [Age Group], COUNT(UserId) AS Count
FROM dbo.User
GROUP BY CASE
WHEN (ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) < 30 THEN '20s'
WHEN (ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) >= 50 THEN '50s'
ELSE str(ROUND(DATEDIFF(year, DateOfBirth, GETDATE()), - 1) - 10) + 's'
END
You could use round with a length argument of -1 and a non-zero function argument to truncate the value to "tens", and group by it:
SELECT UserId,
Round((Convert(int,Convert(char(8),GETDATE(),112))-Convert(char(8),[DateOfBirth],112))/10000, -1, 1) AS [Rounded Age],
Count(*)
FROM dbo.[User]
GROUP BY Round((Convert(int,Convert(char(8),GETDATE(),112))-Convert(char(8),[DateOfBirth],112))/10000, -1, 1)
I'm trying to apply a condition to LAG in a SQL query. Does anyone know how to do this?
This is the query:
SELECT CONCAT([FirstName],' ',[LastName]) AS employee,
CAST([ArrivalTime] AS DATE) AS date,
CAST(DATEADD(hour,2,FORMAT([ArrivalTime],'HH:mm')) AS TIME) as time,
CASE [EventType]
WHEN 20001 THEN 'ENTRY'
ELSE 'EXIT'
END AS Action,
OutTime =
CASE [EventType]
WHEN '20001'
THEN DATEDIFF(minute,Lag([ArrivalTime],1) OVER(ORDER BY [CardHolderID], [ArrivalTime]), [ArrivalTime])
ELSE
NULL
END
FROM [CCFTEvent].[dbo].[ReportEvent]
LEFT JOIN [CCFTCentral].[dbo].[Cardholder] ON [CCFTEvent].[dbo].[ReportEvent].[CardholderID] = [CCFTCentral].[dbo].[Cardholder].[FTItemID]
WHERE EventClass = 41
AND [FirstName] IS NOT NULL
AND [FirstName] LIKE 'Leeann%'
The problem I have is when the times are subtracted between two different dates, it must also be NULL when subtracting between two different dates.
The 910 is incorrect.
I'd add another condition to your case statement. i.e.
...
CASE
WHEN [EventType] = '20001' AND DATEDIFF(DAY,[ArrivalTime],LAG([ArrivalTime]) over (ORDER BY [CardHolderID], [ArrivalTime])) > 0
THEN NULL
WHEN [EventType] = '20001'
THEN DATEDIFF(minute,Lag([ArrivalTime],1) OVER(ORDER BY [CardHolderID], [ArrivalTime]), [ArrivalTime])
ELSE NULL
It seems to me that the LAG just needs to be partitioned by the date (& some other fields for good measure).
If the previous date is in another partition,
then the LAG will return NULL,
then the datediff will return NULL.
SELECT
CONCAT(holder.FirstName+' ', holder.LastName) AS employee,
CAST(repev.ArrivalTime AS DATE) AS [date],
CAST(SWITCHOFFSET(repev.ArrivalTime,'+02:00') AS TIME) as [time],
IIF(repev.EventType = 20001, 'ENTRY', 'EXIT') AS Action,
(CASE WHEN repev.EventType = 20001
THEN DATEDIFF(minute, LAG(repev.ArrivalTime)
OVER (PARTITION BY repev.EventClass, repev.CardholderID, CAST(repev.ArrivalTime AS DATE)
ORDER BY repev.ArrivalTime), repev.ArrivalTime)
END) AS OutTime
FROM [CCFTEvent].[dbo].[ReportEvent] AS repev
LEFT JOIN [CCFTCentral].[dbo].[Cardholder] AS holder ON holder.FTItemID = repev.CardholderID
WHERE repev.EventClass = 41
AND holder.FirstName LIKE 'Leeann%'
Test on db<>fiddle here
I have a SQL Server view to show an overview of account statements, first we calculate the latest closing balances of the user accounts to know what the latest balance was from their account. This is the LATEST_CB_DATES part.
Than we calculate the next business days, meaning the 2 next days where we are expecting to receive a balance in the database. This happens in NEXT_B_DAYS
Finally we calculate if the account is expecting a closing balance, received one or received one too late. Note that we use a window reception ending for this.
IF EXISTS (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = 'VIEW_AS_AS_ACCT_STAT')
DROP VIEW VIEW_AS_AS_ACCT_STAT
GO
CREATE VIEW VIEW_AS_AS_ACCT_STAT AS
WITH LATEST_CB_DATES AS (
SELECT * FROM (
SELECT row_number() over (partition by SD_ACCT.ID order by (AS_ACCT_STAT.CBAL_BAL_DATE) DESC) RN,SD_ACCT.ID, SD_ACCT.ACCT_NBR, AS_ACCT_STAT.CBAL_BAL_DATE AS BAL_DATE, SD_ACCT.CODE, SD_ACCT.CCY, SD_ACCT_GRP.ID AS GRP_ID, SD_ACCT_GRP.CODE AS ACCT_GRP_CODE, SD_ACCT.DATA_OWNER_ID, AS_ACCT_STAT.STATIC_DATA_BNK AS BANK_CODE, AS_ACCT_STAT.STATIC_DATA_HLD AS HOLDER_CODE
FROM SD_ACCT
LEFT JOIN AS_ACCT on SD_ACCT.ID = AS_ACCT.STATIC_DATA_ACCT_ID
LEFT JOIN AS_ACCT_STAT on AS_ACCT.ID = AS_ACCT_STAT.ACCT_ID
JOIN SD_ACCT_GRP_MEMBER ON SD_ACCT.ID = SD_ACCT_GRP_MEMBER.ACCT_ID
JOIN SD_ACCT_GRP on SD_ACCT_GRP_MEMBER.GRP_ID = SD_ACCT_GRP.ID
JOIN SD_ACCT_GRP_ROLE on SD_ACCT_GRP_ROLE.ID = SD_ACCT_GRP.ROLE_ID
WHERE SD_ACCT_GRP_ROLE.CODE = 'AccountStatementsToReceive' AND (AS_ACCT_STAT.VALID = 1 OR AS_ACCT_STAT.VALID IS NULL)
) LST_STMT
WHERE RN = 1
),
NEXT_B_DAYS AS (
SELECT VIEW_BUSINESS_DATES.CAL_ID, VIEW_BUSINESS_DATES.BUSINESS_DATE,
LEAD(VIEW_BUSINESS_DATES.BUSINESS_DATE, 1) OVER (PARTITION BY VIEW_BUSINESS_DATES.CAL_CODE ORDER BY VIEW_BUSINESS_DATES.BUSINESS_DATE) AS NEXT_BUSINESS_DATE,
LEAD(VIEW_BUSINESS_DATES.BUSINESS_DATE, 2) OVER (PARTITION BY VIEW_BUSINESS_DATES.CAL_CODE ORDER BY VIEW_BUSINESS_DATES.BUSINESS_DATE) AS SECOND_BUSINESS_DATE
FROM VIEW_BUSINESS_DATES
)
SELECT LATEST_CB_DATES.ID AS ACCT_ID,
LATEST_CB_DATES.CODE AS ACCT_CODE,
LATEST_CB_DATES.ACCT_NBR,
LATEST_CB_DATES.CCY AS ACCT_CCY,
LATEST_CB_DATES.BAL_DATE AS LATEST_CLOSING_BAL_DATE,
LATEST_CB_DATES.DATA_OWNER_ID,
LATEST_CB_DATES.BANK_CODE,
LATEST_CB_DATES.HOLDER_CODE,
LATEST_CB_DATES.ACCT_GRP_CODE,
CASE
WHEN LATEST_CB_DATES.BAL_DATE IS NULL THEN 'Expecting'
WHEN NEXT_B_DAYS.NEXT_BUSINESS_DATE IS NULL OR NEXT_B_DAYS.SECOND_BUSINESS_DATE IS NULL THEN 'Late'
WHEN AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END IS NOT NULL AND GETDATE() >= TODATETIMEOFFSET(CAST(NEXT_B_DAYS.SECOND_BUSINESS_DATE AS DATETIME) + CAST(CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END AS TIME) AS DATETIME), SEC_TIMEZONE.UTC_TIME_TOTAL_OFFSET) THEN 'Late'
WHEN AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END IS NULL AND GETDATE() >= TODATETIMEOFFSET(CAST(NEXT_B_DAYS.SECOND_BUSINESS_DATE AS DATETIME) + CAST(CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_START AS TIME) AS DATETIME), SEC_TIMEZONE.UTC_TIME_TOTAL_OFFSET) AND CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END AS TIME) >= CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_START AS TIME) THEN 'Expecting'
WHEN AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END IS NULL AND GETDATE() >= TODATETIMEOFFSET(CAST(NEXT_B_DAYS.NEXT_BUSINESS_DATE AS DATETIME) + CAST(CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_START AS TIME) AS DATETIME), SEC_TIMEZONE.UTC_TIME_TOTAL_OFFSET) AND CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END AS TIME) < CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_START AS TIME) THEN 'Expecting' -- overnight
WHEN AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END IS NULL AND CAST (GETDATE() AS DATE) > NEXT_B_DAYS.SECOND_BUSINESS_DATE THEN 'Expecting'
ELSE 'Received'
END AS STAT,
CASE
WHEN LATEST_CB_DATES.BAL_DATE IS NULL THEN NULL
WHEN NEXT_B_DAYS.NEXT_BUSINESS_DATE IS NULL OR NEXT_B_DAYS.SECOND_BUSINESS_DATE IS NULL THEN NULL
WHEN AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END IS NOT NULL THEN CAST(NEXT_B_DAYS.SECOND_BUSINESS_DATE AS DATETIME) + CAST(CAST(AS_AS_RECEPTION_CONF.RECEPTION_WINDOW_END AS TIME) AS DATETIME)
ELSE NULL
END AS DEADLINE,
SEC_TIMEZONE.UTC_TIME_TOTAL_OFFSET AS TIME_ZONE
FROM AS_AS_RECEPTION_CONF
JOIN LATEST_CB_DATES ON AS_AS_RECEPTION_CONF.ACCT_GRP_ID = LATEST_CB_DATES.GRP_ID
JOIN SEC_TIMEZONE ON SEC_TIMEZONE.ID = AS_AS_RECEPTION_CONF.TIME_ZONE_ID
LEFT JOIN NEXT_B_DAYS ON AS_AS_RECEPTION_CONF.CALENDAR_ID = NEXT_B_DAYS.CAL_ID AND LATEST_CB_DATES.BAL_DATE = NEXT_B_DAYS.BUSINESS_DATE
GO
SELECT * FROM VIEW_AS_AS_ACCT_STAT
What is the issue? Nothing, this works fine, but it's slow. We created a graphical report to display the data for our customers, but it takes 1minute, 30 seconds to load this SQL when you have 5000 accounts, which is too slow.
I guess the reason is the last line, but I didn't manage to refactor it well
LEFT JOIN NEXT_B_DAYS ON AS_AS_RECEPTION_CONF.CALENDAR_ID =
NEXT_B_DAYS.CAL_ID AND LATEST_CB_DATES.BAL_DATE =
NEXT_B_DAYS.BUSINESS_DATE
The exeuction plan of my sql can be found here
How can I refactor this to make my view still work but much more performant?
I am trying to run through a table and get the sum of the mins column however it always tells me that mins is not a valid column name
SELECT
acs.cid,
DATEDIFF (n, acs.StartTime, acs.EndTime ) as prepromomins,
CASE
WHEN (p.multiplier is null) THEN DATEDIFF (n , acs.StartTime, acs.EndTime)
ELSE (DATEDIFF ( n , acs.StartTime , acs.EndTime ) * p.multiplier)
END AS mins
FROM
activecards as acs
LEFT JOIN Promotions as p
ON acs.StartTime > p.StartTime and acs.EndTime < p.EndTime
Try this one :
SELECT SUM(DATEDIFF(n, acs.StartTime, acs.EndTime) * ISNULL(p.multiplier,1)) AS TotalMins
FROM activecards as acs
LEFT JOIN Promotions as p
ON acs.StartTime > p.StartTime and acs.EndTime < p.EndTime