Why are the resulting date records from this DATEADD() expansion (union) query out of order after inserting into a table? - sql

The below query should (I believe) create a list of dates + hours in ascending order between the specified start and end datetimes, and then I want to insert them into a temporary table. I expect the table to contain all the new records in ascending order, as the select would return by itself. In practice, the table records begin at the start date and increment as expected up to March 8th, 2023 hour 19 and then the next record is August 14th, 2023 hour 12. The missing dates appear at the end of the table. Even if I try to subsequently sort the dates with another query, they are still not in the correct order. Why is that?
DECLARE #StartDateTime DATETIME = '2023-02-17 00:00:00'
DECLARE #EndDateTime DATETIME = '2023-12-31 00:00:00';
CREATE TABLE #NewDates (ExpandedDateTime DATETIME);
WITH ExpDates AS
(
SELECT #StartDateTime AS ExpandedDateTime
UNION ALL
SELECT DATEADD(HOUR, 1, ExpandedDateTime)
FROM ExpDates
WHERE DATEADD(HOUR, 1, ExpandedDateTime) <= #EndDateTime
)
INSERT INTO #NewDates
SELECT ExpandedDateTime
FROM ExpDates
OPTION (MAXRECURSION 0)
SELECT * FROM #NewDates
Results from temp table query

Related

How to calculate average in date column

I don't know how to calculate the average age of a column of type date in SQL Server.
You can use datediff() and aggregation. Assuming that your date column is called dt in table mytable, and that you want the average age in years over the whole table, then you would do:
select avg(datediff(year, dt, getdate())) avg_age
from mytable
You can change the first argument to datediff() (which is called the date part), to any other supported value depending on what you actually mean by age; for example datediff(day, dt, getdate()) gives you the difference in days.
First, lets calculate the age in years correctly. See the comments in the code with the understanding that DATEDIFF does NOT calculate age. It only calculates the number of temporal boundaries that it crosses.
--===== Local obviously named variables defined and assigned
DECLARE #StartDT DATETIME = '2019-12-31 23:59:59.997'
,#EndDT DATETIME = '2020-01-01 00:00:00.000'
;
--===== Show the difference in milliseconds between the two date/times
-- Because of the rounding that DATETIME does on 3.3ms resolution, this will return 4ms,
-- which certainly does NOT depict an age of 1 year.
SELECT DATEDIFF(ms,#StartDT,#EndDT)
;
--===== This solution will mistakenly return an age of 1 year for the dates given,
-- which are only about 4ms apart according the SELECT above.
SELECT IncorrectAgeInYears = DATEDIFF(YEAR, #StartDT, #EndDT)
;
--===== This calulates the age in years correctly in T-SQL.
-- If the anniversary data has not yet occurred, 1 year is substracted.
SELECT CorrectAgeInYears = DATEDIFF(yy, #StartDT, #EndDT)
- IIF(DATEADD(yy, DATEDIFF(yy, #StartDT, #EndDT), #StartDT) > #EndDT, 1, 0)
;
Now, lets turn that correct calculation into a Table Valued Function that returns a single scalar value producing a really high speed "Inline Scalar Function".
CREATE FUNCTION [dbo].[AgeInYears]
(
#StartDT DATETIME, --Date of birth or date of manufacture or start date.
#EndDT DATETIME --Usually, GETDATE() or CURRENT_TIMESTAMP but
--can be any date source like a column that has an end date.
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT AgeInYears = DATEDIFF(yy, #StartDT, #EndDT)
- IIF(DATEADD(yy, DATEDIFF(yy, #StartDT, #EndDT), #StartDT) > #EndDT, 1, 0)
;
Then, to Dale's point, let's create a test table and populate it. This one is a little overkill for this problem but it's also useful for a lot of different examples. Don't let the million rows scare you... this runs in just over 2 seconds on my laptop including the Clustered Index creation.
--===== Create and populate a large test table on-the-fly.
-- "SomeInt" has a range of 1 to 50,000 numbers
-- "SomeLetters2" has a range of "AA" to "ZZ"
-- "SomeDecimal has a range of 10.00 to 100.00 numbers
-- "SomeDate" has a range of >=01/01/2000 & <01/01/2020 whole dates
-- "SomeDateTime" has a range of >=01/01/2000 & <01/01/2020 Date/Times
-- "SomeRand" contains the value of RAND just to show it can be done without a loop.
-- "SomeHex9" contains 9 hex digits from NEWID()
-- "SomeFluff" is a fixed width CHAR column just to give the table a little bulk.
SELECT TOP 1000000
SomeInt = ABS(CHECKSUM(NEWID())%50000) + 1
,SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
+ CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
,SomeDecimal = CAST(RAND(CHECKSUM(NEWID())) * 90 + 10 AS DECIMAL(9,2))
,SomeDate = DATEADD(dd, ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'2000','2020')), '2000')
,SomeDateTime = DATEADD(dd, DATEDIFF(dd,0,'2000'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020'))
,SomeRand = RAND(CHECKSUM(NEWID())) --CHECKSUM produces an INT and is MUCH faster than conversion to VARBINARY.
,SomeHex9 = RIGHT(NEWID(),9)
,SomeFluff = CONVERT(CHAR(170),'170 CHARACTERS RESERVED') --Just to add a little bulk to the table.
INTO dbo.JBMTest
FROM sys.all_columns ac1 --Cross Join forms up to a 16 million rows
CROSS JOIN sys.all_columns ac2 --Pseudo Cursor
;
GO
--===== Add a non-unique Clustered Index to SomeDateTime for this demo.
CREATE CLUSTERED INDEX IXC_Test ON dbo.JBMTest (SomeDateTime ASC)
;
Now, lets find the average age of those million represented by the SomeDateTime column.
SELECT AvgAgeInYears = AVG(age.AgeInYears )
,RowsCounted = COUNT(*)
FROM dbo.JBMTest tst
CROSS APPLY dbo.AgeInYears(SomeDateTime,GETDATE()) age
;
Results:

How to single out values in a database based on date being month-end?

I have a problem where I need to query a database which includes multiple lines of trade activity for the past 90 days. Currently the query is built to determine the average amount over the 90 day period - so each day has a single exposure value and the query helps us determine the average exposure over 90 days by just summing the daily values and then dividing by 90. And it does this as the date rolls forward, so the value is updated each day the query is run.
The above is simple enough to execute, but now I need to determine the average month-end amounts for the past 3 months. I've figured out how to pull just month-end dates, but not sure how to join that with the current query. Additionally, needs to be able to update itself rolling forward.
/* Test query below */
DECLARE #Date DATETIME = Getdate()
DECLARE #daycount INT = 90
DECLARE #startDate DATETIME = Dateadd(dd, #daycount*-1, #Date)
SELECT sub.Instrument,
( Sum(sub.GrossExposure) / #daycount ) AS AvgGrossExposure
FROM (SELECT DateField,
Instrument,
GrossExposure
FROM table
WHERE DateField <= #Date
AND Datefield >= #startDate
) sub
GROUP BY Instrument
To calculate month-ends in the past 90 days, I've fiddled around with this, but it also includes today's date and I do not need that value in this case.
/* Test query for month-end dates, past 90 days */
DECLARE #Date DATETIME = GetDate()
DECLARE #daycount INT = 90
DECLARE #startDate DATETIME = Dateadd(dd, #daycount*-1, #Date)
SELECT max(datefield) AS month_ends
FROM table
WHERE datefield <= #Date
AND datefield >= #startDate
GROUP BY month(datefield),
year(datefield)
ORDER BY month_ends
Give this a try - you can use a common table expression to append the month end date of each DateField value using EOMONTH(DateField), and then use that in your GROUP BY, with the Average of all GrossExposure values that have that same EOMONTH value for each instrument.
WITH CTE AS (
SELECT EOMONTH(DateField) AS EndOfMonthDate
,DateField
,Instrument
,GrossExposure
FROM TABLE
WHERE DateField BETWEEN GETDATE()-90 AND GETDATE()
)
SELECT CTE.Instrument,
CTE.EndOfMonthDate,
AVG(CTE.GrossExposure) AS AvgGrossExposure
FROM CTE
GROUP BY CTE.Instrument, CTE.EndOfMonthDate

How to create a temp table with values from another table aggregated weekly?

It is a bit difficult to explain but basically I need to create a temporary table (date datetime, #customers int) where #customers is the number of weekly customers pulled from another table. Here's my code.
declare #date datetime
declare #temptable table (date datetime not null,#customers int)
set #date='2018-02-13'
while #date<getdate()
begin
insert into #temptable values
(#date,
(select count(*) from in_ft_conversion
where u4='cfa' and sales_date between #date and #date-7))
set #date=#date+7
end
The result is a table with all the correct date entries but 0 in the customer column... Does anybody know what I'm doing wrong? Thanks!
Your date range is wrong , swap the date values in the BETWEEN so you have BETWEEN <earlier date> AND <later date>
where u4='cfa' and sales_date between #date-7 and #date))
Why would you use a while loop for this? I think you want something like this:
insert into #temptable (date, num_customers)
select dateadd(day, '2018-02-08', weekno * 7)
count(*)
from in_ft_conversion cross apply
(values (datediff(day, '2018-02-08', sales_date) / 7
) v(weekno)
where u4 = 'cfa' and sales_date >= '2018-02-08'
group by v.weekno;
No loop is necessary.
Your problem is specifically the between comparison:
sales_date between #date and #date-7
The dates are backwards -- the lower bound needs to go first.
But, I also doubt that you want to count weeks with 8 days and have one day overlap on each week. I think the above logic does what you want, but you can adjust the date arithmetic to get the exact dates you want.

Moving T-SQL datatime column for X days overlapping UNIQUE constraint

The problem is to use DATEADD function on column with unique value constraint taking into consideration the fact that new values will overlap existing values and in fact there will be violation of constraint, because we can not have two rows with the same date.
e.g. I have table with column [SomeDate] which is of type DateTime and has constraint to be unique. I have dates starting from 2017-01-01 to 2018-01-01 and want to update records by adding 7 days to each of them.
If you update all rows, there should be no problem with a unique constraint.
Here is a quick example:
CREATE TABLE T
(
SomeDate date NOT NULL,
CONSTRAINT uc UNIQUE (SomeDate)
)
;WITH CTE AS
(
SELECT CAST(GETDATE() As Date) As TheDate
UNION ALL
SELECT CAST(DateADD(DAY, 1, TheDate) As Date)
FROM CTE
WHERE TheDate < DATEADD(DAY, 10, GETDATE())
)
INSERT INTO T(SomeDate)
SELECT TheDate
FROM CTE
UPDATE T
SET SomeDate = DATEADD(DAY, 3, SomeDate)
You can see it in action on rextester
One of the possible way is to move the dates ahead to go out of current min-max range and then bring them back taking into account how many days we want to add. Here is ready and working solution:
--Number of days we want to add to existing dates
DECLARE #daysToMoveAhead int = 7;
DECLARE #minDate datetime = (SELECT MIN([SomeDate]) from dbo.MyTable)
DECLARE #maxDate datetime = (SELECT MAX([SomeDate]) from dbo.MyTable)
DECLARE #diff int = DATEDIFF(DAY,#minDate,#maxDate)
--temporary move the dates out of existing min-max range
update dbo.MyTable set [SomeDate] = DATEADD(DAY, #diff,[SomeDate]);
--bring dates back and add as many days as we wanted
update dbo.MyTable set [SomeDate] = DATEADD(DAY, #daysToMoveAhead - #diff,[SomeDate]);

Grouping by contiguous dates, ignoring weekends in SQL

I'm attempting to group contiguous date ranges to show the minimum and maximum date for each range. So far I've used a solution similar to this one: http://www.sqlservercentral.com/articles/T-SQL/71550/ however I'm on SQL 2000 so I had to make some changes. This is my procedure so far:
create table #tmp
(
date smalldatetime,
rownum int identity
)
insert into #tmp
select distinct date from testDates order by date
select
min(date) as dateRangeStart,
max(date) as dateRangeEnd,
count(*) as dates,
dateadd(dd,-1*rownum, date) as GroupID
from #tmp
group by dateadd(dd,-1*rownum, date)
drop table #tmp
It works exactly how I want except for one issue: weekends. My data sets have no records for weekend dates, which means any group found is at most 5 days. For instance, in the results below, I would like the last 3 groups to show up as a single record, with a dateRangeStart of 10/6 and a dateRangeEnd of 10/20:
Is there some way I can set this up to ignore a break in the date range if that break is just a weekend?
Thanks for the help.
EDITED
I didn't like my previous idea very much. Here's a better one, I think:
Based on the first and the last dates from the set of those to be grouped, prepare the list of all the intermediate weekend dates.
Insert the working dates together with weekend dates, ordered, so they would all be assigned rownum values according to their normal order.
Use your method of finding contiguous ranges with the following modifications:
1) when calculating dateRangeStart, if it's a weekend date, pick the nearest following weekday;
2) accordingly for dateRangeEnd, if it's a weekend date, pick the nearest preceding weekday;
3) when counting dates for the group, pick only weekdays.
Select from the resulting set only those rows where dates > 0, thus eliminating the groups formed only of the weekends.
And here's an implementation of the method, where it is assumed, that a week starts on Sunday (DATEPART returns 1) and weekend days are Sunday and Saturday:
DECLARE #tmp TABLE (date smalldatetime, rownum int IDENTITY);
DECLARE #weekends TABLE (date smalldatetime);
DECLARE #minDate smalldatetime, #maxDate smalldatetime, #date smalldatetime;
/* #1 */
SELECT #minDate = MIN(date), #maxDate = MAX(date)
FROM testDates;
SET #date = #minDate - DATEPART(dw, #minDate) + 7;
WHILE #date < #maxDate BEGIN
INSERT INTO #weekends
SELECT #date UNION ALL
SELECT #date + 1;
SET #date = #date + 7;
END;
/* #2 */
INSERT INTO #tmp
SELECT date FROM testDates
UNION
SELECT date FROM #weekends
ORDER BY date;
/* #3 & #4 */
SELECT *
FROM (
SELECT
MIN(date + CASE DATEPART(dw, date) WHEN 1 THEN 1 WHEN 7 THEN 2 ELSE 0 END)
AS dateRangeStart,
MAX(date - CASE DATEPART(dw, date) WHEN 1 THEN 2 WHEN 7 THEN 1 ELSE 0 END)
AS dateRangeEnd,
COUNT(CASE WHEN DATEPART(dw, date) NOT IN (1, 7) THEN date END) AS dates,
DATEADD(d, -rownum, date) AS GroupID
FROM #tmp
GROUP BY DATEADD(d, -rownum, date)
) s
WHERE dates > 0;