Subtracting date using DATEDIFF function and getting NULL as results - sql

So I'm tasked with finding the delay in weeks for each order. I've used the DATEDIFF function and I'd like to believe I'm on the right track but when I use it I get NULL as the result. The data type for each column are both date.
SELECT DISTINCT Sales.Orders.custid, Sales.Customers.companyname,
CASE
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 7 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 14 THEN '1 Week'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 14 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 21 THEN '2 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 21 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 28 THEN '3 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 28 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 35 THEN '4 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 35 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 42 THEN '5 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) = 42 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 49 THEN '6 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) > 49 THEN '7+ Weeks'
ELSE 'Unknown'
END AS Order_Delay
FROM Sales.Orders, Sales.Customers
ORDER BY
Order_Delay ASC;
I'm using MS SQL Server Management Studio 2016.

Try rewriting your query like this:
SELECT DISTINCT Sales.Orders.custid, Sales.Customers.companyname,
CASE
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) >= 7 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 14 THEN '1 Week'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 21 THEN '2 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 28 THEN '3 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 35 THEN '4 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 42 THEN '5 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 49 THEN '6 Weeks'
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) >= 49 THEN '7+ Weeks'
ELSE 'Unknown'
END AS Order_Delay
FROM Sales.Orders, Sales.Customers
ORDER BY
Order_Delay ASC;
I think you want to check whether difference is in particular range (from 7 to 14, etc.).
So I corrected first cndition:
WHEN DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) >= 7 AND DATEDIFF(DAY, Sales.Orders.shippeddate, Sales.Orders.orderdate) < 14 THEN '1 Week'
You could not use BETWEEN here, as its range includes also broders of a set.
For other cases, you don't need to check whether the difference is greater than, eg. in second WHEN you know that the difference is >=14, since it failed first condition, etc.

FROM Sales.Orders, Sales.Customers is an old-fashioned way of forming a cross join. This is probably accidental but the impact of this can be awful both in terms of performance but also the results can be plain wrong - and they are in your example. It is for this reason that I always recommend you use explicit join syntax such as inner join and cease using commas as a way to define the from clause.
You simply have to properly join the 2 tables, otherwise every order is applied to every customer and the results would be quite wrong. I have guessed that join, but it should look something like the one seen below:
SELECT /* DISTINCT ?? */
Sales.Orders.custid
, Sales.Customers.companyname
, CASE
WHEN ca.Order_Delay >= 7 THEN '7+ Weeks'
WHEN ca.Order_Delay >= 1 AND ca.Order_Delay < 7 THEN CAST(ca.Order_Delay AS varchar) + ' Weeks'
ELSE 'Unknown'
END AS order_delay
FROM Sales.Orders AS o
INNER JOIN Sales.Customers AS c ON o.custid = c.id
CROSS APPLY (
SELECT
FLOOR(DATEDIFF(DAY, o.shippeddate, o.orderdate) / 7)
) ca (order_delay)
ORDER BY
order_delay ASC
;
In SQL Server is possible to use cross apply as a way to perform a calculation, and give that calculation an alias that you can then use in the select clause. This can have the effect of making your code somewhat easier to read, but this is optional.
Above I have suggested a way to use floor() which you should read about here:
https://learn.microsoft.com/en-us/sql/t-sql/functions/floor-transact-sql?view=sql-server-2017
nb: If you want to show data for unshipped orders then you may need to change to an outer apply, and if an order is unshipped the the datediff() function would return NULL and your case expression would need to explicitly cater for NULLs
SELECT /* DISTINCT ?? */
Sales.Orders.custid
, Sales.Customers.companyname
, CASE
WHEN ca.Order_Delay >= 7 THEN '7+ Weeks'
WHEN ca.Order_Delay >= 1 AND ca.Order_Delay < 7 THEN CAST(ca.Order_Delay AS varchar) + ' Weeks'
WHEN ca.Order_Delay IS NULL then 'Unshipped'
ELSE 'Unknown'
END AS order_delay
FROM Sales.Orders AS o
INNER JOIN Sales.Customers AS c ON o.custid = c.id
OUTER APPLY (
SELECT
FLOOR(DATEDIFF(DAY, o.shippeddate, o.orderdate) / 7)
) ca (order_delay)
ORDER BY
order_delay ASC
;

https://learn.microsoft.com/en-us/sql/t-sql/functions/datediff-transact-sql?view=sql-server-2017
DATEDIFF ( datepart , startdate , enddate )
My guess is that in your system shippeddate is mostly later or equal to orderdate, so instead of
FLOOR(DATEDIFF(DAY, o.shippeddate, o.orderdate) / 7)
you might want
FLOOR(DATEDIFF(DAY, o.orderdate, o.shippeddate) / 7)
to check this assumption you might want to add ca.Order_Delay (as per code suggested by #Used_by_already) to list of selected columns and see what values are there. My bet is they are all negative.

You are looking for exactly 7, 14, 21 etc. You need >= 7 instead (and repeat for the rest...).

Related

Boolean expression in CASE in WHERE clause does not work

I am having problems with Firebird SQL statement in version Firebird 2.5.
Based on today's date, I have to select either this month's data or the previous month's data.
SELECT * FROM FA_DOBAVNICA
WHERE
1=1
AND CASE WHEN
extract(day from cast('Now' as date)) < 9
THEN
DATUM_NAROCILA BETWEEN 'start of previous month' AND 'end of previous month'
ELSE
DATUM_NAROCILA BETWEEN 'start of this month' AND 'end of this month'
END
I am getting a 104 error Token unknown for BETWEEN. I have no idea what am I doing wrong.
If I understood your problem correctly you can rephrase it with an or clause like this and it should do the job:
SELECT * FROM FA_DOBAVNICA
WHERE
1=1
AND
((extract(day from cast('Now' as date)) < 9 AND DATUM_NAROCILA BETWEEN 'start of previous month' AND 'end of previous month')
or
(extract(day from cast('Now' as date)) > 8 AND DATUM_NAROCILA BETWEEN 'start of this month' AND 'end of this month'))
Personally, I would avoid forking out conditions variants in where and make it two queries instead. I suspect conditional where might suppress Firebird's query optimizer, and two distinct queries connected together might end up with a better query plan using indexes.
Especially if you have more conditions than actually shown there which 1=1 placeholder implies. You have to check and compare real plans using you real data and real added conditions.
SELECT * FROM FA_DOBAVNICA
WHERE extract(day from cast('Now' as date)) < 9
AND DATUM_NAROCILA BETWEEN 'start of previous month' AND 'end of previous'
UNION ALL
SELECT * FROM FA_DOBAVNICA
WHERE extract(day from cast('Now' as date)) >= 9
AND DATUM_NAROCILA BETWEEN 'start of this month' AND 'end of this month'
However in your specific case - why using CASE at all? Why not computing the target date span instead?
WITH
TargetDay as
(SELECT DATEADD( -8 DAY TO CURRENT_DATE) AS TPoint FROM RDB$DATABASE)
,TargetStart as
(SELECT DATEADD( 1 - EXTRACT(DAY FROM TPoint) DAY TO TPoint) AS TStart FROM TargetDay)
,TargetEnd as
(SELECT DATEADD( -1 DAY TO DATEADD( 1 MONTH TO TStart)) AS TEnd FROM TargetStart)
select TStart, TEnd, TPoint from TargetStart, TargetEnd, TargetDay
TSTART
TEND
TPOINT
2021-08-01
2021-08-31
2021-08-03
db<>fiddle here
See? you do not need any if-then-else at all!
Added bonus is that you now can easily turn your 9 into an SQL parameter, rather than having it a literal constant injected into SQL code via always fragile string splicing, because now you only use 9 once in your query, thus you no more have have to care about changing two different parameters/constants in a query always in sync.
WITH
TargetDay as
(SELECT DATEADD( 1 - (cast( ? as integer )) DAY TO CURRENT_DATE) AS TPoint FROM RDB$DATABASE)
,TargetStart as....
Or in languages like Delphi which simulate named parameters for Firebird, it can be like
WITH
TargetDay as
(SELECT DATEADD( 1 - (cast( :ThresholdDay as integer )) DAY TO CURRENT_DATE) AS TPoint FROM RDB$DATABASE)
,TargetStart as....

Move code to nested query to resolve copypaste

I have sql query
Here is code
SELECT tt.creationdate AS CreatedDate,
DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) AS DaysOpen,
CASE WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 180 THEN '180+ Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 150 THEN '150 - 180 Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 120 THEN '120 - 150 Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 90 THEN '90 - 120 Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 60 THEN '60 - 90 Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 30 THEN '30 - 60 Days'
WHEN DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) >= 0 THEN '0 - 30 Days'
ELSE NULL END AS TaskAging,
tms.SupportType,
tms.SupportModule,
tt.*
FROM public.tasks tt
LEFT JOIN
public.tasks_meta_support tms
ON tms.taskid = tt.Id
WHERE tt.issupportticket = 1
AND tt.supportorganizationid = 65277
AND tt.completeddate IS NULL
AND tt.isdeleted = 0
I need to move DaysOpen to nested query to reuse it in CASE
How I can do this correctly?
Just use a subquery:
SELECT tt.DaysOpen,
(CASE WHEN tt.DaysOpen >= 180 THEN '180+ Days'
WHEN tt.DaysOpen >= 150 THEN '150 - 180 Days'
WHEN tt.DaysOpen >= 120 THEN '120 - 150 Days'
WHEN tt.DaysOpen >= 90 THEN '90 - 120 Days'
WHEN tt.DaysOpen >= 60 THEN '60 - 90 Days'
WHEN tt.DaysOpen >= 30 THEN '30 - 60 Days'
WHEN tt.DaysOpen >= 0 THEN '0 - 30 Days'
END )AS TaskAging,
tms.SupportType,
tms.SupportModule,
tt.*
FROM (SELECT tt.*, DATEDIFF(DAY, CAST(tt.creationdate AS DATE), GETDATE()) AS DaysOpen
FROM public.tasks tt
) tt LEFT JOIN
public.tasks_meta_support tms
ON tms.taskid = tt.Id
WHERE tt.issupportticket = 1
AND tt.supportorganizationid = 65277
AND tt.completeddate IS NULL
AND tt.isdeleted = 0;
Note that the ELSE is redundant, so I removed it.
First I would move the GETDATE() to #Now and set it just before the query. You might get a difference in behaviour if this run 23:59:59.999, but in general you want the values fromwhen you asked, not when it's running.
Also try not to use wildcards when selecting columns.
Also, do you want all rows from public.tasks and only the matching one from publi.tasks_meta_support or all rows from public.tasksmeta_support? The order you write thing is really important here (https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2005/ms177634(v%3dsql.90)#e-using-the-sql-92-left-outer-join-syntax).
That said you could use a CTE or some a subquery.
I would go with a CTE and writing out all the columns.
Not knowing which columns there are in public.tasks maybe this subquery?
Note that I moved everything related to public.tasks to tt.
DECLARE #Now datetime;
SET #Now = #Now;
SELECT ptsq.CreatedDate,
ptsq.DaysOpen,
ptsq.TaskAging,
tms.SupportType,
tms.SupportModule,
ptsq.*
FROM (SELECT creationdate AS CreatedDate,
DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) AS DaysOpen,
CASE WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 180 THEN '180+ Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 150 THEN '150 - 180 Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 120 THEN '120 - 150 Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 90 THEN '90 - 120 Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 60 THEN '60 - 90 Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 30 THEN '30 - 60 Days'
WHEN DATEDIFF(DAY, CAST(creationdate AS DATE), #Now) >= 0 THEN '0 - 30 Days'
FROM public.tasks
WHERE issupportticket = 1
AND supportorganizationid = 65277
AND isdeleted = 0) ptsq
LEFT OUTER JOIN public.tasks_meta_support tms ON ptsq.taskid = tms.Id -- Assuming you want all rows from public.tasks.
WHERE ptsq.completeddate IS NULL -- Could probably be moved to ptsq

Redshift: Running query using GETDATE() at specified list of times

So, I have a query that uses GETDATE() in WHERE and HAVING clauses:
SELECT GETDATE(), COUNT(*) FROM (
SELECT 1 FROM events
WHERE (event_time > (GETDATE() - interval '25 hours'))
GROUP BY id
HAVING MAX(event_time) BETWEEN (GETDATE() - interval '25 hours') AND (GETDATE() - interval '24 hours')
)
I'm basically trying to find the number of unique ids that have their latest event_time between 25 and 24 hours ago with respect to the current time.
The problem: I have another table query_dts which contains one column containing timestamps. Instead of running the above query on the current time, using GETDATE(), I need to run in on the timestamp of every entry of the query_dts table. Any ideas?
Note: I'm not really storing query_dts anywhere. I've created it like this:
WITH query_dts AS (
SELECT (
DATEADD(hour,-(row_number() over (order by true)), getdate())
) as n
FROM events LIMIT 48
),
which I got from here
How about avoiding the generator altogether and instead just splitting the intervals:
SELECT
dateadd(hour, -distance, getdate()),
count(0) AS event_count
FROM (
SELECT
id,
datediff(hour, max(event_time), getdate()) AS distance
FROM events
WHERE event_time > getdate() - INTERVAL '2 days'
GROUP BY id) AS events_with_distance
GROUP BY distance;
You can use a JOIN to combine the two queries. Then you just need to substitute the values for your date expression. I think this is the logic:
WITH query_dts AS (
SELECT DATEADD(hour, -(row_number() over (order by true)), getdate()) as n
FROM events
LIMIT 48
)
SELECT d.n, COUNT(*)
FROM (SELECT d.n
FROM events e JOIN
query_dts d
WHERE e.event_time > d.n
GROUP BY id
HAVING MAX(event_time) BETWEEN n - interval '25 hours' AND n
) i;
Here's what I ended up doing:
WITH max_time_table AS
(
SELECT id, max(event_time) AS max_time
FROM events
WHERE (event_time > GETDATE() - interval '74 hours')
GROUP BY id
),
query_dts AS
(
SELECT (DATEADD(hour,-(row_number() over (ORDER BY TRUE) - 1), getdate()) ) AS n
FROM events LIMIT 48
)
SELECT query_dts.n, COUNT(*)
FROM max_time_table JOIN query_dts
ON max_time_table.max_time BETWEEN (query_dts.n - interval '25 hours') AND (query_dts.n - interval '24 hours')
GROUP BY query_dts.n
ORDER BY query_dts.n DESC
Here, I selected 74 hours because I wanted 48 hours ago + 25 hours ago = 73 hours ago.
The problem is that this isn't a general-purpose way of doing this. It's a highly specific solution for this particular problem. Can someone think of a more general way of running a query dependent on GETDATE() using a column of dates in another table?

SQL query not returning expected date time range

I am working on a query that I hope to be able to use to query against a database for a specific range of time on a specific date. If I query for a full day of data I get the correct data returned. One row per hour of data available (0 - 23).
WHERE Documents.CreationTime BETWEEN '2014-10-01 00:00:00.000' AND '2014-10-01 23:59:59.999'
If I attempt to query for a portion of the day, the results are unusual.
WHERE Documents.CreationTime BETWEEN '2014-10-01 00:00:00.000' AND '2014-10-01 06:00:00.000'
Part day query returns: (Note the hours jump from 0 to 19)
Hours Faxes Good Page Count
0 3 4
19 15 58
20 4 9
21 8 42
22 2 4
23 4 12
Here is my reduced query I created to try and resolve the issue.
SELECT DATEPART(hour, DATEADD(HH, - DATEDIFF(Hour, GETDATE(), GETUTCDATE()), Documents.CreationTime)) AS Hours
,COUNT(*) AS Faxes
,SUM(goodpagecount) AS [Good Page Count]
FROM Documents
JOIN Users
ON Documents.OwnerID = Users.handle
JOIN Groups
ON Users.GroupID = Groups.handle
JOIN History
ON History.OWNER = Documents.handle
JOIN HistoryTRX
ON History.handle = HistoryTRX.handle
WHERE Documents.CreationTime BETWEEN '2014-10-01 00:00:00.000'
AND '2014-10-01 06:00:00.000'
GROUP BY DATEPART(hour, DATEADD(HH, - DATEDIFF(Hour, GETDATE(), GETUTCDATE()), Documents.CreationTime))
ORDER BY DATEPART(hour, DATEADD(HH, - DATEDIFF(Hour, GETDATE(), GETUTCDATE()), Documents.CreationTime))
Any suggestions as to what I am missing or improvements?
EDIT- More details
The "Documents.CreationTime" is in UTC. I am looking to have the "Hours" column correspond to local time. In this case UTC -5 as of this entry.
How about using the DATEADD function in your where clause:
WHERE Documents.CreationTime >= '20141001' AND Documents.CreationTime <= DATEADD(HOUR,6,'20141001')
Interesting Blog on the comment made by Lamak written by Aaron Bertrand :
What do BETWEEN and the devil have in common?
Based on suggestions provided in response to my question, I came up with the following new query:
SELECT DATEPART(hour, DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime)) AS Hours ,COUNT(*) AS Faxes,SUM(goodpagecount) AS [Good Page Count]
FROM Documents
JOIN Users ON Documents.OwnerID=Users.handle
JOIN Groups ON Users.GroupID=Groups.handle
JOIN History ON History.Owner=Documents.handle
JOIN HistoryTRX ON History.handle=HistoryTRX.handle
WHERE DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime) >= '2014-10-01 00:00:00.000' and DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime) <= '2014-10-03 08:00:00.000'
GROUP BY DATEPART(hour, DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime))
ORDER BY DATEPART(hour, DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime))
My changes are to the "WHERE" statement by adding my UTC compensation. The "WHERE" now matches the "SELECT".
Before:
WHERE Documents.CreationTime >= '2014-10-01 00:00:00.000' and Documents.CreationTime <= '2014-10-03 08:00:00.000'
After:
WHERE DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime) >= '2014-10-01 00:00:00.000' and DATEADD(HH,-DATEDIFF(Hour,GETDATE(),GETUTCDATE()),Documents.CreationTime) <= '2014-10-03 08:00:00.000'
Also removed the BETWEEN keyword as it may not be as precise for results as I would like.
The results now look like:
Hours Faxes Good Page Count
0 3 4
1 5 9
3 9 50
4 8 16
5 14 40

SQL Server Group By Last 24 Hours, Last 7 Days and Last 14 Days

I need to write an SQL query to produce the following result set. What's the best way to achieve this?
Time Range Qty Amount
===============================================
Last 24 Hours 56 $2000
Last 7 Days 359 $3900
Last 14 Days 2321 $22,888
select 'Last 24 hours'
, sum(Qty) as Qty
, sum(Amount) as Amount
from YourTable
where TradeDt > dateadd(hour, -24, getdate())
union all
select 'Last 7 days'
, sum(Qty)
, sum(Amount)
from YourTable
where TradeDt > dateadd(day, -7, getdate())
union all
select 'Last 14 days'
, sum(Qty)
, sum(Amount)
from YourTable
where TradeDt > dateadd(day, -14, getdate())
My first guess would be to use UNION, if you absolutely need a table as a result (otherwise, you could just fetch the data row by row).
I don't think that there is any nicer way to do this in SQL.
SELECT 'Last 24 hours', SUM(qty), SUM(amount)
FROM table
WHERE datediff(day, getdate(), date) = 1
UNION
SELECT 'Last 7 days', SUM(qty), SUM(amount)
FROM table
WHERE datediff(getdate(), date, 'day') < 7
UNION
SELECT 'Last 14 days', SUM(qty), SUM(amount)
FROM table
WHERE datediff(getdate(), date, 'day') < 14
you can use a where clause on dateAdd like
select * from table
where datefield > DateAdd(d,-1,getdate())
select * from table
where datefield > DateAdd(d,-7,getdate())
select * from table
where datefield > DateAdd(d,-14,getdate())
for day and so on