dateadd in Redshift not compensating for leap year

dateadd in Redshift not compensating for leap year - sql

I'm working on a YoY self join to see sales this year's and last year's sales numbers side by side in the same table.
The query looks something like this:
Select a.date_column, a.sales_column as ty_sales, b.sales_column as ly_sales
from sales_table a
left join sales_table b
on (dateadd(year, -1, a.date_column)) = b.date_column
This in theory should be fine, the problem is that 2016-02-29 records are joining to 2015-03-01 records, which is causing the numbers to be off for 02-2016.
Is this a known issue with redshift/postgres?
Let me know if I can provide any additional clarity.

I wouldn't say that it's a "known issue", it's how they decided to handle leap years. When you compare 2/29 for YoY, which day would you like to compare it with? If you compare it with 2/28 then are you also going to compare 2/28 of this year with 2/28 of last year? You're now comparing two days to the same day. Then you have to account for potentially double-counting those sales from last year when you total things up.
The short of it is that you need to come up with very specific business rules on how you want to handle leap years when it comes to reporting and then implement those rules, being careful to test them thoroughly given that date functions are often a bit arbitrary (for any database/language) when it comes to leap years.

Related

sql Teradata difference in months

I need to get the difference between dates, but I just need to get the whole months that have passed. So for example between "1990-05-24" and "1990-05-27" it should say 0. It would also be 0 for "1990-05-02" and "1990-05-29" because the month has not finished.
I already got the difference in months using MONTHS_BETWEEN(), but I get months with decimals, and ROUNDing is not an option since sometimes it should be up and sometimes down.
I thought about setting al dates to day 01. In both colums Closing_date and Opening_date. But can't figure out how to do it.

I think you want to count boundaries between months. If so, you can use months_between() after truncating to the first of the month:
months_between(trunc(date1, 'MON'), trunc(date2, 'MON')

WHERE statement to choose record previous day but choose Friday record when current day is Monday Microsoft SQL

I need a WHERE statement where the date of the record is the previous day. I have the below code which will do this
WHERE DOC_DATE = dateadd(day,datediff(day,1,GETDATE()),0)
However I need this statement to get Friday's record when the current day is Monday. I have the below code but it will not work for me. No errors come back on SQL although no records results come back either. I have the below code for this
WHERE DOC_DATE = DATEADD(day, CASE WHEN datepart(dw,(GETDATE())) IN (2) then -3 ELSE -1 END ,0)
Important to add that this needs to be in a WHERE clause. This is for a Docuware administrative view I am creating. I have no control on how to write the SELECT statement, I only have access to edit the WHERE clause:

Here's a slightly "magical" way to compute the value that doesn't depend on any particular server settings such as datefirst. It's probably not immediately obvious how it works:
WHERE DOC_DATE = dateadd(day,datediff(day,'20150316',getdate()),
CASE WHEN DATEPART(weekday,getdate()) = DATEPART(weekday,'20150330')
THEN '20150313'
ELSE '20150315' END)
In the first line, we're computing the number of days which have elapsed since some arbitrary date. I picked a day in March 2015 to use as my base date.
The second line asks what today's day of the week is and if it's the same as some arbitrary "Known good" Monday. Just taking one value and comparing it to 2 depends on what your DATEFIRST setting is so I prefer not to write that.
In the third line, we decide what to do if it's a monday - I give a date that is 3 days before my arbitrary date above. If it wasn't a monday, we pick the day before.
Adding it all together, when we add the days difference from the arbitrary date back to one of these two dates from lines 3 and 4, it has the effect of shifting the date backwards 1 or 3 days.
It's can be an odd structure to see if you're not familiar with it - but combining dateadd/datediff and exploiting relationships between an arbitrary date and other dates computed from it can be useful for performing all kinds of calculations. A similar structure can be used for computing e.g. the last day of the month 15 months ago using just dateadd/datediff, an arbitrary date and another date with the right offset from the first:
SELECT DATEADD(month,DATEDIFF(month,'20010101',GETDATE()),'19991031')
As I said in a comment though, usually doing this sort of thing is only a short step away from needing to properly model your organisation's business days, at which point you'd typically want to introduce a calendar table. At one row per day, 20 years worth of pre-calculated calendar (adjusted as necessary as the business changes) is still less than 10000 rows.

You can try this.
WHERE DOC_DATE = DATEADD(DAY, CASE WHEN datepart(dw, GETDATE()) = 2 THEN -3 ELSE -1 END, CAST(GETDATE() AS DATE))

When a start date and end date span more than one week, I need to split a row up in 2 or more new rows

I would like to begin by saying I'm quite new to sql.
That said, here is my question/problem:
I have a view that has two date columns, a variable column and a text column (for comments).
I need to be able to split up all rows where the two dates are not in the same week. And I need to be able to split the variable value as well, so that it gets evenly distributed, based on how many days were in each week. The comment must be copied as well, so to be shown in each row.
My dataset looks like this:
DateIn DateOut Amount Comment
2014-11-01 2014-11-08 600 Good
And what I want is this:
DateIn DateOut Amount Comment
2014-11-01 2014-11-07 525 Good
2014-11-08 2014-11-08 75 Good
And if the time period spreads over more weeks, I would need it to split up to equivalent number of rows.
I would be very greatful if somebody could take the time to tell me how to achive my goal, using an sql-query.
As this is my first post on the forum, I apologize for any format errors in my post.

First, you need a weeks table. I mean physical table or view, where exists one row for every week possible. (We have dates table here, +/- 30 years from now - allows easily create weeks view and similar).
Then you need link your data to weeks table with left join; join condition should check date ranges overlap with week date range (probably you have to have both week start and week end fields in your weeks table - makes comparisons easier).
Then you need to divide amounts between weeks. Because you know date range length, week length and overlapping date range length, this should be trivial :)

Battling Datediff in SQL

I am writing a little query in SQL and am butting heads with an issue that it seems like someone must have run into before. I am trying to find the number of months between two dates. I am using an expression like ...
DATEDIFF(m,{firstdate},{seconddate})
However I notice that this function is tallying the times the date crosses the monthly threshold. In example...
DATEDIFF(m,3/31/2011,4/1/2011) will yield 1
DATEDIFF(m,4/1/2011,4/30/2011) will yield 0
DATEDIFF(m,3/1/2011,4/30/2011) will yield 1
Does anyone know how to find the months between two dates more-so based upon time passed then times passed the monthly threshold?

If you want to find some notional number of months, why not find the difference in days, then divide by 30 (cast to FLOAT as required). Or 30.5-ish perhaps - depends on how you want to handle the variable month length throughout the year. But perhaps that's not a factor in your particular case.

The following statements have the same startdate and the same endate. Those dates are adjacent and differ in time by .0000001 second. The difference between the startdate and endate in each statement crosses one calendar or time boundary of its datepart. Each statement returns 1. ...
SELECT DATEDIFF(month, '2005-12-31 23:59:59.9999999'
, '2006-01-01 00:00:00.0000000'); ....
(from DATEDIFF, section datepart Boundaries ). If you are not satisfied by it, you probably need to use days as unit as proposed by martin clayton

DATEDIFF(m,{firstdate},ISNULL({seconddate},GETDATE())) - CASE
WHEN DATEPART(d,{firstdate}) >= DATEPART(d,ISNULL({seconddate},GETDATE()))
THEN 1
ELSE 0

DATEDIFF is like this by design. When evaluating a particular time measurement (like months, or days, etc.), it considers only that measurement and higher values -- ignoring smaller ones. You'll run into this behavior with any time measurement. For example, if you used DATEDIFF to calculate days, and had one date a few seconds before midnight, and another date a few seconds after midnight, you'd get a "1" day difference, even though the two dates were only a few seconds apart.
DATEDIFF is meant to give a rough answer to questions, like this:
Question: how many years old are you?
Answer: some integer. You don't say "I'm 59 years, 4 months, 17 days, 5 hours, 35 minutes and 27 seconds old". You just say "I'm 59 years old". That's DATEDIFF's approach too.
If you want an answer that's tailored to some contextual meaning (like your son who says "I'm not 8! I'm 8 and 3-quarters!, or I'm almost 9!), then you should look at the next-smallest measurement and approximate with it. So if it's months you're after, then do a DATEDIFF on days or hours instead, and try to approximate months however it seems most relevant to your situation (maybe you want answers like 1-1/2 months, or 1.2 months, etc.) using CASE / IF-THEN kinds of logic.

T-SQL absence by month from start date end date

I have an interesting query to do and am trying to find the best way to do it. Basically I have an absence table in our personnel database this records the staff id and then a start date and end date for the absence. End date being null if not yet entered (not returned). I cannot change the design.
They would like a report by month on number of absences (12 month trend). With staff being off over the month change it obviously may be difficult to calculate.
e.g. Staff off 25/11/08 to 05/12/08 (dd/MM/yy) I would want the days in November to go into the November count and the ones in December in the December count.
I am currently thinking in order to count the number of days I need to separate the start and end date into a record for each day of the absence, assigning it to the month it is in. then group the data for reporting. As for the ones without an end date I would assume null is the current date as they are presently still absent.
What would be the best way to do this?
Any better ways?
Edit: This is SQL 2000 server currently. Hoping for an upgrade soon.

I have had a similar issue where there has been a table of start/end dates designed for data storage but not for reporting.
I sought out the "fastest executing" solution and found that it was to create a 2nd table with the monthly values in there. I populated it with the months from Jan 2000 to Jan 2070. I'm expecting it will suffice or that I get a large pay cheque in 2070 to come and update it...
DECLARE TABLE months (start DATETIME)
-- Populate with all month start dates that may ever be needed
-- And I would recommend indexing / primary keying by start
SELECT
months.start,
data.id,
SUM(CASE WHEN data.start < months.start
THEN DATEDIFF(DAY, months.start, data.end)
ELSE DATEDIFF(DAY, data.start, DATEADD(month, 1, months.start))
END) AS days
FROM
data
INNER JOIN
months
ON data.start < DATEADD(month, 1, months.start)
AND data.end > months.start
GROUP BY
months.start,
data.id
That join can be quite slow for various reasons, I'll search out another answer to another question to show why and how to optimise the join.
EDIT:
Here is another answer relating to overlapping date ranges and how to speed up the joins...
Query max number of simultaneous events

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas