Modulo arithmetic on dates in SQL - sql

I have a system which defines repeating patterns of days. Each pattern has a base date (often a few years in the past, when the pattern was created) and a day count (which loops), so for example it might define a pattern for a seven day period:
Table: Pattern
ID | BaseDate | DayCount
-----------------------------
1 | 01/02/2005 | 7
Table: PatternDetail
PID | Offset | Detail
----------------------
1 | 0 | A
1 | 1 | B
1 | 2 | B
1 | 3 | C
etc.
(The detail column is domain specific and not relevant.)
What I want to do is, given a date (say today) work out the correct Offset in the PatternDetail table for a given working pattern. In pseudocode I would do:
offset = ((today.InDays) - (Pattern.BaseDate.InDays)) % (Pattern.DayCount)
How can I do this in SQL (needs to work in MSSQL Server and Oracle)? In other words how can I calculate the number of days between two dates and take the modulus of this difference?

I don't know what is available in PL/SQL, but T-SQL has a DATEDIFF function which appears to be what you're looking for:
#Offset = ((DATEDIFF(day, #BaseDate, GETDATE()) % #DayCount)

Use DATEDIFF to get the day count different. This gives an integer.
Then use % (standard SQL modulo operator).
Is is that simple?

Related

Rolling sum based on date (when dates are missing)

You may be aware of rolling the results of an aggregate over a specific number of preceding rows. I.e.: how many hot dogs did I eat over the last 7 days
SELECT HotDogCount,
DateKey,
SUM(HotDogCount) OVER (ORDER BY DateKey ROWS 6 PRECEDING) AS HotDogsLast7Days
FROM dbo.HotDogConsumption
Results:
+-------------+------------+------------------+
| HotDogCount | DateKey | HotDogsLast7Days |
+-------------+------------+------------------+
| 3 | 09/21/2020 | 3 |
| 2 | 9/22/2020 | 5 |
| 1 | 09/23/2020 | 6 |
| 1 | 09/24/2020 | 7 |
| 1 | 09/25/2020 | 8 |
| 4 | 09/26/2020 | 12 |
| 1 | 09/27/2020 | 13 |
| 3 | 09/28/2020 | 13 |
| 2 | 09/29/2020 | 13 |
| 1 | 09/30/2020 | 13 |
+-------------+------------+------------------+
Now, the problem I am having is when there are gaps in the dates. So, basically, one day my intestines and circulatory system are screaming at me: "What the heck are you doing, you're going to kill us all!!!" So, I decide to give my body a break for a day and now there is no record for that day. When I use the "ROWS 6 PRECEDING" method, I will now be reaching back 8 days, rather than 7, because one day was missed.
So, the question is, do any of you know how I could use the OVER clause to truly use a date value (something like "DATEADD(day,-7,DateKey)") to determine how many previous rows should be summed up for a true 7 day rolling sum, regardless of whether I only ate hot dogs on one day or on all 7 days?
Side note, to have a record of 0 for the days I didn't eat any hotdogs is not an option. I understand that I could use an array of dates and left join to it and do a
CASE WHEN Datekey IS NULL THEN 0 END
type of deal, but I would like to find out if there is a different way where the rows preceding value can somehow be determined dynamically based on the date.
Window functions are the right approach in theory. But to look back at the 7 preceding days (not rows), we need a range frame specification - which, unfornately, SQL Server does not support.
I am going to recommend a subquery, or a lateral join:
select hdc.*, hdc1.*
from dbo.HotDogConsumption hdc
cross apply (
select coalesce(sum(HotDogCount), 0) HotDogsLast7Days
from dbo.HotDogConsumption hdc1
where hdc1.datekey >= dateadd(day, -7, hdc.datekey)
and hdc1.datekey < hdc.datekey
) hdc1
You might want to adjust the conditions in the where clause of the subquery to the precise frame that you want. The above code computes over the last 7 days, not including today. Something equivalent to your current attempt would be like:
where hdc1.datekey >= dateadd(day, -6, hdc.datekey)
and hdc1.datekey <= hdc.datekey
I'm kind of old school, but this is how I'd go about it:
SELECT
HDC1.HotDogCount
,HDC1.DateKey
,( SELECT SUM( HDC2.HotDogCount )
FROM HotDogConsumption HDC2
WHERE HDC2.DateKey BETWEEN DATEADD( DD, -7, HDC1.DateKey )
AND HDC1.DateKey ) AS 'HotDogsLast7Days'
FROM
HotDogConsumption HDC1
;
Someone younger might use an OUTER APPLY or something.

Best Practice in Scenario

I am currently trying to accomplish the following:
get the Last Weekstamp for the last 6 Months, the following ilustrates how the end result might look like:
Month | Weekstamp |
2013-12| 2013-52 |
2014-01| 2014-05 |
.... and so on
I have a auxiliary Table, which has all Weeks in it and allows me to connect to a Calender Table, which in turn has all months, meaning i am able to get all weekstamps per Month,
but how do i get all of the Last Week Numbers for the Last 6 Months ?
my idea was a Temporary table of some sor (never used one, am a beginner when it Comes to SQL)
which calculates all of the Weekstamps needing to be filtered out per month, and than gives out only values which i could than use to filter a query which contains all the data i Need.
Anybody have a better idea?
As i said I am just a beginner so i can't really say what the best way would be
Thanks a lot in Advance!
My guess is that your challenge is determining what the last six months are. To do this you can use a tally table (spt_values) and DateDiff to determine when the last six months are.
You can also depending on which DB and version easily do this without a calander or weeks table.
This
WITH rnge
AS (SELECT number
FROM master..spt_values
WHERE type = 'P'
AND number > 0
AND number < 7),
dates
AS (SELECT EOMONTH(Dateadd(m, number * -1, Getdate())) dt
FROM rnge)
SELECT Year(dt) year,
Month(dt) month,
Datepart(wk, dt) week
FROM dates
Produces this output
| YEAR | MONTH | WEEK |
|------|-------|------|
| 2014 | 1 | 5 |
| 2013 | 12 | 53 |
| 2013 | 11 | 48 |
| 2013 | 10 | 44 |
| 2013 | 9 | 40 |
| 2013 | 8 | 35 |
Demo
I'll leave it to you to format the values
This assumes SQL Server 2012 since it uses EOMONTH see Get the last day of the month in SQL for previous versions of SQL Server

Transforming a 2 column SQL table into 3 columns, column 3 lagged on 2

Here's my problem: I want to write a query (that goes into a larger query) that takes a table like this;
ID | DATE
A | 1
A | 2
A | 3
B | 1
B | 2
and so on, and transforms it into;
ID | DATE1 | DATE2
A | 1 | 2
A | 2 | 3
A | 3 | NOW
B | 1 | 2
B | 2 | NOW
Where the numbers are dates, and NOW() is always appended to the most recent date. Given free rein I would do this in Python, but unfortunately this goes into a larger query. We're using SyBase's SQL Anywhere 12, I think? I interact with the database using SQuirreL SQL.
I'm very stumped. I thought (SQL query to transform a list of numbers into 2 columns) would help, but I'm afraid I don't know enough to make it work. I was thinking of JOINing the table to itself, but I don't know how to SELECT for only the A-1-2 rows instead of the A-1-3 rows as well, for instance, or how to insert the NOW() value into it. Does anyone have any ideas?
I made a an sqlfiddle.com to outline a solution for your example. You were mentioning dates, but using integers so I chose to do an integer example, but it can be modified. I wrote it in postgresql so the coalesce() function can be substituted with nvl() or similar. Also, the parameter '0' can be substituted with any value, including now(), but you must change the data type of the "i" column in the table to be a date as well. Please let me know if you need further help on this.
select a.id, a.i, coalesce(min(b.i),'0') from
test a
left join test b on b.id=a.id and a.i<b.i
group by a.id,a.i
order by a.id, a.i
http://sqlfiddle.com/#!15/f1fba/6

How can I see if a date is on a weekend?

I have a table:
ID | Name | TDate
1 | John | 1 May 2013, 8:67AM
2 | Jack | 2 May 2013, 6:43AM
3 | Adam | 3 May 2013, 9:53AM
4 | Max | 4 May 2013, 2:13AM
5 | Leny | 5 May 2013, 5:33AM
I need a query that will return all the items where TDate is a weekend. How would I write such a
query?
WHAT I HAVE SO FAR
select
table.*,
EXTRACT (DAY FROM table.tdate )
from table
I did a select using EXTRACT to just see if I can get the right values. However, EXTRACT with the parameter DAY returns the day of the month. If I instead use WEEKDAY, as per the documentation here, then I get error:
ERROR: timestamp units "weekday" not recognized
SQL state: 22023
limit 1250
EDIT
TDate has a data type of datetime (timestamp). I just wrote it like that for easy reading. But regardless of the type, I could easily cast between types if need be.
I know dates 4May and 5May are weekends (as they fall on a Saturday and a Sunday). Does firebird allow for a way to write a query that will return dates if they fall on weekends.
try this:
SELECT ID, Name, TDate
FROM your_table
WHERE EXTRACT(WEEKDAY FROM TDate) IN (6,0)
UPDATE
condition must be (0,6) not (0,1).

yet another date gap-fill SQL puzzle

I'm using Vertica, which precludes me from using CROSS APPLY, unfortunately. And apparently there's no such thing as CTEs in Vertica.
Here's what I've got:
t:
day | id | metric | d_metric
-----------+----+--------+----------
2011-12-01 | 1 | 10 | 10
2011-12-03 | 1 | 12 | 2
2011-12-04 | 1 | 15 | 3
Note that on the first day, the delta is equal to the metric value.
I'd like to fill in the gaps, like this:
t_fill:
day | id | metric | d_metric
-----------+----+--------+----------
2011-12-01 | 1 | 10 | 10
2011-12-02 | 1 | 10 | 0 -- a delta of 0
2011-12-03 | 1 | 12 | 2
2011-12-04 | 1 | 15 | 3
I've thought of a way to do this day by day, but what I'd really like is a solution that works in one go.
I think I could get something working with LAST_VALUE, but I can't come up with the right JOIN statements that will let me properly partition and order on each id's day-by-day history.
edit:
assume I have a table like this:
calendar:
day
------------
2011-01-01
2011-01-02
...
that can be involved with joins. My intent would be to maintain the date range in calendar to match the date range in t.
edit:
A few more notes on what I'm looking for, just to be specific:
In generating t_fill, I'd like to exactly cover the date range in t, as well as any dates that are missing in between. So a correct t_fill will start on the same date and end on the same date as t.
t_fill has two properties:
1) once an id appears on some date, it will always have a row for each later date. This is the gap-filling implied in the original question.
2) Should no row for an id ever appear again after some date, the t_fill solution should merrily generate rows with the same metric value (and 0 delta) from the date of that last data point up to the end date of t.
A solution might backfill earlier dates up to the start of the date range in t. That is, for any id that appears after the first date in t, rows between the first date in t and the first date for the id will be filled with metric=0 and d_metric=0. I don't prefer this kind of solution, since it has a higher growth factor for each id that enters the system. But I could easily deal with it by selecting into a new table only rows where metric!=0 and d_metric!=0.
This about what Jonathan Leffler proposed, but into old-fashioned low-level SQL (without fancy CTE's or window functions or aggregating subqueries):
SET search_path='tmp'
DROP TABLE ttable CASCADE;
CREATE TABLE ttable
( zday date NOT NULL
, id INTEGER NOT NULL
, metric INTEGER NOT NULL
, d_metric INTEGER NOT NULL
, PRIMARY KEY (id,zday)
);
INSERT INTO ttable(zday,id,metric,d_metric) VALUES
('2011-12-01',1,10,10)
,('2011-12-03',1,12,2)
,('2011-12-04',1,15,3)
;
DROP TABLE ctable CASCADE;
CREATE TABLE ctable
( zday date NOT NULL
, PRIMARY KEY (zday)
);
INSERT INTO ctable(zday) VALUES
('2011-12-01')
,('2011-12-02')
,('2011-12-03')
,('2011-12-04')
;
CREATE VIEW v_cte AS (
SELECT t.zday,t.id,t.metric,t.d_metric
FROM ttable t
JOIN ctable c ON c.zday = t.zday
UNION
SELECT c.zday,t.id,t.metric, 0
FROM ctable c, ttable t
WHERE t.zday < c.zday
AND NOT EXISTS ( SELECT *
FROM ttable nx
WHERE nx.id = t.id
AND nx.zday = c.zday
)
AND NOT EXISTS ( SELECT *
FROM ttable nx
WHERE nx.id = t.id
AND nx.zday < c.zday
AND nx.zday > t.zday
)
)
;
SELECT * FROM v_cte;
The results:
zday | id | metric | d_metric
------------+----+--------+----------
2011-12-01 | 1 | 10 | 10
2011-12-02 | 1 | 10 | 0
2011-12-03 | 1 | 12 | 2
2011-12-04 | 1 | 15 | 3
(4 rows)
I am not Vertica user, but if you do not want to use their native support for GAP fillings, here you can find a more generic SQL-only solution to do so.
If you want to use something like a CTE, how about using a temporary table? Essentially, a CTE is a view for a particular query.
Depending on your needs you can make the temporary table transaction or session-scoped.
I'm still curious to know why gap-filling with constant-interpolation wouldn't work here.
Given the complete calendar table, it is doable, though not exactly trivial. Without the calendar table, it would be a lot harder.
Your query needs to be stated moderately precisely, which is usually half the battle in any issue with 'how to write the query'. I think you are looking for:
For each date in Calendar between the minimum and maximum dates represented in T (or other stipulated range),
For each distinct ID represented in T,
Find the metric for the given ID for the most recent record in T on or before the date.
This gives you a complete list of dates with metrics.
You then need to self-join two copies of that list with dates one day apart to form the deltas.
Note that if some ID values don't appear at the start of the date range, they won't show up.
With that as guidance, you should be able get going, I believe.