Can I use SQL to plot actual dates based on schedule information? - sql

If I have a table containing schedule information that implies particular dates, is there a SQL statement that can be written to convert that information into actual rows, using some sort of CROSS JOIN, perhaps?
Consider a payment schedule table with these columns:
StartDate - the date the schedule begins (1st payment is due on this date)
Term - the length in months of the schedule
Frequency - the number of months between recurrences
PaymentAmt - the payment amount :-)
SchedID StartDate Term Frequency PaymentAmt
-------------------------------------------------
1 05-Jan-2003 48 12 1000.00
2 20-Dec-2008 42 6 25.00
Is there a single SQL statement to allow me to go from the above to the following?
Running
SchedID Payment Due Expected
Num Date Total
--------------------------------------
1 1 05-Jan-2003 1000.00
1 2 05-Jan-2004 2000.00
1 3 05-Jan-2005 3000.00
1 4 05-Jan-2006 4000.00
1 5 05-Jan-2007 5000.00
2 1 20-Dec-2008 25.00
2 2 20-Jun-2009 50.00
2 3 20-Dec-2009 75.00
2 4 20-Jun-2010 100.00
2 5 20-Dec-2010 125.00
2 6 20-Jun-2011 150.00
2 7 20-Dec-2011 175.00
I'm using MS SQL Server 2005 (no hope for an upgrade soon) and I can already do this using a table variable and while loop, but it seemed like some sort of CROSS JOIN would apply but I don't know how that might work.
Your thoughts are appreciated.
EDIT: I'm actually using SQL Server 2005 though I initially said 2000. We aren't quite as backwards as I thought. Sorry.

I cannot test the code right now, so take it with a pinch of salt, but I think that something looking more or less like the following should answer the question:
with q(SchedId, PaymentNum, DueDate, RunningExpectedTotal) as
(select SchedId,
1 as PaymentNum,
StartDate as DueDate,
PaymentAmt as RunningExpectedTotal
from PaymentScheduleTable
union all
select q.SchedId,
1 + q.PaymentNum as PaymentNum,
DATEADD(month, s.Frequency, q.DueDate) as DueDate,
q.RunningExpectedTotal + s.PaymentAmt as RunningExpectedTotal
from q
inner join PaymentScheduleTable s
on s.SchedId = q.SchedId
where q.PaymentNum <= s.Term / s.Frequency)
select *
from q
order by SchedId, PaymentNum

Try using a table of integers (or better this: http://www.sql-server-helper.com/functions/integer-table.aspx) and a little date math, e..g. start + int * freq

I've used table-valued functions to achieve a similar result. Basically the same as using a table variable I know, but I remember being really pleased with the design.
The usage ends up reading very well, in my opinion:
/* assumes #startdate and #enddate schedule limits */
SELECT
p.paymentid,
ps.paymentnum,
ps.duedate,
ps.ret
FROM
payment p,
dbo.FUNC_get_payment_schedule(p.paymentid, #startdate, #enddate) ps
ORDER BY p.paymentid, ps.paymentnum

A typical solution is to use a Calendar table. You can expand it to fit your own needs, but it would look something like:
CREATE TABLE Calendar
(
calendar_date DATETIME NOT NULL,
is_holiday BIT NOT NULL DEFAULT(0),
CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED calendar_date
)
In addition to the is_holiday you can add other columns that are relevant for you. You can write a script to populate the table up through the next 10 or 100 or 1000 years and you should be all set. It makes queries like that one that you're trying to do much simpler and can give you additional functionality.

Related

Create a DB2 Calendar table for 20 years with columns dependant on the original date

I'm trying to create a calendar table for 20 years ranging from 2000 - 2020. The aim is to have one row per day along with some other columns that will use logic based on the calendar date generated. An example of this would be having One column as the calendar date (2000-01-01) and the year column reading the year from the values within the calendar date column (2000).
The code for the table is below:
CREATE TABLE TEST.CALENDAR(
CALENDAR_DATE DATE NOT NULL,
CALENDAR_YEAR INTEGER NOT NULL,
CALENDAR_MONTH_NUMBER INTEGER NOT NULL,
CALENDAR_MONTH_NAME VARCHAR(100),
CALENDAR_DAY_OF_MONTH INTEGER NOT NULL,
CALENDAR_DAY_OF_WEEK INTEGER NOT NULL,
CALENDAR_DAY_NAME VARCHAR(100),
CALENDAR_YEAR_MONTH INTEGER NOT NULL);
At the moment, I have a bunch of insert statements that manually insert rows for this table over 20 years. I'm looking to make an insert statement with variables instead and this insert statement would insert data in daily increments until the start date variable is not less than the end date variable.
Currently, I cannot get this to work at all let alone include any logic for any other columns.
Code for the variable insert statement:
declare startdate DATE, enddate DATEset startdate = '2000-01-01'
set enddate = DATEADD(yy,20,startdate)
while startdate < enddate
begin insert into TEST.CALENDAR (CALENDAR_DATE) select startdate
set startdate = DATEADD(dd,1,startdate) end
Would anyone have any ideas of how I can get this to work?
You can do this with a DB2 recursive query and date functions:
Consider:
with cte (
calendar_date,
calendar_year,
calendar_month_number,
calendar_month_name,
calendar_day_of_month,
calendar_day_of_week,
calendar_day_name
) as (
select
calendar_date,
year(calendar_date),
month(calendar_date),
monthname(calendar_date),
dayofmonth(calendar_date),
dayofweek(calendar_date),
dayname(calendar_date)
from (values(date('2000-01-01'))) as t(calendar_date)
union all
select
calendar_date + 1,
year(calendar_date + 1),
month(calendar_date + 1),
monthname(calendar_date + 1),
dayofmonth(calendar_date + 1),
dayofweek(calendar_date + 1),
dayname(calendar_date + 1)
from cte where calendar_date < date('2021-01-01')
)
select * from cte
Note: it is unclear to me what column CALENDAR_YEAR_MONTH means, so I left it apart.
Demo on DB Fiddle for the first 10 days:
CALENDAR_DATE | CALENDAR_YEAR | CALENDAR_MONTH_NUMBER | CALENDAR_MONTH_NAME | CALENDAR_DAY_OF_MONTH | CALENDAR_DAY_OF_WEEK | CALENDAR_DAY_NAME
------------: | ------------: | --------------------: | ------------------: | --------------------: | -------------------: | ----------------:
2000-01-01 | 2000 | 1 | January | 1 | 7 | Saturday
2000-01-02 | 2000 | 1 | January | 2 | 1 | Sunday
2000-01-03 | 2000 | 1 | January | 3 | 2 | Monday
2000-01-04 | 2000 | 1 | January | 4 | 3 | Tuesday
2000-01-05 | 2000 | 1 | January | 5 | 4 | Wednesday
2000-01-06 | 2000 | 1 | January | 6 | 5 | Thursday
2000-01-07 | 2000 | 1 | January | 7 | 6 | Friday
2000-01-08 | 2000 | 1 | January | 8 | 7 | Saturday
2000-01-09 | 2000 | 1 | January | 9 | 1 | Sunday
2000-01-10 | 2000 | 1 | January | 10 | 2 | Monday
Problem • Relational Knowledge
Currently, I cannot get this to work at all let alone include any logic for any other columns.
Well, there is a reason for that:
Since the Relational Model is founded on First Order Predicate Calculus (aka First Order Logic)
there is nothing in the universe that cannot be defined in terms of the RM, and stored in a Relational database (ie. one that complies with the RM), such that it can be retrieved easily and without limitation (including complex queries and DW).
Since SQL is the data sub-language for the RM, and it is Set-oriented
there is nothing, no code requirement, that cannot be implemented in Set-oriented SQL.
DB2 is a Relational database engine, with genuine Set-oriented processing, using SQL.
It appears that you are not aware of that. Instead, you are attempting to:
define low-level data structures
that you think you need for your programming,
rather than ones that are required within the RM and SQL,
that define the data, as data, and nothing but data.
write code that you need to manipulate those data structures, which is:
(a) procedural (one row at a time; WHILE loops; CURSORS; similar abominations), instead of Set-oriented, and
(b) thus the code is consequently complex, if not impossible.
Not to mention, slow as maple syrup in winter
Rather than using the available blindingly fast, Set-oriented code, which will be simple and straight-forward.
The problem may be a bit tricky, but the tables and the code required are not.
Problem • Platform Knowledge
An example of this would be having One column as the calendar date (2000-01-01) and the year column reading the year from the values within the calendar date column (2000)
That breaks two principles, and results in massive duplication within each row:
Use the correct datatype for the datum, as you have with CALENDAR_DATE. Only.
It is a DATE datatype
Using the built-in DATE datatype and DATE_PART(), DATEADD() functions means that DB2 controls the year; month; day; etc values
and all DATE errors such as 29-Feb-2019 and 31-Sep-2019 are prevented
as opposed to your code, which may have one or more errors.
a column that contains any part of a date must be stored as a DATE datatype (any part of a time as TIME; date and time as DATETIME; etc).
It breaks Codd's First Normal Form (as distinct from the ever-changing insanity purveyed as "1NF" by the pretenders)
.
Each domain [column, datum] must be Atomic wrt the platform
.
DB2 handles DATE and DATE_PART() perfectly, so there is no excuse.
All the following columns are redundant, duplicates of CALENDAR_DATE *in the same row`:
CALENDAR_YEAR
CALENDAR_MONTH_NUMBER
CALENDAR_MONTH_NAME
CALENDAR_DAY_OF_MONTH
CALENDAR_DAY_OF_WEEK
CALENDAR_DAY_NAME
CALENDAR_YEAR_MONTH
Kind of like buying a car (CALENDAR_DATE), putting it drive, and then walking beside it (7 duplicated DATE parts). You need to understand the platform; SQL, and trust it a little.
Not only that, but you will have a lot of fun and no hair left, trying to populate those duplicate columns out without making mistakes.
It needs to be said, you need to know all the date Functions in DB2 reasonably well, in order to use it proficiently.
They are duplicate columns because the values can be derived easily via DATE_PART(), something like:
SELECT DATE,
Year = DATE_PART( 'YEAR', DATE ),
MonthNo = DATE_PART( 'MONTH', DATE ),
MonthName = SUBSTR( ‘January February March April May June July August SeptemberOctober November December ‘,
( DATE_PART( 'MONTH', DATE ) - 1 ) * 9 + 1, 9 ),
DayOfMonth = DATE_PART( 'DAY', DATE ),
DayOfWeek = DATE_PART( 'DOW', DATE ),
DayName = SUBSTR( ‘Sunday Monday Tuesday WednesdayThursday Friday Saturday',
( DATE_PART( 'DOW', DATE ) - 1 ) * 9 + 1, 9 ),
YearMonth = DATE_PART( 'YEAR', DATE ) * 100 + DATE_PART( 'MONTH', DATE )
FROM TEST.CALENDAR
In Sybase, I do not have to use SUBSTR() because I have the MonthName and DayName values in tables, the query is simpler still. Or else use CASE.
Do not prefix the columns in each table with the table name. In SQL, to reference a column in a particular table, in order to resolve ambiguity, use:
.
TableName.ColumnName
.
Same as prefixing the table name with an user name TEST.CALENDAR.
The full specification is as follows, with DB2 supplying the relevant defaults based on the context of the command:
.
[SERVER.][Database.][Owner.][Table.]Column
The reason for this rule is this. Columns in one table may well be related to the same column in another table, and should be named the same. That is the nature of Relational. If you break this rule, it will retard your progressive understanding of the Relational Model, and of SQL, its data sub-language.
Problem • Data Knowledge
The aim is to have one row per day along with some other columns that will use logic based on the calendar date generated.
Why on earth would you do that ?
We store Facts about the universe in a Relational database. Only.
We do not need to store non-facts, such as:
Kyle's name isNOT"Fred"
CustomerCode "IBX" doesNOTexist.
A non-fact is simply the absence of a stored Fact.
If Fred does not exist in the Person table, and you
SELECT ... FROM Person WHERE Name = "Fred"
you will obtain an empty result set.
As it should be.
You are storing the grid that you imagine, consisting of
20 years
* 365 days
* whatever Key is relevant [eg. CustomerCode, etc),
in the form of rows.
That will only keep the database chock-full of empty rows, storing non-facts such as [eg.] CustomerCode XYZ has no event on each date for the next 20 years.
What you imagine, is the result set, or the view, which you may construct in the GUI app. It is not the storage.
Store only Facts, [eg.] actual Events per Customer.
Solution
Now for the solution.
Let me assure you that I have implemented this structure, upon which fairly complex logic has been built, in quite a few databases.
- The problem is, educating the developers in order to get them to write correct SQL code.
- Your situation is that of a developer, trying to not only write non-simple code, but to define the structures upon which it depends.
- Two distinct and different sciences.
Data Model
First, a visual data model, in order to understand the data properly.
All my data models are rendered in IDEF1X, the Standard for modelling Relational databases since 1993
My IDEF1X Introduction is essential reading for beginners.
SQL DDL
Only because you appear to work at that level:
CREATE TABLE TEST.Customer (
CustomerCode CHAR(6) NOT NULL,
Name CHAR(30) NOT NULL,
CONSTRAINT PK
PRIMARY KEY ( CustomerCode )
CONSTRAINT AK1
UNIQUE ( Name )
...
);
CREATE TABLE TEST.Event (
CustomerCode CHAR(6) NOT NULL,
Date DATE NOT NULL,
Event CHAR(30) NOT NULL,
CONSTRAINT pk
PRIMARY KEY ( CustomerCode, Date )
CONSTRAINT Customer_Schedules_Event
FOREIGN KEY ( CustomerCode )
REFERENCES Customer ( CustomerCode )
...
);
INSERT a row only when a Customer reserves a Date
Non-facts are not stored.
SELECT ... WHERE CustomerCode = ... AND Date = ...
will obtain a row if the Customer has booked an Event on that Date
will obtain nothing (empty result set) if the Customer has NOT booked an Event on that Date
If you need to store recurring Events, such as a birthday for the next 20 years, use a Projection in SQL to generate the 20 INSERT commands, which is the Set-oriented method.
If you cannot code a Projection, and only then, write a WHILE loop, which is procedural, one row per execution.
Please feel free to ask questions, the more specific the better.
As you can see, this Question is really about How to set up a Calendar for Events, but I won't change the title until I am sure this answer is what you are seeking. And about modelling data for a Relational database. I will add the tag.

Time between date. (More advanced than just Datediff)

I have a table that contains Guest_ID and Trip_Date. I have been tasked with trying to find out for each Guest_ID how many times they have had over 365 days between trips. I know that for the time between the dates I can use datediff formula but I am unsure of how to get the dates plugged in properly. I think if I can get help with this part I can do the rest.
For each time this happened I need to report back Guest_ID, Prior_Last_Trip, New_Trip, days between. This data goes back for over a decade so it is possible for a Guest to have multiple periods of over a year between visits.
I was thinking of just loading a table with that data that can be queried later. That way once I figure out how to make this work the first time I can setup a stored procedure or trigger to check for new occurrences of this and populate the table.
I was not sure were to begin on this code. I was thinking recursion might be the answer but I do not know recursion just that it exist.
This table is quite large. Around 1.5 million unique Guest_ID's with over 30 million trips.
I am using SQL Server 2012. If there is anything else I can add to help this let me know. I will edit and update this as I have ideas on how to make this work myself.
Edit 1: Sample Data and Desired Results
Guest_ID Trip_Date
1 1/1/2013
1 2/5/2013
1 12/5/2013
1 1/1/2015
1 6/5/2015
1 8/1/2017
1 10/2/2017
1 1/6/2018
1 6/7/2018
1 7/1/2018
1 7/5/2018
2 1/1/2018
2 2/6/2018
2 4/2/2018
2 7/3/2018
3 1/1/2014
3 6/5/2014
3 9/4/2014
Guest_ID Prior_Last_Trip New_Trip DaysBetween
1 12/5/2013 1/1/2015 392
1 6/5/2015 8/1/2017 788
So you can see that Guest 1 had 2 different times where they did not have a trip for over a year and that those two instances are recorded in the results. Guest 2 never had a gap of over a year and therefor has no records in the results. Guest 3 has not had a trip in over a year but without have the return trip currently does not qualify for the result set. Should Guest 3 ever make another trip they would then be added to the result set.
Edit 2: Working Query
Thanks to #Code4ml I got this working. Here is the complete query.
Select
Guest_ID, CurrentTrip, DaysBetween, Lasttrip
From (
Select
Guest_ID
,Lag(Trip_Date,1) Over(Partition by Guest_ID Order by Trip_Date) as LastTrip
,Trip_Date as CurrentTrip
,DATEDIFF(d,Lag(Trip_Date,1) Over(Partition by Guest_ID Order by Trip_Date),Trip_Date) as DaysBetween
From UCS
) as A
Where DaysBetween > 365
You may try SQL LAG function to access previous trip date like below.
SELECT guest_id, trip_date,
LAG (trip_date,1) OVER (PARTITION BY guest_id ORDER BY trip_date desc) AS prev_trip_date
FROM tripsTable
Now you can use this as a subquery to calculate number of days between trips and filter the data as required.

How can I return rows which match on some columns and fulfil a DateTime comparison between two other columns using SQL?

I have a table which contains rows for jobs, example below, where 01/01/1980 is used rather than null in the ClosedDate column for jobs which are not finished:
JobNumber JobCategory CustomerID CreatedDate ClosedDate
1 Small 1 01/01/2016 03/01/2016
2 Small 2 03/01/2016 07/01/2016
3 Large 2 06/01/2016 07/01/2016
4 Medium 1 08/01/2016 10/01/2016
5 Small 3 10/01/2016 01/01/1980
6 Medium 3 15/01/2016 01/01/1980
7 Large 2 16/01/2016 17/01/2016
8 Large 2 19/01/2016 20/01/2016
9 Small 1 19/01/2016 01/01/1980
10 Medium 2 19/01/2016 01/01/1980
I need to return a list of any jobs where the same customer has had a job of the same category created within 3 days of the previous job being closed.
So, I would want to return:
7 Large 2 16/01/2016 17/01/2016
8 Large 2 19/01/2016 20/01/2016
because Customer 2 had a Large job closed on 17/01/2016 and another Large job opened on 19/01/2016, which is within 3 days.
In order to do this, I assume I need to compare each record in the table with each subsequent record, looking for a match on JobCategory and comparing CreatedDate with ClosedDate between rows.
Can anyone advise my best option for this using SQL? I'm using SQL Server 2012.
The first thing that you should do is get rid of "magic dates" in your system. If the job hasn't been closed yet then the ClosedDate is not known. SQL has a value for exactly that - NULL. That prevents anyone in the future from having to know the magic date of 1/1/1980 or from that having to be hard-coded throughout your system.
Next, you don't have to compare each row with each one after it. Define what you're looking for and find matches that meet those qualifications. You didn't specify which type of SQL Server you're using (you should tag your question with Oracle or MySQL or SQL Server), so the below query is written for SQL Server. Your version might have different date functions.
SELECT
J1.JobNumber,
J1.JobCategory,
J1.CustomerID,
J1.CreatedDate,
J1.ClosedDate,
J2.JobNumber,
J2.CreatedDate,
J2.ClosedDate
FROM
Jobs J1
INNER JOIN Jobs J2 ON
J2.CustomerID = J1.CustomerID AND
J2.JobCategory = J1.JobCategory AND
DATEDIFF(DAY, J1.ClosedDate, J2.CreatedDate) BETWEEN 0 AND 3 AND
J2.JobNumber <> J1.JobNumber
This will return the jobs in a single row instead of two rows. If that's a problem then the query could be altered slightly to do so. This can also be done a little more easily with windowed functions, but again, since you didn't specify your SQL vendor I didn't want to use those.
Since you're using SQL Server, you should be able to use windowed functions like so:
;WITH CTE_JobsWithDates AS -- Probably a poor name for the CTE
(
SELECT
JobNumber,
JobCategory,
CustomerID,
CreatedDate,
ClosedDate,
LEAD(CreatedDate, 1) OVER (PARTITION BY JobCategory, CustomerID ORDER BY CreatedDate) AS NextCreatedDate,
LAG(ClosedDate, 1) OVER (PARTITION BY JobCategory, CustomerID ORDER BY CreatedDate) AS PreviousClosedDate
FROM
Jobs
)
SELECT
JobNumber,
JobCategory,
CustomerID,
CreatedDate,
ClosedDate
FROM
CTE_JobsWithDates
WHERE
DATEDIFF(DAY, ClosedDate, NextCreatedDate) BETWEEN 0 AND 3 OR
DATEDIFF(DAY, LastClosedDate, CreatedDate) BETWEEN 0 AND 3
That was off the cuff, so please test and let me know if anything isn't quite right.
Try:
SELECT a.*
FROM
Job AS a
JOIN
Job AS b ON
a.CustomerID = b.CustomerID AND a.JobCategory = b.JobCategory
WHERE
a.JobNumber != b.JobNumber
AND (
b.CreatedDate - a.ClosedDate BETWEEN 0 AND 3
OR
a.CreatedDate - b.ClosedDate BETWEEN 0 AND 3)

Join to Calendar Table - 5 Business Days

So this is somewhat of a common question on here but I haven't found an answer that really suits my specific needs. I have 2 tables. One has a list of ProjectClosedDates. The other table is a calendar table that goes through like 2025 which has columns for if the row date is a weekend day and also another column for is the date a holiday.
My end goal is to find out based on the ProjectClosedDate, what date is 5 business days post that date. My idea was that I was going to use the Calendar table and join it to itself so I could then insert a column into the calendar table that was 5 Business days away from the row-date. Then I was going to join the Project table to that table based on ProjectClosedDate = RowDate.
If I was just going to check the actual business-date table for one record, I could use this:
SELECT actual_date from
(
SELECT actual_date, ROW_NUMBER() OVER(ORDER BY actual_date) AS Row
FROM DateTable
WHERE is_holiday= 0 and actual_date > '2013-12-01'
ORDER BY actual_date
) X
WHERE row = 65
from here:
sql working days holidays
However, this is just one date and I need a column of dates based off of each row. Any thoughts of what the best way to do this would be? I'm using SQL-Server Management Studio.
Completely untested and not thought through:
If the concept of "business days" is common and important in your system, you could add a column "Business Day Sequence" to your table. The column would be a simple unique sequence, incremented by one for every business day and null for every day not counting as a business day.
The data would look something like this:
Date BDAY_SEQ
========== ========
2014-03-03 1
2014-03-04 2
2014-03-05 3
2014-03-06 4
2014-03-07 5
2014-03-08
2014-03-09
2014-03-10 6
Now it's a simple task to find the N:th business day from any date.
You simply do a self join with the calendar table, adding the offset in the join condition.
select a.actual_date
,b.actual_date as nth_bussines_day
from DateTable a
join DateTable b on(
b.bday_seq = a.bday_seq + 5
);

SQL SELECT statement, DATEDIFF only if dates in between are/are not in a workdays table

I'm new to SQL, right off the bat but slowly getting things figured out. I searched but couldn't find anything that makes sense to me. Like I said, I'm new, I'm sorry. Using SQL Server 2008 R2 if that's pertinent.
I have one table that's structured like this
AccountNumber StartDate EndDate
123456 2012-05-07 12:55:55.000 2012-05-09 09:53:55.000
789012 2012-10-17 13:55:55.000 2012-11-02 10:53:55.000
345678 2012-11-13 14:55:55.000 2012-12-14 11:53:55.000
I also have a workdays table that's structured like this
Date IsWorkday DayOfWeek
2012-01-01 N 1
2012-01-02 N 2
2012-01-03 Y 3
2012-01-04 Y 4
2012-01-05 Y 5
2012-01-06 Y 6
Etc. through 2020-12-31
I didn't set up the tables and can't get them changed in any way. I tried, I won't be allowed. I can't even email the db admin directly. Neat.
I need select certain columns, which I can handle, but I need one column at the end that will be a count of the number of days between the StartDate and EndDate values, but only if the days in between are in the workdays table with a value of Y for IsWorkday. I hope that makes sense. If it's simply not possible to do or a fantastically enormous hassle to get it done with the current structure, that'd be good to know too (so I can push for a better workdays/non-workdays table or something). Any help will be much appreciated and may possibly be exchanged for my firstborn.
Try this
SELECT X.*, X2.DayCnt
FROM AccountTable X
INNER JOIN
(
SELECT a.accountnumber,a.StartDate, b.EndDate, Count(b.Date) DayCnt
FROM AccountTable a
LEFT OUTER JOIN workdays w
ON a.StartDate >= w.date and
a.EndDate <= w.date and
IsWorkday = 'Y'
GROUP BY a.accountnumber,a.StartDate, b.EndDate
) X2
ON X.accountnumber = X2.accountnumber and
X.StartDate = X2.StartDate and X.EndDate = X2.EndDate