Weighting a length of time to get a different Date each time - sql

I have an arrival Date 01/01/2010, this has occurred 50 times and I want to randomise 50 departure dates using the length of stay weighting guide below, as you can the majority of these will leave 2 days later, but I cannot figure out how to write the code, Can you help.
LengthofStay LengthofStayWeighting
------------ ---------------------
1 1
2 5
3 4
4 3
5 3
6 3
7 3
8 1
9 1
10 1
I have started but have got stuck already
SELECT ArrivalDate,RAND(checksum(NEWID())) * LengthOfStay.LengthofStayWeighting AS Expr1,
ArrivalDate + Expr1 as DepartureDate
FROM Bookings, LengthOfStay
ORDER BY ArrivalDate

You may need to use DATEADD
SELECT ArrivalDate, DATEADD(day, RAND(checksum(NEWID())) * LengthOfStay.LengthofStayWeighting, ArrivalDate) AS DepartureDate
FROM Bookings, LengthOfStay
ORDER BY ArrivalDate
update: Based on your comment, I think I misunderstood the question. Is this what you need?:
SELECT ArrivalDate,
DATEADD(day, (select TOP 1 LengthofStayWeighting FROM LengthOfStay group by LengthofStayWeighting ORDER BY LengthofStayWeighting DESC), ArrivalDate) AS DepartureDate
FROM Bookings
ORDER BY ArrivalDate
Basically you need to obtain the length that is repeated the most, in your case "1". If so, I think you need to include a FOREIGN Key..
SELECT ArrivalDate,
DATEADD(day, (select TOP 1 LengthofStayWeighting FROM LengthOfStay l WHERE b.Id = l.BookingId GROUP BY LengthofStayWeighting ORDER BY LengthofStayWeighting DESC), ArrivalDate) AS DepartureDate
FROM Bookings b
ORDER BY ArrivalDate

You are trying to pull numbers from a cumulative distribution. This requires generating a random number and then pulling from the distribution.
The following code gives an example:
with LengthOfStay as (select 1 as LengthOfStay, 1 as LengthOfStayWeighting union all
select 2 as LengthOfStay, 5 union all
select 3, 4 union all
select 4, 4
),
Bookings as (select cast('2013-01-01' as DATETIME) as ArrivalDate),
CumeLengthOfStay as
(select los.*,
(select SUM(LengthOfStayWeighting) from LengthOfStay los2 where los2.LengthOfStay <= los.LengthOfStay
) as cumeweighting
from LengthOfStay los
) -- select * from CumeLengthOfStay
SELECT ArrivalDate, clos.LengthOfStay, randnum % sumweighting, sumweighting,
ArrivalDate + clos.LengthOfStay as DepartureDate
FROM (select b.*, ABS(CAST(NEWID() AS binary(6))+0) as randnum
from Bookings b
) b cross join
(select SUM(LengthOfStayWeighting) as sumweighting from LengthOfStay) const left outer join
CumeLengthOfStay clos
on (b.randnum % const.sumweighting) between clos.cumeweighting - clos.LengthOfStayWeighting and clos.cumeweighting - 1
ORDER BY ArrivalDate
Basically, you add up the weights, generate a random number less than the highest weight (using the % operator), and then look up this value in the cumulative sum of the weights.

Related

MS-SQL how to add missing month in a table values

I have a table with the following entries,
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that month with frequency value as 0.
The expected output is
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-05-31'
0
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that
I would suggest recursive CTEs:
with cte as (
select id, date, frequency,
lead(date) over (partition by id order by date) as next_date
from t
union all
select id, eomonth(date, 1), 0, next_date
from cte
where eomonth(date, 1) < dateadd(day, -1, next_date)
)
select id, date, frequency
from cte
order by id, date;
The anchor part of the CTE calculates the end date for a given row. The recursive part then just keeps adding months to fill in the missing rows (and none if there are none). The use of eomonth(date, 1) is just a handy way of getting the last day of the next month.
Here is a db<>fiddle.
If you have all dates in the table, you can also use cross join to generate the rows and then left join to bring in the existing data:
select i.id, d.date, coalesce(t.frequency, 0) as frequency
from (select distinct id from t) i cross join
(select distinct date from t) d left join
t
on i.id = t.id and d.date = t.date
order by i.id, d.date;
If you have a large amount of data, you can compare performance. This may be a case where a recursive CTE is faster than alternative methods.

How to repeat a pattern in SQL skipping weekends?

I have a table with a column patterns something like '1,2,3,4' and a column name frequency which represents how many times each pattern shall repeat. For ex.
I have a generated a pattern but not able to skip weekends, here is my current code -
;WITH TestCteNew (EmployeeId, ShiftId, StartDate, Enddate)AS (
SELECT
employeeid.n.query('.[1]').value('.', 'INT') EmployeeId,
shiftid.n.query('.[1]').value('.', 'INT') ShiftId
,StartDate, Enddate
FROM
TestCte
CROSS APPLY employeeid.nodes('x') AS employeeid(n)
CROSS APPLY shiftid.nodes('x') AS shiftid(n)
CROSS APPLY (SELECT TOP(2) ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1 r_num FROM SYS.ALL_OBJECTS A , SYS.ALL_OBJECTS B) X)
,TestCteFinal(EmployeeId, ShiftId, SDate,r_num) AS (
SELECT EmployeeId, ShiftId, StartDate + ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY r_num)-1 AS SD, x.r_num
FROM TestCteNew
CROSS APPLY (SELECT TOP(2) ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1 r_num FROM SYS.ALL_OBJECTS A , SYS.ALL_OBJECTS B) X)
With the above code I am able to generate something like below
Account DayOfWeek Shifts Shifts
1 20201007 100 1
2 20201107 100 1 (Saturday)
3 20201207 100 2 (Sunday)
4 20201307 100 2
5 20201407 100 3
6 20201507 100 3
7 20201607 100 4
8 20201707 100 4
...Same set of records above once again
Here the issue is my pattern is not skipping weekends, I want something like below.
DECLARE #Pattern VARCHAR(10)= '1,2,3,4', #Frequency INT=2
Account DayOfWeek Shifts Shifts
1 20201007 100 1
2 20201107 100 0 (Saturday)
3 20201207 100 0 (Sunday)
4 20201307 100 1
5 20201407 100 2
6 20201507 100 2
7 20201607 100 3
8 20201707 100 3
9 20201807 100 0 (Saturday)
10 20201907 100 0 (Sunday)
12 20202007 100 4
13 20202107 100 4
14 20202207 100 1
15 20202307 100 1
I want to repeat the pattern in the above defined format.
This is pseudo-code because you the source of the data isn't clear. You refer to a TestCte that isn't defined. The query never uses the #Frequency and #Pattern variables that head the desired outer. The output has an Account column that isn't mentioned anywhere else... But perhaps this approach will work better.
declare #Frequency int = 2;
declare #StartDt date = '20200710';
with num(n) as (
select top (256) row_number() over (order by (select null)) - 1
from sys.all_objects
), dates(n, dt) as (
select row_number() over (order by n), dateadd(day, n, #StartDt)
from num
-- filter weekend dates
where datepart(weekday, dateadd(day, n, #StartDt)) between 1 and 5
)
select p.n, r.n, d.dt
from
patterns as p -- this comes from xml? I'm going to assume these are numbered somehow
inner join num as r /* repetitions */
on r.n <= #Frequency -- I think something like "multiples" might be a better name
inner join dates as d
on d.n = #Frequency * p.n + r.n

How to find contiguous dates in numerous rows in SQL Server

We have a table with service provisions for people. For example:
id people_id dateStart dateEnd
1 1 28.07.14 19.07.16
2 2 14.04.15 16.02.16
3 2 16.02.16 18.04.16
4 2 18.04.16 27.06.16
5 2 27.06.16 19.07.16
6 2 19.07.16 NULL
7 3 24.02.12 17.06.12
8 3 23.07.12 19.09.12
9 3 18.08.14 NULL
10 4 28.06.15 NULL
11 5 19.01.16 NULL
I need to extract distinct people_id's (clients) with real start date of unfinished uninterrupted service that lasts more than year and then count days pass. 'Start date' and 'End date' of two different rows should be the same to be count as contiguous. One client can only have one unfinished service.
So the perfect result for the table above would be:
people_id dateStart lasts(days)
2 14.04.15 472
3 18.08.14 711
4 28.06.15 397
I didn't have problem with a single service:
SELECT
--some other columns from PEOPLE,
p.PEOPLE_ID,
s.DATESTART,
DATEDIFF(DAY, s.DATESTART, GETDATE()) as lasts
FROM
PEOPLE p
INNER JOIN service s on s.ID =
(
SELECT TOP 1 s2.ID
FROM service s2
WHERE s2.PEOPLE_ID = p.PEOPLE_ID
AND s2.DATESTART IS NOT NULL
AND s2.DATEEND IS NULL
ORDER BY s2.DATESTART DESC
)
WHERE
DATEDIFF(DAY, s.DATESTART , GETDATE()) >= 365
But I can't figure out how to determine contiguous services.
You can find where periods of "continuous" service begin by using lag(). Then a cumulative sum of this flag provides a group, which can be used for aggregation:
select people_id, min(datestart) as datestart,
(case when count(dateend) = count(*) then max(dateend) end) as dateend
from (select t.*,
sum(case when prev_dateend = datestart then 0 else 1 end) over
(partition by people_id order by datestart) as grp
from (select t.*,
lag(dateend) over (partition by people_id order by date_start) as prev_dateend
from t
) t
) t
group by people_id, grp
having count(*) > count(dateend);
Try this query:
select PeopleId, min(dateStart) as dateStart, sum(diff) as [lasts(days)] from
(
select P.*, datediff(day,datestart, DateEnd) as diff from
(select peopleId, dateStart,
isnull(dateend, cast(getdate() as date)) as DateEnd
from People
) P
where Dateend in
(select DateStart from People
where PeopleId = P.PeopleId)
or DateEnd = cast(getdate() as date ) -- check for continuous dates
) P1 group by PeopleId having sum(diff)> 365 --check for > one year
The comments in the query should explain things

SQL Server 2012, Generate random date in specific way

I am trying to create a random appointment in my db.
How can I refactor this code so StartDate can only be given Whole, Half or Quartz minutes and the EndDate adds 1 hour to StartDate?
I am using SQL Server 2012
SELECT
(SELECT TOP 1 Id from [dbo].[am_Customer] order by newid()) AS CustomerId
-- TODO: StartDate can only be given Whole, Half or Quartz hours
,(SELECT DATEADD(DAY, ABS(CHECKSUM(NEWID()) % 3650), getdate())) AS StartDate
-- TODO: Need to add 1 hour to StartDate
,(SELECT DATEADD(DAY, ABS(CHECKSUM(NEWID()) % 3650), getdate())) AS EndDate
,(SELECT TOP 1 ServiceName from [dbo].[am_Appointments]
WHERE DATALENGTH(ServiceName) > 0 order by newid()) AS ServiceName
,(SELECT TOP 1 Id from [dbo].[Employees] order by newid()) AS EmployeeId
EDIT:
Here is the solution i ended up with:
;WITH s AS (
SELECT
DATEADD(minute, ABS(CHECKSUM(NEWID()) % 350400)*15,
DATEADD(day,DATEDIFF(day,0,getdate()),0)) AS StartDate
)
SELECT
(SELECT TOP 1 Id FROM [dbo].[am_Customer] ORDER BY newid()) AS CustomerId
,(SELECT s.StartDate) AS StartDate
,(SELECT DATEADD(hour,1,s.StartDate)) AS EndDate
,(SELECT TOP 1 ServiceName from [dbo].[am_Appointments] WHERE DATALENGTH(ServiceName) > 0 ORDER BY newid()) AS ServiceName
,(SELECT TOP 1 Id FROM [dbo].[Employees] ORDER BY newid()) AS EmployeeId
FROM s
This will generate a StartDate value some time in the next 10 years that falls on a 15 minute interval, and also an EndDate an hour later:
;With s as (
SELECT
DATEADD(minute, ABS(CHECKSUM(NEWID()) % 350400)*15,
DATEADD(day,DATEDIFF(day,0,getdate()),0)) as StartDate
)
select s.StartDate,DATEADD(hour,1,s.StartDate) as EndDate
from s
This has a (small) probability of generating a StartDate that falls today and before now. If you want to avoid that, the simple fix is to change the second 0 in DATEADD(day,DATEDIFF(day,0,getdate()),0)) to a 1, and then it won't generate any values on today's date.
Use CROSS APPLY to calculate the start date, then that calculation can be referenced by its alias to calculate the end date, like so:
| STARTDATE | COLUMN_1 |
|---------------------------------|---------------------------------|
| February, 25 2020 10:30:00+0000 | February, 25 2020 11:30:00+0000 |
| July, 08 2018 18:15:00+0000 | July, 08 2018 19:15:00+0000 |
produced by
SELECT
ca1.StartDate
, dateadd(hour,1,ca1.StartDate)
from supportContacts --<< any table
CROSS APPLY (select
dateadd(minute,ABS(CHECKSUM(NEWID()) % 3650) * 15 ,DATEADD(DAY, ABS(CHECKSUM(NEWID()) % 3650), dateadd(dd, datediff(dd,0, getDate()), 0)))
) as ca1 (StartDate)
see: http://sqlfiddle.com/#!3/1fa93/14607

SQL spread month value into weeks

I have a table where I have values by month and I want to spread these values by week, taking into account that weeks that spread into two month need to take part of the value of each of the month and weight on the number of days that correspond to each month.
For example I have the table with a different price of steel by month
Product Month Price
------------------------------------
Steel 1/Jan/2014 100
Steel 1/Feb/2014 200
Steel 1/Mar/2014 300
I need to convert it into weeks as follows
Product Week Price
-------------------------------------------
Steel 06-Jan-14 100
Steel 13-Jan-14 100
Steel 20-Jan-14 100
Steel 27-Jan-14 128.57
Steel 03-Feb-14 200
Steel 10-Feb-14 200
Steel 17-Feb-14 200
As you see above, the week that overlaps between Jan and Feb needs to be calculated as follows
(100*5/7)+(200*2/7)
This takes into account tha the week of the 27th has 5 days that fall into Jan and 2 into Feb.
Is there any possible way to create a query in SQL that would achieve this?
I tried the following
First attempt:
select
WD.week,
PM.PRICE,
DATEADD(m,1,PM.Month),
SUM(PM.PRICE/7) * COUNT(*)
from
( select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE
)PM
join
( select '2014-1-20' as week
union
select '2014-1-27' as week
union
select '2014-2-3' as week
)WD
ON WD.week>=PM.Month
AND WD.week < DATEADD(m,1,PM.Month)
group by
WD.week,PM.PRICE, DATEADD(m,1,PM.Month)
This gives me the following
week PRICE
2014-1-20 100 2014-02-01 00:00:00.000 14
2014-1-27 100 2014-02-01 00:00:00.000 14
2014-2-3 200 2014-03-01 00:00:00.000 28
I tried also the following
;with x as (
select price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n <= ndm.nd
)
select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
This gimes me the following
week price
2014-01-07 00:00:00.000 100.00
2014-01-14 00:00:00.000 100.00
2014-01-21 00:00:00.000 100.00
2014-02-04 00:00:00.000 200.00
2014-02-11 00:00:00.000 200.00
2014-02-18 00:00:00.000 200.00
Thanks
If you have a calendar table it's a simple join:
SELECT
product,
calendar_date - (day_of_week-1) AS week,
SUM(price/7) * COUNT(*)
FROM prices AS p
JOIN calendar AS c
ON c.calendar_date >= month
AND c.calendar_date < DATEADD(m,1,month)
GROUP BY product,
calendar_date - (day_of_week-1)
This could be further simplified to join only to mondays and then do some more date arithmetic in a CASE to get 7 or less days.
Edit:
Your last query returned jan 31st two times, you need to remove the =from on n.n < ndm.nd. And as you seem to work with ISO weeks you better change the DATEPART to avoid problems with different DATEFIRST settings.
Based on your last query I created a fiddle.
;with x as (
select price,
datepart(isowk,dateadd(day, n.n, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n < ndm.nd
) select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
Of course the dates might be from multiple years, so you need to GROUP BY by the year, too.
Actually, you need to spred it over days, and then get the averages by week. To get the days we'll use the Numbers table.
;with x as (
select product, price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from #t t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from #t t
where t1.month = t.month and t1.product = t.product) ndm
inner join numbers n on n.n <= ndm.nd
)
select product, min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by product, wk
having count(*) = 7
order by product, wk
The result of datepart(week,dateadd(day, n.n-2, t1.month)) expression depends on SET DATEFIRST so you might need to adjust accordingly.