I'm looking to calculate the number of days that overlap between to (DateTime) spans of time.
Logic behind the question is: A prisoner is serving a sentence from Orig bed start (beginning of his sentence) to Bed End Date (end of his sentence). During his sentence he took a leave of absence for whatever reason... idea is to calculate the numbers of days that specific prisoner took days off from his sentence as an example.
Making sure the leave start and end dates fall between the bed start and end days and then calculating the date difference and ignoring the rest.
Given this existing data:
ORIG_BED_START
ORIG_BED_END
LEAVE_START_DAT
LEAVE_END_DATE
LEAVE_DAYS
2022-10-19 09:21:00.000
2022-11-02 14:49:00.000
2022-10-28 00:00:00.000
2022-11-02 00:00:00.000
??
2022-11-02 14:50:00.000
2022-11-16 13:19:00.000
2022-10-28 00:00:00.000
2022-11-02 00:00:00.000
??
2022-12-19 10:17:00.000
2022-12-27 10:59:00.000
2022-12-19 00:00:00.000
2022-12-30 00:00:00.000
??
2022-12-27 11:00:00.000
NULL
2022-12-19 00:00:00.000
2022-12-30 00:00:00.000
??
2022-12-22 20:29:00.000
2022-12-29 17:48:00.000
2022-12-26 00:00:00.000
2022-12-30 00:00:00.000
??
2022-12-29 17:49:00.000
2022-12-30 14:59:00.000
2022-12-26 00:00:00.000
2022-12-30 00:00:00.000
??
I am expecting the result set to be:
ORIG_BED_START
ORIG_BED_END
LEAVE_START_DAT
LEAVE_END_DATE
LEAVE_DAYS
2022-10-19 09:21:00.000
2022-11-02 14:49:00.000
2022-10-28 00:00:00.000
2022-11-02 00:00:00.000
5
2022-11-02 14:50:00.000
2022-11-16 13:19:00.000
2022-10-28 00:00:00.000
2022-11-02 00:00:00.000
0
2022-12-19 10:17:00.000
2022-12-27 10:59:00.000
2022-12-19 00:00:00.000
2022-12-30 00:00:00.000
8
2022-12-27 11:00:00.000
NULL
2022-12-19 00:00:00.000
2022-12-30 00:00:00.000
3
2022-12-22 20:29:00.000
2022-12-29 17:48:00.000
2022-12-26 00:00:00.000
2022-12-30 00:00:00.000
4
2022-12-29 17:49:00.000
2022-12-30 14:59:00.000
2022-12-26 00:00:00.000
2022-12-30 00:00:00.000
0
This is the closest I have come
CASE
WHEN (CONVERT(DATE, LEAVE_START_DATE) >= CONVERT(DATE, ORIG_BED_START) AND
CONVERT(DATE, LEAVE_START_DATE) <= CONVERT(DATE, ORIG_BED_END))
OR (CONVERT(DATE, LEAVE_END_DATE) >= CONVERT(DATE, ORIG_BED_END) AND CONVERT(DATE, LEAVE_END_DATE) <= CONVERT(DATE, ORIG_BED_END))
THEN DATEDIFF(DAY, LEAVE_START_DATE, LEAVE_END_DATE)
ELSE ''
END AS LEAVE_DAYS
The math to evaluate is MAX(0, MIN(orig_bed_end, leave_end_date) - MAX(orig_bed_start, leave_start_dat) ), in SQL that should give:
greatest(0, trunc( convert(date,least(coalesce(orig_bed_end,leave_end_date),leave_end_date)) - convert(date,greatest(orig_bed_start,leave_start_dat))))
Depending on where you will put the trunc - after or before calculating the difference - you may have slightly different results (+-1).
(quickly converted from ORACLE syntax, so may still need fixes to work in SQLServer)
Related
How can I apply weights from a one table to another [Port] where the weight table has sparse dates?
[Port] table
utcDT UsdPnl
-----------------------------------------------
2012-03-09 00:00:00.000 -0.00581815226439161
2012-03-11 00:00:00.000 -0.000535272460588547
2012-03-12 00:00:00.000 -0.00353079778650661
2012-03-13 00:00:00.000 0.00232882689252497
2012-03-14 00:00:00.000 -0.0102592811199384
2012-03-15 00:00:00.000 0.00254451559598693
2012-03-16 00:00:00.000 0.0146718613139845
2012-03-18 00:00:00.000 0.000425144543842752
2012-03-19 00:00:00.000 -0.00388548271428044
2012-03-20 00:00:00.000 -0.00662423680184768
2012-03-21 00:00:00.000 0.00405506208635343
2012-03-22 00:00:00.000 -0.000814822806982203
2012-03-23 00:00:00.000 -0.00289523953346103
2012-03-25 00:00:00.000 0.00204150859774465
2012-03-26 00:00:00.000 -0.00641635182718787
2012-03-27 00:00:00.000 -0.00107168420738448
2012-03-28 00:00:00.000 0.00131000520696153
2012-03-29 00:00:00.000 0.0008223678402638
2012-03-30 00:00:00.000 -0.00255345945390133
2012-04-01 00:00:00.000 -0.00337792814650089
[Weights] table
utcDT Weight
--------------------------------
2012-03-09 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-29 00:00:00.000 7
So, I want to use the weights as if I had a full table like this below. i.e. change to new weight on first day it appears in [Weights] table:
utcDT UsedWeight
----------------------------------
2012-03-09 00:00:00.000 1
2012-03-11 00:00:00.000 1
2012-03-12 00:00:00.000 1
2012-03-13 00:00:00.000 1
2012-03-14 00:00:00.000 1
2012-03-15 00:00:00.000 1
2012-03-16 00:00:00.000 1
2012-03-18 00:00:00.000 1
2012-03-19 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-21 00:00:00.000 3
2012-03-22 00:00:00.000 3
2012-03-23 00:00:00.000 3
2012-03-25 00:00:00.000 3
2012-03-26 00:00:00.000 3
2012-03-27 00:00:00.000 3
2012-03-28 00:00:00.000 3
2012-03-29 00:00:00.000 7
2012-03-30 00:00:00.000 7
2012-04-01 00:00:00.000 7
You can use apply:
select p.*, w.*
from port p outer apply
(select top (1) w.*
from weights w
where w.utcDT <= p.utcDT
order by w.utcDT desc
) w;
outer apply is usually pretty efficient, if you have the right indexes. In this case, the right inex is on weights(utcDT desc).
You can use lead() in a subquery to associate the next date a weight changes to each weights record, and then join with port using an inequality condition on the dates:
select p.utcDt, w.weight
from port p
inner join (
select utcDt, weight, lead(utcDt) over(order by utcDt) lead_utcDt from weights
) w
on p.utcDt >= w.utcDt
and (w.lead_utcDt is null or p.utcDt < w.lead_utcDt)
I have a table with start_date and end_date columns and I want to remove records where both start_date and end_date are in an existing date range
source data:
start_date end_date
2019-03-18 00:00:00.000 2019-04-08 00:00:00.000
2019-04-01 00:00:00.000 2019-05-31 00:00:00.000
2019-04-03 00:00:00.000 2019-04-24 00:00:00.000
2019-04-24 00:00:00.000 2019-05-05 00:00:00.000
2019-05-06 00:00:00.000 2019-05-16 00:00:00.000
2019-05-06 00:00:00.000 2019-05-20 00:00:00.000
2019-05-06 00:00:00.000 2019-06-17 00:00:00.000
2019-05-10 00:00:00.000 2019-05-14 00:00:00.000
expected result:
start_date end_date
2019-03-18 00:00:00.000 2019-04-08 00:00:00.000
2019-04-01 00:00:00.000 2019-05-31 00:00:00.000
2019-05-06 00:00:00.000 2019-06-17 00:00:00.000
Well it's really not that hard, you just check for literally the thing you want to check for. Simply verify there aren't any records that would would contain your start date and end date between their own start and end date.
Something like this will work:
select *
from so_58088216 wrapping
where not exists (select *
from so_58088216 wrapped
where wrapping.start_date between wrapped.start_date and wrapped.end_date
and wrapping.end_date between wrapped.start_date and wrapped.end_date
-- don't check against yourself, this would be easier if had an ID or something
and wrapping.start_date != wrapped.start_date
and wrapping.end_date != wrapped.end_date)
Here's a working example
How can I convert in pandas a date format that looks something like this:
2018-08-27 00:00:00.000
2018-08-26 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-24 00:00:00.000
2018-08-23 00:00:00.000
2018-08-23 00:00:00.000
2018-08-20 00:00:00.000
2018-08-20 00:00:00.000
to an integer format counting the days since first of January 2010?
Subtract date from column by Series.sub and convert timedeltas to days by Series.dt.days:
df['days'] = pd.to_datetime(df['date']).sub(pd.Timestamp('2010-01-01')).dt.days
print (df)
date days
0 2018-08-27 00:00:00.000 3160
1 2018-08-26 00:00:00.000 3159
2 2018-08-24 00:00:00.000 3157
3 2018-08-24 00:00:00.000 3157
4 2018-08-24 00:00:00.000 3157
5 2018-08-24 00:00:00.000 3157
6 2018-08-23 00:00:00.000 3156
7 2018-08-23 00:00:00.000 3156
8 2018-08-20 00:00:00.000 3153
9 2018-08-20 00:00:00.000 3153
You can simply apply sub on the pandas Timestamp column like this as mentioned by jezrael in his answer which is very direct.
If you want to do the same serially one by one you can do it like this with the help of map
base_date = pd.Timestamp('2010-01-01 00:00:00')
df['days'] = df['date'].map(lambda date : (pd.Timestamp(date) - base_date).days )
I want to get maximum time of a given date from a query that returns From and to Date:
A function named [fn_COM_PeriodSplit] returns output like
Output:
PrSeq FromDate ToDate
1 2015-05-01 00:00:00.000 2015-05-25 23:29:29.000
PrSeq FromDate ToDate
1 2015-05-01 00:00:00.000 2015-05-07 00:00:00.000
2 2015-05-08 00:00:00.000 2015-05-14 00:00:00.000
3 2015-05-15 00:00:00.000 2015-05-21 00:00:00.000
4 2015-05-22 00:00:00.000 2015-05-25 23:59:59.000
But I want the to date like
PrSeq FromDate ToDate
1 2015-05-01 00:00:00.000 2015-05-25 23:59:59.000
How to get to date as maximum of end time?
I have daily values in one table and monthly values in another table. I need to use the values of the monthly table and calculate them on a daily basis.
basically, monthly factor * daily factor -- for each day
thanks!
I have a table like this:
2010-12-31 00:00:00.000 28.3
2010-09-30 00:00:00.000 64.1
2010-06-30 00:00:00.000 66.15
2010-03-31 00:00:00.000 12.54
and a table like this :
2010-12-31 00:00:00.000 98.1
2010-12-30 00:00:00.000 97.61
2010-12-29 00:00:00.000 99.03
2010-12-28 00:00:00.000 97.7
2010-12-27 00:00:00.000 96.87
2010-12-23 00:00:00.000 97.44
2010-12-22 00:00:00.000 97.76
2010-12-21 00:00:00.000 96.63
2010-12-20 00:00:00.000 95.47
2010-12-17 00:00:00.000 95.2
2010-12-16 00:00:00.000 94.84
2010-12-15 00:00:00.000 94.8
2010-12-14 00:00:00.000 94.1
2010-12-13 00:00:00.000 93.88
2010-12-10 00:00:00.000 93.04
2010-12-09 00:00:00.000 91.07
2010-12-08 00:00:00.000 90.89
2010-12-07 00:00:00.000 92.72
2010-12-06 00:00:00.000 93.05
2010-12-03 00:00:00.000 91.74
2010-12-02 00:00:00.000 90.74
2010-12-01 00:00:00.000 90.25
I need to take the value for the quarter and multiply it buy all the days in the quarter by the daily value
You could try:
SELECT dt.day, dt.factor*mt.factor AS daily_factor
FROM daily_table dt INNER JOIN month_table mt
ON YEAR(dt.day) = YEAR(mt.day)
AND FLOOR((MONTH(dt.day)-1)/3) = FLOOR((MONTH(mt.day)-1)/3)
ORDER BY dt.day
or (as suggested by #Andriy)
SELECT dt.day, dt.factor*mt.factor AS daily_factor
FROM daily_table dt INNER JOIN month_table mt
ON YEAR(dt.day) = YEAR(mt.day)
AND DATEPART(QUARTER, dt.day) = DATEPART(QUARTER, mt.day)
ORDER BY dt.day