SQL Partition Data by Date Range ignoring date gaps and weekends

SQL Partition Data by Date Range ignoring date gaps and weekends - sql

Thank you in advance for your patience, and help!
I am trying to partition my data in a way that displays date ranges.
IMAGE: Data Set - Current Results - Desired Results
In the image you can see what my data set looks like. The results I'm currently getting. As well as, the results I would like to get.
Here is the code that I've got that is getting me my current results. I'm struggling understanding PARTITION.
Edit I can bring Saturday and Sunday back in, if the data is necessary to have all 365 consecutive days. I'm simply removing it from the Data Source in the WHERE clause.
--DELETE TEMP TABLE USED TO STORE CONSECUTIVE ABSENCES
DROP TABLE IF EXISTS #StuConsecAtt;
--CREATE TEMP TABLE THAT STORES CONSECUTIVE ABSENCE DATE RANGES
CREATE TABLE #StuConsecAtt(
SIS_NUMBER INT,
ABS_FROM DATE,
ABS_TO DATE
);
--INSERT CONSECUTIVE ABSENCE DATA NEW TABLE
WITH stuAtt
AS (
SELECT *
,DATEADD(DAY, - ROW_NUMBER() OVER (
PARTITION BY SIS_NUMBER ORDER BY ABS_DATE
), ABS_DATE) AS grp
FROM #stuCalAtt
)
INSERT INTO #StuConsecAtt
(ABS_FROM, ABS_TO, SIS_NUMBER)
SELECT min(ABS_DATE) AS [From]
,max(ABS_DATE) AS [To]
-- ,[ABS_REASON]
,SIS_NUMBER
FROM stuATT
GROUP BY SIS_NUMBER
,grp
ORDER BY [From];
SELECT * FROM #StuConsecAtt
WHERE ABS_TO > ABS_FROM;
EDIT BELOW
DATA
Looking at the data I'm trying to put consecutive days with ABSENT_DATE = Y in a single group. Below 10/4 through 10/11 are consecutive (but the weekends would be ABSENT_DAY = N) so I removed the weekends. No because 10/4 through 10/11 are grouped together (consecutive in the dataset), all with ABSENT_DAY = Y, I would like to group them so I can get the outcome range of 10/4-10/11. Just like the following range would be 10/18 - 10/19. Where the weekend gap is, is cause the issue.
SIS_NUMBER CALENDAR_DATE WEEK_DAY ABS_DATE SCHOOL_DAY ABSENT_DAY
641861 2017-10-03 Tuesday NULL Y N
641861 2017-10-04 Wednesday 2017-10-04 Y Y
641861 2017-10-05 Thursday 2017-10-05 Y Y
641861 2017-10-06 Friday 2017-10-06 Y Y
641861 2017-10-09 Monday 2017-10-09 Y Y
641861 2017-10-10 Tuesday 2017-10-10 Y Y
641861 2017-10-11 Wednesday 2017-10-11 Y Y
641861 2017-10-12 Thursday NULL N N
641861 2017-10-13 Friday NULL N N
641861 2017-10-16 Monday NULL Y N
641861 2017-10-17 Tuesday NULL Y N
641861 2017-10-18 Wednesday 2017-10-18 Y Y
641861 2017-10-19 Thursday 2017-10-19 Y Y
CURRENT RESULTS
SIS_NUMBER FROM_DATE TO_DATE
641861 2017-10-04 2017-10-06
641861 2017-10-09 2017-10-11
641861 2017-10-18 2017-10-19
DESIRED RESULTS
SIS_NUMBER FROM_DATE TO_DATE
641861 2017-10-04 2017-10-11
641861 2017-10-18 2017-10-19

Related

How to break datetime in 12 hour chunks and use it for aggregation in Presto SQL?

I have been trying to break the datetime in 12 hour chunk in Presto SQL but was unsuccessful.
Raw data table:
datetime
Login
2022-05-08 07:10:00.000
1234
2022-05-09 23:20:00.000
5678
2022-05-09 06:20:00.000
5674
2022-05-08 09:20:00.000
8971
The output table should look like below. I have to get count of login in 12 hour chunks. So, first should be from 00:00:00.000 to 11:59:00:000 and the next chunk from 12:00:00.000 to 23:59:00:000
Output:
datetime
count
2022-05-08 00:00:00.000
2
2022-05-08 12:00:00.000
0
2022-05-09 00:00:00.000
1
2022-05-09 12:20:00.000
1

This should work:
Extract the hour from the timestamp, then integer divide it by 12. That will make it 0 till 11:59, and 1 till 23:59. Then, multiply that back by 12.
Use that resulting integer to DATE_ADD() it with unit 'HOUR' to the timestamp of the row truncated to the day.
SELECT
DATE_ADD('HOUR',(HOUR(ts) / 12) * 12, TRUNC(ts,'DAY')) AS halfday
, SUM(login) AS count_login
FROM indata
GROUP BY
halfday
;
-- out halfday | count_login
-- out ---------------------+-------------
-- out 2022-05-08 00:00:00 | 15879
-- out 2022-05-08 12:00:00 | 5678

This query worked for me.
SELECT
DATE_ADD('HOUR',(HOUR(ts) / 12) * 12, date_trunc('DAY',ts)) AS halfday
, SUM(login) AS count_login
FROM indata
GROUP BY
halfday
;

aligning tables with different dates

I have two tables, called tblDaily and tblWeekly.
So tblDaily contains daily data & tblWeekly contains data that is stored every friday.
So obviously it is easy to join the daily table to the weekly table when the date in the daily data is a friday.
My question is what is the best way to join when the date is not a friday. So for example say I had the date 2018-05-09 (Wednesday) I would like to join it on the previous friday (2018-05-04). What is the optimal way of doing this?
I read about a calendar table, would that be the correct way to go? Although I'm not sure how that would work in this case?
tblDaily
date val
2018-04-30 2 'mon
2018-05-01 3 'tues
2018-05-02 3 'wed
2018-05-03 3 'thurs
2018-05-04 3 'fri
2018-05-07 2 'mon
2018-05-08 3 'tues
2018-05-09 3 'wed
2018-05-10 3 'thurs
2018-05-11 3 'fri
2018-05-14 3 'mon
tblWeekly
date val
2018-05-04 2 'fri
2018-05-11 3 'fri

This might work:
SELECT
[dailydate] = D.[date],
[dailyval] = D.[val],
[weeklydate] = W.[date],
[weeklyval] = W.[val]
FROM
[tblDaily] AS D
OUTER APPLY (SELECT TOP (1) _W.*
FROM [tblWeekly] AS _W
WHERE _W.[date] <= D.[date]
ORDER BY _W.[date] DESC) AS W;
This query produces the following results:
dailydate dailyval weeklydate weeklyval
2018-04-30 2 NULL NULL
2018-05-01 3 NULL NULL
2018-05-02 3 NULL NULL
2018-05-03 3 NULL NULL
2018-05-04 3 2018-05-04 2
2018-05-07 2 2018-05-04 2
2018-05-08 3 2018-05-04 2
2018-05-09 3 2018-05-04 2
2018-05-10 3 2018-05-04 2
2018-05-11 3 2018-05-11 3
2018-05-14 3 2018-05-11 3

Try something like this:
select * from tblDaily a join tblWeekly b on a.date1= dateadd(day,-5,b.date2)

Try this simple join:
select *
from tblDaily [d]
--first condition in join is to match firdays exactly
left join tblWeekly [w] on [w].[date] = [d].[date] or
--here you are joining fridays from tblWeekly to last friday before the date in tblDaily
[w].[date] = dateadd(day, -datepart(weekday, [d].[date]) - 1, [d].[date])
Here is SQL fiddle.

Compare values for consecutive dates of same month

I have a table
ID Value Date
1 10 2017-10-02 02:50:04.480
2 20 2017-10-01 07:28:53.593
3 30 2017-09-30 23:59:59.000
4 40 2017-09-30 23:59:59.000
5 50 2017-09-30 02:36:07.520
I compare Value with previous date. But, I don't need compare result between first day in current month and last day in previous month. For this table, I don't need to compare result between 2017-10-01 07:28:53.593 and 2017-09-30 23:59:59.000 How it can be done?
Result table for this example:
ID Value Date Diff
1 10 2017-10-02 02:50:04.480 10
2 20 2017-10-01 07:28:53.593 NULL
3 30 2017-09-30 23:59:59.000 10
4 40 2017-09-29 23:59:59.000 10
5 50 2017-09-28 02:36:07.520 NULL

You can use this.
SELECT * ,
LEAD(Value) OVER( PARTITION BY DATEPART(YEAR,[Date]), DATEPART(MONTH,[Date]) ORDER BY ID ) - Value AS Diff
FROM MyTable
ORDER BY ID

you can use a query like below
select *,
diff=LEAD(Value) OVER( PARTITION BY Month(Date),Year(Date) ORDER BY Date desc)-Value
from t
order by id asc
see working demo

How to select periods of time with empty data?

I want to find out all periods with empty data, given the following table my_table:
id day
29 2017-06-05
26 2017-06-05
30 2017-06-06
30 2017-06-06
21 2017-06-06
21 2017-07-01
29 2017-07-01
30 2017-07-20
The answer would be:
Empty_start Empty_end
2017-06-07 2017-06-30
2017-07-02 2017-07-19
It's important that the number of months is considered. For example, in the first row the answer 2017-06-31 would be incorrect.
How can I write this query in Hive?

You can use lag() or lead():
select date_add(day, 1) as empty_start, date_add(next_day, -1) as empty_end
from (select day,
lead(day) over (order by day) as next_day
from t
group by day
) t
where next_day <> date_add(day, 1);

Computation of period Start date

I have a table that hold the start date and the end date of a financial period.
CHARGE_PERIOD_ID START_DATE END_DATE
13 2013-03-31 00:00:00.000 2013-04-27 00:00:00.000
14 2013-04-28 00:00:00.000 2013-05-25 00:00:00.000
15 2013-05-26 00:00:00.000 2013-06-29 00:00:00.000
16 2013-06-30 00:00:00.000 2013-07-27 00:00:00.000
17 2013-07-28 00:00:00.000 2013-08-24 00:00:00.000
18 2013-08-25 00:00:00.000 2013-09-28 00:00:00.000
19 2013-09-29 00:00:00.000 2013-10-26 00:00:00.000
20 2013-10-27 00:00:00.000 2013-11-23 00:00:00.000
21 2013-11-24 00:00:00.000 2013-12-28 00:00:00.000
22 2013-12-29 00:00:00.000 2014-01-25 00:00:00.000
23 2014-01-26 00:00:00.000 2014-02-22 00:00:00.000
24 2014-02-23 00:00:00.000 2014-03-29 00:00:00.000
The user of a report wants the current financial year split into 12 periods and want to give to feed in 2 parameters into the report , a year and a period number which will go into my sql . So something like #year=2014 #period=1 will be recieved . I have to write some sql to go to this table and set a period start date of 31/03/2014 and a period end date of 27/04/2014.
So in pseudo code:
Look up period 1 for 2014 and return period start date of 31/03/2014 and period end date of 27/04/2014.
#PERIOD_START_DATE = select the the first period that starts in March for the given year . all financial period starts in March.
#PERIOD_END_DATE = select the corresponding END_DATE from the table .
The question is how to begin to code this or my design approach? Should I create a function that calcualtes this or should I do a CTE and add a column which will hold the period number in the way they want etc .
Thinking about it more I think I need a mapping table . So the real question is can I do this without a mapping table ?

DECLARE #Year INT
DECLARE #Period INT
SET #Year= 2013
SET #Period = 1
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY
CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) -1 ELSE YEAR([START_DATE]) END
ORDER BY
CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) - 1 ELSE YEAR([START_DATE]) END
,CASE WHEN MONTH([START_DATE])<3 THEN MONTH([START_DATE]) + 12 ELSE MONTH([START_DATE]) END
) AS RN
FROM Periods
)
SELECT * FROM CTE
WHERE RN = #Period
AND CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) -1 ELSE YEAR([START_DATE]) END = #Year
SQLFiddle DEMO

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Partition Data by Date Range ignoring date gaps and weekends - sql

Related

How to break datetime in 12 hour chunks and use it for aggregation in Presto SQL?

aligning tables with different dates

Compare values for consecutive dates of same month

How to select periods of time with empty data?

Computation of period Start date

Categories

Resources