I'd like to analyze some daily data by hydrologic year: From 1 September to 31 August. I've created a synthetic data set with:
import pandas as pd
t = pd.date_range(start='2015-01-01', freq='D', end='2021-09-03')
df = pd.DataFrame(index = t)
df['hydro_year'] = df.index.year
df['hydro_year'].loc[df.index.month >= 9] += 1
df['id'] = df['hydro_year'] - df.index.year[0]
df['count'] = 1
Note that in reality I do not have a hydro_year column so I do not use groupby. I would expect the following to resample by hydrologic year:
print(df['2015-09-01':].resample('12M').agg({'hydro_year':'mean','id':'mean','count':'sum'}))
But the output does not align:
| | hydro_year | id | count |
|---------------------+------------+---------+-------|
| 2015-09-30 00:00:00 | 2016 | 1 | 30 |
| 2016-09-30 00:00:00 | 2016.08 | 1.08197 | 366 |
| 2017-09-30 00:00:00 | 2017.08 | 2.08219 | 365 |
| 2018-09-30 00:00:00 | 2018.08 | 3.08219 | 365 |
| 2019-09-30 00:00:00 | 2019.08 | 4.08219 | 365 |
| 2020-09-30 00:00:00 | 2020.08 | 5.08197 | 366 |
| 2021-09-30 00:00:00 | 2021.01 | 6.00888 | 338 |
However, if I start a day earlier, then things do align, except the first day is 'early' and dangling alone...
| | hydro_year | id | count |
|---------------------+------------+----+-------|
| 2015-08-31 00:00:00 | 2015 | 0 | 1 |
| 2016-08-31 00:00:00 | 2016 | 1 | 366 |
| 2017-08-31 00:00:00 | 2017 | 2 | 365 |
| 2018-08-31 00:00:00 | 2018 | 3 | 365 |
| 2019-08-31 00:00:00 | 2019 | 4 | 365 |
| 2020-08-31 00:00:00 | 2020 | 5 | 366 |
| 2021-08-31 00:00:00 | 2021 | 6 | 365 |
| 2022-08-31 00:00:00 | 2022 | 7 | 3 |
IIUC, you can use 12MS (Start) instead of 12M:
>>> df['2015-09-01':].resample('12MS') \
.agg({'hydro_year':'mean','id':'mean','count':'sum'})
hydro_year id count
2015-09-01 2016.0 1.0 366
2016-09-01 2017.0 2.0 365
2017-09-01 2018.0 3.0 365
2018-09-01 2019.0 4.0 365
2019-09-01 2020.0 5.0 366
2020-09-01 2021.0 6.0 365
2021-09-01 2022.0 7.0 3
We can try with Anchored Offsets annually starting with SEP:
resampled_df = df['2015-09-01':].resample('AS-SEP').agg({
'hydro_year': 'mean', 'id': 'mean', 'count': 'sum'
})
hydro_year id count
2015-09-01 2016.0 1.0 366
2016-09-01 2017.0 2.0 365
2017-09-01 2018.0 3.0 365
2018-09-01 2019.0 4.0 365
2019-09-01 2020.0 5.0 366
2020-09-01 2021.0 6.0 365
2021-09-01 2022.0 7.0 3
I would appreciate how can i get this result.
My original table:
Date | Indicator | Value
2020-01-01 | 1 | 3000.00
2020-01-02 | 1 | 2500.00
2020-01-03 | 1 | 1000.00
2020-01-01 | 2 | 12.50
2020-01-02 | 2 | 13.23
2020-01-03 | 2 | 14.24
2020-01-01 | 3 | 150.00
2020-01-02 | 3 | 300.00
2020-01-03 | 3 | 200.00
I need to expanse the value of indicator 1 for the rest indicators
respecting the date.
Date | Indicator | Value | Result
2020-01-01 | 1 | 3000.00 | 3000.00
2020-01-02 | 1 | 2500.00 | 2500.00
2020-01-03 | 1 | 1000.00 | 1000.00
2020-01-01 | 2 | 12.50 | 3000.00
2020-01-02 | 2 | 13.23 | 2500.00
2020-01-03 | 2 | 14.24 | 1000.00
2020-01-01 | 3 | 150.00 | 3000.00
2020-01-02 | 3 | 300.00 | 2500.00
2020-01-03 | 3 | 200.00 | 1000.00
One method uses a window function:
select t.*,
first_value(value) over (partition by date order by indicator) as result
from t;
'HDP--3.1.4,The table containing the parquet timestamp which has hourly data ,hive server is pushing the hour data into into next date example is shown below , please check before and after 29 th Mar 2020 , where Mar 29 is the BST time settings with day light saving'
| 2020-03-22 | 2020-03-22 00:00:59.0 | 2020-03-22 23:59:59.0 |
| 2020-03-23 | 2020-03-23 00:00:59.0 | 2020-03-23 23:59:59.0 |
| 2020-03-24 | 2020-03-24 00:00:59.0 | 2020-03-24 23:59:59.0 |
| 2020-03-25 | 2020-03-25 00:00:59.0 | 2020-03-25 23:59:59.0 |
| 2020-03-26 | 2020-03-26 00:00:59.0 | 2020-03-26 23:59:59.0 |
| 2020-03-27 | 2020-03-27 00:00:59.0 | 2020-03-27 23:59:59.0 |
| 2020-03-28 | 2020-03-28 00:00:59.0 | 2020-03-28 23:59:59.0 |
| 2020-03-29 | 2020-03-29 00:00:59.0 | 2020-03-30 00:59:59.0 |
| 2020-03-30 | 2020-03-30 01:00:59.0 | 2020-03-31 00:59:59.0 |
| 2020-03-31 | 2020-03-31 01:00:59.0 | 2020-04-01 00:59:59.0 |
| 2020-04-01 | 2020-04-01 01:00:59.0 | 2020-04-02 00:59:59.0 |
| 2020-04-02 | 2020-04-02 01:00:59.0 | 2020-04-03 00:59:59.0 |
When writing to parquet table in hive make sure the timestamp values are in UTC and set time zone in hive to match the local timezone .
set time zone LOCAL;
or
set time zone '+1:00'
I want to get a breakdown by Name, Year/Month and Total. How can I do that with what I've got so far?
My data looks like this:
| name | ArtifactID | Name | DateCollected | FileSizeInBytes | WorkspaceArtifactId | TimestampOfLatestRecord |
+---------+------------+---------------------------+-------------------------+-----------------+---------------------+-------------------------+
| Pony | 1265555 | LiteDataPublishedToReview | 2018-12-21 00:00:00.000 | 5474.00 | 2534710 | 2018-12-21 09:26:49.000 |
| Wheels | 1265566 | LiteDataPublishedToReview | 2019-02-26 00:00:00.000 | 50668.00 | 2634282 | 2019-02-26 17:38:39.000 |
| Wheels | 1265567 | LiteDataPublishedToReview | 2019-01-11 00:00:00.000 | 10921638320.00 | 2634282 | 2019-01-11 16:44:04.000 |
| Wheels | 1265568 | LiteDataPublishedToReview | 2019-01-15 00:00:00.000 | 110261521.00 | 2634282 | 2019-01-15 17:43:57.000 |
| Wheels | 1265569 | LiteDataProcessed | 2018-12-13 00:00:00.000 | 123187605031.00 | 2634282 | 2018-12-13 21:50:34.000 |
| Wheels | 1265570 | FullDataProcessed | 2018-12-13 00:00:00.000 | 6810556609.00 | 2634282 | 2018-12-13 21:50:34.000 |
| Wheels | 1265571 | LiteDataProcessed | 2018-12-15 00:00:00.000 | 0.00 | 2634282 | 2018-12-15 14:52:20.000 |
| Wheels | 1265572 | FullDataProcessed | 2018-12-15 00:00:00.000 | 13362690.00 | 2634282 | 2018-12-15 14:52:20.000 |
| Wheels | 1265573 | LiteDataProcessed | 2019-01-09 00:00:00.000 | 1480303616.00 | 2634282 | 2019-01-09 13:52:23.000 |
| Wheels | 1265574 | FullDataProcessed | 2019-01-09 00:00:00.000 | 0.00 | 2634282 | 2019-01-09 13:52:23.000 |
| Wheels | 1265575 | LiteDataProcessed | 2019-02-25 00:00:00.000 | 0.00 | 2634282 | 2019-02-25 10:49:41.000 |
| Wheels | 1265576 | FullDataProcessed | 2019-02-25 00:00:00.000 | 7633201.00 | 2634282 | 2019-02-25 10:49:41.000 |
| Levack | 1265577 | LiteDataProcessed | 2018-12-16 00:00:00.000 | 0.00 | 2636230 | 2018-12-16 10:13:36.000 |
| Levack | 1265578 | FullDataProcessed | 2018-12-16 00:00:00.000 | 59202559.00 | 2636230 | 2018-12-16 10:13:36.000 |
| Van | 1265579 | LiteDataPublishedToReview | 2019-01-11 00:00:00.000 | 2646602711.00 | 2636845 | 2019-01-11 09:50:49.000 |
| Van | 1265580 | LiteDataPublishedToReview | 2019-01-10 00:00:00.000 | 10081222022.00 | 2636845 | 2019-01-10 18:32:03.000 |
| Van | 1265581 | LiteDataPublishedToReview | 2019-01-15 00:00:00.000 | 3009227476.00 | 2636845 | 2019-01-15 10:49:38.000 |
| Van | 1265582 | LiteDataPublishedToReview | 2019-03-26 00:00:00.000 | 87220831.00 | 2636845 | 2019-03-26 10:34:10.000 |
| Van | 1265583 | LiteDataPublishedToReview | 2019-03-28 00:00:00.000 | 688708119.00 | 2636845 | 2019-03-28 14:11:38.000 |
| Van | 1265584 | LiteDataProcessed | 2018-12-18 00:00:00.000 | 5408886887.00 | 2636845 | 2018-12-18 11:29:03.000 |
| Van | 1265585 | FullDataProcessed | 2018-12-18 00:00:00.000 | 0.00 | 2636845 | 2018-12-18 11:29:03.000 |
| Van | 1265586 | LiteDataProcessed | 2018-12-19 00:00:00.000 | 12535359488.00 | 2636845 | 2018-12-19 17:25:10.000 |
| Van | 1265587 | FullDataProcessed | 2018-12-19 00:00:00.000 | 0.00 | 2636845 | 2018-12-19 17:25:10.000 |
| Van | 1265588 | LiteDataProcessed | 2018-12-21 00:00:00.000 | 52599693312.00 | 2636845 | 2018-12-21 09:09:18.000 |
| Van | 1265589 | FullDataProcessed | 2018-12-21 00:00:00.000 | 0.00 | 2636845 | 2018-12-21 09:09:18.000 |
| Van | 1265590 | LiteDataProcessed | 2019-03-25 00:00:00.000 | 3588613120.00 | 2636845 | 2019-03-25 16:41:17.000 |
| Van | 1265591 | FullDataProcessed | 2019-03-25 00:00:00.000 | 0.00 | 2636845 | 2019-03-25 16:41:17.000 |
| Holiday | 1265592 | LiteDataProcessed | 2018-12-28 00:00:00.000 | 0.00 | 2638126 | 2018-12-28 09:15:21.000 |
| Holiday | 1265593 | FullDataProcessed | 2018-12-28 00:00:00.000 | 9219122847.00 | 2638126 | 2018-12-28 09:15:21.000 |
| Holiday | 1265594 | LiteDataProcessed | 2019-01-31 00:00:00.000 | 0.00 | 2638126 | 2019-01-31 14:45:07.000 |
| Holiday | 1265595 | FullDataProcessed | 2019-01-31 00:00:00.000 | 61727744.00 | 2638126 | 2019-01-31 14:45:07.000 |
| Holiday | 1265596 | LiteDataProcessed | 2019-02-05 00:00:00.000 | 0.00 | 2638126 | 2019-02-05 15:23:27.000 |
| Holiday | 1265597 | FullDataProcessed | 2019-02-05 00:00:00.000 | 199454805.00 | 2638126 | 2019-02-05 15:23:27.000 |
| Holiday | 1265598 | LiteDataProcessed | 2019-02-07 00:00:00.000 | 0.00 | 2638126 | 2019-02-07 11:55:55.000 |
| Holiday | 1265599 | FullDataProcessed | 2019-02-07 00:00:00.000 | 17944713.00 | 2638126 | 2019-02-07 11:55:55.000 |
| Holiday | 1265600 | LiteDataProcessed | 2019-02-13 00:00:00.000 | 0.00 | 2638126 | 2019-02-13 15:48:56.000 |
| Holiday | 1265601 | FullDataProcessed | 2019-02-13 00:00:00.000 | 60421568.00 | 2638126 | 2019-02-13 15:48:56.000 |
| Crosbie | 1265604 | LiteDataProcessed | 2019-01-21 00:00:00.000 | 0.00 | 2644032 | 2019-01-21 15:43:43.000 |
| Crosbie | 1265605 | FullDataProcessed | 2019-01-21 00:00:00.000 | 131445.00 | 2644032 | 2019-01-21 15:43:43.000 |
| Stone | 1265606 | LiteDataPublishedToReview | 2019-02-12 00:00:00.000 | 1626943444.00 | 2647518 | 2019-02-12 17:45:25.000 |
| Stone | 1265607 | LiteDataPublishedToReview | 2019-03-05 00:00:00.000 | 2134872671.00 | 2647518 | 2019-03-05 13:00:31.000 |
| Stone | 1265608 | LiteDataProcessed | 2019-02-05 00:00:00.000 | 38828043264.00 | 2647518 | 2019-02-05 09:40:55.000 |
| Stone | 1265609 | FullDataProcessed | 2019-02-05 00:00:00.000 | 0.00 | 2647518 | 2019-02-05 09:40:55.000 |
| Frost | 1265610 | LiteDataPublishedToReview | 2019-03-18 00:00:00.000 | 776025640.00 | 2658542 | 2019-03-18 12:34:10.000 |
| Frost | 1265611 | LiteDataPublishedToReview | 2019-03-05 00:00:00.000 | 3325335118.00 | 2658542 | 2019-03-05 15:02:39.000 |
| Frost | 1265612 | LiteDataPublishedToReview | 2019-03-20 00:00:00.000 | 211927893.00 | 2658542 | 2019-03-20 17:25:30.000 |
| Frost | 1265613 | LiteDataPublishedToReview | 2019-03-06 00:00:00.000 | 466536488.00 | 2658542 | 2019-03-06 11:00:59.000 |
| Frost | 1265614 | LiteDataPublishedToReview | 2019-03-21 00:00:00.000 | 3863850553.00 | 2658542 | 2019-03-21 17:14:27.000 |
| Frost | 1265615 | LiteDataProcessed | 2019-02-28 00:00:00.000 | 94249740012.00 | 2658542 | 2019-02-28 14:13:23.000 |
| Frost | 1265616 | FullDataProcessed | 2019-02-28 00:00:00.000 | 0.00 | 2658542 | 2019-02-28 14:13:23.000 |
| Yellow | 1265617 | LiteDataPublishedToReview | 2019-03-27 00:00:00.000 | 4550540631.00 | 2659077 | 2019-03-27 16:09:41.000 |
| Yellow | 1265618 | LiteDataProcessed | 2019-03-07 00:00:00.000 | 0.00 | 2659077 | 2019-03-07 16:53:16.000 |
| Yellow | 1265619 | FullDataProcessed | 2019-03-07 00:00:00.000 | 96139872.00 | 2659077 | 2019-03-07 16:53:16.000 |
| Yellow | 1265620 | LiteDataProcessed | 2019-03-08 00:00:00.000 | 105357273318.00 | 2659077 | 2019-03-08 16:43:24.000 |
| Yellow | 1265621 | FullDataProcessed | 2019-03-08 00:00:00.000 | 0.00 | 2659077 | 2019-03-08 16:43:24.000 |
+---------+------------+---------------------------+-------------------------+-----------------+---------------------+-------------------------+
This is my attempt:
SELECT
CAST(YEAR(ps.DateCollected) AS VARCHAR(4)) + '-' + right('00' + CAST(MONTH(ps.DateCollected) AS VARCHAR(2)), 2),
ps.[Name],
c.name,
ceiling(SUM(ps.FileSizeInBytes)/1024/1024/1024.0) [Processed]
FROM EDDSDBO.RPCCProcessingStatistics ps
inner join edds.eddsdbo.[case] c on c.artifactid = ps.workspaceartifactid
where ps.DateCollected >= '2018-12-01'
GROUP BY ps.name, c.name, CAST(YEAR(ps.DateCollected) AS VARCHAR(4)) + '-' + right('00' + CAST(MONTH(ps.DateCollected) AS VARCHAR(2)), 2)
The logic should be this:
(1) Get all values after 2018-12-01 in bytes
(2) Total them
(3) Convert to GB
(4) Ceiling the result
When I run my code and I add the results together for FullDataProcessed I get 22. However, when I manually add up the results for FullDataProcessed, I get 15.40 which when ceiling'd is 16.
I would expect the FullDataProcessed from the results of my code to equal 16, not 22.
I would guess that one or more of your records has its workspaceartifactid specified more than once in the edds.eddsdbo.[case] table. Is the primary key on the case table more than just artifactid?
I am trying to create a table which contains Fiscal day,month and year.
However I want to add an actual date column in the give result as well.
My Query -
(FISCAL_DAY, BEGIN_DATE ,END_DATE ,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR ) AS
(SELECT CAST(1 AS INT) ,begin_date,end_DATE,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR FROM DB_NAME.ORIGINAL_FISCAL_TABLE
UNION ALL
SEL Fiscal_Day+1,begin_date,end_DATE,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR
FROM TMP_FISCAL_DAY WHERE BEGIN_DATE<END_DATE AND FISCAL_DAY<END_DATE-BEGIN_DATE)
SEL * FROM TMP_FISCAL_DAY
Output
+------------+------------+------------+--------------+----------------+-------------+
| FISCAL_DAY | BEGIN_DATE | END_DATE | FISCAL_MONTH | FISCAL_QUARTER | FISCAL_YEAR |
+------------+------------+------------+--------------+----------------+-------------+
| 1 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 2 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 3 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 4 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 5 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 6 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 7 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 8 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 9 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 10 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 11 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 12 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 13 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 14 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 15 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 16 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 17 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 18 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 19 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 20 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 21 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 22 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 23 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 24 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 25 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 26 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 27 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 28 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 29 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 30 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 31 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 32 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 33 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
| 34 | 12/30/2017 | 02/02/2018 | 12 | 4 | 2,018 |
+------------+------------+------------+--------------+----------------+-------------+
Expected output
+------------+-------------+------------+----------+--------------+----------------+-------------+
| FISCAL_DAY | Actual Date | BEGIN_DATE | END_DATE | FISCAL_MONTH | FISCAL_QUARTER | FISCAL_YEAR |
+------------+-------------+------------+----------+--------------+----------------+-------------+
| 1 | 12/30/2017 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 2 | 12/31/2017 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 3 | 1/1/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 4 | 1/2/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 5 | 1/3/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 6 | 1/4/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 7 | 1/5/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 8 | 1/6/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 9 | 1/7/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 10 | 1/8/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 11 | 1/9/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 12 | 1/10/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 13 | 1/11/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 14 | 1/12/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 15 | 1/13/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 16 | 1/14/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 17 | 1/15/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 18 | 1/16/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 19 | 1/17/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 20 | 1/18/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 21 | 1/19/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 22 | 1/20/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 23 | 1/21/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 24 | 1/22/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 25 | 1/23/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 26 | 1/24/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 27 | 1/25/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 28 | 1/26/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 29 | 1/27/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 30 | 1/28/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 31 | 1/29/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 32 | 1/30/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 33 | 1/31/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
| 34 | 2/1/2018 | 12/30/2017 | 2/2/2018 | 12 | 4 | 2,018 |
+------------+-------------+------------+----------+--------------+----------------+-------------+
How do I put date in recursion such that actual dates show up ?
My Attempt (incorrect results)
WITH RECURSIVE TMP_FISCAL_DAY
(FISCAL_DAY, ACTUAL_DATE, BEGIN_DATE ,END_DATE ,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR ) AS
(SELECT CAST(1 AS INT) ,cast(current_date as date), begin_date,end_DATE,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR FROM DB_NAME.ORIGINAL_FISCAL_TABLE
UNION ALL
SEL Fiscal_Day+1,ACTUAL_DATE+FISCAL_DAY,begin_date,end_DATE,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR
FROM TMP_FISCAL_DAY WHERE BEGIN_DATE<END_DATE AND FISCAL_DAY<END_DATE-BEGIN_DATE)
SEL * FROM TMP_FISCAL_DAY where CURRENT_DATE BETWEEN BEGIN_DATE AND END_DATE
Assuming there's one row per fiscal month in your ORIGINAL_FISCAL_TABLE you should filter the current month before recursion and then use BEGIN_DATE instead of CURRENT_DATE:
WITH RECURSIVE TMP_FISCAL_DAY ( FISCAL_DAY, ACTUAL_DATE, BEGIN_DATE ,END_DATE ,FISCAL_MONTH,FISCAL_QUARTER,FISCAL_YEAR )
AS
(
SELECT
Cast(1 AS INT)
,BEGIN_DATE
,begin_date
,end_DATE
,FISCAL_MONTH
,FISCAL_QUARTER
,FISCAL_YEAR
FROM DB_NAME.ORIGINAL_FISCAL_TABLE
WHERE Current_Date BETWEEN BEGIN_DATE AND END_DATE
UNION ALL
SELECT
Fiscal_Day+1
,ACTUAL_DATE+1
,begin_date
,end_DATE
,FISCAL_MONTH
,FISCAL_QUARTER
,FISCAL_YEAR
FROM TMP_FISCAL_DAY
WHERE ACTUAL_DATE+1 < END_DATE
)
SELECT *
FROM TMP_FISCAL_DAY
As #RonBallard wrote there's no need for recursion, you can use EXPAND ON instead:
SELECT
ACTUAL_DATE - BEGIN_DATE + 1 AS Fiscal_Day, dt.*
FROM
(
SELECT Begin(pd) AS ACTUAL_DATE, t.*
FROM ORIGINAL_FISCAL_TABLE AS t
WHERE Current_Date BETWEEN BEGIN_DATE AND END_DATE
EXPAND ON PERIOD(BEGIN_DATE, END_DATE) AS pd
) AS dt
But finally there should be no need for any kind of calculation, every company should have a calendar table with pre-calculated data:
SELECT ...
FROM myCalendar
WHERE Current_Date BETWEEN FISCAL_MONTH_BEGIN_DATE AND FISCAL_MONTH_END_DATE