Postgresql - How to get value from last record of each month

Postgresql - How to get value from last record of each month - sql

I have a view like this:
Year | Month | Week | Category | Value |
2017 | 1 | 1 | A | 1
2017 | 1 | 1 | B | 2
2017 | 1 | 1 | C | 3
2017 | 1 | 2 | A | 4
2017 | 1 | 2 | B | 5
2017 | 1 | 2 | C | 6
2017 | 1 | 3 | A | 7
2017 | 1 | 3 | B | 8
2017 | 1 | 3 | C | 9
2017 | 1 | 4 | A | 10
2017 | 1 | 4 | B | 11
2017 | 1 | 4 | C | 12
2017 | 2 | 5 | A | 1
2017 | 2 | 5 | B | 2
2017 | 2 | 5 | C | 3
2017 | 2 | 6 | A | 4
2017 | 2 | 6 | B | 5
2017 | 2 | 6 | C | 6
2017 | 2 | 7 | A | 7
2017 | 2 | 7 | B | 8
2017 | 2 | 7 | C | 9
2017 | 2 | 8 | A | 10
2017 | 2 | 8 | B | 11
2017 | 2 | 8 | C | 12
And I need to make a new view which needs to show average of value column (let's call it avg_val) and the value from the max week of the month (max_val_of_month). Ex: max week of january is 4, so the value of category A is 10. Or something like this to be clear:
Year | Month | Category | avg_val | max_val_of_month
2017 | 1 | A | 5.5 | 10
2017 | 1 | B | 6.5 | 11
2017 | 1 | C | 7.5 | 12
2017 | 2 | A | 5.5 | 10
2017 | 2 | B | 6.5 | 11
2017 | 2 | C | 7.5 | 12
I have use window function, over partition by year, month, category to get the avg value. But how can I get the value of the max week of each month?

Assuming that you need a month average and a value for the max week not the max value per month
SELECT year, month, category, avg_val, value max_week_val
FROM (
SELECT *,
AVG(value) OVER (PARTITION BY year, month, category) avg_val,
ROW_NUMBER() OVER (PARTITION BY year, month, category ORDER BY week DESC) rn
FROM view1
) q
WHERE rn = 1
ORDER BY year, month, category
or more verbose version without window functions
SELECT q.year, q.month, q.category, q.avg_val, v.value max_week_val
FROM (
SELECT year, month, category, avg(value) avg_val, MAX(week) max_week
FROM view1
GROUP BY year, month, category
) q JOIN view1 v
ON q.year = v.year
AND q.month = v.month
AND q.category = v.category
AND q.max_week = v.week
ORDER BY year, month, category
Here is a dbfiddle demo for both queries

And here is my NEW version.
My thanks to #peterm for pointing me about the prior false value of val_from_max_week_of_month. So, I corrected it:
SELECT
a.Year,
a.Month,
a.Category,
max(a.Week) AS max_week,
AVG(a.Value) AS avg_val,
(
SELECT b.Value
FROM decades AS b
WHERE
b.Year = a.Year AND
b.Month = a.Month AND
b.Week = max(a.Week) AND
b.Category = a.Category
) AS val_from_max_week_of_month
FROM decades AS a
GROUP BY
a.Year,
a.Month,
a.Category
;
The new results:

First, you might need to check, how do you handle the first week in January. If 1st of January are not a Monday, there are several interpretations & not every one of them will fit the solutions here. You'll either need to use:
the ISO week concept, ie. the week column should hold the ISO week & the year column should hold the ISO year (week-year, rather). Note: in this concept, 1st of January actually sometimes belongs to the previous year
use your own concept, where the first week of the year is "split" into two if 1st of January is not a Monday.
Note: the solutions below will not work if (in your table) the first week of January can be 52 or 53.
Given that avg_val is just a simple aggregation, while max_val_of_month can be calculated with typical greatest-n-per-group queries. It has a lot of possible solutions in PostgreSQL, with varying performance. Fortunately, your query will naturally have an easily determined selectivity: you'll always need (approx.) a quarter of your data.
Usual winners (in performance) are:
(These are not surprise though, as these 2 should perform more and more as you need more portion of the original data.)
array_agg() with order by variant:
select year, month, category, avg(value) avg_val,
(array_agg(value order by week desc))[1] max_val_of_month
from table_name
group by year, month, category;
distinct on variant:
select distinct on (year, month, category) year, month, category,
avg(value) over (partition by year, month, category) avg_val,
value max_val_of_month
from table_name
order by year, month, category, week desc;
The pure window function variant is not that bad either:
row_number() variant:
select year, month, category, avg_val, max_val_of_month
from (select year, month, category, value max_val_of_month,
avg(value) over (partition by year, month, category) avg_val,
row_number() over (partition by year, month, category order by week desc) rn
from table_name) w
where rn = 1;
But the LATERAL variant is only viable with an index:
LATERAL variant:
create index idx_table_name_year_month_category_week_desc
on table_name(year, month, category, week desc);
select year, month, category,
avg(value) avg_val,
max_val_of_month
from table_name t
cross join lateral (select value max_val_of_month
from table_name
where (year, month, category) = (t.year, t.month, t.category)
order by week desc
limit 1) m
group by year, month, category, max_val_of_month;
But most of the solutions above can actually utilize this index, not just this last one.
Without the index: http://rextester.com/WNEL86809
With the index: http://rextester.com/TYUA52054

with data (yr, mnth, wk, cat, val) as
(
-- begin test data
select 2017 , 1 , 1 , 'A' , 1 from dual union all
select 2017 , 1 , 1 , 'B' , 2 from dual union all
select 2017 , 1 , 1 , 'C' , 3 from dual union all
select 2017 , 1 , 2 , 'A' , 4 from dual union all
select 2017 , 1 , 2 , 'B' , 5 from dual union all
select 2017 , 1 , 2 , 'C' , 6 from dual union all
select 2017 , 1 , 3 , 'A' , 7 from dual union all
select 2017 , 1 , 3 , 'B' , 8 from dual union all
select 2017 , 1 , 3 , 'C' , 9 from dual union all
select 2017 , 1 , 4 , 'A' , 10 from dual union all
select 2017 , 1 , 4 , 'B' , 11 from dual union all
select 2017 , 1 , 4 , 'C' , 12 from dual union all
select 2017 , 2 , 5 , 'A' , 1 from dual union all
select 2017 , 2 , 5 , 'B' , 2 from dual union all
select 2017 , 2 , 5 , 'C' , 3 from dual union all
select 2017 , 2 , 6 , 'A' , 4 from dual union all
select 2017 , 2 , 6 , 'B' , 5 from dual union all
select 2017 , 2 , 6 , 'C' , 6 from dual union all
select 2017 , 2 , 7 , 'A' , 7 from dual union all
select 2017 , 2 , 8 , 'A' , 10 from dual union all
select 2017 , 2 , 8 , 'B' , 11 from dual union all
select 2017 , 2 , 7 , 'B' , 8 from dual union all
select 2017 , 2 , 7 , 'C' , 9 from dual union all
select 2018 , 2 , 7 , 'C' , 9 from dual union all
select 2017 , 2 , 8 , 'C' , 12 from dual
-- end test data
)
select * from
(
select
-- data.*: all columns of the data table
data.*,
-- avrg: partition by a combination of year,month and category to work out -
-- the avg for each category in a month of a year
avg(val) over (partition by yr, mnth, cat) avrg,
-- mwk: partition by year and month to work out -
-- the max week of a month in a year
max(wk) over (partition by yr, mnth) mwk
from
data
)
-- as OP's interest is in the max week of each month of a year, -
-- "wk" column value is matched against
-- the derived column "mwk"
where wk = mwk
order by yr,mnth,cat;

Related

BigQuery Running Count of Unique ID per Year

I found a bunch of similar questions but not addressing this one specifically (correct me if I'm wrong).
I am trying---on BigQuery---to index each row on a table with the running count of user per year using an analytical function.
So with:
with dataset as (
select 'A' as user, '2020' as year, RAND() as some_value
union all
select 'A' as user, '2020' as year, RAND() as some_value
union all
select 'B' as user, '2020' as year, RAND() as some_value
union all
select 'B' as user, '2020' as year, RAND() as some_value
union all
select 'B' as user, '2020' as year, RAND() as some_value
union all
select 'C' as user, '2020' as year, RAND() as some_value
union all
select 'C' as user, '2020' as year, RAND() as some_value
union all
select 'A' as user, '2021' as year, RAND() as some_value
union all
select 'A' as user, '2021' as year, RAND() as some_value
union all
select 'B' as user, '2021' as year, RAND() as some_value
union all
select 'C' as user, '2021' as year, RAND() as some_value
union all
select 'C' as user, '2021' as year, RAND() as some_value
union all
select 'C' as user, '2021' as year, RAND() as some_value
union all
select 'C' as user, '2021' as year, RAND() as some_value
union all
select 'C' as user, '2021' as year, RAND() as some_value
)
I would like to get:
rcount | user | year | some_value
1 | A | 2020 | 0.2365421124968884
1 | A | 2020 | 0.21087749308191206
2 | B | 2020 | 0.6096882013526258
2 | B | 2020 | 0.8544447727632739
2 | B | 2020 | 0.6113604025541309
3 | C | 2020 | 0.5803237472480643
3 | C | 2020 | 0.165305669127888
1 | A | 2021 | 0.1200575362708826
1 | A | 2021 | 0.015721175944171915
2 | B | 2021 | 0.21890252010457295
3 | C | 2021 | 0.5087613385277634
3 | C | 2021 | 0.9949262690813603
3 | C | 2021 | 0.50824183164116
3 | C | 2021 | 0.8262428736484341
3 | C | 2021 | 0.6866964737106948
I tried :
count(user) over (partition by year,user )
I also tried using ranges like order by year range between unbounded preceding and current row
and row_count()
I have no idea where to tap for a solution now.

A simpler solution would be to use DENSE_RANK:
SELECT
DENSE_RANK() OVER (PARTITION BY year ORDER BY user) as rcount,
user,
year,
some_value
FROM dataset
Information about DENSE_RANK can be found here.

Try the following:
select user
, year
, some_value
, sum(count) over (partition by year order by year, user ROWS UNBOUNDED PRECEDING) as rcount
from (
select user
, year
, some_value
, IF(lag(user,1) OVER (order by year,user)=user,0,1) count
from dataset
)
The sub-select is defining the logic of whether to count the record or not based on what the previous row was, then we simply perform a sum with the outer select.

Check if a month is skipped then add values dynamically?

I have a set of data from a table that would only be populated if a user has data for a certain month just like this:
Month | MonthName | Value
3 | March | 136.00
4 | April | 306.00
7 | July | 476.00
12 | December | 510.48
But what I need is to check if a month is skipped then adding the value the month before so the end result would be like this:
Month | MonthName | Value
3 | March | 136.00
4 | April | 306.00
5 | May | 306.00 -- added data
6 | June | 306.00 -- added data
7 | July | 476.00
8 | August | 476.00 -- added data
9 | September | 476.00 -- added data
10 | October | 476.00 -- added data
11 | November | 476.00 -- added data
12 | December | 510.48
How can I do this dynamically on SQL Server?

One method is a recursive CTE:
with cte as (
select month, value, lead(month) over (order by month) as next_month
from t
union all
select month + 1, value, next_month
from cte
where month + 1 < next_month
)
select month, datename(month, datefromparts(2020, month, 1)) as monthname, value
from cte
order by month;
Here is a db<>fiddle.

you can use spt_values to get continuous number 1-12, and then left join your table by max(month)
select t1.month
,datename(month,datefromparts(2020, t1.month, 1)) monthname
,t2.value
from (
select top 12 number + 1 as month from master..spt_values
where type = 'p'
) t1
left join t t2 on t2.month = (select max(month) from t tmp where tmp.month < = t1.month)
where t2.month is not null
CREATE TABLE T
([Month] int, [MonthName] varchar(8), [Value] numeric)
;
INSERT INTO T
([Month], [MonthName], [Value])
VALUES
(3, 'March', 136.00),
(4, 'April', 306.00),
(7, 'July', 476.00),
(12, 'December', 510.48)
;
Demo Link SQL Server 2012 | db<>fiddle
note
if you have year column then you need to fix the script.

Oracle first and last observation over multiple windows

I have a problem with a query in Oracle.
My table contains all of the loan applications from last year. Some of the customers have more than one application. I want to aggregate those applications as follows:
For each customer, I want to find his first application (let's call it A) in the last year and then I want to find out what was the last application in 30 days interval, counting from the first application (say B is the last one). Next, I need to find the application following B and again find for it the last one in 30 days interval, as in the previous step. What I want as the result is the table with the latest and earliest applications on each customer's interval. It is also possible that the first one is the same as the last one.
How could I do this in Oracle without plsql? Is this possible? Should I use cumulative sums of time intervals for it? (but then the starting point for each sum depends on the counted sum..)
Let's say the table has a following form:
application_id (unique) | customer_id (not unique) | create_date
1 1 2017-01-02 <- first
2 1 2017-01-10 <- middle
3 1 2017-01-30 <- last
4 1 2017-05-02 <- first and last
5 1 2017-06-02 <- first
6 1 2017-06-30 <- middle
7 1 2017-06-30 <- middle
8 1 2017-07-01 <- last
What I expect is:
application_id (unique) | customer_id (not unique) | create_date
1 1 2017-01-02 <- first
3 1 2017-01-30 <- last
4 1 2017-05-02 <- first and last
5 1 2017-06-02 <- first
8 1 2017-07-01 <- last
Thanks in advance for help.

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( application_id, customer_id, create_date ) AS
SELECT 1, 1, DATE '2017-01-02' FROM DUAL UNION ALL -- <- first
SELECT 2, 1, DATE '2017-01-10' FROM DUAL UNION ALL -- <- middle
SELECT 3, 1, DATE '2017-01-30' FROM DUAL UNION ALL -- <- last
SELECT 4, 1, DATE '2017-05-02' FROM DUAL UNION ALL -- <- first and last
SELECT 5, 1, DATE '2017-06-02' FROM DUAL UNION ALL -- <- first
SELECT 6, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle
SELECT 7, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle
SELECT 8, 1, DATE '2017-07-01' FROM DUAL -- <- last
Query 1:
WITH data ( application_id, customer_id, create_date, first_date, grp ) AS (
SELECT t.application_id,
t.customer_id,
t.create_date,
t.create_date,
1
FROM table_name t
WHERE application_id = 1
UNION ALL
SELECT t.application_id,
t.customer_id,
t.create_date,
CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY
THEN d.first_date
ELSE t.create_date
END,
CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY
THEN grp
ELSE grp + 1
END
FROM data d
INNER JOIN table_name t
ON ( d.customer_id = t.customer_id
AND d.application_id + 1 = t.application_id )
)
SELECT application_id,
customer_id,
create_date,
grp
FROM (
SELECT d.*,
ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date ASC ) AS rn_a,
ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date DESC ) AS rn_d
FROM data d
)
WHERE rn_a = 1
OR rn_d = 1
Results:
| APPLICATION_ID | CUSTOMER_ID | CREATE_DATE | GRP |
|----------------|-------------|----------------------|-----|
| 1 | 1 | 2017-01-02T00:00:00Z | 1 |
| 3 | 1 | 2017-01-30T00:00:00Z | 1 |
| 4 | 1 | 2017-05-02T00:00:00Z | 2 |
| 5 | 1 | 2017-06-02T00:00:00Z | 3 |
| 8 | 1 | 2017-07-01T00:00:00Z | 3 |

Distribute rows evenly by days

I have table, where I put lets call it manual values that are used later in my code. This table looks like that:
subId | MonthNo | PackagesNumber | Country | EntryMethod | PaidAmount | Version
1 | 201701 | 223 | NO | BCD | 44803 | 2
2 | 201701 | 61 | NO | GHI | 11934 | 2
3 | 201701 | 929 | NO | ABC | 88714 | 2
4 | 201701 | 470 | NO | DEF | 98404 | 2
5 | 201702 | 223 | NO | BCD | 28225 | 2
All I have to do is, to divide those values into single rows, at the level of single package. In example, there are 223 packages in January 2017 in Country NO with EntryMethod BCD, so I want 223 separate rows. PaidAmount should be also divided by number of PackagesNumber.
The problem is I have to associate date to every record. Records should be distributed evenly through whole month. I have Date dimension, that I can intersect with my table by pulling month and year separately from MontNo.
For example, January 2017, EntryMethod BCD I have packages, so it's ~7 packages per day.
That's what I want:
subId | Date | Country | Packages | EntryMethod | PaidAmount | Version
1 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
2 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
3 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
4 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
5 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
6 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
7 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
8 | 02.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
Bonus: I wrote code, that's dividing Packages into single records, and it's putting first day of each month as date.
SELECT
Date =
(
SELECT TOP 1
date
FROM dim_Date dim
WHERE dim.Month = a.Month
AND dim.Year = a.Year
)
, Country
, EntryMethod
, Deliveries = 1
, PaidAmount = NULLIF(PaidAmount, 0) / PackagesNumber
, SubscriptionId = 90000000 + ROW_NUMBER() OVER(ORDER BY n.number)
, Version
FROM
(
SELECT
[Year] = LEFT(MonthNo, 4)
, [Month] = RIGHT(MonthNo, 2)
, Country
, EntryMethod
, PackagesNumber
, PaidAmount
, Version
FROM tgm.rep_PredictionsReport_ManualValues tgm
/*WHERE MonthNo = 201701*/
) a
JOIN master..spt_values n
ON n.type = 'P'
AND n.number < CAST(PackagesNumber AS INT);
EDIT: I made some progress. I used NTILE function, to divide rows into groups.
The only thing that changed is Date from top level select. It looks like that now:
Date = concat([Year], '-', [Month], '-', case when ntile(31) over(order by n.number) < 10 then '0' + cast(ntile(31) over(order by n.number) as varchar(2)) else cast(ntile(31) over(order by n.number) as varchar(2)) end)
Explanation: I am creating Date filed using Year and Month fields, and NTILE over number of days in month(now it's static number, but later to be changed). Results aren't good as I'd expect, it's creating groups twice as big as they should be(14 instead of 7 rows in each date).

You can accomplish this using the modulo operator, which allows you to divide items into a set number of categories.
Here is a full test: http://rextester.com/TOROA96856
Here is the relevant query:
--recursive query to expand each row.
with expand_rows (subid,monthno,month,packagesnumber,paidamount) as (
select subid,monthno,month,packagesnumber,(paidamount+0.0000)/packagesnumber
from initial_table
union all
select subid,monthno,month,packagesnumber-1,paidamount
from expand_rows where packagesnumber >1
)
select expand_rows.*,(packagesnumber % numdays)+1 day, paidamount from expand_rows
join dayspermonth d on
d.month = expand_rows.month
order by subid, day
option (maxrecursion 0)
(packagesnumber % numdays)+1 is the modulo operation that assigns items to a day.
Note that I precomputed a table of the number of days in each month in order to use in the query. I also simplified the problem slightly for purposes of the answer (added a pure month column because I didn't want to mess around with replicating your date dimension).
You may need to tweak the modulo query if you care where the extra items end up when things don't divide evenly (e.g. if you have 32 items in January, which day has an extra item?). In this example the second day of the month tends to get the most (because of adding 1 to account for the fact that the last day of the month ends up 0). If you want the extra days to fall at the beginning of the month you could use a case statement that converts 0 to the number of days in the month, instead.

To distribute 223 numbers evenly on days of january we do this:
There are 31 days in january
The remainder for 223/3 is 6
223/31 is 7 (integer division)
So thats 7 records pr day, plus 1 record extra for january 1-6.
I've used a tally table to make dates and some more, but the distribution of rows pr day can be determined like this:
with
tally as
(
select row_number() over (order by n)-1 n from
(values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) n(n)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) m(m)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) l(m)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) k(m)
)
,t1 as
(
select
*
from
(values
(1 , 201701 , 223 , 'NO' , 'BCD' , 44803 , 2)
,(2 , 201701 , 61 , 'NO' , 'GHI' , 11934 , 2)
,(3 , 201701 , 929 , 'NO' , 'ABC' , 88714 , 2)
,(4 , 201701 , 470 , 'NO' , 'DEF' , 98404 , 2)
,(5 , 201702 , 223 , 'NO' , 'BCD' , 28225 , 2)
) t(subId , MonthNo , PackagesNumber , Country , EntryMethod , PaidAmount , Version)
)
,dates as
(
select dateadd(day,n,'20170101') as dt
,convert(varchar(10),dateadd(day,n,'20170101'),112)/100 mnthkey
,day(dateadd(day,-1,dateadd(month,1,cast(((convert(varchar(10),dateadd(day,n,'20170101'),112)/100)*100 +1) as varchar(10))))) DaysInMonth
from
tally
)
select
subId
,MonthNo
,dt
,PackagesNumber
,case when day(dt)<=PackagesNumber%DaysInMonth then 1 else 0 end remainder
,PackagesNumber/DaysInMonth evenlyspread
,Country
,EntryMethod
,PaidAmount
,Version
from t1 a
inner join dates b
on a.MonthNo=b.mnthkey
I Join on the month with the data table, and for each day in the month i assig evenlydistributed days, 7 in our example, and for the first days, 6 in our example, i add 1 as remainder
Now we have the info from your base table, multiplied by every day in the relevant months, now we just need to make multiple rows pr day, here we use the tally tabe again:
with
tally as
(
select row_number() over (order by n)-1 n from
(values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) n(n)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) m(m)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) l(m)
cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) k(m)
)
,t1 as
(
select
*
from
(values
(1 , 201701 , 223 , 'NO' , 'BCD' , 44803 , 2)
,(2 , 201701 , 61 , 'NO' , 'GHI' , 11934 , 2)
,(3 , 201701 , 929 , 'NO' , 'ABC' , 88714 , 2)
,(4 , 201701 , 470 , 'NO' , 'DEF' , 98404 , 2)
,(5 , 201702 , 223 , 'NO' , 'BCD' , 28225 , 2)
) t(subId , MonthNo , PackagesNumber , Country , EntryMethod , PaidAmount , Version)
)
,dates as
(
select dateadd(day,n,'20170101') as dt
,convert(varchar(10),dateadd(day,n,'20170101'),112)/100 mnthkey
,day(dateadd(day,-1,dateadd(month,1,cast(((convert(varchar(10),dateadd(day,n,'20170101'),112)/100)*100 +1) as varchar(10))))) DaysInMonth
from
tally
)
,forshow as
(
select
subId
,MonthNo
,dt
,PackagesNumber
,case when day(dt)<=PackagesNumber%DaysInMonth then 1 else 0 end remainder
,PackagesNumber/DaysInMonth evenlyspread
,Country
,EntryMethod
,(PaidAmount+0.0000)/(PackagesNumber*1.0000) PaidAmount
,Version
,PaidAmount TotalPaidAmount
from t1 a
inner join dates b
on a.MonthNo=b.mnthkey
)
select
subId
,dt [Date]
,Country
,1 Packages
,EntryMethod
,PaidAmount
,Version
-- the following rows are just for control
,remainder+evenlyspread toalday
,count(*) over (partition by subId,MonthNo,dt) calctotalday
,PackagesNumber
,count(*) over (partition by subId) calcPackagesNumber
,sum(PaidAmount)over (partition by subId) calcPaidAmount
,TotalPaidAmount
from forshow
inner join tally on n<(remainder+evenlyspread )
order by subId,MonthNo,dt
I join with the number of days (evenlyspread+remainder) and get one row pr package.
I've added some check columns to make sure I get 8 rows the first 6 days, and 223 rows in total for our example

List of years between two dates

I have a table with columns for a start- and enddate.
My goal is to get a list of each year in that timespan for each row, so
+-------------------------+
| startdate | enddate |
+------------+------------+
| 2004-08-01 | 2007-01-08 |
| 2005-06-02 | 2007-05-08 |
+------------+------------+
should output this:
+-------+
| years |
+-------+
| 2004 |
| 2005 |
| 2006 |
| 2007 |
| 2005 |
| 2006 |
| 2007 |
+-------+
I have problems now to create the years in between the two dates. My first approach was to use a UNION (order of dates is irrelevant), but the years in between are missing in this case...
Select
Extract(Year From startdate)
From
table1
Union
Select
Extract(Year From enddate)
From
table1
Thanks for any advises!

Row Generator technique
SQL> WITH DATA1 AS(
2 SELECT TO_DATE('2004-08-01','YYYY-MM-DD') STARTDATE, TO_DATE('2007-01-08','YYYY-MM-DD') ENDDATE FROM DUAL UNION ALL
3 SELECT TO_DATE('2005-06-02','YYYY-MM-DD') STARTDATE, TO_DATE('2007-05-08','YYYY-MM-DD') ENDDATE FROM DUAL
4 ),
5 DATA2 AS(
6 SELECT EXTRACT(YEAR FROM STARTDATE) ST, EXTRACT(YEAR FROM ENDDATE) ED FROM DATA1
7 ),
8 data3
9 AS
10 (SELECT level-1 line
11 FROM DUAL
12 CONNECT BY level <=
13 (SELECT MAX(ed-st) FROM data2
14 )
15 )
16 SELECT ST+LINE FROM
17 DATA2, DATA3
18 WHERE LINE <= ED-ST
19 ORDER BY 1
20 /
ST+LINE
----------
2004
2005
2005
2006
2006
2007
6 rows selected.
SQL>

Try this Query
; with CTE as
(
select datepart(year, '2005-12-25') as yr
union all
select yr + 1
from CTE
where yr < datepart(year, '2013-11-14')
)
select yr
from CTE

Try this:
Create a table with years as follow:
CREATE TABLE tblyears(y int)
INSERT INTO tblyears VALUES (1900);
INSERT INTO tblyears VALUES (1901);
INSERT INTO tblyears VALUES (1902);
and so on until
INSERT INTO tblyears VALUES (2100)
So, you'll write this query:
SELECT y.y
FROM tblyears y
JOIN table1 t
ON y.y >= EXTRACT(year from startdate)
AND y.y <= EXTRACT(year from enddate)
ORDER BY y.y
Show SqlFiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas