Postgresql - How to get value from last record of each month - sql
I have a view like this:
Year | Month | Week | Category | Value |
2017 | 1 | 1 | A | 1
2017 | 1 | 1 | B | 2
2017 | 1 | 1 | C | 3
2017 | 1 | 2 | A | 4
2017 | 1 | 2 | B | 5
2017 | 1 | 2 | C | 6
2017 | 1 | 3 | A | 7
2017 | 1 | 3 | B | 8
2017 | 1 | 3 | C | 9
2017 | 1 | 4 | A | 10
2017 | 1 | 4 | B | 11
2017 | 1 | 4 | C | 12
2017 | 2 | 5 | A | 1
2017 | 2 | 5 | B | 2
2017 | 2 | 5 | C | 3
2017 | 2 | 6 | A | 4
2017 | 2 | 6 | B | 5
2017 | 2 | 6 | C | 6
2017 | 2 | 7 | A | 7
2017 | 2 | 7 | B | 8
2017 | 2 | 7 | C | 9
2017 | 2 | 8 | A | 10
2017 | 2 | 8 | B | 11
2017 | 2 | 8 | C | 12
And I need to make a new view which needs to show average of value column (let's call it avg_val) and the value from the max week of the month (max_val_of_month). Ex: max week of january is 4, so the value of category A is 10. Or something like this to be clear:
Year | Month | Category | avg_val | max_val_of_month
2017 | 1 | A | 5.5 | 10
2017 | 1 | B | 6.5 | 11
2017 | 1 | C | 7.5 | 12
2017 | 2 | A | 5.5 | 10
2017 | 2 | B | 6.5 | 11
2017 | 2 | C | 7.5 | 12
I have use window function, over partition by year, month, category to get the avg value. But how can I get the value of the max week of each month?
Assuming that you need a month average and a value for the max week not the max value per month
SELECT year, month, category, avg_val, value max_week_val
FROM (
SELECT *,
AVG(value) OVER (PARTITION BY year, month, category) avg_val,
ROW_NUMBER() OVER (PARTITION BY year, month, category ORDER BY week DESC) rn
FROM view1
) q
WHERE rn = 1
ORDER BY year, month, category
or more verbose version without window functions
SELECT q.year, q.month, q.category, q.avg_val, v.value max_week_val
FROM (
SELECT year, month, category, avg(value) avg_val, MAX(week) max_week
FROM view1
GROUP BY year, month, category
) q JOIN view1 v
ON q.year = v.year
AND q.month = v.month
AND q.category = v.category
AND q.max_week = v.week
ORDER BY year, month, category
Here is a dbfiddle demo for both queries
And here is my NEW version.
My thanks to #peterm for pointing me about the prior false value of val_from_max_week_of_month. So, I corrected it:
SELECT
a.Year,
a.Month,
a.Category,
max(a.Week) AS max_week,
AVG(a.Value) AS avg_val,
(
SELECT b.Value
FROM decades AS b
WHERE
b.Year = a.Year AND
b.Month = a.Month AND
b.Week = max(a.Week) AND
b.Category = a.Category
) AS val_from_max_week_of_month
FROM decades AS a
GROUP BY
a.Year,
a.Month,
a.Category
;
The new results:
First, you might need to check, how do you handle the first week in January. If 1st of January are not a Monday, there are several interpretations & not every one of them will fit the solutions here. You'll either need to use:
the ISO week concept, ie. the week column should hold the ISO week & the year column should hold the ISO year (week-year, rather). Note: in this concept, 1st of January actually sometimes belongs to the previous year
use your own concept, where the first week of the year is "split" into two if 1st of January is not a Monday.
Note: the solutions below will not work if (in your table) the first week of January can be 52 or 53.
Given that avg_val is just a simple aggregation, while max_val_of_month can be calculated with typical greatest-n-per-group queries. It has a lot of possible solutions in PostgreSQL, with varying performance. Fortunately, your query will naturally have an easily determined selectivity: you'll always need (approx.) a quarter of your data.
Usual winners (in performance) are:
(These are not surprise though, as these 2 should perform more and more as you need more portion of the original data.)
array_agg() with order by variant:
select year, month, category, avg(value) avg_val,
(array_agg(value order by week desc))[1] max_val_of_month
from table_name
group by year, month, category;
distinct on variant:
select distinct on (year, month, category) year, month, category,
avg(value) over (partition by year, month, category) avg_val,
value max_val_of_month
from table_name
order by year, month, category, week desc;
The pure window function variant is not that bad either:
row_number() variant:
select year, month, category, avg_val, max_val_of_month
from (select year, month, category, value max_val_of_month,
avg(value) over (partition by year, month, category) avg_val,
row_number() over (partition by year, month, category order by week desc) rn
from table_name) w
where rn = 1;
But the LATERAL variant is only viable with an index:
LATERAL variant:
create index idx_table_name_year_month_category_week_desc
on table_name(year, month, category, week desc);
select year, month, category,
avg(value) avg_val,
max_val_of_month
from table_name t
cross join lateral (select value max_val_of_month
from table_name
where (year, month, category) = (t.year, t.month, t.category)
order by week desc
limit 1) m
group by year, month, category, max_val_of_month;
But most of the solutions above can actually utilize this index, not just this last one.
Without the index: http://rextester.com/WNEL86809
With the index: http://rextester.com/TYUA52054
with data (yr, mnth, wk, cat, val) as
(
-- begin test data
select 2017 , 1 , 1 , 'A' , 1 from dual union all
select 2017 , 1 , 1 , 'B' , 2 from dual union all
select 2017 , 1 , 1 , 'C' , 3 from dual union all
select 2017 , 1 , 2 , 'A' , 4 from dual union all
select 2017 , 1 , 2 , 'B' , 5 from dual union all
select 2017 , 1 , 2 , 'C' , 6 from dual union all
select 2017 , 1 , 3 , 'A' , 7 from dual union all
select 2017 , 1 , 3 , 'B' , 8 from dual union all
select 2017 , 1 , 3 , 'C' , 9 from dual union all
select 2017 , 1 , 4 , 'A' , 10 from dual union all
select 2017 , 1 , 4 , 'B' , 11 from dual union all
select 2017 , 1 , 4 , 'C' , 12 from dual union all
select 2017 , 2 , 5 , 'A' , 1 from dual union all
select 2017 , 2 , 5 , 'B' , 2 from dual union all
select 2017 , 2 , 5 , 'C' , 3 from dual union all
select 2017 , 2 , 6 , 'A' , 4 from dual union all
select 2017 , 2 , 6 , 'B' , 5 from dual union all
select 2017 , 2 , 6 , 'C' , 6 from dual union all
select 2017 , 2 , 7 , 'A' , 7 from dual union all
select 2017 , 2 , 8 , 'A' , 10 from dual union all
select 2017 , 2 , 8 , 'B' , 11 from dual union all
select 2017 , 2 , 7 , 'B' , 8 from dual union all
select 2017 , 2 , 7 , 'C' , 9 from dual union all
select 2018 , 2 , 7 , 'C' , 9 from dual union all
select 2017 , 2 , 8 , 'C' , 12 from dual
-- end test data
)
select * from
(
select
-- data.*: all columns of the data table
data.*,
-- avrg: partition by a combination of year,month and category to work out -
-- the avg for each category in a month of a year
avg(val) over (partition by yr, mnth, cat) avrg,
-- mwk: partition by year and month to work out -
-- the max week of a month in a year
max(wk) over (partition by yr, mnth) mwk
from
data
)
-- as OP's interest is in the max week of each month of a year, -
-- "wk" column value is matched against
-- the derived column "mwk"
where wk = mwk
order by yr,mnth,cat;
Related
BigQuery Running Count of Unique ID per Year
I found a bunch of similar questions but not addressing this one specifically (correct me if I'm wrong). I am trying---on BigQuery---to index each row on a table with the running count of user per year using an analytical function. So with: with dataset as ( select 'A' as user, '2020' as year, RAND() as some_value union all select 'A' as user, '2020' as year, RAND() as some_value union all select 'B' as user, '2020' as year, RAND() as some_value union all select 'B' as user, '2020' as year, RAND() as some_value union all select 'B' as user, '2020' as year, RAND() as some_value union all select 'C' as user, '2020' as year, RAND() as some_value union all select 'C' as user, '2020' as year, RAND() as some_value union all select 'A' as user, '2021' as year, RAND() as some_value union all select 'A' as user, '2021' as year, RAND() as some_value union all select 'B' as user, '2021' as year, RAND() as some_value union all select 'C' as user, '2021' as year, RAND() as some_value union all select 'C' as user, '2021' as year, RAND() as some_value union all select 'C' as user, '2021' as year, RAND() as some_value union all select 'C' as user, '2021' as year, RAND() as some_value union all select 'C' as user, '2021' as year, RAND() as some_value ) I would like to get: rcount | user | year | some_value 1 | A | 2020 | 0.2365421124968884 1 | A | 2020 | 0.21087749308191206 2 | B | 2020 | 0.6096882013526258 2 | B | 2020 | 0.8544447727632739 2 | B | 2020 | 0.6113604025541309 3 | C | 2020 | 0.5803237472480643 3 | C | 2020 | 0.165305669127888 1 | A | 2021 | 0.1200575362708826 1 | A | 2021 | 0.015721175944171915 2 | B | 2021 | 0.21890252010457295 3 | C | 2021 | 0.5087613385277634 3 | C | 2021 | 0.9949262690813603 3 | C | 2021 | 0.50824183164116 3 | C | 2021 | 0.8262428736484341 3 | C | 2021 | 0.6866964737106948 I tried : count(user) over (partition by year,user ) I also tried using ranges like order by year range between unbounded preceding and current row and row_count() I have no idea where to tap for a solution now.
A simpler solution would be to use DENSE_RANK: SELECT DENSE_RANK() OVER (PARTITION BY year ORDER BY user) as rcount, user, year, some_value FROM dataset Information about DENSE_RANK can be found here.
Try the following: select user , year , some_value , sum(count) over (partition by year order by year, user ROWS UNBOUNDED PRECEDING) as rcount from ( select user , year , some_value , IF(lag(user,1) OVER (order by year,user)=user,0,1) count from dataset ) The sub-select is defining the logic of whether to count the record or not based on what the previous row was, then we simply perform a sum with the outer select.
Check if a month is skipped then add values dynamically?
I have a set of data from a table that would only be populated if a user has data for a certain month just like this: Month | MonthName | Value 3 | March | 136.00 4 | April | 306.00 7 | July | 476.00 12 | December | 510.48 But what I need is to check if a month is skipped then adding the value the month before so the end result would be like this: Month | MonthName | Value 3 | March | 136.00 4 | April | 306.00 5 | May | 306.00 -- added data 6 | June | 306.00 -- added data 7 | July | 476.00 8 | August | 476.00 -- added data 9 | September | 476.00 -- added data 10 | October | 476.00 -- added data 11 | November | 476.00 -- added data 12 | December | 510.48 How can I do this dynamically on SQL Server?
One method is a recursive CTE: with cte as ( select month, value, lead(month) over (order by month) as next_month from t union all select month + 1, value, next_month from cte where month + 1 < next_month ) select month, datename(month, datefromparts(2020, month, 1)) as monthname, value from cte order by month; Here is a db<>fiddle.
you can use spt_values to get continuous number 1-12, and then left join your table by max(month) select t1.month ,datename(month,datefromparts(2020, t1.month, 1)) monthname ,t2.value from ( select top 12 number + 1 as month from master..spt_values where type = 'p' ) t1 left join t t2 on t2.month = (select max(month) from t tmp where tmp.month < = t1.month) where t2.month is not null CREATE TABLE T ([Month] int, [MonthName] varchar(8), [Value] numeric) ; INSERT INTO T ([Month], [MonthName], [Value]) VALUES (3, 'March', 136.00), (4, 'April', 306.00), (7, 'July', 476.00), (12, 'December', 510.48) ; Demo Link SQL Server 2012 | db<>fiddle note if you have year column then you need to fix the script.
Oracle first and last observation over multiple windows
I have a problem with a query in Oracle. My table contains all of the loan applications from last year. Some of the customers have more than one application. I want to aggregate those applications as follows: For each customer, I want to find his first application (let's call it A) in the last year and then I want to find out what was the last application in 30 days interval, counting from the first application (say B is the last one). Next, I need to find the application following B and again find for it the last one in 30 days interval, as in the previous step. What I want as the result is the table with the latest and earliest applications on each customer's interval. It is also possible that the first one is the same as the last one. How could I do this in Oracle without plsql? Is this possible? Should I use cumulative sums of time intervals for it? (but then the starting point for each sum depends on the counted sum..) Let's say the table has a following form: application_id (unique) | customer_id (not unique) | create_date 1 1 2017-01-02 <- first 2 1 2017-01-10 <- middle 3 1 2017-01-30 <- last 4 1 2017-05-02 <- first and last 5 1 2017-06-02 <- first 6 1 2017-06-30 <- middle 7 1 2017-06-30 <- middle 8 1 2017-07-01 <- last What I expect is: application_id (unique) | customer_id (not unique) | create_date 1 1 2017-01-02 <- first 3 1 2017-01-30 <- last 4 1 2017-05-02 <- first and last 5 1 2017-06-02 <- first 8 1 2017-07-01 <- last Thanks in advance for help.
SQL Fiddle Oracle 11g R2 Schema Setup: CREATE TABLE table_name ( application_id, customer_id, create_date ) AS SELECT 1, 1, DATE '2017-01-02' FROM DUAL UNION ALL -- <- first SELECT 2, 1, DATE '2017-01-10' FROM DUAL UNION ALL -- <- middle SELECT 3, 1, DATE '2017-01-30' FROM DUAL UNION ALL -- <- last SELECT 4, 1, DATE '2017-05-02' FROM DUAL UNION ALL -- <- first and last SELECT 5, 1, DATE '2017-06-02' FROM DUAL UNION ALL -- <- first SELECT 6, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle SELECT 7, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle SELECT 8, 1, DATE '2017-07-01' FROM DUAL -- <- last Query 1: WITH data ( application_id, customer_id, create_date, first_date, grp ) AS ( SELECT t.application_id, t.customer_id, t.create_date, t.create_date, 1 FROM table_name t WHERE application_id = 1 UNION ALL SELECT t.application_id, t.customer_id, t.create_date, CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY THEN d.first_date ELSE t.create_date END, CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY THEN grp ELSE grp + 1 END FROM data d INNER JOIN table_name t ON ( d.customer_id = t.customer_id AND d.application_id + 1 = t.application_id ) ) SELECT application_id, customer_id, create_date, grp FROM ( SELECT d.*, ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date ASC ) AS rn_a, ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date DESC ) AS rn_d FROM data d ) WHERE rn_a = 1 OR rn_d = 1 Results: | APPLICATION_ID | CUSTOMER_ID | CREATE_DATE | GRP | |----------------|-------------|----------------------|-----| | 1 | 1 | 2017-01-02T00:00:00Z | 1 | | 3 | 1 | 2017-01-30T00:00:00Z | 1 | | 4 | 1 | 2017-05-02T00:00:00Z | 2 | | 5 | 1 | 2017-06-02T00:00:00Z | 3 | | 8 | 1 | 2017-07-01T00:00:00Z | 3 |
Distribute rows evenly by days
I have table, where I put lets call it manual values that are used later in my code. This table looks like that: subId | MonthNo | PackagesNumber | Country | EntryMethod | PaidAmount | Version 1 | 201701 | 223 | NO | BCD | 44803 | 2 2 | 201701 | 61 | NO | GHI | 11934 | 2 3 | 201701 | 929 | NO | ABC | 88714 | 2 4 | 201701 | 470 | NO | DEF | 98404 | 2 5 | 201702 | 223 | NO | BCD | 28225 | 2 All I have to do is, to divide those values into single rows, at the level of single package. In example, there are 223 packages in January 2017 in Country NO with EntryMethod BCD, so I want 223 separate rows. PaidAmount should be also divided by number of PackagesNumber. The problem is I have to associate date to every record. Records should be distributed evenly through whole month. I have Date dimension, that I can intersect with my table by pulling month and year separately from MontNo. For example, January 2017, EntryMethod BCD I have packages, so it's ~7 packages per day. That's what I want: subId | Date | Country | Packages | EntryMethod | PaidAmount | Version 1 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 2 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 3 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 4 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 5 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 6 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 7 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 8 | 02.01.2017 | NO | 1 | BCD | 200.910313901345 | 2 Bonus: I wrote code, that's dividing Packages into single records, and it's putting first day of each month as date. SELECT Date = ( SELECT TOP 1 date FROM dim_Date dim WHERE dim.Month = a.Month AND dim.Year = a.Year ) , Country , EntryMethod , Deliveries = 1 , PaidAmount = NULLIF(PaidAmount, 0) / PackagesNumber , SubscriptionId = 90000000 + ROW_NUMBER() OVER(ORDER BY n.number) , Version FROM ( SELECT [Year] = LEFT(MonthNo, 4) , [Month] = RIGHT(MonthNo, 2) , Country , EntryMethod , PackagesNumber , PaidAmount , Version FROM tgm.rep_PredictionsReport_ManualValues tgm /*WHERE MonthNo = 201701*/ ) a JOIN master..spt_values n ON n.type = 'P' AND n.number < CAST(PackagesNumber AS INT); EDIT: I made some progress. I used NTILE function, to divide rows into groups. The only thing that changed is Date from top level select. It looks like that now: Date = concat([Year], '-', [Month], '-', case when ntile(31) over(order by n.number) < 10 then '0' + cast(ntile(31) over(order by n.number) as varchar(2)) else cast(ntile(31) over(order by n.number) as varchar(2)) end) Explanation: I am creating Date filed using Year and Month fields, and NTILE over number of days in month(now it's static number, but later to be changed). Results aren't good as I'd expect, it's creating groups twice as big as they should be(14 instead of 7 rows in each date).
You can accomplish this using the modulo operator, which allows you to divide items into a set number of categories. Here is a full test: http://rextester.com/TOROA96856 Here is the relevant query: --recursive query to expand each row. with expand_rows (subid,monthno,month,packagesnumber,paidamount) as ( select subid,monthno,month,packagesnumber,(paidamount+0.0000)/packagesnumber from initial_table union all select subid,monthno,month,packagesnumber-1,paidamount from expand_rows where packagesnumber >1 ) select expand_rows.*,(packagesnumber % numdays)+1 day, paidamount from expand_rows join dayspermonth d on d.month = expand_rows.month order by subid, day option (maxrecursion 0) (packagesnumber % numdays)+1 is the modulo operation that assigns items to a day. Note that I precomputed a table of the number of days in each month in order to use in the query. I also simplified the problem slightly for purposes of the answer (added a pure month column because I didn't want to mess around with replicating your date dimension). You may need to tweak the modulo query if you care where the extra items end up when things don't divide evenly (e.g. if you have 32 items in January, which day has an extra item?). In this example the second day of the month tends to get the most (because of adding 1 to account for the fact that the last day of the month ends up 0). If you want the extra days to fall at the beginning of the month you could use a case statement that converts 0 to the number of days in the month, instead.
To distribute 223 numbers evenly on days of january we do this: There are 31 days in january The remainder for 223/3 is 6 223/31 is 7 (integer division) So thats 7 records pr day, plus 1 record extra for january 1-6. I've used a tally table to make dates and some more, but the distribution of rows pr day can be determined like this: with tally as ( select row_number() over (order by n)-1 n from (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) n(n) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) m(m) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) l(m) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) k(m) ) ,t1 as ( select * from (values (1 , 201701 , 223 , 'NO' , 'BCD' , 44803 , 2) ,(2 , 201701 , 61 , 'NO' , 'GHI' , 11934 , 2) ,(3 , 201701 , 929 , 'NO' , 'ABC' , 88714 , 2) ,(4 , 201701 , 470 , 'NO' , 'DEF' , 98404 , 2) ,(5 , 201702 , 223 , 'NO' , 'BCD' , 28225 , 2) ) t(subId , MonthNo , PackagesNumber , Country , EntryMethod , PaidAmount , Version) ) ,dates as ( select dateadd(day,n,'20170101') as dt ,convert(varchar(10),dateadd(day,n,'20170101'),112)/100 mnthkey ,day(dateadd(day,-1,dateadd(month,1,cast(((convert(varchar(10),dateadd(day,n,'20170101'),112)/100)*100 +1) as varchar(10))))) DaysInMonth from tally ) select subId ,MonthNo ,dt ,PackagesNumber ,case when day(dt)<=PackagesNumber%DaysInMonth then 1 else 0 end remainder ,PackagesNumber/DaysInMonth evenlyspread ,Country ,EntryMethod ,PaidAmount ,Version from t1 a inner join dates b on a.MonthNo=b.mnthkey I Join on the month with the data table, and for each day in the month i assig evenlydistributed days, 7 in our example, and for the first days, 6 in our example, i add 1 as remainder Now we have the info from your base table, multiplied by every day in the relevant months, now we just need to make multiple rows pr day, here we use the tally tabe again: with tally as ( select row_number() over (order by n)-1 n from (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) n(n) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) m(m) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) l(m) cross join (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) k(m) ) ,t1 as ( select * from (values (1 , 201701 , 223 , 'NO' , 'BCD' , 44803 , 2) ,(2 , 201701 , 61 , 'NO' , 'GHI' , 11934 , 2) ,(3 , 201701 , 929 , 'NO' , 'ABC' , 88714 , 2) ,(4 , 201701 , 470 , 'NO' , 'DEF' , 98404 , 2) ,(5 , 201702 , 223 , 'NO' , 'BCD' , 28225 , 2) ) t(subId , MonthNo , PackagesNumber , Country , EntryMethod , PaidAmount , Version) ) ,dates as ( select dateadd(day,n,'20170101') as dt ,convert(varchar(10),dateadd(day,n,'20170101'),112)/100 mnthkey ,day(dateadd(day,-1,dateadd(month,1,cast(((convert(varchar(10),dateadd(day,n,'20170101'),112)/100)*100 +1) as varchar(10))))) DaysInMonth from tally ) ,forshow as ( select subId ,MonthNo ,dt ,PackagesNumber ,case when day(dt)<=PackagesNumber%DaysInMonth then 1 else 0 end remainder ,PackagesNumber/DaysInMonth evenlyspread ,Country ,EntryMethod ,(PaidAmount+0.0000)/(PackagesNumber*1.0000) PaidAmount ,Version ,PaidAmount TotalPaidAmount from t1 a inner join dates b on a.MonthNo=b.mnthkey ) select subId ,dt [Date] ,Country ,1 Packages ,EntryMethod ,PaidAmount ,Version -- the following rows are just for control ,remainder+evenlyspread toalday ,count(*) over (partition by subId,MonthNo,dt) calctotalday ,PackagesNumber ,count(*) over (partition by subId) calcPackagesNumber ,sum(PaidAmount)over (partition by subId) calcPaidAmount ,TotalPaidAmount from forshow inner join tally on n<(remainder+evenlyspread ) order by subId,MonthNo,dt I join with the number of days (evenlyspread+remainder) and get one row pr package. I've added some check columns to make sure I get 8 rows the first 6 days, and 223 rows in total for our example
List of years between two dates
I have a table with columns for a start- and enddate. My goal is to get a list of each year in that timespan for each row, so +-------------------------+ | startdate | enddate | +------------+------------+ | 2004-08-01 | 2007-01-08 | | 2005-06-02 | 2007-05-08 | +------------+------------+ should output this: +-------+ | years | +-------+ | 2004 | | 2005 | | 2006 | | 2007 | | 2005 | | 2006 | | 2007 | +-------+ I have problems now to create the years in between the two dates. My first approach was to use a UNION (order of dates is irrelevant), but the years in between are missing in this case... Select Extract(Year From startdate) From table1 Union Select Extract(Year From enddate) From table1 Thanks for any advises!
Row Generator technique SQL> WITH DATA1 AS( 2 SELECT TO_DATE('2004-08-01','YYYY-MM-DD') STARTDATE, TO_DATE('2007-01-08','YYYY-MM-DD') ENDDATE FROM DUAL UNION ALL 3 SELECT TO_DATE('2005-06-02','YYYY-MM-DD') STARTDATE, TO_DATE('2007-05-08','YYYY-MM-DD') ENDDATE FROM DUAL 4 ), 5 DATA2 AS( 6 SELECT EXTRACT(YEAR FROM STARTDATE) ST, EXTRACT(YEAR FROM ENDDATE) ED FROM DATA1 7 ), 8 data3 9 AS 10 (SELECT level-1 line 11 FROM DUAL 12 CONNECT BY level <= 13 (SELECT MAX(ed-st) FROM data2 14 ) 15 ) 16 SELECT ST+LINE FROM 17 DATA2, DATA3 18 WHERE LINE <= ED-ST 19 ORDER BY 1 20 / ST+LINE ---------- 2004 2005 2005 2006 2006 2007 6 rows selected. SQL>
Try this Query ; with CTE as ( select datepart(year, '2005-12-25') as yr union all select yr + 1 from CTE where yr < datepart(year, '2013-11-14') ) select yr from CTE
Try this: Create a table with years as follow: CREATE TABLE tblyears(y int) INSERT INTO tblyears VALUES (1900); INSERT INTO tblyears VALUES (1901); INSERT INTO tblyears VALUES (1902); and so on until INSERT INTO tblyears VALUES (2100) So, you'll write this query: SELECT y.y FROM tblyears y JOIN table1 t ON y.y >= EXTRACT(year from startdate) AND y.y <= EXTRACT(year from enddate) ORDER BY y.y Show SqlFiddle