Getting a period index from a date in PostgreSQL - sql

Here is a Postgres code I created, it works. Is there a way to code it in a more efficient way? My goal is to get how much periods a given date falls from 2014-03-01. One period is a half-year starting from March or September.
I updated this code below on 2022-05-18 at 10:19 UTC+2
select date,
dense_rank() over (order by half_year_mar_sep) as period_index
from
(
select date as date,
case when extract(month from date) = 12 then (extract(year from date) || '-09-01')
when extract(month from date) in (1, 2) then (extract(year from date) - 1 || '-09-01')
when extract(month from date) in (3, 4, 5) then (extract(year from date) || '-03-01')
when extract(month from date) in (6, 7, 8) then (extract(year from date) || '-03-01')
else extract(year from date) || '-09-01'
end::date as half_year_mar_sep
from
(
select generate_series(date '2014-03-01', CURRENT_DATE, interval '1 day')::date as date
) s1
) s2
If I encapsulate the code above into select min(date), period_index from (<code above>) s3 group by 2 order by 1 then here is the result what I need:

WITH cte AS (
SELECT
date1::date,
rank() OVER (ORDER BY date1)
FROM generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval '6 month') g (date1)
),
cteall AS (
SELECT
all_date::date
FROM
generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval ' 1 day') s (all_date)
),
cte3 AS (
SELECT
*
FROM
cteall c1
LEFT JOIN cte c2 ON date1 = all_date
),
cte4 AS (
SELECT
*,
count(rank) OVER w AS ct_str
FROM
cte3
WINDOW w AS (ORDER BY all_date))
SELECT
*,
rank() OVER (PARTITION BY ct_str ORDER BY all_date) AS rank1,
dense_rank() OVER (ORDER BY all_date) AS dense_rank1
FROM
cte4;
Hope it's not intimidating. personally I found cte is a good tool, since it make logic more clearly.
demo
useful link: How to do forward fill as a PL/PGSQL function
If some column don't need, you can simple replace * with the columns you want.

Based on #Mark's answer I wrote this code below, but it's not simpler than the original code.
select s.date,
m.period_index
from
(
select date::date as half_year_start,
rank() over (order by date) as period_index,
coalesce(lead(date::date, 1) over (), CURRENT_DATE) as following_half_year_start
from generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval '6 month') as date
) m
left join
(
select generate_series(date '2014-03-01', CURRENT_DATE, interval '1 day')::date as date
) s
on s.date between m.half_year_start and m.following_half_year_start
;

Related

ORACLE SQL QUERY SYSDATE

I need help with my query
select distinct count(item_number), creation_date
from EGP_SYSTEM_ITEMS_B ,
all I need to count the item number every month
for example
3-9-2020 count:29700
4-9-2020 count:29600
5-9-2020 Count:30000
and get the all date for the month and the previous month from creation_id or sysdate any of them
thanks
To count the number per month, you would aggregate by the month:
select trunc(creation_date, 'MON') as yyyymm, count(*)
from EGP_SYSTEM_ITEMS_B
group by trunc(creation_date, 'MON');
Your question is not entirely clear; perhaps you're trying to get
select TRUNC(CREATION_DATE), count(item_number)
from EGP_SYSTEM_ITEMS_B
WHERE TRUNC(CREATION_DATE)
IN (ADD_MONTHS(TRUNC(SYSDATE), -1),
TRUNC(SYSDATE),
ADD_MONTHS(TRUNC(SYSDATE), 1))
GROUP BY TRUNC(CREATION_DATE)
EDIT
Apparently OP wants a running monthly count, so something like:
WITH cteLimits (START_DATE, END_DATE)
AS (SELECT ADD_MONTHS(TRUNC(SYSDATE), -2), ADD_MONTHS(TRUNC(SYSDATE), -1) - INTERVAL '1' DAY FROM DUAL UNION ALL
SELECT ADD_MONTHS(TRUNC(SYSDATE), -1), TRUNC(SYSDATE) - INTERVAL '1' DAY FROM DUAL UNION ALL
SELECT TRUNC(SYSDATE), ADD_MONTHS(TRUNC(SYSDATE), 1) - INTERVAL '1' DAY FROM DUAL),
cteDay_totals
AS (SELECT TRUNC(CREATION_DATE) AS CREATION_DATE,
COUNT(*) AS DAY_TOTAL
FROM EGP_SYSTEM_ITEMS_B
GROUP BY TRUNC(CREATION_DATE))
SELECT l.START_DATE,
l.END_DATE,
SUM(d.DAY_TOTAL) AS MONTH_TOTAL
FROM cteLimits l
INNER JOIN cteDay_totals d
ON d.CREATION_DATE BETWEEN l.START_DATE AND l.END_DATE
GROUP BY l.START_DATE,
l.END_DATE

Window function is not allowed in where clause redshift

I have a dates CTE in my below query where I am using limit clause which I don't want to use it. I am trying to understand on how to rewrite my dates CTE so that I can avoid using limit 8 query.
WITH dates AS (
SELECT (date_trunc('week', getdate() + INTERVAL '1 day')::date - 7 * (row_number() over (order by true) - 1) - INTERVAL '1 day')::date AS week_column
FROM dimensions.customer LIMIT 8
)
SELECT
dates.week_column,
'W' || ceiling(date_part('week', dates.week_column + INTERVAL '1 day')) AS week_number,
COUNT(DISTINCT features.client_id) AS total
FROM dimensions.program features
JOIN dates ON features.last_update <= dates.week_column
WHERE features.type = 'capacity'
AND features.status = 'CURRENT'
GROUP BY dates.week_column
ORDER by dates.week_column DESC
Below is the output I get from my inner dates CTE query:
SELECT (date_trunc('week', getdate() + INTERVAL '1 day')::date - 7 * (row_number() over (order by true) - 1) - INTERVAL '1 day')::date AS week_column
FROM dimensions.customer LIMIT 8
Output from dates CTE :
2021-01-10
2021-01-03
2020-12-27
2020-12-20
2020-12-13
2020-12-06
2020-11-29
2020-11-22
Is there any way to avoid using limit 8 in my CTE query and still get same output? Our platform doesn't allow us to run queries if it has limit clause in it so trying to see if I can rewrite it differently in sql redshift?
If I modify my dates CTE query like this, then it gives me error as window function is not allowed in where clause.
WITH dates AS (
SELECT (date_trunc('week', getdate() + INTERVAL '1 day')::date - 7 * (row_number() over (order by true) - 1) - INTERVAL '1 day')::date AS week_column,
ROW_NUMBER() OVER () as seqnum
FROM dimensions.customer
WHERE seqnum <= 8;
)
....
Update
Something like this you mean?
WITH dates AS (
SELECT (date_trunc('week', getdate() + INTERVAL '1 day')::date - 7 * (row_number() over (order by true) - 1) - INTERVAL '1 day')::date AS week_column,
ROW_NUMBER() OVER () as seqnum
FROM dimensions.customer
)
SELECT
dates.week_column,
'W' || ceiling(date_part('week', dates.week_column + INTERVAL '1 day')) AS week_number,
COUNT(DISTINCT features.client_id) AS total
FROM dimensions.program features
JOIN dates ON features.last_update <= dates.week_column
WHERE dates.seqnum <= 8
AND features.type = 'capacity'
AND features.status = 'CURRENT'
GROUP BY dates.week_column
ORDER by dates.week_column DESC
Just move your WHERE clause to the outer SELECT. Seqnum doesn't exists until the CTE runs but does exist when the result of the CTE is consumed.
UPDATE ...
After moving the where clause AndyP got a correlated subquery error coming from a WHERE clause not included in the posted query. As shown in this somewhat modified query:
WITH dates AS
(
SELECT (DATE_TRUNC('week',getdate () +INTERVAL '1 day')::DATE- 7*(ROW_NUMBER() OVER (ORDER BY TRUE) - 1) -INTERVAL '1 day')::DATE AS week_of
FROM (SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X)
)
SELECT dates.week_of,
'W' || CEILING(DATE_PART('week',dates.week_of +INTERVAL '1 day')) AS week_number,
COUNT(DISTINCT features.id) AS total
FROM dimensions.program features
JOIN dates ON features.last_update <= dates.week_of
WHERE features.version = (SELECT MAX(version)
FROM headers f2
WHERE features.id = f2.id
AND features.type = f2.type
AND f2.last_update <= dates.week_of)
AND features.type = 'type'
AND features.status = 'live'
GROUP BY dates.week_of
ORDER BY dates.week_of DESC;
This was an interesting replacement of a correlated query with a join due to the inequality in the correlated sub query. We thought others might be helped by posting the final solution. This works:
WITH dates AS
(
SELECT (DATE_TRUNC('week',getdate () +INTERVAL '1 day')::DATE- 7*(ROW_NUMBER() OVER (ORDER BY TRUE) - 1) -INTERVAL '1 day')::DATE AS week_of
FROM (SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X UNION ALL SELECT 1 AS X)
)
SELECT dates.week_of,
'W' || CEILING(DATE_PART('week',dates.week_of +INTERVAL '1 day')) AS week_number,
COUNT(DISTINCT features.carrier_id) AS total
FROM dimensions.program features
JOIN dates ON features.last_update <= dates.week_of
JOIN (SELECT MAX(MAX(version)) OVER(Partition by id, type Order by dates.weeks_of rows unbounded preceding) AS feature_version,
f2.id,
f2.type,
dates.week_of
FROM dimensions.headers f2
JOIN dates ON f2.last_update <= dates.week_of
GROUP BY f2.id,
f2.type,
dates.week_of) f2
ON features.id = f2.id
AND features.type = f2.type
AND f2.week_of = dates.week_of
AND features.version = f2.version
WHERE features.type = 'type'
AND features.status = 'live'
GROUP BY dates.week_of
ORDER BY dates.week_of DESC;
Needing to make a data segment that had all the possible Max(version) for all possible week_of values was the key. Hopefully having both of these queries posted will help other fix correlated subquery errors.

create table with dates - sql

I have a query that can create a table with dates like below:
with digit as (
select 0 as d union all
select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
),
seq as (
select a.d + (10 * b.d) + (100 * c.d) + (1000 * d.d) as num
from digit a
cross join
digit b
cross join
digit c
cross join
digit d
order by 1
)
select (last_day(sysdate)::date - seq.num)::date as "Date"
from seq;
How could this be changed to generate only dates
Thanks
demo:db<>fiddle
WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
generate_series(first_day_of_month, last_day_of_month, interval '1 day')::date
FROM dates
date_trunc() truncates a type date (or timestamp) to a certain date part. date_trunc('month', ...) removes all parts but year and month. All other parts are set to their lowest possible values. So, the day part is set to 1. That's why you get the first day of month with this.
adding a month returns the first of the next month, subtracting a day from this results in the last day of the current month.
Finally you can generate a date series with start and end date using the generate_series() function
Edit: Redshift does not support generate_series() with type date and timestamp but with integer. So, we need to create an integer series instead and adding the results to the first of the month:
db<>fiddle
WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
first_day_of_month::date + gs
FROM
dates,
generate_series(
date_part('day', first_day_of_month)::int - 1,
date_part('day', last_day_of_month)::int - 1
) as gs
This answers the original version of the question.
You would use generate_series():
select gs.dte
from generate_series(date_trunc('month', now()::date),
date_trunc('month', now()::date) + interval '1 month' - interval '1 day',
interval '1 day'
) gs(dte);
Here is a db<>fiddle.

postgreSQL: How Select the nearest date that is not null

I got a date that I want to find the all records in the past that got the same month and day.
The problem accrues when there is no such date in the same year. For example, the 29th February.
My goal is to get the nearest date from below the date that does not exist.
This is my currently query with the date 2012-02-29:
SELECT date, amount
FROM table_name
WHERE
EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM DATE('2012-02-29') )
AND EXTRACT(DAY FROM date) = EXTRACT(DAY FROM DATE('2012-02-29') )
AND date < '2012-02-29'
ORDER BY date DESC LIMIT 10;
If I understand correctly, you want one date per year with the property that that day is nearest to the given date.
I would suggest using distinct on:
select distinct on (date_trunc('year', date)) t.*
from table_name t
order by date_trunc('year', date),
abs(date_part('day, (date -
(date '2012-02-29' -
(extract(year from date '2012-02-29') - extract(year from date)) * interval '1 year'
)
)
)
)
);
EDIT:
An example of working code:
select distinct on (date_trunc('year', date)) t.*
from table_name t
order by date_trunc('year', date),
abs(date_part('day', date - (date '2012-02-29' -
((extract(year from date '2012-02-29') - extract(year from date)) * interval '1 year')
)
))

update table with dates with month

There's a table dates_calendar:
id | date
-------------------------
13 | 2016-10-23 00:00:00
14 | 2016-10-24 00:00:00
I need to update this table and insert dates until the next month counting from the last date in the table. E.g. last date is 2016-10-24 00:00:00 - I need to insert dates till 2016-10-31. After that (the last date now is 2016-10-31) next statement call should insert dates till 2016-11-30 and so on.
Example of my SQL code, but it inserts 30 days all the time.
INSERT INTO dates_calendar (date)
VALUES (
generate_series(
(SELECT date FROM dates_calendar ORDER BY date DESC LIMIT 1) + interval '1 day',
(SELECT date FROM dates_calendar ORDER BY date DESC LIMIT 1) + interval '1 month',
'1 day'
)
);
I'm using PostgreSQL. As well would be fine to get rid of a duplicated SELECT statement of the last date.
insert into dates_calendar (date)
select dates::date
from (
select max(date)::date+ 1 next_day, '1day'::interval one_day, '1month'::interval one_month
from dates_calendar
) s,
generate_series(
next_day,
date_trunc('month', next_day)+ one_month- one_day,
one_day) dates;
To calculate the first and last date you need to insert you can use this query:
select max(date) + interval '1' day as first_day,
date_trunc('month', max(date) + interval '1' month) - interval '1' day as last_day
from dates_calendar
The expression date_trunc('month', max(date) + interval '1' month) calculates the start date of the next month. Subtracting one day from that will give you the last day of that month.
This can then be used to generate the list of dates:
with from_to (first_day, last_day) as (
select max(date) + interval '1' day,
date_trunc('month', max(date) + interval '1' month) - interval '1' day
from dates_calendar
)
select dt
from generate_series( (select first_day from from_to), (select last_day from from_to), interval '1' day) as t(dt);
And finally this can be used to insert the generated rows into the table:
with from_to (first_day, last_day) as (
select max(date) + interval '1' day,
date_trunc('month', max(date) + interval '1' month) - interval '1' day
from dates_calendar
)
insert into dates_calendar (date)
select dt
from generate_series( (select first_day from from_to), (select last_day from from_to), interval '1' day) as t(dt);
with max_date (d) as (select max(date)::date from dates_calendar)
insert into dates_calendar (date)
select d
from generate_series (
(select d from max_date) + 1,
(select date_trunc('month', d + interval '1 month')::date - 1 from max_date),
'1 day'
) g(d)