Netezza range of dates - sql

I want to solve an issue to reduce manual labour in a specific query. Hope i can phrase this correctly to be understood.
In Netezza, I want to generate a (date ) value and run the query for every different value specified.
What I want to do is replace all the unions into one query.
SELECT DISTINCT COUNT(DISTINCT a.x) AS NO_OF_X, a.column1, 'JAN 2019'
FROM my_table a
WHERE 1=1
AND current_date BETWEEN a.date_from and a.date_to
GROUP BY 2,3
UNION ALL
SELECT DISTINCT COUNT(DISTINCT a.x) AS NO_OF_X, a.column1, 'DEC 2018'
FROM my_table a
WHERE 1=1
AND '2018-12-31' BETWEEN a.date_from and a.date_to
GROUP BY 2,3
UNION ALL
SELECT DISTINCT COUNT(DISTINCT a.x) AS NO_OF_X, a.column1, 'NOV 2018'
FROM my_table a
WHERE 1=1
AND '2018-11-30' BETWEEN a.date_from and a.date_to
GROUP BY 2,3
What i want to do is something like this
SELECT DISTINCT COUNT(DISTINCT a.x) AS NO_OF_X, a.column1, last_day( date ) as "MONTH"
FROM my_table a
WHERE 1=1
AND /*run the query for all last_days in a range */
GROUP BY 2,3
Is this possible? Tried to make a CTE but it is really important to get results for each specific last day of month, because our datawarehouse is designed, to store different time slices for each transaction etc. And i want to get only transactions with time slices on a specific last_day().
Cheers.

You need to generate a list of dates. Here is one method:
select to_char(m.dte, 'MMM YYYY'), t.column1,
count(distinct a.x) AS NO_OF_X
from (SELECT current_date as dte UNION ALL
SELECT date_trunc('month', current_date) - interval '1 day' as dte UNION ALL
SELECT date_trunc('month', current_date) - interval '1 day' - interval '1 month' as dte UNION ALL
SELECT date_trunc('month', current_date) - interval '1 day' - interval '2 month' as dte
) m left join
my_table t
where m.dte between t.date_from and t.date_to
group by to_char(m.dte, 'MMM YYYY'), t.column1
order by min(m.dte), t.column1;

Related

Getting a period index from a date in PostgreSQL

Here is a Postgres code I created, it works. Is there a way to code it in a more efficient way? My goal is to get how much periods a given date falls from 2014-03-01. One period is a half-year starting from March or September.
I updated this code below on 2022-05-18 at 10:19 UTC+2
select date,
dense_rank() over (order by half_year_mar_sep) as period_index
from
(
select date as date,
case when extract(month from date) = 12 then (extract(year from date) || '-09-01')
when extract(month from date) in (1, 2) then (extract(year from date) - 1 || '-09-01')
when extract(month from date) in (3, 4, 5) then (extract(year from date) || '-03-01')
when extract(month from date) in (6, 7, 8) then (extract(year from date) || '-03-01')
else extract(year from date) || '-09-01'
end::date as half_year_mar_sep
from
(
select generate_series(date '2014-03-01', CURRENT_DATE, interval '1 day')::date as date
) s1
) s2
If I encapsulate the code above into select min(date), period_index from (<code above>) s3 group by 2 order by 1 then here is the result what I need:
WITH cte AS (
SELECT
date1::date,
rank() OVER (ORDER BY date1)
FROM generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval '6 month') g (date1)
),
cteall AS (
SELECT
all_date::date
FROM
generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval ' 1 day') s (all_date)
),
cte3 AS (
SELECT
*
FROM
cteall c1
LEFT JOIN cte c2 ON date1 = all_date
),
cte4 AS (
SELECT
*,
count(rank) OVER w AS ct_str
FROM
cte3
WINDOW w AS (ORDER BY all_date))
SELECT
*,
rank() OVER (PARTITION BY ct_str ORDER BY all_date) AS rank1,
dense_rank() OVER (ORDER BY all_date) AS dense_rank1
FROM
cte4;
Hope it's not intimidating. personally I found cte is a good tool, since it make logic more clearly.
demo
useful link: How to do forward fill as a PL/PGSQL function
If some column don't need, you can simple replace * with the columns you want.
Based on #Mark's answer I wrote this code below, but it's not simpler than the original code.
select s.date,
m.period_index
from
(
select date::date as half_year_start,
rank() over (order by date) as period_index,
coalesce(lead(date::date, 1) over (), CURRENT_DATE) as following_half_year_start
from generate_series(date '2014-03-01', CURRENT_DATE + interval '1' month, interval '6 month') as date
) m
left join
(
select generate_series(date '2014-03-01', CURRENT_DATE, interval '1 day')::date as date
) s
on s.date between m.half_year_start and m.following_half_year_start
;

ORACLE SQL QUERY SYSDATE

I need help with my query
select distinct count(item_number), creation_date
from EGP_SYSTEM_ITEMS_B ,
all I need to count the item number every month
for example
3-9-2020 count:29700
4-9-2020 count:29600
5-9-2020 Count:30000
and get the all date for the month and the previous month from creation_id or sysdate any of them
thanks
To count the number per month, you would aggregate by the month:
select trunc(creation_date, 'MON') as yyyymm, count(*)
from EGP_SYSTEM_ITEMS_B
group by trunc(creation_date, 'MON');
Your question is not entirely clear; perhaps you're trying to get
select TRUNC(CREATION_DATE), count(item_number)
from EGP_SYSTEM_ITEMS_B
WHERE TRUNC(CREATION_DATE)
IN (ADD_MONTHS(TRUNC(SYSDATE), -1),
TRUNC(SYSDATE),
ADD_MONTHS(TRUNC(SYSDATE), 1))
GROUP BY TRUNC(CREATION_DATE)
EDIT
Apparently OP wants a running monthly count, so something like:
WITH cteLimits (START_DATE, END_DATE)
AS (SELECT ADD_MONTHS(TRUNC(SYSDATE), -2), ADD_MONTHS(TRUNC(SYSDATE), -1) - INTERVAL '1' DAY FROM DUAL UNION ALL
SELECT ADD_MONTHS(TRUNC(SYSDATE), -1), TRUNC(SYSDATE) - INTERVAL '1' DAY FROM DUAL UNION ALL
SELECT TRUNC(SYSDATE), ADD_MONTHS(TRUNC(SYSDATE), 1) - INTERVAL '1' DAY FROM DUAL),
cteDay_totals
AS (SELECT TRUNC(CREATION_DATE) AS CREATION_DATE,
COUNT(*) AS DAY_TOTAL
FROM EGP_SYSTEM_ITEMS_B
GROUP BY TRUNC(CREATION_DATE))
SELECT l.START_DATE,
l.END_DATE,
SUM(d.DAY_TOTAL) AS MONTH_TOTAL
FROM cteLimits l
INNER JOIN cteDay_totals d
ON d.CREATION_DATE BETWEEN l.START_DATE AND l.END_DATE
GROUP BY l.START_DATE,
l.END_DATE

create table with dates - sql

I have a query that can create a table with dates like below:
with digit as (
select 0 as d union all
select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
),
seq as (
select a.d + (10 * b.d) + (100 * c.d) + (1000 * d.d) as num
from digit a
cross join
digit b
cross join
digit c
cross join
digit d
order by 1
)
select (last_day(sysdate)::date - seq.num)::date as "Date"
from seq;
How could this be changed to generate only dates
Thanks
demo:db<>fiddle
WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
generate_series(first_day_of_month, last_day_of_month, interval '1 day')::date
FROM dates
date_trunc() truncates a type date (or timestamp) to a certain date part. date_trunc('month', ...) removes all parts but year and month. All other parts are set to their lowest possible values. So, the day part is set to 1. That's why you get the first day of month with this.
adding a month returns the first of the next month, subtracting a day from this results in the last day of the current month.
Finally you can generate a date series with start and end date using the generate_series() function
Edit: Redshift does not support generate_series() with type date and timestamp but with integer. So, we need to create an integer series instead and adding the results to the first of the month:
db<>fiddle
WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
first_day_of_month::date + gs
FROM
dates,
generate_series(
date_part('day', first_day_of_month)::int - 1,
date_part('day', last_day_of_month)::int - 1
) as gs
This answers the original version of the question.
You would use generate_series():
select gs.dte
from generate_series(date_trunc('month', now()::date),
date_trunc('month', now()::date) + interval '1 month' - interval '1 day',
interval '1 day'
) gs(dte);
Here is a db<>fiddle.

Speed up query where results with count(*) = 0 are included

I have a table squitters with, amongst others, a column parsed_time. I want to know the number of records per hour for the last two days and used this query:
SELECT date_trunc('hour', parsed_time) AS hour , count(*)
FROM squitters
WHERE parsed_time > date_trunc('hour', now()) - interval '2 day'
GROUP BY hour
ORDER BY hour DESC;
This works, but hours with zero records do not appear in the result. I want to have hours
with zero records also in the result with a count equal to zero, so I wrote this query using the generate_series function:
SELECT bins.hour, count(squitters.parsed_time)
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') bins(hour)
LEFT OUTER JOIN squitters ON bins.hour = date_trunc('hours', squitters.parsed_time)
GROUP BY bins.hour
ORDER BY bins.hour DESC;
This works, in the results are hour-bins with counts equal to zero, but is considerably slower.
How can I have the speed of the first query with the count=zero results of the second query?
(btw. there is an index on parsed_time)
You could try and change the join condition so no date function is applied on column parsed_time:
SELECT b.hour, COUNT(s.parsed_time) cnt
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') b(hour)
LEFT OUTER JOIN squitters s
ON s.parsed_time >= b.hour
AND s.parsed_time < b.hours + interval '1 hour'
GROUP BY b.hour
ORDER BY b.hour DESC;
Alternatively, you could also try using a correlated subquery (or a lateral join) instead of a left join - this avoids the need for outer aggregation:
SELECT
b.hour,
(
SELECT COUNT(*)
FROM squitters s
WHERE s.parsed_time >= b.hour AND s.parsed_time < b.hours + interval '1 hour'
) cnt
FROM generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') b(hour)
ORDER BY b.hour desc
You could take advantage of Common Table Expressions to divide your problem into small chunks:
WITH cte AS (
--First query your table
SELECT date_trunc('hour', parsed_time) AS sq_hour , count(*)
FROM squitters
WHERE parsed_time > date_trunc('hour', now()) - interval '2 day'
GROUP BY hour
ORDER BY hour DESC
), series AS (
--Create the series without the data returned from 1st query
SELECT
bins.series_hour,
0
FROM
generate_series(date_trunc('hour', now() - interval '2 day'), now(), '1 hour') bins(series_hour)
WHERE
series_hour not in (SELECT sq_hour FROM cte)
)
--Union the result
SELECT * FROM cte
UNION
SELECT * FROM series
ORDER BY 1

Redshift FULL OUTER JOIN doesn't output NULL

We have a 'numbers' table that holds 0-10000 values in its single value 'n'.
We have tableX that has calculated_at datetime and a term.
We are trying to fill the holes where in tableX doesnt have matches in the given dates. HOWEVER, this doesn't seem to yield NULL or 0 for the non-matching...
select term
, avg(total::float)
, date_trunc('day', series.date) as date1
, date_trunc('day', calculated_at) as date2
from (select
(current_timestamp - interval '1 day' * numbers.n)::date as date
from numbers) as series
full outer join terms
on series.date = date_trunc('day', calculated_at)
where series.date BETWEEN '2017-07-01' AND '2017-07-30'
AND (term in ('term111') or term is null)
group by term
, date_trunc('day', series.date)
, date_trunc('day', calculated_at)
order by date_trunc('day', series.date) asc
The full outer join is fine. The problem is the filters. These are really tricky with a full outer join. I would recommend:
select t.term, avg(total::float),
date_trunc('day', series.date) as date1,
date_trunc('day', calculated_at) as date2
from (select (current_timestamp - interval '1 day' * numbers.n)::date as date
from numbers
where (current_timestamp - interval '1 day' * numbers.n)::date BETWEEN '2017-07-01' AND '2017-07-30'
) series full outer join
(select t.*
from terms
where term = 'term111'
) t
on series.date = date_trunc('day', t.calculated_at)
group by t.term, date_trunc('day', series.date), date_trunc('day', calculated_at)
order by date_trunc('day', series.date) asc;
My guess though is that a left join would do what you want. I doubt a full outer join is what you really intend. If you have doubts, ask another question and provide sample data and desired results.