Oracle 9i Group by offset date time - sql

I have a table of orders which have a create_date_time (ie - 02/12/2015 14:00:44)
What I would like to do is group two months worth of orders by this create_date_time but instead of using trunc and using a proper day I'd like to go from 6am to 6am. I've tried this below but it doesn't seem to work in that way, rather it truncates and then alters the create_date_time.
select "Date", sum(CFS), sum(MCR) from
(select trunc(phi.create_date_Time)+6/24 as "Date",
case when pkt_sfx = 'CFS' then sum(total_nbr_of_units)
End as CFS,
case when pkt_sfx <> 'CFS' then sum(total_nbr_of_units)
end as MCR
from pkt_hdr ph
inner join pkt_hdr_intrnl phi
on phi.pkt_ctrl_nbr = ph.pkt_ctrl_nbr
where sale_grp = 'I'
group by trunc(phi.create_date_time)+6/24, pkt_sfx
union
select trunc(phi.create_date_Time)+6/24 as "Date",
case when pkt_sfx = 'CFS' then sum(total_nbr_of_units)
End as CFS,
case when pkt_sfx <> 'CFS' then sum(total_nbr_of_units)
end as MCR
from wm_archive.pkt_hdr ph
inner join wm_archive.pkt_hdr_intrnl PHI
on phi.pkt_Ctrl_nbr = ph.pkt_ctrl_nbr
where sale_grp = 'I'
and trunc(phi.create_date_time) >= trunc(sysdate)-60
group by trunc(phi.create_date_time)+6/24, pkt_sfx
)
group by "Date"
Please note the union isn't necessarily important but it is required in the code as half the results will be archived but the current archive day will cause date overlap that must be removed with the outer query.
Thanks

If I understood correctly, you need to subtract six hours from date and then trunc this date:
select trunc(dt-6/24) dt, sum(units) u
from ( select dt, units from t1 union all
select dt, units from t2 )
group by trunc(dt-6/24)
Test:
create table t1 (dt date, units number(5));
insert into t1 values (timestamp '2015-12-01 12:47:00', 7);
insert into t1 values (timestamp '2015-12-01 23:47:00', 7);
create table t2 (dt date, units number(5));
insert into t2 values (timestamp '2015-12-02 05:47:00', 7);
insert into t2 values (timestamp '2015-12-02 14:47:00', 7);
Output:
Dt U
---------- ---
2015-12-01 21
2015-12-02 7

Just swap the ordering of TRUNC and adding 6h - instead of
select trunc(phi.create_date_Time)+6/24 as "Date"
use
select trunc(phi.create_date_Time + 6/24) as "Date"
(you also need to change the other occurrences of trunc())
BTW: I'd use another name for the "Date" column - DATE is a SQL data type, so having a column named "Date" is somewhat confusing.

Related

SQL:Insert rows with sum of DLY rows less than the WKLY rows

Requirement: Is to insert rows ONLY FOR those rows whose difference b/w SUM of DLY rows are less than WKLY value and the DATES of DLY are within the range of DATES of WKLY
DDL:
create or replace table table_a
(
ID number,
qty number,
date_from date,
date_to date,
grain String
);
insert into tempdw.table_a values (1,102,'2020-07-04','2020-07-04','DLY');
insert into tempdw.table_a values (1,1028,'2020-07-05','2020-07-05','DLY');
insert into tempdw.table_a values (1,2828,'2020-07-06','2020-07-06','DLY');
insert into tempdw.table_a values (1,3870,'2020-07-05','2020-07-11','WKLY');
I need to insert a new row (yellow) with the difference of SUM of DLY(Orange) and WKLY(Green)
Tried :
select ID , sum(impression) over(partition by id , time_grain),date_from,date_to,time_grain
from tempdw.test_impress;
I don't have access to Snowflake but here's an example worked out (and tested with your sample data) using PostgreSQL. Hopefully you can tweak it for your own flavour of SQL.
INSERT INTO table_a
SELECT id,
missing_qty,
missing_date,
missing_date,
'DLY' AS grain
FROM ( /* NOTE: Use Average of b.qty because the value repeats on each row selected */
SELECT a.id,
Cast(Avg(b.qty) - Sum(a.qty) AS INTEGER) AS missing_qty,
( /* NOTE: Find an unused date in the week */
SELECT date(date_from + interval '1 day')
FROM table_a
WHERE grain = 'DLY'
AND date_from + interval '1 day' NOT IN
(
SELECT date_from
FROM table_a
WHERE id = a.id
AND grain <> 'WKLY') ) AS missing_date
FROM table_a a
JOIN table_a b
ON a.id = b.id
AND a.date_from BETWEEN b.date_from AND b.date_to
AND a.grain = 'DLY'
AND b.grain = 'WKLY'
GROUP BY a.id ) x
WHERE missing_qty > 0
This seems to work based on the data you've provided:
alter session set week_start = 7; -- Sets start of week to Sunday
insert into table_a (ID, qty, date_from, date_to, grain)
with t1 as (
select *
, concat(year(date_from),'-',week(date_from)) as year_week -- Week used to group records
, max(date_to) over (partition by grain, year_week) as max_dly_date -- Max date already used within week
,dateadd(day,1,max_dly_date) as new_dly_date -- Next date after the max date
,sum(qty) over (partition by grain, year_week) as sum_dly_qty -- Total qty by week and grain
from table_a
)
select dly.ID, (wkly.qty - dly.sum_dly_qty), dly.new_dly_date, dly.new_dly_date, 'DLY'
from t1 dly
inner join t1 wkly on dly.year_week = wkly.year_week and wkly.grain = 'WKLY'
where dly.grain = 'DLY' and dly.date_to = dly.max_dly_date; -- We only need one DLY record in each week

Postgresql left join date_trunc with default values

I have 3 tables which I'm querying to get the data based on different conditions. I have from and to params and these are the ones I'm using to create a range of time in which I'm looking for the data in those tables.
For instance if I have from equals to '2020-07-01' and to equals to '2020-08-01' I'm expecting to receive the grouped row values of the tables by week, if in some case some of the weeks don't have records I want to return 0, if some tables have records for the same week, I'd like to sum them.
Currently I have this:
SELECT d.day, COALESCE(t.total, 0)
FROM (
SELECT day::date
FROM generate_series(timestamp '2020-07-01',
timestamp '2020-08-01',
interval '1 week') day
) d
LEFT JOIN (
SELECT date AS day,
SUM(total)
FROM table1
WHERE id = '1'
AND date BETWEEN '2020-07-01' AND '2020-08-01'
GROUP BY day
) t USING (day)
ORDER BY d.day;
I'm generating a series of dates grouped by week, and on top of that I'm doing adding a left join. Now for some reason, it only works if the dates match completely, otherwise COALESCE(t.total, 0) returns 0 even if in that week the SUM(total) is not 0.
The same way I'm applying the LEFT JOIN, I'm using other left joins with other tables in the same query, so I'm falling with the same problem.
Please see if this works for you. Whenever you find yourself aggregating more than once, ask yourself whether it is necessary.
Rather than try to match on discrete days, use time ranges.
with limits as (
select '2020-07-01'::timestamp as dt_start,
'2020-08-01'::timestamp as dt_end
), weeks as (
SELECT x.day::date as day, least(x.day::date + 7, dt_end::date) as day_end
FROM limits l
CROSS JOIN LATERAL
generate_series(l.dt_start, l.dt_end, interval '1 week') as x(day)
WHERE x.day::date != least(x.day::date + 7, dt_end::date)
), t1 as (
select w.day,
sum(coalesce(t.total, 0)) as t1total
from weeks w
left join table1 t
on t.id = 1
and t.date >= w.day
and t.date < w.day_end
group by w.day
), t2 as (
select w.day,
sum(coalesce(t.sum_measure, 0)) as t2total
from weeks w
left join table2 t
on t.something = 'whatever'
and t.date >= w.day
and t.date < w.day_end
group by w.day
)
select t1.day,
t1.t1total,
t2.t2total
from t1
join t2 on t2.day = t1.day;
You can keep adding tables like that with CTEs.
My earlier example with multiple left join was bad because it blows out the rows due to a lack of join conditions between the left-joined tables.
There is an interesting corner case for e.g. 2019-02-01 to 2019-03-01 which returns an empty interval as the last week. I have updated to filter that out.

Last day of month joined to second table

I need to get the last day of the previous month and then join this to another table to return the year/month column that the date relates to but I'm struggling to achieve what I want.
I have tried:
SELECT b.yrmonth, LAST_DAY(ADD_MONTHS(SYSDATE,-1)) DT
FROM dual a
INNER JOIN D_DAY b on DT = b.DT
The year month just returns everything in the table rather than just one row so any help would be much appreciated!
Your query is effectively:
SELECT b.yrmonth,
'some constant masking b.DT' DT
FROM dual a
INNER JOIN
D_DAY b
on ( b.DT = b.DT ) -- Always true
You do not need to join the DUAL table and need to filter your table in the WHERE clause.
If the DT date column has varying time components:
SELECT yrmonth, dt
FROM D_DAY
WHERE DT >= TRUNC(LAST_DAY(ADD_MONTHS(SYSDATE,-1)))
AND DT < TRUNC(SYSDATE,'MM');
(Which will allow the database to use indexes on the DT column)
or, if your DT column always has dates with the time component at midnight:
SELECT yrmonth, dt
FROM D_DAY
WHERE DT = TRUNC(LAST_DAY(ADD_MONTHS(SYSDATE,-1)));
You don't need to join with dual table. You can simply add your condition in thewhere clause:
select *
from D_DAY b
where TRUNC(b.DT) = TRUNC(LAST_DAY(ADD_MONTHS(SYSDATE,-1))
and cast the date (with time) as date because LAST_DAY function returns date (with time component and if you want to check the date, you need to cast it before.

Identify the actual TAT(Turn Around Time) by comparing two date columns

I am having a table with the following structure, using which I am trying to find the TAT(Turn Around Time) between two days. But, do the overlapping days, I am unable to find the actual TAT
Appln No Start Date End Date
1001009 01-10-15 06-10-15
1001009 02-10-15 04-10-15
1001009 03-10-15 04-10-15
1001009 03-10-15 05-10-15
1001009 04-10-15 07-10-15
1001009 09-10-15 10-10-15
1001009 12-10-15 16-10-15
1001009 14-10-15 17-10-15
After removing the overlapping dates from the above sample data, the output will be in the following format -
Appln No Start Date End Date
1001009 01-10-15 07-10-15
1001009 09-10-15 10-10-15
1001009 12-10-15 17-10-15
Since I am a beginner in sql and using oracle sql developer, I am finding it difficult to write the above logic into code. Any suggestion on the issue is welcome :)
Try this:
select t1.* from myTable t1
inner join myTable t2
on t2.StartDate > t1.StartDate and t2.StartDate < t1.EndDate
More that a tricky task, as you can't trust any order of the intervals.
I attack it by removing the subintervals (intervals completely covered in other interval).
After this I can follow the order defined by START_DATE to see if the perceeding interval is overlaping with the next one and apply the standard grouping mechanism.
with subs as (
/* first remove all intervals that are subsets of other intervals */
select * from tst t1
where NOT exists (select null from tst t2 where t2.start_date < t1.start_date and t1.end_date < t2.end_date)
),overlap as (
select APPLN_NO, START_DATE, END_DATE,
case when (nvl(lag(END_DATE) over (partition by APPLN_NO order by START_DATE),START_DATE-1) < START_DATE) then
row_number() over (partition by APPLN_NO order by START_DATE) end grp
from subs),
overlap2 as (
select
APPLN_NO, START_DATE, END_DATE, GRP,
last_value(grp ignore nulls) over (partition by APPLN_NO order by START_DATE) as grp2
from overlap)
select
APPLN_NO, min(START_DATE) START_DATE, max(END_DATE) END_DATE
from overlap2
group by APPLN_NO, grp2
order by 1,2
;
For checking the query here my setup
drop table tst ;
create table tst
(appln_no number,
start_date date,
end_date date);
insert into tst values (1001009, to_date('01-10-15','dd-mm-rr'),to_date('06-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('02-10-15','dd-mm-rr'),to_date('04-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('03-10-15','dd-mm-rr'),to_date('04-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('03-10-15','dd-mm-rr'),to_date('05-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('04-10-15','dd-mm-rr'),to_date('07-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('09-10-15','dd-mm-rr'),to_date('10-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('12-10-15','dd-mm-rr'),to_date('16-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('13-10-15','dd-mm-rr'),to_date('14-10-15','dd-mm-rr')); /* this is added to make it more interesting */
insert into tst values (1001009, to_date('15-10-15','dd-mm-rr'),to_date('17-10-15','dd-mm-rr'));
give
APPLN_NO START_DATE END_DATE
---------- ------------------- -------------------
1001009 01.10.2015 00:00:00 07.10.2015 00:00:00
1001009 09.10.2015 00:00:00 10.10.2015 00:00:00
1001009 12.10.2015 00:00:00 17.10.2015 00:00:00
as expected.
This is a tricky query. You need to identify groups that overlap, by assigning a grouping id. One way to do this is to find where the overlapping groups start, then accumulate the number of starts between each record.
The following assumes that your table has a primary key (called id for lack of a better name).
This gives the opportunity to aggregate to get what you want:
select ApplnNo, min(start), max(end)
from (select t.*,
sum(IsGroupStart) over (partition by ApplnNo order by start) as grp
from (select t.*,
(case when exists (select 1
from t t2
where t2.end >= t.start and t2.start <= t.end and
t2.id <> t.id
)
then 0 else 1
end) as IsGroupStart
from t
) t
) t
group by ApplnNo, grp;
There are some nuances. The exact innermost subquery for exists depends on how you define overlaps. This includes even one day of overlap at the beginning or end.

Join a count query on generate_series() and retrieve Null values as '0'

I want to count ID's per month using generate_series(). This query works in PostgreSQL 9.1:
SELECT (to_char(serie,'yyyy-mm')) AS year, sum(amount)::int AS eintraege FROM (
SELECT
COUNT(mytable.id) as amount,
generate_series::date as serie
FROM mytable
RIGHT JOIN generate_series(
(SELECT min(date_from) FROM mytable)::date,
(SELECT max(date_from) FROM mytable)::date,
interval '1 day') ON generate_series = date(date_from)
WHERE version = 1
GROUP BY generate_series
) AS foo
GROUP BY Year
ORDER BY Year ASC;
This is my output:
"2006-12" | 4
"2007-02" | 1
"2007-03" | 1
But what I want to get is this output ('0' value in January):
"2006-12" | 4
"2007-01" | 0
"2007-02" | 1
"2007-03" | 1
Months without id should be listed nevertheless.
Any ideas how to solve this?
Sample data:
drop table if exists mytable;
create table mytable(id bigint, version smallint, date_from timestamp);
insert into mytable(id, version, date_from) values
(4084036, 1, '2006-12-22 22:46:35'),
(4084938, 1, '2006-12-23 16:19:13'),
(4084938, 2, '2006-12-23 16:20:23'),
(4084939, 1, '2006-12-23 16:29:14'),
(4084954, 1, '2006-12-23 16:28:28'),
(4250653, 1, '2007-02-12 21:58:53'),
(4250657, 1, '2007-03-12 21:58:53')
;
Untangled, simplified and fixed, it might look like this:
SELECT to_char(s.tag,'yyyy-mm') AS monat
, count(t.id) AS eintraege
FROM (
SELECT generate_series(min(date_from)::date
, max(date_from)::date
, interval '1 day'
)::date AS tag
FROM mytable t
) s
LEFT JOIN mytable t ON t.date_from::date = s.tag AND t.version = 1
GROUP BY 1
ORDER BY 1;
db<>fiddle here
Among all the noise, misleading identifiers and unconventional format the actual problem was hidden here:
WHERE version = 1
You made correct use of RIGHT [OUTER] JOIN. But adding a WHERE clause that requires an existing row from mytable converts the RIGHT [OUTER] JOIN to an [INNER] JOIN effectively.
Move that filter into the JOIN condition to make it work.
I simplified some other things while being at it.
Better, yet
SELECT to_char(mon, 'yyyy-mm') AS monat
, COALESCE(t.ct, 0) AS eintraege
FROM (
SELECT date_trunc('month', date_from)::date AS mon
, count(*) AS ct
FROM mytable
WHERE version = 1
GROUP BY 1
) t
RIGHT JOIN (
SELECT generate_series(date_trunc('month', min(date_from))
, max(date_from)
, interval '1 mon')::date
FROM mytable
) m(mon) USING (mon)
ORDER BY mon;
db<>fiddle here
It's much cheaper to aggregate first and join later - joining one row per month instead of one row per day.
It's cheaper to base GROUP BY and ORDER BY on the date value instead of the rendered text.
count(*) is a bit faster than count(id), while equivalent in this query.
generate_series() is a bit faster and safer when based on timestamp instead of date. See:
Generating time series between two dates in PostgreSQL