OR clause with subquery taking too much time - sql

Date Range query taking too much time.
Just i removed the one condition then it working fine taking 2 second. If adding
then 30 seconds.
SELECT UserName,COUNT ('t') TOTAL
FROM TRANSACTIONS E1
WHERE E1.START_DATE BETWEEN TO_DATE ('20130101', 'YYYYMMDD') AND TO_DATE ('20140101', 'YYYYMMDD')
AND
(
TO_CHAR (E1.START_DATE, 'D') in ( 7)
OR Exists(SELECT 1 FROM HOLIDAYS TT
WHERE E1.START_DATE BETWEEN TT.DATE_FROM AND TT.DATE_TO )
)
AND EXISTS (SELECT 't' FROM TRANSACTIONS_ORG E2 WHERE E1.TRANTYPE = E2.tran_type)
GROUP BY UserName;
HOLIDAYS table
Id FromDate ToDate Description
1 1-Feb-11 3-Feb-11 Maintance
3 2-Sep-09 5-Sep-09 Eid Holiday Fine Block
4 3-Dec-09 4-Dec-09 Due to System Problem
5 4-Dec-07 04-Dec-07 National Day
EIDTED
I figured out that the issue is not in the date range. but the OR clause in the
TO_CHAR (E1.START_DATE, 'D') in ( 5,6)
OR
Exists(SELECT 1 FROM HOLIDAYS TT
WHERE E1.START_DATE BETWEEN TT.DATE_FROM AND TT.DATE_TO )
if removed OR and put AND then fine and if shuffle conditions with OR still same issue.

The problem is likely with the OR <subquery> construct.
If there can only be one holiday for a particular date, then you could use the following:
select username
,count(*)
from transactions e1
left join holidays tt on(e1.start_date between tt.date_from and tt.date_to)
where e1.start_date between date '2017-02-01' and date '2018-02-01'
and ( to_char(e1.start_date, 'D') in(5, 6)
or tt.date_from is not null)
)
and exists(
select *
from transactions_org e2
where e1.trantype = e2.tran_type
)
group
by username;
This entire category of problems can be solved by implementing a Calendar table. If you had such a table with one record per date, you could easily add columns indicating day of week and holiday flags and such. If your calendar table looked something like this:
DAY DAYNAME IS_WEEKEND IS_WEEKDAY HOLINAME
---------- --------- ---------- ---------- ------------
2017-02-01 WEDNESDAY 0 1
2017-02-02 THURSDAY 0 1
2017-02-03 FRIDAY 0 1 Some holiday
2017-02-04 SATURDAY 1 0
2017-02-05 SUNDAY 1 0
2017-02-06 MONDAY 0 1
2017-02-07 TUESDAY 0 1
2017-02-08 WEDNESDAY 0 1
Your query could be rewritten as:
from transactions e1
join calendar c on(c.day = trunc(e1.start_date, 'DD')) -- Remove hours, minutes
where e1.start_date between date '2017-02-01' and date '2018-02-01'
and ( c.weekday in('THURSDAY', 'FRIDAY') -- Either specific weekdays
or c.holiname is not null -- or there is a holiday
)
and exists(
select *
from transactions_org e2
where e1.trantype = e2.tran_type
)

Related

Split current date into hourly intervals and get count of production

How can I split the current date into hourly intervals like 00:00 - 01:00 for 24 hours and based on that I need to get the count of production which is another column.
This is the code for date column and count column which I wanted to group by hour interval.
select count(*),order_start_time_T
from UDA_Order UDA INNER JOIN WORK_Order WO ON WO.order_key = UDA.object_key
where order_state = 'BOOKED' OR order_state = 'CLOSED'
GROUP BY order_start_time_T
this returns me
Count order_start_time_T
2 2019-07-02 10:54:27.000
7 2019-07-02 10:55:27.000
1 2019-07-02 11:51:58.000
1 2019-07-02 11:58:41.000
1 2019-07-02 12:19:13.000
The result I expect is
Count Hour interval till 24 hours for current day
2 00:00 - 01:00
7 01:00 - 02:00
1 02:00 - 03:00
1 03:00 - 04:00
1 04:00 - 05:00
1 05:00 - 06:00
and so on till 24 hours for current day.
You need to use the DATEPART function that returns the part of the date that you need, which is hrs in your case.
select count(*), CAST(order_start_time_T AS DATE) StartDate, DATEPART(HOUR, order_start_time_T) StartHr
from UDA_Order UDA INNER JOIN WORK_Order WO ON WO.order_key = UDA.object_key
where order_state = 'BOOKED' OR order_state = 'CLOSED'
GROUP BY CAST(order_start_time_T AS DATE), DATEPART(HOUR, order_start_time_T)
But this will not return the results as you wish.
It will returns it like this (for example):
Count StartDate StartHr
2 2019-07-02 10
7 2019-07-02 11
1 2019-07-02 12
I would try with a helper table, which would hold a start hour (h1 column) and end hour (h2 column). I used a temporary table, but it can be a standard table or a table variable. Column display is just for display purposes.
First of all I populate the table with start and end hour, starting from 0.
Secondly, I use DATEPART to identify an hour of an order (order_start_time_T) and check, in which period that hour depends to.
h1 h2 display
--- --- ---
0 1 00:00 - 01:00
1 2 01:00 - 02:00
....
23 24 23:00 - 24:00
Query:
-- Populate time table
if object_id('tempdb..#t') is not null drop table #t
create table #t (
h1 tinyint,
h2 tinyint,
display varchar(30)
);
declare #i tinyint =0
while #i<24 begin
insert into #t (h1, h2, display) values(#i, #i+1
, case when #i<10 then '0' else '' end+cast(#i as varchar)
+':00 - ' + case when #i<9 then '0' else '' end+ cast(#i+1 as varchar)+':00')
set #i = #i + 1
end
-- Group per period
select count(*) [Count], t.display
from UDA_Order UDA INNER JOIN WORK_Order WO ON WO.order_key = UDA.object_key
JOIN #t t ON datepart(hour, order_start_time_T) between t.h1 and t.h2
where order_state = 'BOOKED' OR order_state = 'CLOSED'
GROUP BY t.display

Count days between two segments

I have two tables below. I want to count the number of days, Monday-Friday only between Hire_dt and end of calendar month the hire date falls under.
TableA
Hire_DT Id
09/26/2018 1
TableCalendar:
Date WorkDay(M-F) EOM WorkDay
09/26/2018 Wednesday 9/30/2018 1
09/27/2018 Thursday 09/30/2018 1
09/28/2018 Friday 09/30/2018 1
09/29/2018 Saturday 09/30/2018 0
09/30/2018 Sunday 09/30/2018 0
Expected Results
Hire_dt WorkDaysEndMonth WorkDaysEndMonth --counting hire_dt
09/26/2018 2 3
Here is one way to do the calculation - WITHOUT using a calendar table. The only input data is what comes from your first table (ID and HIRE_DATE), which I included in a WITH clause (not part of the query that answers your question!). Everything else is calculated. I show how to compute the number of days INCLUDING the hire date; if you don't need that, subtract 1 at the end.
TRUNC(<date>, 'iw') is the Monday of the week of <date>. The query computes how many days are in the EOM week, between Monday and EOM, but no more than 5 (in case EOM may be a Saturday or Sunday). It does a similar calculation for HIRE_DATE, but it counts the days from Monday to HIRE_DATE excluding HIRE_DATE. The last part is adding 5 days for each full week between the Monday of HIRE_DATE and the Monday of EOM.
with
sample_data(id, hire_date) as (
select 1, to_date('09/26/2018', 'mm/dd/yyyy') from dual union all
select 2, to_date('07/10/2018', 'mm/dd/yyyy') from dual
)
select id, to_char(hire_date, 'Dy mm/dd/yyyy') as hire_date,
to_char(eom, 'Dy mm/dd/yyyy') as eom,
least(5, eom - eom_mon + 1) - least(5, hire_date - hire_mon)
+ (eom_mon - hire_mon) * 5 / 7 as workdays
from (
select id, hire_date, last_day(hire_date) as eom,
trunc(hire_date, 'iw') as hire_mon,
trunc(last_day(hire_date), 'iw') as eom_mon
from sample_data
)
;
ID HIRE_DATE EOM WORKDAYS
---------- ----------------------- ----------------------- ----------
1 Wed 09/26/2018 Sun 09/30/2018 3
2 Tue 07/10/2018 Tue 07/31/2018 16
You may use the following routine ( where last_day function is a great contributor ):
SQL> alter session set NLS_TERRITORY="AMERICA";
SQL> create table TableA( ID int, Hire_DT date );
SQL> insert into TableA values(1,date'2018-09-26');
SQL> select sum(case when mod(to_char(myDate,'D'),7) <= 1 then 0 else 1 end )
as "WorkDaysEndMonth"
from
(
select Hire_DT + level - 1 myDate
from TableA
where ID = 1
connect by level <= last_day(Hire_DT) - Hire_DT + 1
);
WorkDaysEndMonth
----------------
3
P.S. integer value comes from to_char(<date>,'D') depends on the NLS_TERRITORY setting. Here I used AMERICA for which Saturday is the 7th and Sunday is the 1st day, while for UNITED KINGDOM(or my country TURKEY) setting those are 6th and 7th respectively.
Rextester Demo

Total Number of Records per Week

I have a Postgres 9.1 database. I am trying to generate the number of records per week (for a given date range) and compare it to the previous year.
I have the following code used to generate the series:
select generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series
However, I am not sure how to join the counted records to the dates generated.
So, using the following records as an example:
Pt_ID exam_date
====== =========
1 2012-01-02
2 2012-01-02
3 2012-01-08
4 2012-01-08
1 2013-01-02
2 2013-01-02
3 2013-01-03
4 2013-01-04
1 2013-01-08
2 2013-01-10
3 2013-01-15
4 2013-01-24
I wanted to have the records return as:
series thisyr lastyr
=========== ===== =====
2013-01-01 4 2
2013-01-08 3 2
2013-01-15 1 0
2013-01-22 1 0
2013-01-29 0 0
Not sure how to reference the date range in the subsearch. Thanks for any assistance.
The simple approach would be to solve this with a CROSS JOIN like demonstrated by #jpw. However, there are some hidden problems:
The performance of an unconditional CROSS JOIN deteriorates quickly with growing number of rows. The total number of rows is multiplied by the number of weeks you are testing for, before this huge derived table can be processed in the aggregation. Indexes can't help.
Starting weeks with January 1st leads to inconsistencies. ISO weeks might be an alternative. See below.
All of the following queries make heavy use of an index on exam_date. Be sure to have one.
Only join to relevant rows
Should be much faster:
SELECT d.day, d.thisyr
, count(t.exam_date) AS lastyr
FROM (
SELECT d.day::date, (d.day - '1 year'::interval)::date AS day0 -- for 2nd join
, count(t.exam_date) AS thisyr
FROM generate_series('2013-01-01'::date
, '2013-01-31'::date -- last week overlaps with Feb.
, '7 days'::interval) d(day) -- returns timestamp
LEFT JOIN tbl t ON t.exam_date >= d.day::date
AND t.exam_date < d.day::date + 7
GROUP BY d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.day, d.thisyr
ORDER BY d.day;
This is with weeks starting from Jan. 1st like in your original. As commented, this produces a couple of inconsistencies: Weeks start on a different day each year and since we cut off at the end of the year, the last week of the year consists of just 1 or 2 days (leap year).
The same with ISO weeks
Depending on requirements, consider ISO weeks instead, which start on Mondays and always span 7 days. But they cross the border between years. Per documentation on EXTRACT():
week
The number of the week of the year that the day is in. By definition (ISO 8601), weeks start on Mondays and the first week of a
year contains January 4 of that year. In other words, the first
Thursday of a year is in week 1 of that year.
In the ISO definition, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for
late-December dates to be part of the first week of the next year. For
example, 2005-01-01 is part of the 53rd week of year 2004, and
2006-01-01 is part of the 52nd week of year 2005, while 2012-12-31 is
part of the first week of 2013. It's recommended to use the isoyear
field together with week to get consistent results.
Above query rewritten with ISO weeks:
SELECT w AS isoweek
, day::text AS thisyr_monday, thisyr_ct
, day0::text AS lastyr_monday, count(t.exam_date) AS lastyr_ct
FROM (
SELECT w, day
, date_trunc('week', '2012-01-04'::date)::date + 7 * w AS day0
, count(t.exam_date) AS thisyr_ct
FROM (
SELECT w
, date_trunc('week', '2013-01-04'::date)::date + 7 * w AS day
FROM generate_series(0, 4) w
) d
LEFT JOIN tbl t ON t.exam_date >= d.day
AND t.exam_date < d.day + 7
GROUP BY d.w, d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.w, d.day, d.day0, d.thisyr_ct
ORDER BY d.w, d.day;
January 4th is always in the first ISO week of the year. So this expression gets the date of Monday of the first ISO week of the given year:
date_trunc('week', '2012-01-04'::date)::date
Simplify with EXTRACT()
Since ISO weeks coincide with the week numbers returned by EXTRACT(), we can simplify the query. First, a short and simple form:
SELECT w AS isoweek
, COALESCE(thisyr_ct, 0) AS thisyr_ct
, COALESCE(lastyr_ct, 0) AS lastyr_ct
FROM generate_series(1, 5) w
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS thisyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2013
GROUP BY 1
) t13 USING (w)
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS lastyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2012
GROUP BY 1
) t12 USING (w);
Optimized query
The same with more details and optimized for performance
WITH params AS ( -- enter parameters here, once
SELECT date_trunc('week', '2012-01-04'::date)::date AS last_start
, date_trunc('week', '2013-01-04'::date)::date AS this_start
, date_trunc('week', '2014-01-04'::date)::date AS next_start
, 1 AS week_1
, 5 AS week_n -- show weeks 1 - 5
)
SELECT w.w AS isoweek
, p.this_start + 7 * (w - 1) AS thisyr_monday
, COALESCE(t13.ct, 0) AS thisyr_ct
, p.last_start + 7 * (w - 1) AS lastyr_monday
, COALESCE(t12.ct, 0) AS lastyr_ct
FROM params p
, generate_series(p.week_1, p.week_n) w(w)
LEFT JOIN (
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.this_start -- only relevant dates
AND t.exam_date < p.this_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.next_start -- don't cross over into next year
GROUP BY 1
) t13 USING (w)
LEFT JOIN ( -- same for last year
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.last_start
AND t.exam_date < p.last_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.this_start
GROUP BY 1
) t12 USING (w);
This should be very fast with index support and can easily be adapted to intervals of choice.
The implicit JOIN LATERAL for generate_series() in the last query requires Postgres 9.3.
SQL Fiddle.
Using across joinshould work, I'm just going to paste the markdown output from SQL Fiddle below. It would seem that your sample output is incorrect for series 2013-01-08: the thisyr should be 2, not 3. This might not be the best way to do this though, my Postgresql knowledge leaves a lot to be desired.
SQL Fiddle
PostgreSQL 9.2.4 Schema Setup:
CREATE TABLE Table1
("Pt_ID" varchar(6), "exam_date" date);
INSERT INTO Table1
("Pt_ID", "exam_date")
VALUES
('1', '2012-01-02'),('2', '2012-01-02'),
('3', '2012-01-08'),('4', '2012-01-08'),
('1', '2013-01-02'),('2', '2013-01-02'),
('3', '2013-01-03'),('4', '2013-01-04'),
('1', '2013-01-08'),('2', '2013-01-10'),
('3', '2013-01-15'),('4', '2013-01-24');
Query 1:
select
series,
sum (
case
when exam_date
between series and series + '6 day'::interval
then 1
else 0
end
) as thisyr,
sum (
case
when exam_date + '1 year'::interval
between series and series + '6 day'::interval
then 1 else 0
end
) as lastyr
from table1
cross join generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series
group by series
order by series
Results:
| SERIES | THISYR | LASTYR |
|--------------------------------|--------|--------|
| January, 01 2013 00:00:00+0000 | 4 | 2 |
| January, 08 2013 00:00:00+0000 | 2 | 2 |
| January, 15 2013 00:00:00+0000 | 1 | 0 |
| January, 22 2013 00:00:00+0000 | 1 | 0 |
| January, 29 2013 00:00:00+0000 | 0 | 0 |

SQL Count by Active Date

If I have a table of records and active/inacitve dates, is there a simple way to count active records by month? For example:
tbl_a
id dt_active dt_inactive
a 2013-01-01 2013-08-24
b 2013-01-01 2013-07-05
c 2012-02-01 2012-01-01
If I have to generate an output of active records by month like this:
active: dt_active < first_day_of_month <= dt_inactive
month count
2013-01 2
2013-02 2
2013-03 2
2013-04 2
2013-05 2
2013-06 2
2013-07 2
2013-08 1
2013-09 0
Is there any clever way to do this besides uploading a temp table of dates and using subqueries?
Here is one method that gives the count of actives on the beginning of the month. It creates a list of all the months and then joins this information to tbl_a.
with dates as (
select cast('2013-01-01' as date) as month
union all
select dateadd(month, 1, dates.month)
from dates
where month < cast('2013-09-01' as date)
)
select convert(varchar(7), month, 121), count(a.id)
from dates m left outer join
tbl_a a
on m.month between a.dt_active and a.dt_inactive
group by convert(varchar(7), month, 121)
order by 1;
Note: if dt_inactive is the first date of inactivity, then the on clause should be:
on m.month >= a.dt_active and m.month < a.dt_inactive
Here is a SQL Fiddle with the working query.

Efficient join with a "correlated" subquery

Given three tables Dates(date aDate, doUse boolean), Days(rangeId int, day int, qty int) and Range(rangeId int, startDate date) in Oracle
I want to join these so that Range is joined with Dates from aDate = startDate where doUse = 1 whith each day in Days.
Given a single range it might be done something like this
SELECT rangeId, aDate, CASE WHEN doUse = 1 THEN qty ELSE 0 END AS qty
FROM (
SELECT aDate, doUse, SUM(doUse) OVER (ORDER BY aDate) day
FROM Dates
WHERE aDate >= :startDAte
) INNER JOIN (
SELECT rangeId, day,qty
FROM Days
WHERE rangeId = :rangeId
) USING (day)
ORDER BY day ASC
What I want to do is make query for all ranges in Range, not just one.
The problem is that the join value "day" is dependent on the range startDate to be calculated, wich gives me some trouble in formulating a query.
Keep in mind that the Dates table is pretty huge so I would like to avoid calculating the day value from the first date in the table, while each Range Days shouldn't be more than a 100 days or so.
Edit: Sample data
Dates Days
aDate doUse rangeId day qty
2008-01-01 1 1 1 1
2008-01-02 1 1 2 10
2008-01-03 0 1 3 8
2008-01-04 1 2 1 2
2008-01-05 1 2 2 5
Ranges
rangeId startDate
1 2008-01-02
2 2008-01-03
Result
rangeId aDate qty
1 2008-01-02 1
1 2008-01-03 0
1 2008-01-04 10
1 2008-01-05 8
2 2008-01-03 0
2 2008-01-04 2
2 2008-01-05 5
Try this:
SELECT rt.rangeId, aDate, CASE WHEN doUse = 1 THEN qty ELSE 0 END AS qty
FROM (
SELECT *
FROM (
SELECT r.*, t.*, SUM(doUse) OVER (PARTITION BY rangeId ORDER BY aDate) AS span
FROM (
SELECT r.rangeId, startDate, MAX(day) AS dm
FROM Range r, Days d
WHERE d.rangeid = r.rangeid
GROUP BY
r.rangeId, startDate
) r, Dates t
WHERE t.adate >= startDate
ORDER BY
rangeId, t.adate
)
WHERE
span <= dm
) rt, Days d
WHERE d.rangeId = rt.rangeID
AND d.day = GREATEST(rt.span, 1)
P. S. It seems to me that the only point to keep all these Dates in the database is to get a continuous calendar with holidays marked.
You may generate a calendar of arbitrary length in Oracle using following construction:
SELECT :startDate + ROWNUM
FROM dual
CONNECT BY
1 = 1
WHERE rownum < :length
and keep only holidays in Dates. A simple join will show you which Dates are holidays and which are not.
Ok, so maybe I've found a way. Someting like this:
SELECT irangeId, aDate + sum(case when doUse = 1 then 0 else 1) over (partionBy rangeId order by aDate) as aDate, qty
FROM Days INNER JOIN (
select irangeId, startDate + day - 1 as aDate, qty
from Range inner join Days using (irangeid)
) USING (aDate)
Now I just need a way to fill in the missing dates...
Edit: Nah, this way means that I'll miss the doUse vaue of the last dates...