How to simplify multiple CTE - sql

I have several similar CTE, actually 9. The difference is in the WHERE clause from the subquery on the column for.
WITH my_cte_1 AS (
SELECT id,
"time",
LEAD("time",1) OVER (
PARTITION BY id
ORDER BY id,"time"
) next_time
FROM history
where id IN (SELECT id FROM req WHERE type = 'sup' AND for = 1)
),
WITH my_cte_2 AS (
SELECT id,
"time",
LEAD("time",1) OVER (
PARTITION BY id
ORDER BY id,"time"
) next_time
FROM history
where id IN (SELECT id FROM req WHERE type = 'sup' AND for = 2)
),
WITH my_cte_3 AS (
SELECT id,
"time",
LEAD("time",1) OVER (
PARTITION BY id
ORDER BY id,"time"
) next_time
FROM history
where id IN (SELECT id FROM req WHERE type = 'sup' AND for = 3)
)
SELECT
'History' AS "Indic",
(SELECT count(DISTINCT(id)) FROM my_cte_1 ) AS "cte1",
(SELECT count(DISTINCT(id)) FROM my_cte_2 ) AS "cte2",
(SELECT count(DISTINCT(id)) FROM my_cte_3 ) AS "cte3",
My database is read only so I can't use function.
Each CTE process a large record of data.
Is there a way, where I can setup a parameter for the column for or a workaround ?

I'm assuming a little bit here, but I would think something like this would work:
with cte as (
SELECT
h.id, h."time",
LEAD(h."time",1) OVER (PARTITION BY h.id ORDER BY h.id, h."time") next_time,
r.for
FROM
history h
join req r on
r.type = 'sup' and
h.id = r.id and
r.for between 1 and 3
)
select
'History' AS "Indic",
count (distinct id) filter (where for = 1) as cte1,
count (distinct id) filter (where for = 2) as cte2,
count (distinct id) filter (where for = 3) as cte3
from cte
This would avoid multiple passes on the various tables and should run much quicker unless these are highly selective values.
Another note... the "lead" analytic function doesn't appear to be used. If this is really all there is to your query, you can omit that and make it run a lot faster. I left it in assuming it had some other purpose.

Related

SQL - delete record where sum = 0

I have a table which has below values:
If Sum of values = 0 with same ID I want to delete them from the table. So result should look like this:
The code I have:
DELETE FROM tmp_table
WHERE ID in
(SELECT ID
FROM tmp_table WITH(NOLOCK)
GROUP BY ID
HAVING SUM(value) = 0)
Only deletes rows with ID = 2.
UPD: Including additional example:
Rows in yellow needs to be deleted
Your query is working correctly because the only group to total zero is id 2, the others have sub-groups which total zero (such as the first two with id 1) but the total for all those records is -3.
What you're wanting is a much more complex algorithm to do "bin packing" in order to remove the sub groups which sum to zero.
You can do what you want using window functions -- by enumerating the values for each id. Taking your approach using a subquery:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
)
delete from t
where exists (select 1
from t t2
where t2.id = t.id and t2.value = - t.value and t2.seqnum = t.seqnum
);
You can also do this with a second layer of window functions:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
),
tt as (
select t.*, count(*) over (partition by id, abs(value), seqnum) as cnt
from t
)
delete from tt
where cnt = 2;

Performance tuning of Oracle SQL query

Can someone help me out to tune this query? It's taking 1 minute time to return the data in sqldeveloper.
SELECT
masterid, notification_id, notification_list, typeid,
subject, created_at, created_by, approver, sequence_no,
productid, statusid, updated_by, updated_at, product_list,
notification_status, template, notification_type, classification
FROM
(
SELECT
masterid, notification_id, notification_list, typeid, subject,
approver, created_at, created_by, sequence_no, productid,
statusid, updated_by, updated_at, product_list, notification_status,
template, notification_type, classification,
ROW_NUMBER() OVER(ORDER BY masterid DESC)AS r
FROM
(
SELECT DISTINCT
a.masterid AS masterid,
a.maxid AS notification_id,
notification_list,
typeid,
noti.subject AS subject,
noti.approver AS approver,
noti.created_at AS created_at,
noti.created_by AS created_by,
noti.sequence_no AS sequence_no,
a.productid AS productid,
a.statusid AS statusid,
noti.updated_by AS updated_by,
noti.updated_at AS updated_at,
(
SELECT LISTAGG(p.name,',') WITHIN GROUP(ORDER BY p.id) AS list_noti
FROM product p
INNER JOIN notification_product np ON np.product_id = p.id
WHERE notification_id = a.maxid
) AS product_list,
(
SELECT description
FROM notification_status
WHERE id = a.statusid
) AS notification_status,
(
SELECT name
FROM template
WHERE id = a.templateid
) AS template,
(
SELECT description
FROM notification_type
WHERE id = a.typeid
) AS notification_type,
(
SELECT tc.description
FROM template_classification tc
INNER JOIN notification nt ON tc.id = nt.classification_id
WHERE nt.id = a.maxid
) AS classification
FROM
(
SELECT
nm.id AS masterid,
nm.product_id AS productid,
nm.notification_status_id AS statusid,
nm.template_id AS templateid,
nm.notification_type_id AS typeid,
(
SELECT MAX(id)
FROM notification
WHERE notification_master_id = nm.id
) AS maxid,
(
SELECT LISTAGG(n.id,',') WITHIN GROUP(ORDER BY nf.id) AS list_noti
FROM notification n
WHERE notification_master_id = nm.id
) AS notification_list
FROM notification_master nm
INNER JOIN notification nf ON nm.id = nf.notification_master_id
WHERE nm.disable = 'N'
ORDER BY nm.id DESC
) a
INNER JOIN notification noti
ON a.maxid = noti.id
AND
(
(
(
TO_DATE('01-jan-1970','dd-MM-YYYY') +
numtodsinterval(created_at / 1000,'SECOND')
) <
(current_date + INTERVAL '-21' DAY)
)
OR (typeid exists(2,4) AND statusid = 4)
)
)
)
WHERE r BETWEEN 11 AND 20
DISTINCT is very often an indicator for a badly written query. A normalized database doesn't contain duplicate data, so where do the duplicates suddenly come from that you must remove with DISTINCT? Very often it is your own query producing these. Avoid producing duplicates in the first place, so you don't need DISTINCT later.
In your case you are joining with the table notification in your subquery a, but you are not using its rows in that subquery; you only select from notification_master_id.
After all, you want to get notification masters, get their latest related notification (by getting its ID first and then select the row). You don't need hundreds of subqueries to achieve this.
Some side notes:
To get the description from template_classification you are joining again with the notification table, which is not necessary.
ORDER BY in a subquery (ORDER BY nm.id DESC) is superfluous, because subquery results are per standard SQL unsorted. (Oracle violates this standard sometimes in order to apply ROWNUM on the result, but you are not using ROWNUM in your query.)
It's a pity that you store created_at not as a DATE or TIMESTAMP, but as a number. This forces you to calculate. I don't think this has a great impact on your query, though, because you are using it in an OR condition.
CURRENT_DATE gets you the client date. This is rarely wanted, as you select data from the database, which should of course not relate to some client's date, but to its own date SYSDATE.
If I am not mistaken, your query can be shortened to:
SELECT
nm.id AS masterid,
nf.id AS notification_id,
nfagg.notification_list AS notification_list,
nm.notification_type_id AS typeid,
nf.subject AS subject,
nf.approver AS approver,
nf.created_at AS created_at,
nf.created_by AS created_by,
nf.sequence_no AS sequence_no,
nm.product_id AS productid,
nm.notification_status_id AS statusid,
nf.updated_by AS updated_by,
nf.updated_at AS updated_at,
(
SELECT LISTAGG(p.name, ',') WITHIN GROUP (ORDER BY p.id)
FROM product p
INNER JOIN notification_product np ON np.product_id = p.id
WHERE np.notification_id = nf.id
) AS product_list,
(
SELECT description
FROM notification_status
WHERE id = nm.notification_status_id
) AS notification_status,
(
SELECT name
FROM template
WHERE id = nm.template_id
) AS template,
(
SELECT description
FROM notification_type
WHERE id = nm.notification_type_id
) AS notification_type,
(
SELECT description
FROM template_classification
WHERE id = nf.classification_id
) AS classification
FROM notification_master nm
INNER JOIN
(
SELECT
notification_master_id,
MAX(id) AS maxid,
LISTAGG(id,',') WITHIN GROUP (ORDER BY id) AS notification_list
FROM notification
GROUP BY notification_master_id
) nfagg ON nfagg.notification_master_id = nm.id
INNER JOIN notification nf
ON nf.id = nfagg.maxid
AND
(
(
DATE '1970-01-01' + NUMTODSINTERVAL(nf.created_at / 1000, 'SECOND')
< CURRENT_DATE + INTERVAL '-21' DAY
)
OR (nm.notification_type_id IN (2,4) AND nm.notification_status_id = 4)
)
WHERE nm.disable = 'N'
ORDER BY nm.id DESC
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY;
As mentioned, you may want to replace CURRENT_DATE with SYSDATE.
I recommend the following indexes for the query:
CREATE INDEX idx1 ON notification_master (disable, id, notification_status_id, notification_type_id);
CREATE INDEX idx2 ON notification (notification_master_id, id, created_at);
A last remark on paging: In order to skip n rows to get the next n, the whole query must get executed for all data and then all result rows be sorted only to pick n of them at last. It is usually better to remember the last fetched ID and then only select rows with a higher ID in the next execution.

Could this query be optimized?

My goal is to select record by two criterias that depend on each other and group it by other criteria.
I found solution that select record by single criteria and group it
SELECT *
FROM "records"
NATURAL JOIN (
SELECT "group", min("priority1") AS "priority1"
FROM "records"
GROUP BY "group") AS "grouped"
I think I understand concept of this searching - select properties you care about and match them in original table - but when I use this concept with two priorities I get this monster
SELECT *
FROM "records"
NATURAL JOIN (
SELECT *
FROM (
SELECT "group", "priority1", min("priority2") AS "priority2"
FROM "records"
GROUP BY "group", "priority1") AS "grouped2"
NATURAL JOIN (
SELECT "group", min("priority1") AS "priority1"
FROM "records"
NATURAL JOIN (
SELECT "group", "priority1", min("priority2") AS "priority2"
FROM "records"
GROUP BY "group", "priority1") AS "grouped2'"
GROUP BY "group") AS "GroupNested") AS "grouped1"
All I am asking is couldn't it be written better (optimalized and looking-better)?
JSFIDDLE
---- Update ----
The goal is that I want select single id for each group by priority1 and priority2 should be selected as first and then priority2).
Example:
When I have table records with id, group, priority1 and priority2
with data:
id , group , priority1 , priority2
56 , 1 , 1 , 2
34 , 1 , 1 , 3
78 , 1 , 3 , 1
the result should be 56,1,1,2. For each group search first for min of priority1 than search for min of priority2.
I tried combine max and min together (in one query`, but it does not find anything (I do not have this query anymore).
EXISTS() to the rescue! (I did some renaming to avoid reserved words)
SELECT *
FROM zrecords r
WHERE NOT EXISTS (
SELECT *
FROM zrecords nx
WHERE nx.zgroup = r.zgroup
AND ( nx.priority1 < r.priority1
OR nx.priority1 = r.priority1 AND nx.priority2 < r.priority2
)
);
Or, to avoid the AND / OR logic, compare the two-tuples directly:
SELECT *
FROM zrecords r
WHERE NOT EXISTS (
SELECT *
FROM zrecords nx
WHERE nx.zgroup = r.zgroup
AND (nx.priority1, nx.priority2) < (r.priority1 , r.priority2)
);
maybe this is what you expect
with dat as (
SELECT "group" grp
, priority1, priority2, id
, row_number() over (partition by "group" order by priority1) +
row_number() over (partition by "group" order by priority2) as lp
FROM "records")
select dt.grp, priority1, priority2, dt.id
from dat dt
join (select min(lp) lpmin, grp from dat group by grp) dt1 on (dt1.lpmin = dt.lp and dt1.grp =dt.grp)
Simply use row_number() . . . once:
select r.*
from (select r.*,
row_number() over (partition by "group" order by priority1, priority2) as seqnum
from records r
) r
where seqnum = 1;
Note: I would advise you to avoid natural join. You can use using instead (if you don't want to explicitly include equality comparisons).
Queries with natural join are very hard to debug, because the join keys are not listed. Worse, "natural" joins do not use properly declared foreign key relationships. They depend simply on columns that have the same name.
In tables that I design, they would never be useful anyway, because almost all tables have createdAt and createdBy columns.

Oracle - optimising SQL query

I have two tables - countries (id, name) and users (id, name, country_id). Each user belongs to one country. I want to select 10 random users from the same random country. However, there are countries that have less than 10 users, so I can't use them. I need to select only from those countries, that have at least 10 users.
I can write something like this:
SELECT * FROM(
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
WHERE u.country_id =
(
SELECT *
FROM
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM users
{MANY_OTHER_JOINS_AND_CONDITIONS}
GROUP BY users.country_id
) X ON X.country_id = c.id
WHERE X.cnt >= 10
ORDER BY DBMS_RANDOM.RANDOM
) Y
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
However, In my real scenario, I have more conditions and joins to other tables for determining which user is applicable. By using this query, I must have these conditions on two places - in query that actually selects data and in the count subquery.
Is there any way how to write query like this but without having those other conditions on two places (which is probably not good performance-wise)?
You can use a CTE for the user criteria to avoid repeating the logic and to allow the DB to cache that set once (though in my experience the DB isn't as good at that as it should be, so check your execution plan).
I'm more of a Sql Server guy than Oracle, and syntax is subtly different so this may need some tweaks yet, but try this:
WITH SafeUsers (ID, Name, country_id) As
(
--criteria for users only has to specified here
SELECT ID, Name, country_id
FROM users
WHERE ...
),
RandomCountry (ID) As
(
SELECT ID
FROM (
SELECT u.country_id AS ID
FROM SafeUsers u -- but we reference it HERE
GROUP BY u.country_id
HAVING COUNT(u.Id) >= 10
ORDER BY DBMS_RANDOM.RANDOM
) c
WHERE ROWNUM = 1
)
SELECT u.*
FROM (
SELECT s.*
FROM SafeUsers s -- and HERE
INNER JOIN RandomCountry r ON s.country_id = r.ID
ORDER BY DBMS_RANDOM.RANDOM
) u
WHERE ROWNUM <= 10
And by removing nesting and introducing names for each intermediate step, this query is suddenly much easier to read and maintain.
you could create a view
for
create view user_with_many_cond as
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
ths looking to your query
You could use having instead of a where outside the query
The order by seems could be placed inside the inner query
so the filter for one row
SELECT * FROM(
SELECT *
FROM user_with_many_cond u
WHERE u.country_id =
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM user_with_many_cond
GROUP BY users.country_id
HAVING cnt >=10
ORDER BY DBMS_RANDOM.RANDOM
) X ON X.country_id = c.id
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
To get countries with more than 10 users:
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as rn
FROM users
GROUP BY users.country_id having count(*) > 10
Use this as a sub-query to choose a country and grab some users:
with ctry as (
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as ctry_rn
FROM users
GROUP BY users.country_id having count(*) > 10
)
, usr as (
select user_id
, row_number() over (order by dbms_random.value()) as usr_rn
from ctry
join users
on users.country_id = ctry.country_id
where ctry.ctry_rn = 1
)
select users.*
from usr
join users
on users.user_id = usr.user_id
where usr.usr_rn <= 10
/
This example ignores your {MANY_OTHER_JOINS_AND_CONDITIONS}: please inject them back where you need them.

How to perform reference a window function inside current table?

I have this part in a larger query which consume lot of RAM:
TopPerPost as
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id (The most common close Reason)
from ReasonsPerPost
where Name is NOT NULL and TopPerPost.seq=1 -- Remove useless results here, instead of doing it later
)
but I got The multi-part identifier "TopPerPost.seq" could not be bound.
Last detail... I only Use theNamecolumn in a laterINNER JOINof that table.
You can't reference a window function in the where of the same query. Just create a second cte.
with TopPerPost as
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id
from ReasonsPerPost
where Name is NOT NULL
)
, OnlyTheTop as
(
select *
from TopPerPost
where seq = 1
)
Or you can do it like this.
select * from
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id
from ReasonsPerPost
where Name is NOT NULL
) s
where seq = 1
Here is another option that should eliminate the need for so many rows being returned.
select Id,
CloseReasonTypeId,
Name,
s.TotalByCloseReason
from ReasonsPerPost rpp
cross apply
(
select top 1 TotalByCloseReason
from ReasonsPerPost rpp2
where rpp2.Id = rpp.Id
order by TotalByCloseReason desc
) s
where Name is NOT NULL
Attempt #4...this would be a LOT easier with a sql fiddle to work with.
select Id,
CloseReasonTypeId,
Name,
s.TotalByCloseReason
from ReasonsPerPost rpp
inner join
(
select top 1 TotalByCloseReason
from ReasonsPerPost rpp2
where rpp2.Id = rpp.Id
and Name is NOT NULL
order by TotalByCloseReason desc
) s on s.Id = rpp.Id
where Name is NOT NULL
The below might work for your need.
But without looking at the data is hard to tell it will or not.
;with t as
(
Select Id, max(totalbyclosereason) TC from reasonsperpost where name is not null group by id
)
Select T.id,t.tc,c.closereasontypeid,c.name
From t join reasonsperpost c on t.id = c.id and t.tc = c.totalbyclosereason