How to enhance the performance of nested query? - sql

I have the following query but it run slowly in my sql editor !
How to enhance it (write wise) to speed the query running .
SELECT year,main_code,name,father_code,main_code || '__' || year AS main_id,
(SELECT COUNT(*) FROM GK_main
WHERE father_code=sc.main_code AND
year= (SELECT MAX(year)FROM SS_job)) childcount
FROM GK_main sc WHERE year=(SELECT MAX(year)FROM SS_job)

The efficiency depends more on available indexes, rather than the way it is written. You could try this version (without inline subqueries):
SELECT
sc.year,
sc.main_code,
sc.name,
sc.father_code,
sc.main_code || '__' || sc.year AS main_id,
NVL(g.childcount, 0) AS childcount
FROM
GK_main sc
LEFT JOIN
( SELECT father_code ,
COUNT(*) AS childcount
FROM GK_main
WHERE year = (SELECT MAX(year) FROM SS_job)
GROUP BY father_code
) AS g
ON g.father_code = sc.main_code
WHERE
sc.year = (SELECT MAX(year) FROM SS_job) ;
But what would benefit the efficiency would be indexes.
Is there an index on SS_job (year)?
Is there an index on GK_main (year, father_code) or on GK_main (father_code, year)?
Is there an index on GK_main (year, main_code) or on GK_main (main_code)?

Use a join instead of subquery.
SELECT sc.year,sc.main_code,sc.name,sc.father_code,sc.main_code || '__' || sc.year AS main_id,
COUNT(F.father_code) AS childcount
FROM GK_main sc LEFT JOIN GK_main F ON F.father_code = sc.main_code
WHERE year=(SELECT MAX(year)FROM SS_job)
GROUP BY sc.year,sc.main_code,sc.name,sc.father_code
Not tested and made it quickly so might contain a mistake. But this should at least save you from checking SELECT MAX(year)FROM SS_job twice.
I would never do COUNT(*), but always chose the collumn(s) I wish to count.

Try this query
SELECT
year,
main_code,
name,
father_code,
main_code || '__' || year AS main_id,
childcount.cnt as 'count'
FROM
GK_main sc
LEFT JOIN
(SELECT
father_code,
COUNT(*) AS cnt
FROM
GK_main
WHERE
year= (SELECT MAX(year)FROM SS_job)
GROUP BY
father_code) childcount
ON childcount.father_code = sc.main_code
Create necesary index on (father_code, year) that will help

Related

Take the last in the query

I have a query like this:
Select *
From Table1 as ad
inner join Table2 as u
on u.employee_ident=ad.employee_ident
inner join Table3 as t
On u.employee_ident=t.employee_ident and u.hire_date=t.hire_date
where DATEDIFF(day,t.term_date,GETDATE() )>=60 AND u.status in ('nohire','1') and u.company_group_abbr_name='ABC'
order by
t.term_date asc
Table3 for the same user has more than one term_date. I want that when I run this query in the moment that the compare will be done in DATEDIFF(day,t.term_date,GETDATE() )>=60 in the part of t.term_date it will take the last one. Actually when I run it it makes the compare with the first one that it finds.
So from the dates 2018, 2020, and 2022 it compares with 2018 and I want it to make the compare with 2022 which is the most recent one. How can I do this?
Try something like this:
WITH T3Latest (
employee_ident,
hire_date,
term_date,
term_rank
)
AS (
SELECT employee_ident,
hire_date,
term_date,
RANK() OVER (
PARTITION BY employee_ident ORDER BY term_date DESC
) term_rank
FROM Table3
)
SELECT *
FROM Table1 AS ad
INNER JOIN Table2 AS u ON u.employee_ident = ad.employee_ident
INNER JOIN T3Latest AS t ON u.employee_ident = t.employee_ident
AND u.hire_date = t.hire_date
WHERE t.term_rank = 1
AND DATEDIFF(day, t.term_date, GETDATE()) >= 60
AND u.STATUS IN (
'nohire',
'1'
)
AND u.company_group_abbr_name = 'ABC'
ORDER BY t.term_date ASC;

Convert aliases and distinct subquery from SQLite to SQLAlchemy

I am trying to convert a SQLite statement into python SQLAlchemy to be used with FASTApi. I am not sure how to convert a query this complex with aliases of s and p for the single prices table.
Here is the SQLite query:
SELECT s.security_id, p.price, MAX(p.price_datetime) price_datetime
FROM (SELECT DISTINCT security_id FROM prices) s
LEFT JOIN prices p ON p.security_id = s.security_id AND p.price_datetime <= '2022-08-10 19:000:00.000000'
GROUP BY s.security_id;
Here is my attempt so far:
# starting attempt so far
select(models.Price.security_id, models.Price.price, func.max(models.Price.price_datetime), models.Price.price_datetime)
First wonder is why do you have such a complicated query ? Selecting distinct security_id to join again, to group by security_id makes no sense to me.
I have come up with this much simpler version, which in my tests works the same.
SELECT security_id, price, MAX(price_datetime) price_datetime
FROM prices
WHERE price_datetime <= '2022-02-01'
GROUP BY security_id;
Which then is fairly easy to translate to SQLAlchemy.
stmt = (
select(
Price.security_id,
Price.price,
func.max(Price.price_datetime).alias("price_datetime"),
)
.filter(Price.price_datetime <= '2022-02-01')
.group_by(Price.security_id)
)
After OP's comment:
SELECT s.id, p.price, MAX(p.price_datetime) AS price_datetime
FROM security AS s
LEFT JOIN prices as p
ON s.id = p.security_id AND p.price_datetime <= '2021-02-01'
GROUP BY s.id;
which should translate to
stmt = (
select(
Security.id,
Price.price,
func.max(Price.price_datetime).label("price_datetime"),
)
.join(
Price,
and_(
Security.id == Price.security_id,
Price.price_datetime <= "2022-01-01",
),
isouter=True,
)
.group_by(Security.id)
)

Sql query tuning/optimization

For large amounts of data, it is taking a lot of time to execute.
Please help tune this query.
select *
from
(select cs.sch, cs.cls, cs.std, d.date, d.count
from
(select c.sch, c.cls, s.std
from
(select distinct sch, cls from Data) c --List of school/classes
cross join
(select distinct std from Data) s --list of std
) cs --every possible combination of school/classes and std
left outer join
Data D on D.sch = cs.sch and D.cls = cs.cls and D.std = cs.std --try and join to the original data
group by
c.sch, c.cls, s.std, d.date, d.count)
order by
cs.sch, cs.cls,
case
when (cs.std= 'Ax')
then 1
when (cs.std= 'Bo')
then 2
when (cs.std= 'Ct')
then 3
else null
end
Thanks in advance
Magickk
First, the query is generating a lot of rows (presumably) and so it is going to take time.
From what I can tell, the outer aggregation is not necessary. At the very least, you have no aggregation functions which is suspicious.
select c.sch, c.cls, s.std, d.date, d.count
from (Select distinct sch, cls from Data
) c cross join -- list of school/classes
(select distinct std from Data
) s left join -- list of std
Data d
on d.sch = cs.sch and d.cls = cs.cls and d.std = cs.std
order by cs.sch, cs.cls,
(case cs.std when 'Ax' then 1 when 'Bo' then 2 when 'Ct' else 3 end)
There is nothing you can do about the outer order by. For the select distinct subqueries, you can create indexes on data(sch, cls, std) (the third column is for the join) and data(std).
DISTINCT is slowing down performance on big tables. Instead, a replacement for DISTINCT could be GROUP BY (wich in some scenarios is more rapid)
select *
from
(select cs.sch, cs.cls, cs.std, d.date, d.count
from
(select c.sch, c.cls, s.std
from
(select sch, cls from Data
group by sch, cls) c
cross join
(select std from Data
group by std) s) cs --every possible combination of school/classes and std
left outer join
Data D on D.sch = cs.sch and D.cls = cs.cls and D.std = cs.std --try and join to the original data
group by
c.sch, c.cls, s.std, d.date, d.count)
order by
cs.sch, cs.cls,
case
when (cs.std= 'Ax')
then 1
when (cs.std= 'Bo')
then 2
when (cs.std= 'Ct')

Using Subqueries (Common Table Expression) As Filters

tl;dr: My filter conditions are significantly different based on delivery date of today vs yesterday. How can utilize subqueries to simplify my code?
I've been teaching myself SQL and am pretty solid on most concepts, with the exception of subqueries. Generally I gather that I can use WITH ___ AS before my query to return results that meet the conditions specified in the subquery. I thought I could make it work with more than one subquery but I'm having trouble. It works with one subquery but not with multiple. I know this can be done within a WHERE statement but it would be quite complex. Here is an example of what I'd like to do:
WITH todays_results AS(
SELECT
order_id,
status,
message
FROM delivery_statuses
WHERE delivery date = STRLEFT(CAST(now() AS string,10)
AND (status = 'delivered'
OR (status = 'out for delivery' AND message = 'On vehicle for delivery')
OR (status = 'in transit' AND message <>'At sort center')
),
yesterdays_results AS (
SELECT
order_id,
status,
message
FROM delivery_statuses
WHERE delivery date = STRLEFT(CAST(now() - INTERVAL 1 days AS string,10)
AND (status = 'delivered'
OR (status = 'out for delivery' AND message = 'Shipment will be delivered within 1 hour')
OR (status = 'pre transit' AND message <> 'Order processing')
)
SELECT
*
FROM customer_details cd
INNER JOIN
(SELECT * FROM todays_results) tr
ON cd.order_id = tr.order_id
INNER JOIN
(SELECT * FROM yesterdays_results) yr
ON cd.order_id = yr.order_id
How can I make this return the results that match the first subquery and below the results the second subquery? I would like to even add a third subquery.
Give each with clause a number to identify it.
Then you can union all the joins, and order the results by the common id column.
-- this code has not been tested.
WITH
A AS
(
SELECT 1 ID, OTHER_STUFF FROM SOME_WHERE
),
B AS
(
SELECT 2 ID, OTHER_STUFF FROM SOME_WHERE_ELSE
)
SELECT *
FROM
(
SELECT A.ID, C.*
FROM TABLEX C
JOIN A
ON A.KEY = C.KEY
UNION ALL
SELECT B.ID, D.*
FROM TABLEX D
JOIN B
ON B.KEY = D.KEY
)
ORDER BY ID

Regarding one query error

Here is my Requirement when I am using the below query I am getting the correct response but the problem is I want to select the distinct records so please help me how can I use distinct in the below query
SELECT LISTAGG(PAC.DESCRIPTION || ' = '|| ORL.ITEM_PACKAGE_COUNT , ',') WITHIN GROUP (ORDER BY PAC.DESCRIPTION || ' = '|| ORL.ITEM_PACKAGE_COUNT)
FROM ORDER_RELEASE_LINE ORL , PACKAGED_ITEM PAC , SHIPMENT SH , ORDER_MOVEMENT OM
WHERE ORL.PACKAGED_ITEM_GID = PAC.PACKAGED_ITEM_GID
AND OM.ORDER_RELEASE_GID = ORL.ORDER_RELEASE_GID
AND OM.SHIPMENT_GID = SH.SHIPMENT_GID
AND SH.SHIPMENT_GID = 'ULA/SAO.5000072118'
Your subquery returns SELECT DISTINCT PAC.DESCRIPTION but outer query uses aliases and values from inner query LISTAGG(PAC.DESCRIPTION || ' = '|| ORL.ITEM_PACKAGE_COUNT , ',') ORL.ITEM_PACKAGE_COUNT is not returned by subquery. Try:
SELECT LISTAGG(SUBQ.DESCRIPTION || ' = '|| SUBQ.ITEM_PACKAGE_COUNT , ',')
WITHIN GROUP (ORDER BY SUBQ.DESCRIPTION || ' = '|| SUBQ.ITEM_PACKAGE_COUNT)
FROM (SELECT DISTINCT PAC.DESCRIPTION, ORL.ITEM_PACKAGE_COUNT
FROM ORDER_RELEASE_LINE ORL , PACKAGED_ITEM PAC , SHIPMENT SH , ORDER_MOVEMENT OM
WHERE ORL.PACKAGED_ITEM_GID = PAC.PACKAGED_ITEM_GID
AND OM.ORDER_RELEASE_GID = ORL.ORDER_RELEASE_GID
AND OM.SHIPMENT_GID = SH.SHIPMENT_GID
AND SH.SHIPMENT_GID = 'ULA/SAO.5000072118') SUBQ
Generally it is wrong practice to use same alias PAC in inner query for a table and in outer query for result of joined data. Another wrong practice is using implicit joins instead of defining explicit INNER JOIN ON
If you need to get distinct values from a query, and then build the LISTAGG of these distinct values, you can simply use DISTINCT in your query and wrap it with an external one where you use LISTAGG.
For example:
with dupValTab(s) as
(
select 'something' from dual union all
select 'something else' from dual union all
select 'something' from dual
)
select listagg(s, ', ') within group (order by s)
from (
select distinct s
from dupValTab
)