Trying to figure out why this SQL request take 47min to execute

Trying to figure out why this SQL request take 47min to execute - sql

I'm trying to make a SQL request but that request is taking forever to finish. The request is done in Excel 2003 with VBA.
Size of the TABLE:
TABLE1 = 12600 Row
TABLE2 = 361K Row
Here's the query:
SELECT DISTINCT
y.code AS CODE,
y.name AS LIBELLE,
#[...]
#[...]
#[...]
#[...]
y.IS_BILAN,
y.INACTIVE,
(SELECT COUNT(1)
FROM TABLE1 d, TABLE2 a
WHERE a.record_date_time >= '2018/01/01'
AND a.record_date_time < '2019/01/01'
AND global_status <> 'C'
AND a.id = d.id
AND d.type_id = y.code) AS TOTAL_2018
FROM
anal_exam y
ORDER BY
code
The whole query run instantly when removing the last part "SELECT COUNT(1)"
The execution plan I see in Oracle SQL Developer:
How could I speed up this query? It takes 47 minutes to finish

Try defining your JOIN like this:
SELECT DISTINCT
y.code AS CODE,
y.name AS LIBELLE,
y.IS_BILAN,
y.INACTIVE,
COUNT(*) AS TOTAL_2018
FROM anal_exam y
JOIN TABLE1 d
ON d.type_id = y.code
JOIN TABLE2 a
ON d.ID = a.ID
WHERE a.record_date_time BETWEEN '2018/01/01' AND '2019/01/01'
AND global_status <> 'C'
order by code

I added a GROUP BY y.code, y.name, y.IS_BILAN, y.inactive at the end and it work's
runtime is 47 sec.
It's quite fast but i'm wondering if there's a way to get the line with count = 0 because 3k line are omitted in this query

With the code from T McKeown i'm getting this result :
CODE1|LIBELLE1|T|T|1530
CODE3|LIBELLE2|T|T|20
CODE5|LIBELLE3|T|T|143
The result i'm seeking include the line with count()=0
CODE1|LIBELLE1|T|T|1530
CODE2|LIBELLE2|T|F|0
CODE3|LIBELLE2|T|F|20
CODE4|LIBELLE4|T|T|0
CODE5|LIBELLE3|F|T|143
How can i achieve this ?

Related

how to add a conditional statement after calculating two fields in SQL

I need to output data based on a condition to limit output to usable data. Need help with understanding and optimizing query and removing redundancies for my SQL query
I tried conditions in the where statement, but that is giving me an error. Also tried adding a Having statement, which did not work either.
select
o2.car_move_id as Carrier_Code,
o1.early_shpdte,
o1.prtnum,
shpsts,
(o1.host_ordqty / o3.untqty) as Order_pallets,
(
select
count(i3.untqty)
from
INVENTORY_PCKWRK_VIEW i3
inner join prtftp_dtl i4 on i3.prtnum = i4.prtnum
where
i3.invsts like 'U'
and i3.wrkref is null
and i3.prtnum = o1.prtnum
and i3.untqty = i4.untqty
and i4.uomcod like 'PL'
and i4.wh_id like 'RX'
) as full_pallets,
(
select
count(i5.untqty)
from
INVENTORY_PCKWRK_VIEW i5
inner join prtftp_dtl i6 on i5.prtnum = i6.prtnum
where
i5.invsts like 'U'
and i5.wrkref is null
and i5.prtnum = o1.prtnum
and i5.untqty < i6.untqty
and i5.prtnum = i6.prtnum
and i6.uomcod like 'PL'
and i6.wh_id like 'RX'
) as Partial_pallets
from
ord_line o1
inner join SHIP_STRUCT_VIEW o2 on o1.ordnum = o2.ship_id
inner join prtftp_dtl o3 on o1.prtnum = o3.prtnum
where
o2.ship_id like '0%'
and shpsts in ('R', 'I')
and o1.non_alc_flg = 0
and o3.wh_id like 'RX'
and o3.uomcod like 'PL'
order by
full_pallets asc,
o1.early_shpdte asc
I want to only output the query where order_pallets > Full_Pallets. not sure where I can add this condition in my query.

The items on the SELECT list of an SQL query are logically processed after the WHERE clause (as explained in this answer), that's why you cannot reference column aliases in the WHERE clause. You will need to use a subselect to accomplish what you want:
select * from (
select
o2.car_move_id as Carrier_Code,
o1.early_shpdte,
o1.prtnum,
shpsts,
-- the rest of your current query
) t
where t.order_pallets > t.Full_Pallets

You can enclose your entire query in
with x as ()
Then select from it:
select * from x
where x.order_pallets > x.full_pallets
This will save you from having to maintain multiple subqueries for the same information.

Select row from a MAX() GROUP BY in SQL Server

I have a table I called Eventos. I have to select the corresponding outTime from the alarm which has the greater inTime.
And I have to do it quickly/optimized. I have about 1 million entries in the table.
This is my code:
SELECT
CadGrupoEventos.Severidade AS Nível,
CadGrupoEquipamentos.Nome AS Grupo,
CadEquipamentos.TAG AS Equipamento,
CadEventos.MensagemPT AS 'Mensagem de alarme',
MAX(Eventos.InTime) AS 'Hora do evento',
Eventos.OutTime AS 'Hora de saída'
FROM
CadGrupoEventos,
CadEquipamentos,
CadEventos,
Eventos,
CadUsuarios,
CadGrupoEquipamentos
WHERE
Eventos.Acked = 0
AND CadGrupoEventos.Codigo = CadEventos.Grupo
AND CadEquipamentos.Codigo = Eventos.TAG
AND CadEventos.Codigo = Eventos.CodMensagem
AND CadGrupoEquipamentos.Codigo = CadEquipamentos.Grupo
GROUP BY
CadGrupoEventos.Severidade,
CadEquipamentos.TAG,
CadEventos.MensagemPT,
CadGrupoEquipamentos.Nome,
Eventos.OutTime
This code, as it is, returns every single entry from the table.
I have to take Eventos.OutTime out of GROUP BY and still get the value of it.

This is just an educated guess based on your description. Notice I used ANSI-92 style joins which are much more explicit. I also used aliases to make this a lot more legible. Your query might look something like this.
select x.Severidade AS Nível,
x.Nome AS Grupo,
x.TAG AS Equipamento,
x.MensagemPT AS [Mensagem de alarme],
x.[Hora do evento],
x.OutTime AS [Hora de saída]
from
(
SELECT cge.Severidade,
cgequip.Nome,
ce.TAG,
cevt.MensagemPT,
MAX(e.InTime) AS [Hora do evento],
e.OutTime
, RowNum = ROW_NUMBER() over(partition by cge.Severidade, ce.TAG, cevt.MensagemPT, cgequip.Nome order by e.OutTime /*maybe desc???*/)
FROM CadGrupoEventos cge
join CadEventos cevt on cge.Codigo = cevt.Grupo
join Eventos e on AND cevt.Codigo = e.CodMensagem
join CadEquipamentos ce on ce.Codigo = e.TAG
join CadGrupoEquipamentos cgequip on cgequip.Codigo = ce.Grupo
cross join CadUsuarios cu --not sure if this is really what you want but your original code did not have any logic for this table
WHERE e.Acked = 0
GROUP BY cge.Severidade,
ce.TAG,
cevt.MensagemPT,
cgequip.Nome,
e.OutTime
) x
where x.RowNum = 1

Teradata using Top 10 and Distinct

I am getting an error saying that I cannot use Top 10 with distinct I am wondering if there is any way I can get my query to work on that fashion. this is what the error is saying .
[Teradata Database] [6916] TOP N Syntax error: Top N option is not supported with DISTINCT option.
Query Below:
Thank you.
Select Distinct TOP 10 t1.Adjustment_ID, t1.OfficeNum, t1.InvoiceNum, t1.PatientNum,
t1.CurrentStatus, t1.AdjustmentTotal, t1.SubmittedOn, t1.UserSubmitted,
t1.Invoice_Type, t1.Pat_First_Name, t1.Pat_Last_Name,t2.Reason_Code FROM App_UnityAdj_AdjInfo_Tbl t1
Left Join RCM_WORK_PRD.App_UnityAdj_AdjRecord_Tbl t2
On t1.Adjustment_ID = t2.Adjustment_ID
Where t1.UserSubmitted = 'Name' AND (t1.CurrentStatus = 'Pending' OR t1.CurrentStatus = 'Deny')

What are you trying to do? You have columns in the SELECT that are not in the GROUP BY. You also have TOP without ORDER BY, which is suspicious.
One simple method is to move all the SELECT columns to the GROUP BY:
select TOP 10 t1.Adjustment_ID, t1.OfficeNum, t1.InvoiceNum, t1.PatientNum,
t1.CurrentStatus, t1.AdjustmentTotal, t1.SubmittedOn, t1.UserSubmitted,
t1.Invoice_Type, t1.Pat_First_Name, t1.Pat_Last_Name, t2.Reason_Code
from App_UnityAdj_AdjInfo_Tbl t1 Left Join
RCM_WORK_PRD.App_UnityAdj_AdjRecord_Tbl t2
On t1.Adjustment_ID = t2.Adjustment_ID
where t1.UserSubmitted = 'Name' AND
t1.CurrentStatus in ('Pending', 'Deny')
group by t1.Adjustment_ID, t1.OfficeNum, t1.InvoiceNum, t1.PatientNum,
t1.CurrentStatus, t1.AdjustmentTotal, t1.SubmittedOn, t1.UserSubmitted,
t1.Invoice_Type, t1.Pat_First_Name, t1.Pat_Last_Name,t2.Reason_Code;

I think I figured it out, in case anyone wants to know it is sample 10
Select Distinct t1.Adjustment_ID, t1.OfficeNum, t1.InvoiceNum, t1.PatientNum,
t1.CurrentStatus, t1.AdjustmentTotal, t1.SubmittedOn, t1.UserSubmitted,
t1.Invoice_Type, t1.Pat_First_Name, t1.Pat_Last_Name,t2.Reason_Code FROM App_UnityAdj_AdjInfo_Tbl t1
Left Join RCM_WORK_PRD.App_UnityAdj_AdjRecord_Tbl t2
On t1.Adjustment_ID = t2.Adjustment_ID
Where t1.AssignedTo IS null AND (t1.CurrentStatus = 'Pending')
sample 10

Find duplicates in SQL Server database where one of the columns must differ

I'm trying to write a SQL query to find duplicates. What I can't manage to do is to make my query only select duplicates where one of the columns value must differ. So, I want to find all the duplicates where all the columns are the same, but one of the values must differ.
What I've got at the moment:
SELECT
a.1, underlag.1, f.1, f.2, f.3, f.4, f.5, f.6, f.7, f.8,
COUNT(*) TotalCount
FROM
f
JOIN
a ON a.Id = f.Id
JOIN
underlag ON underlag.Id = f.Id
GROUP BY
a.1, underlag.1, f.1, f.2, f.3, f.4, f.5, f.6, f.7, f.8
HAVING
COUNT(*) > 1
ORDER BY
underlag.1
The column that I want to differ is f.9 but I've no clue on how to do this. Any help or pointers in the right direction would be great!

SELECT *
FROM (
SELECT
a1 = a.[1]
, underlag1 = underlag.[1]
, f.[1], f.[2], f.[3], f.[4], f.[5], f.[6], f.[7], f.[8], f.[9]
, val = SUM(1) OVER (PARTITION BY CHECKSUM(f.[1], f.[2], f.[3], f.[4], f.[5], f.[6], f.[7], f.[8]))
FROM f
JOIN a on a.Id = f.Id
JOIN underlag on underlag.Id = f.Id
) t
WHERE t.val > 1
ORDER BY underlag1

Why grouping in a subquery causes problems

When I include the 2 commented out lines in the following subquery, seems that it takes forever until my Sybase 12.5 ASE server gets any results. Without these 2 lines the query runs ok. What is so wrong with that grouping?
select days_played.day_played, count(distinct days_played.user_id) as OLD_users
from days_played inner join days_received
on days_played.day_played = days_received.day_received
and days_played.user_id = days_received.user_id
where days_received.min_bulk_MT > days_played.min_MO
and days_played.user_id in
(select sgia.user_id
from days_played as sgia
where sgia.day_played < days_played.day_played
--group by sgia.user_id
--having sum(sgia.B_first_msg) = 0
)
group by days_played.day_played

Find out what the query does by using showplan to show the explanation.
In this case Can't you eliminate the subquery by making it part of the main query?

Could you try rewriting the query as follows?
select days_played.day_played,
count(distinct days_played.user_id) as OLD_users
from days_played
inner join days_received on days_played.day_played = days_received.day_received
and days_played.user_id = days_received.user_id
where days_received.min_bulk_MT > days_played.min_MO
and 0 = (select sum(sgia.B_first_msg)
from days_played as sgia
where sgia.user_id = days_played.user_id
and sgia.day_played < days_played.day_played
)
group by days_played.day_played
I guess this should give you better performance...

ok I found out what the problem was
I had to include user id in the subquery: "where days_played.user_id = sgia.user_id
and sgia.day_played < days_played.day_played"

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Trying to figure out why this SQL request take 47min to execute - sql

I added a GROUP BY y.code, y.name, y.IS_BILAN, y.inactive at the end and it work's runtime is 47 sec. It's quite fast but i'm wondering if there's a way to get the line with count = 0 because 3k line are omitted in this query

Related

how to add a conditional statement after calculating two fields in SQL

Select row from a MAX() GROUP BY in SQL Server

Teradata using Top 10 and Distinct

Find duplicates in SQL Server database where one of the columns must differ

Why grouping in a subquery causes problems

Categories

Resources