why "OR" operator is slower than union in oracle - sql

Does anyone know why an "OR" operator is slower than union in ORACLE.
I have query like this:
Select
O.Order_number,
DA. ID,
DA.Country,
Sum(amount) Amount
from
Order O
left join Delivery_Address DA on
O.ID = DA.order_Id
left join TBL_A on
TBL_A.DA_ID = DA.ID
enter code here
< ... Left joining another 10 tables>
enter code here
Left join Transaction Tr on
TR.Order_id = Order.id
where
DA.Country = 'USA'
OR
Tr.transaction_Date between to_date('20200701','yyyymmdd') and sysdate
This takes 200 secs for the first 50 records.
Select
O.Order_number,
DA.ID,
DA.Country,
Sum(amount) Amount
from
Order O
left join Delivery_Address DA on
O.ID = DA.order_Id
left join TBL_A on
TBL_A.DA_ID = DA.ID
enter code here
< ... Left joining another 10 tables>
enter code here
Left join Transaction Tr on
TR.Order_id = Order.id
where
DA.Country = 'USA'
union
Select
O.Order_number,
DA. ID,
DA.Country,
Sum(amount) Amount
from
Order O
left join Delivery_Address DA on
O.ID = DA.order_Id
left join TBL_A on
TBL_A.DA_ID = DA.ID
enter code here
< ... Left joining another 10 tables>
enter code here
Left join Transaction Tr on
TR.Order_id = Order.id
where
Tr.transaction_Date between to_date('20200701','yyyymmdd') and sysdate
This second query takes 13 secs for the first 50 records.
The transaction_date from the Transaction table is indexed, but the Country column is not indexed.
Anyone have any idea?

The OR allows each subquery to be evaluated independently.
You would have to look at the execution plans to see what is really happening. However, in the first subquery query, an index using da.country is probably using an index. And in the second, tr.transaction_date.

Related

Access Subquery On mulitple conditions

This SQL query needs to be done in ACCESS.
I am trying to do a subquery on the total sales, but I want to link the sale to the province AND to product. The below query will work with one or the other: (po.product_name = allp.all_products) AND (p.province = allp.all_province); -- but it will no take both.
I will be including every month into this query, once I can figure out the subquery on with two criteria.
Select
p.province as [Province],
po.product_name as [Product],
all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id)
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province
)
as allp
on (po.product_name = allp.all_products) AND (p.province = allp.all_province);
Make the first select sql into a table by giving it an alias and join table 1 to table 2. I don't have your table structure or data to test it but I think this will lead you down the right path:
select table1.*, table2.*
from
(Select
p.province as [Province],
po.product_name as [Product]
--removed this ,all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id) table1
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province --check your group by, I dont think you want pp1.price here if you want to aggregate
) as table2 --changed from allp
on (table1.product = table2.all_products) AND (table1.province = table2.all_province);

Count with subselect really slow in postgres

I have this query:
SELECT c.name, COUNT(t.id)
FROM Cinema c
JOIN CinemaMovie cm ON cm.cinema_id = c.id
JOIN Ticket t ON cm.id = cinema_movie_id
WHERE cm.id IN (
SELECT cm1.id
FROM CinemaMovie cm1
JOIN Movie m1 ON m1.id = cm1.movie_id
JOIN Ticket t1 ON t1.cinema_movie_id = cm1.id
WHERE m1.name = 'Hellboy'
AND t1.time >= timestamp '2019-04-18 00:00:00'
AND t1.time <= timestamp '2019-04-18 23:59:59' )
GROUP BY c.id;
and the problem is that this query runs really slow (more than 1 minute) when the table has like 20 million rows. From what I understand, the problem seems to be the inner query, as it takes a long time. Also, I have all indexes on foreign keys. What am I missing ?
Also note that when I select only by name (I omit the date) everything takes like 10 seconds.
EDIT
What I am trying to do, is count number of tickets for each cinema name, based on movie name and the timestamp on ticket.
I don't understand why you are using a subquery. Does this do what you want?
SELECT c.name, COUNT(t.id)
FROM Cinema c JOIN
CinemaMovie cm
ON cm.cinema_id = c.id JOIN
Ticket t
ON cm.id = cinema_movie_id JOIN
Movie m
ON m.id = cm.movie_id
WHERE m.name = 'Hellboy' AND
t.time >= '2019-04-18'::timestamp and
t.time < '2019-04-19'::timestamp
GROUP BY c.id, c.name;

sql subquery join group by

I am trying to get a list of our users from our database along with the number of people from the same cohort as them - which in this case is defined as being from the same medical school at the same time.
medical_school_id is stored in the doctor_record table
graduation_dt is stored in the doctor_record table as well.
I have managed to write this query out using a subquery which does a select statement counting the number of others for each row but this takes forever. My logic is telling me that I ought to run a simple GROUP BY query once first and then somehow JOIN the medical_school_id on to that.
The group by query is as follows
select count(ca.id) , cdr.medical_school_id, cdr.graduation_dt
from account ca
LEFT JOIN doctor cd on ca.id = cd.account_id
LEFT JOIN doctor_record cdr on cd.gmc_number = cdr.gmc_number
GROUP BY cdr.medical_school_id, cdr.graduation_dt
The long select query is
select a.id, a.email , dr.medical_school_id,
(select count(ba.id) from account ba
LEFT JOIN doctor bd on ba.id = bd.account_id
LEFT JOIN doctor_record bdr on bd.gmc_number = bdr.gmc_number
WHERE bdr.medical_school_id = dr.medical_school_id AND bdr.graduation_dt = dr.graduation_dt) AS med_count,
from account a
LEFT JOIN doctor d on a.id = d.account_id
LEFT JOIN doctor_record dr on d.gmc_number = dr.gmc_number
If you could push me in the right direction that would be amazing
I think you just want window functions:
select a.id, a.email, dr.medical_school_id, dr.graduation_dt,
count(*) over (partition by dr.medical_school_id, dr.graduation_dt) as cohort_size
from account a left join
doctor d
on a.id = d.account_id left join
doctor_record dr
on d.gmc_number = dr.gmc_number;
Using your same code for group by:
SELECT * FROM (
(
SELECT acc.[id]
, acc.[email]
FROM
account acc
LEFT JOIN
doctor doc
ON
acc.id = doc.account_id
LEFT JOIN
doctor_record doc_rec
ON
doc.gmc_number = doc_rec.gmc_number
) label
LEFT JOIN
(
SELECT count(acco.id)
, doc_reco.medical_school_id
, doc_reco.graduation_dt
FROM
account acco
LEFT JOIN
doctor doct
ON
acco.id = doct.account_id
LEFT JOIN
doctor_record doc_reco
ON
doct.gmc_number = doc_reco.gmc_number
GROUP BY
doc_reco.medical_school_id,
doc_reco.graduation_dt
) count
ON
count.[medical_school_id]=label.[medical_school_id]
AND
count.[graduation_dt]=label.[graduation_date]
)
how about something like this?
select a.doctor_id
, count(*) - 1
from doctor_record a
left join doctor_record b on a.medical_school_id = b.medical_school_id
and a.graduation_dt = b.graduation_dt
group by a.doctor_id
Subtract 1 from the count so that you're not counting the doctor in the "other folks in same cohort" number
I'm defining "same cohort" as "same medical school & graduation date".
I'm unclear on what GMC number is and how it is related. Is it something to do with cohort?

SQL Multiple Joins not working as expected

I have following query not working when I try to join all 4 tables (It is taking over an hour to run, I have to eventually kill the query without any data being returned).
It works when Table 1,2 & 3 are joined AND Then If I try Table 1,2 & 4 join but not when I attempt to join all 4 tables below.
Select * From
(Select
R.ID, R.MId, R.RId, R.F_Name, R.F_Value, FE.FullEval, M.Name, RC.CC
FROM Table1 as R
Inner Join Table2 FE
ON R.ID = FE.RClId and R.MId = FE.MId and R.RId = FE.RId
Inner Join Table3 as M
ON R.MId = M.MId and FE.MId = M.MId
Inner Join Table4 as RC
ON R.RId = RC.RId and FE.RId = RC.RId and FE.Date = RC.Date
) AS a
NOTE:
1) RId is not available in table3.
2) MId is not available in table4.
Thanks for help.
Since you mentioned that you don't have permission to view the query plan, try breaking down into each table join. You can also check which table join is taking time to retrieve records. From there, you can investigate the data why it's taking time. It may be because of non-availability of column keys in Table 3 and Table 4?
WITH Tab1_2 AS
(SELECT r.ID, r.MId, r.RId, r.F_Name, r.F_Value, fe.FullEval, fe.date
FROM Table1 as r
INNER JOIN Table2 fe
ON r.ID = fe.RClId
AND r.MId = fe.MId
AND r.RId = fe.RId
WHERE ... -- place your conditions if any
),
Tab12_3 AS
(SELECT t12.*, m.Name
FROM Tab1_2 t12
INNER JOIN Table3 as m
ON t12.MId = m.MId
WHERE ... -- place your conditions if any
),
Tab123_4 AS
(SELECT t123.ID, t123.MId, t123.RId, t123.F_Name, t123.F_Value, t123.FullEval, rc.CC
FROM Tab12_3 t123
INNER JOIN Table4 as rc
ON t123.RId = rc.RId
AND t123.Date = rc.Date
WHERE ... -- place your conditions if any
)
SELECT *
FROM Tab123_4 t1234

SELECT records with condition that filters the last chronilogical multiple and specific value of a column

I have a joined table that looks like that:
my goal is to filter all records that was created after the last 'active' value inside LineStatusName Column. (the yellow marked rows in the attached image).
here is what i have done so far, it is almost work as desired, but the problem is that the date that returns from the nested select steatment is not the date of the highest chronological datetime value of 'active' and if i try to do ORDER BY Changes.ChangeDateTim in the end of the nested select i get a syntax error:
Conversion failed when converting the nvarchar value '30-9000241' to data type int.
I will be grateful if someone can suggest a better solution to achieve that task or to improve my query.
SELECT Orders.OrderID,LineStatuses.LineStatusName,OrderTypes.OrderTypeName,
Changes.ChangeDateTime,Orders.ProjectNumber,Changes.Comments,Changes.ChangeTypeID
FROM Orders
INNER JOIN Changes ON Changes.ItemID = Orders.OrderID
INNER JOIN LineStatusSettings ON LineStatusSettings.LineStatusSettingID = Changes.NewValue
INNER JOIN LineStatuses ON LineStatuses.LineStatusID= LineStatusSettings.LineStatusID
INNER JOIN OrderTypes ON OrderTypes.OrderTypeID = LineStatusSettings.OrderTypeID
WHERE Orders.OrderID = 194 AND Orders.Deleted=0
AND
Changes.ChangeDateTime > (
SELECT TOP 1 Changes.ChangeDateTime
FROM Orders
INNER JOIN Changes ON Changes.ItemID = Orders.OrderID
INNER JOIN LineStatusSettings ON LineStatusSettings.LineStatusSettingID = Changes.NewValue
INNER JOIN LineStatuses ON LineStatuses.LineStatusID= LineStatusSettings.LineStatusID
INNER JOIN OrderTypes ON OrderTypes.OrderTypeID = LineStatusSettings.OrderTypeID
WHERE LineStatuses.LineStatusName = 'active'
) AND OrderTypes.OrderTypeName NOT IN ('disconnected line')
ORDER BY Changes.ChangeDateTime
Here is one method:
with jt as (
<your query here>
)
select jt.*
from jt
where jt.date > (select max(jt2.date)
from jt jt2
where jt2.orderid = jt.orderid and jt2.linestatusname = 'Active'
);