How to improve SELECT statements with multiple sub-queries - sql

How can I better improve the following SQL statements? I have tried using EXCEPT but it does not work. Any suggestions/advice are greatly welcome!
SELECT COUNT(*)
FROM ( SELECT L_ORDERKEY, COUNT(*)
FROM LINEITEM
GROUP BY L_ORDERKEY
HAVING COUNT(*) > (SELECT DISTINCT TSIZE
FROM LINEITEM) );
SELECT LINEITEM.L_ORDERKEY, LINEITEM.L_LINENUMBER
FROM LINEITEM JOIN PART
ON LINEITEM.L_PARTKEY = PART.P_PARTKEY
WHERE PART.P_PARTKEY IN (46557,20193,19110,45690,45123)
MINUS
(SELECT LINEITEM.L_ORDERKEY, LINEITEM.L_LINENUMBER
FROM LINEITEM JOIN PART
ON LINEITEM.L_PARTKEY = PART.P_PARTKEY
WHERE PART.P_PARTKEY IN (46557,20193,19110,45690,45123)
MINUS
SELECT LINEITEM.L_ORDERKEY, LINEITEM.L_LINENUMBER
FROM LINEITEM JOIN SUPPLIER
ON LINEITEM.L_SUPPKEY = SUPPLIER.S_SUPPKEY
WHERE SUPPLIER.S_SUPPKEY IN (4567,2323,1987,2194,1111)
);

You can use not exists for the second query
SELECT L.L_ORDERKEY, L.L_LINENUMBER
FROM LINEITEM L
JOIN PART P
ON L.L_PARTKEY = P.P_PARTKEY
WHERE P.P_PARTKEY IN (46557, 20193, 19110, 45690, 45123)
AND NOT EXISTS
(SELECT 0
FROM SUPPLIER S
WHERE S.S_SUPPKEY IN (4567, 2323, 1987, 2194, 1111)
AND L.L_SUPPKEY = S.S_SUPPKEY );

Related

How to convert a query into a nested query

How do I change this query into a nested query?
The query and tables are listed below.
SELECT
Nation.N_NAME as "nation",
ROUND(
SUM(
Lineitem.L_QUANTITY * (Lineitem.L_EXTENDEDPRICE - Lineitem.L_DISCOUNT)
), 2
) AS "order size"
FROM Nation
JOIN Supplier ON Nation.N_NATIONKEY = Supplier.S_NATIONKEY
JOIN Customer ON Supplier.S_NATIONKEY = Customer.C_NATIONKEY
JOIN Orders ON Customer.C_CUSTKEY = Orders.O_CUSTKEY
JOIN Lineitem ON Orders.O_ORDERKEY = Lineitem.L_ORDERKEY
WHERE Lineitem.L_SUPPKEY = Supplier.S_SUPPKEY
GROUP BY Nation.N_NAME
;
tables goes as follows
Nation : N_NATIONKEY, N_NAME
Supplier : S_SUPPKEY, S_NAME, S_NATIONKEY
Customer : C_CUSTKEY, C_NAME, C_NATIONKEY
Orders: O_ORDERKEY, O_CUSTKEY
Lineitem: L_ORDERKEY, L_SUPPKEY, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT
I'm not sure exactly what kind of nested join you're looking for, but here is one option:
SELECT
src.N_NAME as "nation",
ROUND(
SUM(
lineitem.L_QUANTITY * (lineitem.L_EXTENDEDPRICE - lineitem.L_DISCOUNT)
), 2
) AS "order size"
FROM lineitem -- Get line items
INNER JOIN (
SELECT nation.N_NAME, supplier.S_SUPPKEY, Orders.O_ORDERKEY
FROM nation
JOIN Supplier ON Nation.N_NATIONKEY = Supplier.S_NATIONKEY
JOIN Customer ON Supplier.S_NATIONKEY = Customer.C_NATIONKEY
JOIN Orders ON Customer.C_CUSTKEY = Orders.O_CUSTKEY
) src ON lineitem.L_SUPPKEY = src.S_SUPPKEY AND lineitem.L_ORDERKEY = src.O_ORDERKEY
GROUP BY src.N_NAME
I haven't tested it but give it a try and see if it works. If it doesn't give you what you want, please post some sample data.

Aggregate function results in select statement

Hopefully the code below should demonstrate what I'm trying to achieve.
The issue is that none of the input selects are resolved by the time I try to calculate VatableCash so I get "Invalid Column" when trying to select it.
Sorry if there's something plainly obvious I can do here. SQL isn't one of my strong suits.
select
OrderHeader.ID,
sum(OrderLine.NetPrice) as OrderLineNetPrice,
sum(OrderLine.GrossPrice) as OrderLineGrossPrice,
sum(
case when PaymentOption_ID = 8
then Payment.Amount
else 0
end
) as TotalCashAmount,
((OrderLineGrossPrice - OrderLineNetPrice) / OrderLineGrossPrice) * TotalCashAmount as VatableCash
from OrderHeader
inner join Payment on Payment.OrderHeader_ID = OrderHeader.ID
inner join OrderLine on OrderLine.OrderHeader_ID = OrderHeader.ID
group by OrderHeader.ID
You need to use sub query.
you can try this.
;WITH CTE AS
(
select
OrderHeader.ID,
sum(OrderLine.NetPrice) as OrderLineNetPrice,
sum(OrderLine.GrossPrice) as OrderLineGrossPrice,
sum(
case when PaymentOption_ID = 8
then Payment.Amount
else 0
end
) as TotalCashAmount
from OrderHeader
inner join Payment on Payment.OrderHeader_ID = OrderHeader.ID
inner join OrderLine on OrderLine.OrderHeader_ID = OrderHeader.ID
group by OrderHeader.ID
)
SELECT *,
((OrderLineGrossPrice - OrderLineNetPrice) / OrderLineGrossPrice) * TotalCashAmount as VatableCash
FROM CTE
Love the cross apply! Use it whenever you want some handy extra columns.
select
OrderHeader.ID,
sum(OrderLine.NetPrice) as OrderLineNetPrice,
sum(OrderLine.GrossPrice) as OrderLineGrossPrice,
TotalCashAmount,
((OrderLineGrossPrice - OrderLineNetPrice) / OrderLineGrossPrice) * TotalCashAmount as VatableCash
from OrderHeader
inner join Payment on Payment.OrderHeader_ID = OrderHeader.ID
inner join OrderLine on OrderLine.OrderHeader_ID = OrderHeader.ID
cross apply ( select sum(
case when PaymentOption_ID = 8
then Payment.Amount
else 0
end
)) as subquery(TotalCashAmount)
group by OrderHeader.ID

missing expression in Select SQL

I'm new to SQL and sub queries, When I run the sub query by itself I get the correct data output. But when I run the full query I get the error message
SELECT * Error at line 3: ORA-00936: missing expression
Here's my code:
SELECT
MAX(
SELECT
SUM(
ALLOCATION.HourlyRate
*
ACTION.HrsWorked
)
FROM ALLOCATION
INNER JOIN ACTION
ON ((ACTION.ActId = ALLOCATION.ActId) AND (ACTION.EmpId = ALLOCATION.EmpId))
GROUP BY (ALLOCATION.ActId)
)
FROM ALLOCATION
GROUP BY (ALLOCATION.ActId)
SOLUTION:
SELECT MAX(sum_total_pay)
FROM
(
SELECT SUM(ALLOCATION.HourlyRate * ACTION.HrsWorked) AS sum_total_pay
FROM ALLOCATION
INNER JOIN ACTION
ON ((ACTION.ActId = ALLOCATION.ActId) AND (ACTION.EmpId = ALLOCATION.EmpId))
GROUP BY (ALLOCATION.ActId)
);
You needs two parentheses more :)
SELECT MAX ( ( SELECT SUM (ALLOCATION.HourlyRate * ACTION.HrsWorked)
FROM ALLOCATION INNER JOIN ACTION ON (ACTION.ActId = ALLOCATION.ActId) AND (ACTION.EmpId = ALLOCATION.EmpId)
GROUP BY (ALLOCATION.ActId)))
FROM ALLOCATION
GROUP BY ALLOCATION.ActId
The Select into MAX have to be in paranthesis.
The logic is not clear. This give you the total pay per activity:
SELECT ALLOCATION.ActId, SUM (ALLOCATION.HourlyRate * ACTION.HrsWorked) total_pay
FROM ALLOCATION b INNER JOIN ACTION ON ACTION.ActId = ALLOCATION.ActId AND ACTION.EmpId = ALLOCATION.EmpId
GROUP BY ALLOCATION.ActId
If you want to see, which activity has the biggest payment, then order by desc :
SELECT *
FROM ( SELECT ALLOCATION.ActId, SUM (ALLOCATION.HourlyRate * ACTION.HrsWorked) total_pay
FROM ALLOCATION b INNER JOIN ACTION ON ACTION.ActId = ALLOCATION.ActId AND ACTION.EmpId = ALLOCATION.EmpId
GROUP BY ALLOCATION.ActId)
ORDER BY total_pay DESC

Need to speed up the results of this SQL statement. Any advice?

I've got the following SQL Statement that needs some major speed up. The problem is I need to search on two fields, where each of them is calling several sub-selects. Is there a way to join the two fields together so I call the sub-selects only once?
SELECT billyr, billno, propacct, vinid, taxpaid, duedate, datepif, propdesc
FROM trcdba.billspaid
WHERE date(datepif) > '01/06/2009'
AND date(datepif) <= '01/06/2010'
AND custno in
(select custno from cwdba.txpytaxid where taxpayerno in
(select taxpayerno from cwdba.txpyaccts where accountno in
(select accountno from rtadba.reasacct where controlno = 1234567)))
OR custno2 in
(select custno from cwdba.txpytaxid where taxpayerno in
(select taxpayerno from cwdba.txpyaccts where accountno in
(select accountno from rtadba.reasacct where controlno = 1234567)))
I would use joins instead of the embedded sub-queries.
when you use a function on the column:
date(datepif) > '01/06/2009'
AND date(datepif) <= '01/06/2010'
an index will NOT be used. Try something like this
datepif > someconversionhere('01/06/2009')
AND datepif <= someconversionhere('01/06/2010')
Use inner joins too. There isn't any info in the question to indicate table size or if there is an index or not, so this is a guess and should work best if there are many more rows in billspaid for the date range vs rows that match the joining tables for r.controlno = 1234567, which I suspect is the case:
SELECT
COALESCE(b1.billyr,b2.billyr) AS billyr
,COALESCE(b1.billno,b2.billno) AS billno
,COALESCE(b1.propacct,b2.propacct) AS propacct
,COALESCE(b1.vinid,b2.vinid) AS vinid
,COALESCE(b1.taxpaid,b2.taxpaid) AS taxpaid
,COALESCE(b1.duedate,b2.duedate) AS duedate
,COALESCE(b1.datepif,b2.datepif) AS datepif
,COALESCE(b1.propdesc,b2.propdesc) AS propdesc
FROM rtadba.reasacct r
INNER JOIN cwdba.txpyaccts a ON r.accountno=t.accountno
INNER JOIN cwdba.txpytaxid t ON a.taxpayerno=t.taxpayerno
LEFT OUTER JOIN trcdba.billspaid b1 ON t.custno=b1.custno AND b1.datepif > someconversionhere('01/06/2009') AND b1.datepif <= someconversionhere('01/06/2010')
LEFT OUTER JOIN trcdba.billspaid b2 ON t.custno2=b2.custno AND b2.datepif > someconversionhere('01/06/2009') AND b2.datepif <= someconversionhere('01/06/2010')
WHERE r.controlno = 1234567
AND COALESCE(b1.custno,b2.custno) IS NOT NULL
create an index for each of these:
rtadba.reasacct.controlno and cover on accountno
cwdba.txpyaccts.accountno and cover on taxpayerno
cwdba.txpytaxid.taxpayerno and cover on custno
trcdba.billspaid.custno +datepif
trcdba.billspaid.custno2 +datepif
Here's the same thing using JOIN instead of sub queries.
SELECT billyr, billno, propacct, vinid, taxpaid, duedate, datepif, propdesc
FROM billspaid
INNER JOIN txpytaxid
ON txpytaxid.custno = billspaid.custno OR txpytaxid.custno = billspaid.custno2
INNER JOIN txpyaccts
ON txpyaccts.taxpayerno = txpytaxid.taxpayerno
INNER JOIN reasacct
ON reasacct.accountno = txpyaccts.accountno AND reasacct.controlno = 1234567
WHERE date(datepif) > '01/06/2009'
AND date(datepif) <= '01/06/2010'
However, if the OR in the JOIN is giving you performance problems, you can always try using a union:
(SELECT billyr, billno, propacct, vinid, taxpaid, duedate, datepif, propdesc
FROM billspaid
INNER JOIN txpytaxid
ON txpytaxid.custno = billspaid.custno
INNER JOIN txpyaccts
ON txpyaccts.taxpayerno = txpytaxid.taxpayerno
INNER JOIN reasacct
ON reasacct.accountno = txpyaccts.accountno AND reasacct.controlno = 1234567
WHERE date(datepif) > '01/06/2009'
AND date(datepif) <= '01/06/2010')
UNION
(SELECT billyr, billno, propacct, vinid, taxpaid, duedate, datepif, propdesc
FROM billspaid
INNER JOIN txpytaxid
ON txpytaxid.custno = billspaid.custno2
INNER JOIN txpyaccts
ON txpyaccts.taxpayerno = txpytaxid.taxpayerno
INNER JOIN reasacct
ON reasacct.accountno = txpyaccts.accountno AND reasacct.controlno = 1234567
WHERE date(datepif) > '01/06/2009'
AND date(datepif) <= '01/06/2010')
Use EXISTS instead of IN ( unless the result set of the IN subquery is very small).
If you do UNION instead of OR ( which should be functionally equivalent ) use UNION ALL instead.

Using JOINS in MySQL

I have this query which works perfectly:
SELECT *
FROM Customer
WHERE SacCode IN
(
SELECT SacCode
FROM SacCode
WHERE ResellerCorporateID = 392
ORDER BY SacCode
)
AND CustomerID IN
(
SELECT CxID
FROM CustAppointments
WHERE AppRoomID IN
(
SELECT AppRoomID
FROM ClinicRooms
WHERE ClinID IN
(
SELECT ClinID
FROM AppClinics
WHERE ClinDate >='20090101'
AND ClinDate <='20091119'
)
)
)
However, I need to see the value of ClinDate (inside the last nested query) so I've been told I need to rework the query using JOINS.
I have no idea how, can someone help please?
Thanks.
Here's a start:
SELECT *
FROM Customer c
INNER JOIN CustAppointments ca ON ca.CxId = c.CustomerID
INNER JOIN ClinicRooms cr ON cr.AppRoomID = ca.AppRoomID
INNER JOIN AppClinics ac ON ac.ClinID = cr.ClinID
WHERE ap.ClinDate BETWEEN '20090101' AND '20091119'
AND SacCode IN (SELECT sc.SacCode
FROM SacCode sc
WHERE sc.ResellerCorporateID = 392)
This will allow you to select columns from AppClinics.
Read this http://www.w3schools.com/Sql/sql_join.asp