How could I optimize this dynamic SQL query? - sql

I've been having difficulties with the following query. I've been trying to optimize it and perhaps make it more readable. Let's say I have 3 tables orders_returned, orders_completed, orders_delivered with matching columns oder_id, customer_id. Depending on selected options, I might need to retrieve orders which were delivered, then returned and finally completed (same order_id occurs in all three tables) which have the same customer_id. Also I might only need to retrieve only delivered and returned orders in which case I would omit AND order_id IN (SELECT order_id FROM ORDERS_COMPLETED) from the WHERE clause. For example, Get delivered and returned orders by customers John and Tim
As of now my query looks like this:
SELECT order_id
FROM
(
SELECT order_id, customer_id
FROM ORDERS_RETURNED
UNION
SELECT order_id, customer_id
FROM ORDERS_COMPLETED
UNION
SELECT order_id, customer_id
FROM ORDERS_DELIVERED
)
WHERE
customer_id IN ('customer1', 'customer2', ...)
AND order_id IN (SELECT order_id FROM ORDERS_RETURNED)
AND order_id IN (SELECT order_id FROM ORDERS_COMPLETED)
AND order_id IN (SELECT order_id FROM ORDERS_DELIVERED)
I'm still learning SQL and would like to see if there are better options.
EDIT: I am using Oracle database. There is also Orders table which has distinct order_ids and some other irrelevant columns. It does not store customer_ids.
Also, the order might occur in one table or in two of them only, so joins, I think, are of no use here.

Since you have an Order table, I presume you are also storing the CustomerId in that table as well. Assuming so, try this:
SELECT DISTINCT O.OrderId
FROM Orders O
LEFT JOIN Orders_Completed OC ON O.OrderId = OC.OrderId
LEFT JOIN Orders_Delivered OD ON O.OrderId = OD.OrderId
LEFT JOIN Orders_Returned ORE ON O.OrderId = ORE.OrderId
WHERE O.CustomerId IN (...)
AND OD.OrderId IS NOT NULL AND ORE.OrderId IS NOT NULL AND OC.OrderId IS NULL
This particular query will return you all distinct orders where customer in (...) where the order has been delivered and returned, but not completed. Toggle the use of the IS NULL and IS NOT NULL to get your desired output.
Good luck.

You should use Joins instead of Inner/Nested queries.
Try below instead ::
SELECT A.order_id, A.customer_id FROM ORDERS_RETURNED A
INNER JOIN ORDERS_COMPLETED B ON A.order_id = B.order_id AND A.customer_id = B.customer_id
INNER JOIN ORDERS_DELIVERED C ON A.order_id = C.order_id AND A.customer_id = C.customer_id
Where A.customer_id IN ('customer1', 'customer2', ...)

you could do this with something like:
with data
as (select customer_id,
order_id,
nvl(max(case status when 'RETURNED' then 'Y' end), 'N') returned,
nvl(max(case status when 'COMPLETED' then 'Y' end), 'N') completed,
nvl(max(case status when 'DELIVERED' then 'Y' end), 'N') delivered
from (select 'RETURNED' status, order_id, customer_id
from orders_returned
union all
select 'COMPLETED' status, order_id, customer_id
from orders_completed
union all
select 'DELIVERED' status, order_id, customer_id
from orders_delivered)
group by customer_id, order_id)
select *
from data
where returned = 'Y'
and delivered = 'Y'
and customer_id in ('xx', 'xxx') ;
or
with data
as (select customer_id,
order_id,
max(returned) returned,
max(completed) completed,
max(delivered) delivered
from (select 'Y' returned, null completed, null delivered, order_id, customer_id
from orders_returned
union all
select null, 'Y', null, order_id, customer_id
from orders_completed
union all
select null, null, 'Y', order_id, customer_id
from orders_delivered)
group by customer_id, order_id)
select *
from data
where returned = 'Y'
and delivered = 'Y'
and customer_id in ('xx', 'xxx');
eg: http://sqlfiddle.com/#!4/3e2fb/2

Related

Multiple SQL filter on same column

How do I display the customer_id of customers who bought products A and B, but didn’t buy product C, ordered by ascending customer ID.
I tried the below code, but does not give me any result.
select customer_id, product_name from orders where customer_id = 'A' and 'product_name '= 'B'
select customer_id from orders where product_name = 'A'
intersect
select customer_id from orders where product_name = 'B'
except
select customer_id from orders where product_name = 'C'
You can use analytical function as follows:
Select * from
(select customer_id, product_name ,
Count(distinct case when product_name in ('A','B') then product_name end)
Over (partition by customer_id) as cntab ,
Count(case when product_name = 'C' then product_name end)
Over (partition by customer_id) as cntc
from orders t) t
Where cntab = 2 and cntc = 0;
One method use exists and not exists:
select o.*
from orders o
where exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'A'
) and
exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'B'
) and
not exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'C'
)
order by customer_id;
To me, this query reads well ...
SELECT customer_id
FROM (
SELECT TableA.customer_id
FROM (
SELECT customer_id
FROM orders
WHERE product_name = 'A'
) AS TableA
INNER JOIN (
SELECT customer_id
FROM orders
WHERE product_name = 'B'
) AS TableB
ON TableA.customer_id = TableB.customer_id
) AS TableX
WHERE customer_id NOT IN (SELECT customer_id FROM orders WHERE product_name = 'C')
Explanation. Get a list of customer ids that bought products A and B with self inner join. Then, pare that list down by removing any rows where those customers bought product C.

Sum of all values except the first

I have the following three tables:
Customers:
Cust_ID,
Cust_Name
Products:
Prod_ID,
Prod_Price
Orders:
Order_ID,
Cust_ID,
Prod_ID,
Quantity,
Order_Date
How do I display each costumer and how much they spent excluding their very first purchase?
[A] - I can get the total by multiplying Products.Prod_Price and Orders.Quantity, then GROUP by Cust_ID
[B] - I also can get the first purchase by using TOP 1 on Order_Date for each customer.
But I couldnt figure out how to produce [A]-[B] in one query.
Any help will be greatly appreciated.
For SQL-Server 2005, 2008 and 2008R2:
; WITH cte AS
( SELECT
c.Cust_ID, c.Cust_Name,
Amount = o.Quantity * p.Prod_Price,
Rn = ROW_NUMBER() OVER (PARTITION BY c.Cust_ID
ORDER BY o.Order_Date)
FROM
Customers AS c
JOIN
Orders AS o ON o.Cust_ID = c.Cust_ID
JOIN
Products AS p ON p.Prod_ID = o.Prod_ID
)
SELECT
Cust_ID, Cust_Name,
AmountSpent = SUM(Amount)
FROM
cte
WHERE
Rn >= 2
GROUP BY
Cust_ID, Cust_Name ;
For SQL-Server 2012, using the FIRST_VALUE() analytic function:
SELECT DISTINCT
c.Cust_ID, c.Cust_Name,
AmountSpent = SUM(o.Quantity * p.Prod_Price)
OVER (PARTITION BY c.Cust_ID)
- FIRST_VALUE(o.Quantity * p.Prod_Price)
OVER (PARTITION BY c.Cust_ID
ORDER BY o.Order_Date)
FROM
Customers AS c
JOIN
Orders AS o ON o.Cust_ID = c.Cust_ID
JOIN
Products AS p ON p.Prod_ID = o.Prod_ID ;
Another way (that works in 2012 only) using OFFSET FETCH and CROSS APPLY:
SELECT
c.Cust_ID, c.Cust_Name,
AmountSpent = SUM(x.Quantity * x.Prod_Price)
FROM
Customers AS c
CROSS APPLY
( SELECT
o.Quantity, p.Prod_Price
FROM
Orders AS o
JOIN
Products AS p ON p.Prod_ID = o.Prod_ID
WHERE
o.Cust_ID = c.Cust_ID
ORDER BY
o.Order_Date
OFFSET
1 ROW
-- FETCH NEXT -- not needed,
-- 20000000000 ROWS ONLY -- can be removed
) AS x
GROUP BY
c.Cust_ID, c.Cust_Name ;
Tested at SQL-Fiddle
Note that the second solution returns also the customers with only one order (with the Amount as 0) while the other two solutions do not return those customers.
Which version of SQL? If 2012 you might be able to do something interesting with OFFSET 1, but I'd have to ponder much more how that works with grouping.
EDIT: Adding a 2012 specific solution inspired by #ypercube
I wanted to be able to use OFFSET 1 within the WINDOW to it al in one step, but the syntax I want isn't valid:
SUM(o.Quantity * p.Prod_Price) OVER (PARTITION BY c.Cust_ID
ORDER BY o.Order_Date
OFFSET 1)
Instead I can specify the row boxing, but have to filter the result set to the correct set. The query plan is different from #ypercube's, but the both show 50% when run together. They each run twice as as fast as my original answer below.
WITH cte AS (
SELECT c.Cust_ID
,c.Cust_Name
,SUM(o.Quantity * p.Prod_Price) OVER(PARTITION BY c.Cust_ID
ORDER BY o.Order_ID
ROWS BETWEEN 1 FOLLOWING
AND UNBOUNDED FOLLOWING) AmountSpent
,rn = ROW_NUMBER() OVER(PARTITION BY c.Cust_ID ORDER BY o.Order_ID)
FROM Customers AS c
INNER JOIN
Orders AS o ON o.Cust_ID = c.Cust_ID
INNER JOIN
Products AS p ON p.Prod_ID = o.Prod_ID
)
SELECT Cust_ID
,Cust_Name
,ISNULL(AmountSpent ,0) AmountSpent
FROM cte WHERE rn=1
My more general solution is similar to peter.petrov's, but his didn't work "out of the box" on my sample data. That might be an issue with my sample data or not. Differences include use of CTE and a NOT EXISTS with a correlated subquery.
CREATE TABLE Customers (Cust_ID INT, Cust_Name VARCHAR(10))
CREATE TABLE Products (Prod_ID INT, Prod_Price MONEY)
CREATE TABLE Orders (Order_ID INT, Cust_ID INT, Prod_ID INT, Quantity INT, Order_Date DATE)
INSERT INTO Customers SELECT 1 ,'Able'
UNION SELECT 2, 'Bob'
UNION SELECT 3, 'Charlie'
INSERT INTO Products SELECT 1, 10.0
INSERT INTO Orders SELECT 1, 1, 1, 1, GetDate()
UNION SELECT 2, 1, 1, 1, GetDate()
UNION SELECT 3, 1, 1, 1, GetDate()
UNION SELECT 4, 2, 1, 1, GetDate()
UNION SELECT 5, 2, 1, 1, GetDate()
UNION SELECT 6, 3, 1, 1, GetDate()
;WITH CustomersFirstOrder AS (
SELECT Cust_ID
,MIN(Order_ID) Order_ID
FROM Orders
GROUP BY Cust_ID
)
SELECT c.Cust_ID
,c.Cust_Name
,ISNULL(SUM(Quantity * Prod_Price),0) CustomerOrderTotalAfterInitialPurchase
FROM Customers c
LEFT JOIN (
SELECT Cust_ID
,Quantity
,Prod_Price
FROM Orders o
INNER JOIN
Products p ON o.Prod_ID = p.Prod_ID
WHERE NOT EXISTS (SELECT 1 FROM CustomersFirstOrder a WHERE a.Order_ID=o.Order_ID)
) b ON c.Cust_ID = b.Cust_ID
GROUP BY c.Cust_ID
,c.Cust_Name
DROP TABLE Customers
DROP TABLE Products
DROP TABLE Orders
Try this. It should do it.
SELECT c1.cust_name ,
c1.cust_id ,
SUM(p1.Prod_Price)
FROM orders o1
JOIN products p1 ON o1.prod_id = p1.prod_id
JOIN customers c1 ON o1.cust_id = c1.cust_id
LEFT JOIN ( SELECT o2.cust_id ,
MIN(o2.Order_Date) AS Order_Date
FROM orders o2
GROUP BY o2.cust_id
) t ON o1.cust_id = t.cust_id
AND o1.Order_Date = t.Order_Date
WHERE t.Order_Date IS NULL
GROUP BY c1.cust_name ,
c1.cust_id
You have to number orders by Customer and then you can have the amount for the first order and next orders with a CTE and ROW_NUMBER() like this:
; WITH NumberedOrders
AS ( SELECT Customers.Cust_Id ,
Customers.Cust_Name ,
ROW_NUMBER() OVER ( ORDER BY Customers.Cust_id ) AS Order_Number ,
Orders.Order_Date ,
Products.Prod_price * Orders.Quantity AS Amount
FROM Orders
INNER JOIN Customers ON Orders.Cust_Id = Customers.Cust_Id
INNER JOIN Products ON Orders.Prod_Id = Products.Prod_Id
)
SELECT Cust_Id ,
SUM(CASE WHEN Order_Number = 1 THEN Amount
ELSE 0
END) AS A_First_Order ,
SUM(CASE WHEN Order_Number = 1 THEN 0
ELSE Amount
END) AS B_Other_orders ,
SUM(Amount) AS C_All_orders
FROM NumberedOrders
GROUP BY Cust_Id
ORDER BY Cust_Id

Last order item in Oracle SQL

I need to list columns from customer table, the date from first order and all data from last one, in a 1:N relationship between customer and order tables. I'm using Oracle 10g.
How the best way to do that?
TABLE CUSTOMER
---------------
id NUMBER
name VARCHAR2(200)
subscribe_date DATE
TABLE ORDER
---------------
id NUMBER
id_order NUMBER
purchase_date DATE
purchase_value NUMBER
Here is one way of doing it, using the row_number function, one join, and on aggregation:
select c.*,
min(o.purchase_date) as FirstPurchaseDate,
min(case when seqnum = 1 then o.id_order end) as Last_IdOrder,
min(case when seqnum = 1 then o.purchase_date end) as Last_PurchaseDate,
min(case when seqnum = 1 then o.purchase_value end) as Last_PurchaseValue
from Customer c join
(select o.*,
row_number() over (partition by o.id order by purchase_date desc) as seqnum
from orders o
) o
on c.customer_id = o.order_id
group by c.customer_id, c.name, c.subscribe_date
It's not obvious how to join the customer table to the orders table (order is a reserved word in Oracle so your table can't be named order). If we assume that the id_order in orders joins to the id in customer
SELECT c.id customer_id,
c.name name,
c.subscribe_date,
o.first_purchase_date,
o.id last_order_id,
o.purchase_date last_order_purchase_date,
o.purchase_value last_order_purchase_value
FROM customer c
JOIN (SELECT o.*,
min(o.purchase_date) over (partition by id_order) first_purchase_date,
rank() over (partition by id_order order by purchase_date desc) rnk
FROM orders o) o ON (c.id = o.id_order)
WHERE rnk = 1
I'm confused by your field names, but I'm going to assume that ORDER.id is the id in the CUSTOMER table.
The earliest order date is easy.
select CUSTOMER.*, min(ORDER.purchase_date)
from CUSTOMER
inner join ORDER on CUSTOMER.id = ORDER.id
group by CUSTOMER.*
To get the last order data, join this to the ORDER table again.
select CUSTOMER.*, min(ORD_FIRST.purchase_date), ORD_LAST.*
from CUSTOMER
inner join ORDER ORD_FIRST on CUSTOMER.id = ORD_FIRST.id
inner join ORDER ORD_LAST on CUSTOMER.id = ORD_LAST.id
group by CUSTOMER.*, ORD_LAST.*
having ORD_LAST.purchase_date = max(ORD_FIRST.purchase_date)
Maybe something like this assuming the ID field in the Order table is actually the Customer ID:
SELECT C.*, O1.*, O2.purchase_Date as FirstPurchaseDate
FROM Customer C
LEFT JOIN
(
SELECT Max(purchase_date) as pdate, id
FROM Orders
GROUP BY id
) MaxPurchaseOrder
ON C.Id = MaxPurchaseOrder.Id
LEFT JOIN Orders O1
ON MaxPurchaseOrder.pdate = O1.purchase_date
AND MaxPurchaseOrder.id = O1.id
LEFT JOIN
(
SELECT Min(purchase_date) as pdate, id
FROM Orders
GROUP BY id
) MinPurchaseOrder
ON C.Id = MinPurchaseOrder.Id
LEFT JOIN Orders O2
ON MinPurchaseOrder.pdate = O2.purchase_date
AND MinPurchaseOrder.id = O2.id
And the sql fiddle.

Caching inner query in SQL

How can I optimize below SQL (simplified view of my complex query.) Ideally, I should be able to cache the first SQL result (order ids) and do some kind of projection on OrderLine table in the second query.
Any pointers will be helpful.
Restrictions - I cannot create temporary tables, cursors or procedures / functions. I am connecting to Oracle 10g.
SELECT 'Object_id', id, mod_id FROM
(
(Select 'Order_id', order_id, mod_id FROM Orders)
UNION
(select 'Order_line_id', order_line_id, mod_id FROM OrderLine
WHERE order_id IN (Select order_id FROM Orders)
)
)
The optimizer is responsible for optimizing your query; relax and let it do its stuff.
Any explicit caching you attempt will involve things like temporary tables (which you've ruled out), so you can only make the common query more explicit, perhaps, by using a CTE (common table expression, aka WITH clause) to name the common sub-query. But the optimizer might well process things the same way regardless.
You can replace the IN clause with a JOIN; that will likely be faster. Again, the optimizer might do that anyway. However, that's not a caching operation; it is a standard query rewrite.
You can design your queries based on following example:
WITH CTE AS
(
Select 'Order_id' Descr, order_id, mod_id FROM Orders
)
, CTE2 AS
(
SELECT * FROM CTE
UNION
SELECT 'Order_line_id' Descr, order_line_id, mod_id FROM OrderLine ol
WHERE EXISTS (Select order_id FROM CTE WHERE order_id = ol.order_id)
)
SELECT * FROM CTE2;
Select order_id, mod_id
from orders o
inner join orderline ol
on o.order_id = ol.order_line_id
You might be able to remove this line:
WHERE order_id IN (Select order_id FROM Orders)
If your database has integrity, there are no orphans allowed and this line doesn't filter anything.
with myOrders as (
select order_id,
mod_id
from Orders
)
select 'order_id' obj_type,
order_id,
mod_id
from myOrders
union all
select 'order_line_id' obj_type,
order_line_id,
mod_id
from OrderLine
join myOrders
on OrderLine.order_id = myOrders.order_id;
You could try something like the following...
SELECT DISTINCT 'Object_id', id, mod_id
FROM
(
SELECT
CASE
WHEN CROSSNUM = 1 THEN 'Order_ID'
ELSE 'Order_Line_ID'
END AS Object_ID,
CASE
WHEN CROSSNUM = 1 THEN order_id
ELSE order_line_id
END AS id,
CASE
WHEN CROSSNUM = 1 THEN Orders.mod_id
ELSE OrderLine.mod_id
END AS mod_id
FROM
(Select 'Order_id', order_id, mod_id FROM Orders) Orders
LEFT JOIN (select 'Order_line_id', order_line_id, mod_id FROM OrderLine) OrderLine
ON Orders.Order_id = OrderLine.Order_Id
CROSS JOIN
(
SELECT 1 AS CROSSNUM FROM DUAL
UNION ALL
SELECT 2 AS CROSSNUM FROM DUAL
) X
WHERE
NOT (CROSSNUM = 2 AND order_line_id IS NULL)
)

Optimize SQL query for canceled orders

Here is a subset of my tables:
orders:
- order_id
- customer_id
order_products:
- order_id
- order_product_id (unique key)
- canceled
I want to select all orders (order_id) for a given customer(customer_id), where ALL of the products in the order are canceled, not just some of the products. Is there a more elegantly or efficient way of doing it than this:
select order_id from orders
where order_id in (
select order_id from orders
inner join order_products on orders.order_id = order_products.order_id
where order_products.customer_id = 1234 and order_products.canceled = 1
)
and order_id not in (
select order_id from orders
inner join order_products on orders.order_id = order_products.order_id
where order_products.customer_id = 1234 and order_products.canceled = 0
)
If all orders have at least one row in order_products, Try this
Select order_id from orders o
Where Not Exists
(Select * From order_products
Where order_id = o.order_id
And cancelled = 1)
If the above assumption is not true, then you also need:
Select order_id from orders o
Where Exists
(Select * From order_products
Where order_id = o.order_id)
And Not Exists
(Select * From order_products
Where order_id = o.order_id
And cancelled = 1)
The fastest way will be this:
SELECT order_id
FROM orders o
WHERE customer_id = 1234
AND
(
SELECT canceled
FROM order_products op
WHERE op.order_id = o.order_id
ORDER BY
canceled DESC
LIMIT 1
) = 0
The subquery will return 0 if and only if there had been some products and they all had been canceled.
If there were no products at all, the subquery will return NULL; if there is at least one uncanceled product, the subquery will return 1.
Make sure you have an index on order_products (order_id, canceled)
Something like this? This assumes that every order has at least one product, otherwise this query will return also orders without any products.
select order_id
from orders o
where not exists (select 1 from order_products op
where canceled = 0
and op.order_id = o.order_id
)
and o.customer_id = 1234
SELECT customer_id, order_id, count(*) AS product_count, sum(canceled) AS canceled_count
FROM orders JOIN order_products
ON orders.order_id = order_products.order_id
WHERE customer_id = <<VALUE>>
GROUP BY customer_id, order_id
HAVING product_count = canceled_count
You can try something like this
select orders.order_id
from #orders orders inner join
#order_products order_products on orders.order_id = order_products.order_id
where order_products.customer_id = 1234
GROUP BY orders.order_id
HAVING SUM(order_products.canceled) = COUNT(order_products.canceled)
Since we don't know the database platform, here's an ANSI standard approach. Note that this assumes nothing about the schema (i.e. data type of the cancelled field, how the cancelled flag is set (i.e. 'YES',1,etc.)) and uses nothing specific to a given database platform (which would likely be a more efficient approach if you could give us the platform and version you are using):
select op1.order_id
from (
select op.order_id, cast( case when op.cancelled is not null then 1 else 0 end as tinyint) as is_cancelled
from #order_products op
) op1
group by op1.order_id
having count(*) = sum(op1.is_cancelled);