How to find the date of the last item sold sql server - sql

Hi Guys I am having a bit of trouble writing the most efficient and optimized query for this question:
Find the order ID and date of the last discontinued item sold.
I have my code below as well as the metadata for the tables. I am not sure if my code will produce the correct output because I have no way of testing it and I am not sure if my code will be the best way to complete this query. Any advice would help.
Select
orders.orderid,
Max(orders.orderdate)
from orders
inner join order_details on orders.orderid = order_details.orderid
inner join products on order_details.productid = products.productid
where discontinued = 1
group by orders.orderid ```

Using row_number is the easiest way:
select orderID,OrderDate from (
select o.orderID,o.OrderDate,rn = row_number() over (order by orderdate desc)
from products p
join orderDetails od on od.productID=p.productID
join orders o on o.orderID=od.orderID
where p.discontinued = 1) sub
where sub.rn = 1

your query is pretty much already what you want, the fastest way is to simply order by the required column and select top 1
select top (1) o.orderid, o.orderdate
from orders o
join order_details od on od.orderid = o.orderid
join products p on p.productid = od.productid
where p.discontinued = 1
order by o.orderdate desc
This will be more performant than using a window function to number all the rows before selecting row one.
I have a very similar arrangement of tables with the ubiquitous orders/orderitems/products arrangement including a similar deleted flag for products so it's easy to test both side by side, this query is a bit more performant than using the row_number equivalent, using a table of 13.5m orders and 6m products. Execution times for both were sub-second but this query was slightly faster.

Related

How to write this SQL query more elegantly ( joining + max query )

Ok I am using the following example from w3school
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all
and I want to get the date in which the amount was ordered
SELECT OrderDate
FROM Orders
WHERE OrderID = (SELECT OrderID
FROM OrderDetails
WHERE Quantity = (SELECT MAX(Quantity)
FROM OrderDetails));
This works but my guts tell me I need to use joining or having ??
You want the date of the order that has the maximum quantity.
It does not look like you do need two levels of subqueries. You could use a row-limiting subquery instead:
select orderdate
from orders
where orderid = (select orderid from from orderdetails order by quantity desc limit 1)
This is shorter, and does not fail if there is more than one order with the same, maximum quantity (while your original code does, because the subquery returns more than one row).
Another approach uses window functions:
select o.orderdate
from orders o
inner join (
select od.*, rank() over(order by quantity desc) rn
from orderdetails od
) od on od.orderid = o.orderid
where od.rn = 1
This will properly handle top ties, in the sense that it will return them all (while the first query returns only one of them).
I think this is much cleaner solution!
best regards
select max(od.quantity) as MaxOrder,orderdate
from orderdetails as od inner join orders as o on od.orderid=o.orderid

SQL - INNER JOIN with AND vs using sub-query

I'm practicing questions for the book "SQL Practice Problems: 57 beginning, intermediate, and advanced challenges for you to solve using a “learn-by-doing” approach ". Question 31 is -
Customers with no orders for EmployeeID 4
One employee (Margaret Peacock, EmployeeID 4) has placed the most orders. However, there are some customers who've never placed an order with her. Show only those customers who have never placed an order with her.
The solution I did creates a temporary table "cte" and joints it with an existing table. Not pretty or readable actually -
with cte as
(Select Customers.CustomerID
from customers
where CustomerID not in
(select Orders.CustomerID from orders where orders.EmployeeID = '4'))
select *
from cte left join
(select CustomerID from Orders where Orders.EmployeeID = '4') O
on cte.CustomerID = O.CustomerID
I found the following solution online -
SELECT c.CustomerID, o.CustomerID
FROM Customers AS c
LEFT JOIN Orders AS o ON o.CustomerID = c.CustomerID AND o.EmployeeID = 4
WHERE o.CustomerID IS NULL;
Which is nicer.
My question - when can I use OR, AND clauses in a JOIN? What are the advantages? Is it the fact that a JOIN is executed before the where clause?
Thanks,
Asaf
A JOIN condition can contain any boolean comparison, even subqueries using EXISTS and correlated subqueries. There is no limitation on what can be expressed.
Just a note, however. = and AND are good for performance. Inequalities tend to be performance killers.
As for your particular problem, I think the following is a more direct interpretation of the question:
SELECT c.CustomerID
FROM Customers c
WHERE NOT EXISTS (SELECT 1
FROM Orders o
WHERE o.CustomerID = c.CustomerID AND
o.EmployeeID = 4
);
That is, get all customers for whom no order exists with employee 4.
Generally I'd recommend that you always choose the most readable version of the query unless you can actually measure a performance difference with realistic data. The cost based optimiser should pick a good way of executing the query to return the results you want in this case.
For me the JOIN is a lot more readable than the CTE.
Here's is another Solution
SELECT * FROM(
(SELECT Customers.CustomerID AS Customers_ID
FROM Customers) AS P
LEFT JOIN
(Select Orders.CustomerID from Orders
where Orders.EmployeeID=4) as R
on R.CustomerID = P.Customers_ID
)
WHERE R.CustomerID IS NULL
ORDER BY R.CustomerID DESC

finding average dollar amount of an order

I am trying to find the average dollar amount of an order. I have calculated the average order Total but I need an average that takes into account the fact that not all Orders have a corresponding OrderItems.
This is a homework question and it is as follows:
What is the average $$ value of an order? To get the answer, you need
to add up all the order values and divide this by the
number of orders. There are two possible averages on this question,
because not all of the order numbers in the ORDERS table are in the
ORDERITEMS table... You will calculate and display both averages.
I have writtern the one ignoring orders with no OrderItem, but not sure of how to go about the second case.
SELECT SUM(OrderItems.qty*INVENTORY.price) / COUNT(*) AS dollarValue
FROM Orders, OrderItems, Inventory
WHERE ORDERS.orderid = OrderItems.orderid AND OrderItems.partid = Inventory.partid
Link To DB Diagram
The Avg function will not replace NULL with zero; it will exclude NULL from its calculation. If you have Order rows which have no OrderItem, you need to use Left Joins. A trick you can use in SQL Server is to nest the joins like so (note the parentheses):
Select Avg(OI.Qty * I.Price)
From Orders As O
Left Join (OrderItems As OI
Join Inventory As I
On I.PartId = OI.PartId)
On OI.OrderId = O.OrderId
This will join the Inventory table to the OrderItems table before it Left Joins that result to the Orders table. In this way, OI.Qty and I.Price with both return NULL for Orders that have no OrderItems and be excluded from the calculation. An equivalent approach to the above would be to use two Left Joins:
Select Avg(OI.Qty * I.Price)
From Orders As O
Left Join OrderItems As OI
On OI.OrderId = O.OrderId
Left Join Inventory As I
On I.PartId = OI.PartId
If you wanted to count Orders with no OrderItems as zero, then you need to covert those nulls to zero using Coalesce:
Select Avg(OI.Qty * I.Price) As Avg_ExcludingNull
, Avg( Coalesce(OI.Qty * I.Price,0) ) As Avg_NullAsZero
From Orders As O
Left Join (OrderItems As OI
Join Inventory As I
On I.PartId = OI.PartId)
On OI.OrderId = O.OrderId
SQL has an aggregate function for calculating the average: AVG()
SELECT AVG(OrderItems.qty*INVENTORY.price) AS dollarValue
FROM Orders, OrderItems, Inventory
WHERE ORDERS.orderid = OrderItems.orderid AND OrderItems.partid = Inventory.partid
While we're here, may I suggest you use the more modern JOIN syntax:
SELECT AVG(OrderItems.qty*INVENTORY.price) AS dollarValue
FROM Orders
JOIN OrderItems ON ORDERS.orderid = OrderItems.orderid
JOIN Inventory ON OrderItems.partid = Inventory.partid

How do I use SQL to select rows that have > 50 related rows in another table?

I've been trying to find out how to write this query in sql.
What I need is to find the productnames (in the products table) that have 50 or more orders (which are in the order table).
only one orderid is matched up to a productname at a time so when I try to count the orderid's it counts all of them.
I can get distinct productnames but once i add in the orderid's then it goes back to having multiple productnames.
I also need to count the number of customers (in the order table) that have ordered those products.
I need some serious help ASAP! if anyone could help me figure out how to figure this out that would be awesome!
Table: Products
`productname` in the form of a text like 'GrannySmith'
Table: Orders
`orderid` in the form of '10222'..etc
`custid` in the form of something like 'SMITH'
Assuming the orders table has a field that relates back to the products table named ProductId. The SQL would translate to:
SELECT p.ProductName, Count(*)
FROM Orders o
JOIN Products p
on o.ProductId = p.ProductId
GROUP BY p.ProductName HAVING COUNT(*) >= 50
The key is in the having component of the Group By clause. I hope this helps.
You might be missing an "Order Details" table - typically, an order has several order details, and each of the order details then maps to a product - something like the sample in Northwind:
In that case, your SQL query would be something like this: join the [Order Details] table to both the [Orders] and [Products] tables, group by the product ID and name, and count the OrderID's:
select
p.ProductID, p.ProductName, count(o.OrderID)
from
[order details] od
inner join
orders o on od.OrderID = o.OrderID
inner join
products p ON od.productID = p.ProductID
group by
p.ProductID, p.ProductName
having
count(o.OrderID) > 50

SQL Query to find the maximum of a set of averages

This is a query based on the Northwind Database in MS SQL Server 2005.
First I have to get the average of the UnitPrice from OrderDetails table, and group it by ProductID for that particular column alone and alias it as AveragePrice.
Then I need to find the maximum(AveragePrice) which is nothing but the max of previous column, how can I do it??? This is a kind of very tricky for me its taking me ages to think on it.
select
O.CustomerID,
E.EmployeeID,
E.FirstName+space(1)+E.LastName FullName,
OD.OrderID,
OD.ProductID,
(select avg(DO.UnitPrice) from OrderDetails
DO where OD.ProductID = DO.ProductID
group by DO.ProductID) AveragePrice ,
from OrderDetails OD
join Orders O
on OD.OrderID = O.OrderID
join Customers C
on C.CustomerID = O.CustomerID
join Employees E
on E.EmployeeID = O.EmployeeID
This is not a Homework question, am learning SQL, but am really stuck at this point, please help me.
It's 2 steps: "the ungrouped maximum of the grouped averages"
You can expand this as needed which shows how to apply an aggregate on top of an aggregate
SELECT
MAX(AveragePrice) AS MaxAveragePrice
FROM
(
select
avg(UnitPrice) AS AveragePrice, ProductID
from
OrderDetails
group by
ProductID
) foo
Or with CTE
;WITH AvgStuff AS
(
select
avg(UnitPrice) AS AveragePrice
from
OrderDetails
group by
ProductID
)
SELECT
MAX(AveragePrice) AS MaxAveragePrice
FROM
AvgStuff