SQL Statement Help - Select latest Order for each Customer - sql

Say I have 2 tables: Customers and Orders. A Customer can have many Orders.
Now, I need to show any Customers with his latest Order. This means if a Customer has more than one Orders, show only the Order with the latest Entry Time.
This is how far I managed on my own:
SELECT a.*, b.Id
FROM Customer a INNER JOIN Order b ON b.CustomerID = a.Id
ORDER BY b.EntryTime DESC
This of course returns all Customers with one or more Orders, showing the latest Order first for each Customer, which is not what I wanted. My mind was stuck in a rut at this point, so I hope someone can point me in the right direction.
For some reason, I think I need to use the MAX syntax somewhere, but it just escapes me right now.
UPDATE: After going through a few answers here (there's a lot!), I realized I made a mistake: I meant any Customer with his latest record. That means if he does not have an Order, then I do not need to list him.
UPDATE2: Fixed my own SQL statement, which probably caused no end of confusion to others.

I don't think you do want to use MAX() as you don't want to group the OrderID. What you need is an ordered sub query with a SELECT TOP 1.
select *
from Customers
inner join Orders
on Customers.CustomerID = Orders.CustomerID
and OrderID = (
SELECT TOP 1 subOrders.OrderID
FROM Orders subOrders
WHERE subOrders.CustomerID = Orders.CustomerID
ORDER BY subOrders.OrderDate DESC
)

Something like this should do it:
SELECT X.*, Y.LatestOrderId
FROM Customer X
LEFT JOIN (
SELECT A.Customer, MAX(A.OrderID) LatestOrderId
FROM Order A
JOIN (
SELECT Customer, MAX(EntryTime) MaxEntryTime FROM Order GROUP BY Customer
) B ON A.Customer = B.Customer AND A.EntryTime = B.MaxEntryTime
GROUP BY Customer
) Y ON X.Customer = Y.Customer
This assumes that two orders for the same customer may have the same EntryTime, which is why MAX(OrderID) is used in subquery Y to ensure that it only occurs once per customer. The LEFT JOIN is used because you stated you wanted to show all customers - if they haven't got any orders, then the LatestOrderId will be NULL.
Hope this helps!
--
UPDATE :-) This shows only customers with orders:
SELECT A.Customer, MAX(A.OrderID) LatestOrderId
FROM Order A
JOIN (
SELECT Customer, MAX(EntryTime) MaxEntryTime FROM Order GROUP BY Customer
) B ON A.Customer = B.Customer AND A.EntryTime = B.MaxEntryTime
GROUP BY Customer

While I see that you've already accepted an answer, I think this one is a bit more intuitive:
select a.*
,b.Id
from customer a
inner join Order b
on b.CustomerID = a.Id
where b.EntryTime = ( select max(EntryTime)
from Order
where a.Id = b.CustomerId
);
a.Id = b.CustomerId because you want the max EntryTime of all orders (in b) for the customer (a.Id).
I would have to run something like this through an execution plan to see the difference in execution, but where the TOP function is done after-the-fact and that using order by can be expensive, I believe that using max(EntryTime) would be the best way to run this.

You can use a window function.
SELECT *
FROM (SELECT a.*, b.*,
ROW_NUMBER () OVER (PARTITION BY a.ID ORDER BY b.orderdate DESC,
b.ID DESC) rn
FROM customer a, ORDER b
WHERE a.ID = b.custid)
WHERE rn = 1
For each customer (a.id) it sorts all orders and discards everything but the latest.
ORDER BY clause includes both order date and entry id, in case there are multiple orders on the same date.
Generally, window functions are much faster than any look-ups using MAX() on large number of records.

This query is much faster than the accepted answer :
SELECT c.id as customer_id,
(SELECT co.id FROM customer_order co WHERE
co.customer_id=c.id
ORDER BY some_date_column DESC limit 1) as last_order_id
FROM customer c

SELECT Cust.*, Ord.*
FROM Customers cust INNER JOIN Orders ord ON cust.ID = ord.CustID
WHERE ord.OrderID =
(SELECT MAX(OrderID) FROM Orders WHERE Orders.CustID = cust.ID)

Something like:
SELECT
a.*
FROM
Customer a
INNER JOIN Order b
ON a.OrderID = b.Id
INNER JOIN (SELECT Id, max(EntryTime) as EntryTime FROM Order b GROUP BY Id) met
ON
b.EntryTime = met.EntryTime and b.Id = met.Id

One approach that I haven't seen above yet:
SELECT
C.*,
O1.ID
FROM
dbo.Customers C
INNER JOIN dbo.Orders O1 ON
O1.CustomerID = C.ID
LEFT OUTER JOIN dbo.Orders O2 ON
O2.CustomerID = C.ID AND
O2.EntryTime > O1.EntryTime
WHERE
O2.ID IS NULL
This (as well as the other solutions I believe) assumes that no two orders for the same customer can have the exact same entry time. If that's a concern then you would have to make a choice as to what determines which one is the "latest". If that's a concern post a comment and I can expand the query if needed to account for that.
The general approach of the query is to find the order for a customer where there is not another order for the same customer with a later date. It is then the latest order by definition. This approach often gives better performance then the use of derived tables or subqueries.

A simple max and "group by" is sufficient.
select c.customer_id, max(o.order_date)
from customers c
inner join orders o on o.customer_id = c.customer_id
group by c.customer_id;
No subselect needed, which slows things down.

Related

SQL strategy to fetch maximum

Suppose I have these three tables:
I want to get, for all products, it's product_id and the client that bougth it most times (the biggest client of the product).
I solved it like this:
SELECT
product_id AS product,
(SELECT TOP 1 client_id FROM Bill_Item, Bill
WHERE Bill_Item.product_id = p.product_id
and Bill_Item.bill_id = Bill.bill_id
GROUP BY
client_id
ORDER BY
COUNT(*) DESC
) AS client
FROM Product p
Do you know a better way?
the inner query will give you the ranking. The outer query will give you the client that puchase the most for a product
SELECT *
(
SELECT i.product_id, b.client_id,
r = row_number() over (partition by i.product_id
order by count(*) desc)
FROM Bill b
INNER JOIN Bill_Item i ON b.bill_id = i.bill_id
GROUP BY i.product_id, b.client_id
) d
WHERE r = 1
I was going to submit pretty much the same thing as #Squirrell only with a Common Table Expression [CTE] rather than a derived table. So I wont duplicate that but there are some learning points concerning your query. First is IMPLICIT JOINS such as FROM Bill_Item, Bill are really easy to have uintended consequences (one of many questions: Queries that implicit SQL joins can't do?) Next for the Calculated column you can actually do this in a OUTER APPLY or CROSS APPLY which is a very useful technique.
So you could re-write your method as follows:
SELECT *
FROM
Product p
OUTER APPLY (SELECT TOP 1 b.client_id
FROM
Bill_Item bi
INNER JOIN Bill b
ON bi.bill_id = b.bill_id
WHERE
bi.product_id = p.product_id
GROUP BY
b.client_id
ORDER BY
COUNT(*) DESC) c
And to show you how squirell's answer can still include products that have never been sold all you need to do is join Products and LEFT JOIN to other tables:
;WITH cte AS (
SELECT
p.product_id
,b.client_id
,ROW_NUMBER() OVER (PARTITION BY p.product_id ORDER BY COUNT(*) DESC) as RowNumber
FROM
Product p
LEFT JOIN Bill_Item bi
ON p.product_id = bi.product_id
LEFT JOIN Bill b
ON bi.bill_id = b.bill_id
GROUP BY
p.product_id
,b.client_id
)
SELECT *
FROM
cte
WHERE
RowNumber = 1
Techniques used in some of these that are useful.
CTE
APPLY (Outer & Cross)
Window Functions
Squirrel's answer doesn't return products that have never been sold. If you want to include those, then your approach is ok, although I would write the query as:
SELECT product_id as product,
(SELECT TOP 1 b.client_id
FROM Bill_Item bi JOIN
Bill b
ON bi.bill_id = b.bill_id
WHERE Bill_Item.product_id = p.product_id
GROUP BY client_id
ORDER BY COUNT(*) DESC
) as client
FROM Product p;
You can also express this using APPLY, but a correlated subquery is also fine.
Note the correct use of the explicit JOIN syntax.

Sql query to display records that appear more than once in a table

I have two tables, Customer with columns CustomerID, FirstName, Address and Purchases with columns PurchaseID, Qty, CustomersID.
I want to create a query that will display FirstName(s) that have bought more than two products, product quantity is represented by Qty.
I can't seem to figure this out - I've just started with T-SQL
You could sum the purchases and use a having clause to filter those you're interested in. You can then use the in operator to query only the customer names that fit these IDs:
SELECT FirstName
FROM Customer
WHERE CustomerID IN (SELECT CustomerID
FROM Purchases
GROUP BY CustomerID
HAVING SUM(Qty) > 2)
Please try this, it should work for you, according to your question.
Select MIN(C.FirstName) FirstName from Customer C INNER JOIN Purchases P ON C.CustomerID=P.CustomersID Group by P.CustomersID Having SUM(P.Qty) >2
Please try this:
select c.FirstName,p.Qty
from Customer as c
join Purchase as p
on c.CustomerID = p.CustomerID
where CustomerID in (select CustomerID from Purchases group by CustomerID having count(CustomerID)>2);
SELECT
c.FirstName
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
GROUP BY
c.FirstName
HAVING
SUM(p.Qty) > 2
While the IN suggestions would work they are kind of overkill and more than likely less performant than a straight up join with aggregation. The trick is the HAVING Clause by using it you can limit your result to the names you want. Here is a link to learn more about IN vs. Exists vs JOIN (NOT IN vs NOT EXISTS)
There are dozens of ways of doing this and to introduce you to Window Functions and common table expressions which are way over kill for this simplified example but are invaluable in your toolset as your queries continue to get more complex:
;WITH cte AS (
SELECT DISTINCT
c.FirstName
,SUM(p.Qty) OVER (PARTITION BY c.CustomerId) as SumOfQty
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
)
SELECT *
FROM
cte
WHERE
SumOfQty > 2

Need hints on seemingly simple SQL query

I'm trying to do something like:
SELECT c.id, c.name, COUNT(orders.id)
FROM customers c
JOIN orders o ON o.customerId = c.id
However, SQL will not allow the COUNT function. The error given at execution is that c.Id is not valid in the select list because it isn't in the group by clause or isn't aggregated.
I think I know the problem, COUNT just counts all the rows in the orders table. How can I make a count for each customer?
EDIT
Full query, but it's in dutch... This is what I tried:
select k.ID,
Naam,
Voornaam,
Adres,
Postcode,
Gemeente,
Land,
Emailadres,
Telefoonnummer,
count(*) over (partition by k.id) as 'Aantal bestellingen',
Kredietbedrag,
Gebruikersnaam,
k.LeverAdres,
k.LeverPostnummer,
k.LeverGemeente,
k.LeverLand
from klanten k
join bestellingen on bestellingen.klantId = k.id
No errors but no results either..
When using an aggregate function like that, you need to group by any columns that aren't aggregates:
SELECT c.id, c.name, COUNT(orders.id)
FROM customers c
JOIN orders o ON o.customerId = c.id
GROUP BY c.id, c.name
If you really want to be able to select all of the columns in Customers without specifying the names (please read this blog post in full for reasons to avoid this, and easy workarounds), then you can do this lazy shorthand instead:
;WITH o AS
(
SELECT CustomerID, CustomerCount = COUNT(*)
FROM dbo.Orders GROUP BY CustomerID
)
SELECT c.*, o.OrderCount
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON c.id = o.CustomerID;
EDIT for your real query
SELECT
k.ID,
k.Naam,
k.Voornaam,
k.Adres,
k.Postcode,
k.Gemeente,
k.Land,
k.Emailadres,
k.Telefoonnummer,
[Aantal bestellingen] = o.klantCount,
k.Kredietbedrag,
k.Gebruikersnaam,
k.LeverAdres,
k.LeverPostnummer,
k.LeverGemeente,
k.LeverLand
FROM klanten AS k
INNER JOIN
(
SELECT klantId, klanCount = COUNT(*)
FROM dbo.bestellingen
GROUP BY klantId
) AS o
ON k.id = o.klantId;
I think this solution is much cleaner than grouping by all of the columns. Grouping on the orders table first and then joining once to each customer row is likely to be much more efficient than joining first and then grouping.
The following will count the orders per customer without the need to group the overall query by customer.id. But this also means that for customers with more than one order, that count will repeated for each order.
SELECT c.id, c.name, COUNT(orders.id) over (partition by c.id)
FROM customers c
JOIN orders ON o.customerId = c.id

SQL Find next Date for each Customer, SQL For each?

I have some problems with an SQL statement. I need to find the next DeliveryDate for each Customer in the following setup.
Tables
Customer (id)
DeliveryOrder (id, deliveryDate)
DeliveryOrderCustomer (customerId, deliveryOrderId)
Each Customer may have several DeliveryOrders on the same deliveryDate. I just can't figure out how to only get one deliveryDate for each customer. The date should be the next upcoming DeliveryDate after today. I feel like I would need some sort of "for each" here but I don't know how to solve it in SQL.
Another simpler version
select c.id, min(o.date)
from customer c
inner join deliveryordercustomer co o on co.customerId = c.id
inner join deliveryorder o on co.deliveryOrderId = o.id and o.date>getdate()
group by c.id
This would give the expected results using a subselect. Take into account that current_date may be rdbms specific, it works for Oracle.
select c.id, o.date
from customer c
inner join deliveryordercustomer co o on co.customerId = c.id
inner join deliveryorder o on co.deliveryOrderId = o.id
where o.date =
(select min(o2.date)
from deliveryorder o2
where o2.id = co.deliveryOrderId and o2.date > current_date)
You need to use a group by. There's a lot of ways to do this, here's my solution that takes into account multiple orders on same day for customer, and allows you to query different delivery slots, first, second etc. This assumes Sql Server 2005 and above.
;with CustomerDeliveries as
(
Select c.id, do.deliveryDate, Rank()
over (Partition BY c.id order by do.deliveryDate) as DeliverySlot
From Customer c
inner join DeliveryOrderCustomer doc on c.id = doc.customerId
inner join DeliveryOrder do on do.id = doc.deliveryOrderId
Where do.deliveryDate>GETDATE()
Group By c.id, do.deliveryDate
)
Select id, deliveryDate
From CustomerDeliveries
Where DeliverySlot = 1

What is the most efficient way to write query to get latest item from collection?

Sorry about the title. That was the best I could come up with.
Anyways, here is my question - I have 2 tables, a Customer table and Order table as shown below:
Customer {
Long id;
String name;
String address;
Timestamp createdOn;
}
Order {
Long id;
String productName;
Long customerId;
Timestamp createdOn;
}
There is a one to many relation from Customer to Order, i.e., a Customer can have many Orders as identified by 'customerId'.
Now here is my question. I am looking to retrieve a list of all the Customers along with their latest Orders. Based on the above table this is what I have been able to come up with so far:
SELECT *
FROM CUSTOMER AS CUSTOMERS
INNER JOIN (
SELECT *
FROM ORDER AS BASE_ORDERS
INNER JOIN (
SELECT MAX_ORDERS.CUSTOMERID,
MAX(MAX_ORDERS.CREATEDON)
FROM ORDER AS MAX_ORDERS
GROUP BY CUSTOMERID
) AS GROUPED_ORDERS
ON BASE_ORDERS.CUSTOMERID = GROUPED_ORDERS.CUSTOMERID
AND BASE_ORDERS.CREATEDON = GROUPED_ORDERS.CREATEDON
) AS LATEST_ORDERS
ON CUSTOMERS.ID = LATEST_ORDERS.CUSTOMERID
Is there a better way to do this? The only other way that I can think of is to add a new field called 'isLatest' and set that true each time a new order is placed for a customer.
Let me know what you think.
I find this variation a little cleaner:
select c.*, o.*
from (
select customerid, max(createdOn) as MaxCreatedOn
from Order
group by customerid
) mo
inner join Order o on mo.customerid = o.customerid and mo.MaxCreatedOn = o.createdOn
inner join Customer c on o.customerid = c.id
What about something like this:
SELECT
c.*,
o.*
FROM
Customers c
INNER JOIN
Orders o ON o.customerId = c.id
GROUP BY
c.id
ORDER BY
o.createdOn DESC
Not sure how efficient this will be with the Group and Order By statements, but I believe that it will get you the data that you are looking for.