Distinct across similar records in SQL Server 2008 database - sql

I have a SQL Server 2008 database. This database has a table called Product, Order, and OrderProduct. These three tables look like the following:
Product
-------
ID
Name
Description
Order
-----
ID
OrderDate
Status
OrderProduct
------------
OrderID
ProductID
Quantity
I am trying to identify the last three unique products a person ordered. However, I also need to include the last date on which the product was ordered. My problem is I keep getting a result set like this:
Can of Beans (10/10/2011)
Soda (10/09/2011)
Can of Beans (10/08/2011)
The second "Can of Beans" should not be there because I already showed "Can of Beans". My query looks like this:
SELECT TOP 3 DISTINCT
p.[Name],
o.[OrderDate]
FROM
[Product] p,
[Order] o
[OrderProduct] l
WHERE
l.[ProductID]=p.[ID] and
l.[OrderID]=o.[ID]
ORDER BY
o.[OrderDate] DESC
I understand that the reason DISTINCT won't work is because of the order dates are different. However, I'm not sure how to remedy this. Can somebody tell me how to fix this?

WITH cteProducts AS (
SELECT p.Name, o.OrderDate,
ROW_NUMBER() OVER(PARTITION BY p.Name ORDER BY o.OrderDate DESC) as RowNum
FROM Product p
INNER JOIN OrderProduct op
ON p.ID = op.ProductID
INNER JOIN Order o
ON op.OrderID = o.ID
)
SELECT TOP 3 Name, OrderDate
FROM cteProducts
WHERE RowNum = 1
ORDER BY OrderDate DESC;

Have you tried GROUP BY?
SELECT TOP 3
p.[Name],
max(o.[OrderDate])
FROM
[Product] p,
[Order] o
[OrderProduct] l
WHERE
l.[ProductID]=p.[ID] and
l.[OrderID]=o.[ID]
GROUP BY p.[Name]
ORDER BY
max(o.[OrderDate]) DESC

Try grouping like :
SELECT TOP 3
p.[Name],
MAX(o.[OrderDate])
FROM
[Product] p,
[Order] o
[OrderProduct] l
WHERE
l.[ProductID]=p.[ID] and
l.[OrderID]=o.[ID]
GROUP BY p.[Name]
ORDER BY
MAX(o.[OrderDate]) DESC

Use GROUP BY... it's been a while since I've used SQL Server, but the query will look something like this:
SELECT TOP 3
p.[Name],
max(o.[OrderDate]) AS MostRecentOrderDate
FROM
[Product] p,
[Order] o
[OrderProduct] l
WHERE
l.[ProductID]=p.[ID] and
l.[OrderID]=o.[ID]
GROUP BY p.[Name]
ORDER BY
MostRecentOrderDate DESC
Or to show the first time they ordered that product, choose min() instead of max()

Related

SQL query to get three most recent records by customer

I have a table of orders and am looking to get the three most recent orders by customer id
customer orderID orderDate
1 234 2018-01-01
1 236 2017-02-01
3 256 20157-03-01
I was able to use row number () to identify the row number of each line in the table, but is there a way to get the three most recent orders by customer id? Some customers do have less than 3 orders while others have more than 10 orders so I wasn't able to specify by the row number.
Does anyone have recommendations for a different option?
Here is an interesting approach using apply (and assuming you have a customers table):
select o.*
from customers c cross apply
(select top 3 o.*
from orders o
where o.customerid = c.customerid
order by orderdate desc
) o;
You could use partition by;
select customerid, orderid,orderdate from (
select t.customerid, t.orderid,t.orderdate
,row_number() over (partition by t.customerid order by t.orderDate desc) as mostRecently
from samplecustomers t)
Records where mostRecently < 4
Use this query:
SELECT result.customer
, result.orderID
, result.orderDate
FROM
(
SELECT Temp.customer
, Temp.orderID
, Temp.orderDate
, ROW_NUMBER() OVER(PARTITION BY Temp.customer
ORDER BY Temp.orderDate DESC) AS MR
FROM YourTable AS Temp
) AS result
WHERE result.MR <= 3;
Try this:
SELECT *
FROM orders
WHERE customer = 1
ORDER BY orderDate ASC limit 3
This should solve the problem.
How about this:
select a.Customer, a.orderID, a.orderDate
from orders a
where a.orderID in
(
select top 3 b.orderID
from orders b
where b.Customer = a.Customer
order by b.orderDate desc
)
order by a.Customer, a.orderID, a.orderDate

Select Top1 from multiple-column query for each product ID

I have a query in SQLServer that returns the last entry in our stock of a given product, as well as many other columns. Something like:
SELECT
TOP(1) EntryDate,
EntryPrice,
TaxID,
TransportCost,
...
FROM
StockEntries
WHERE
ProductID = #ID
ORDER BY
EntryDate DESC
I cannot use MAX to get the last entry because sometimes it returns duplicate rows (when there are two entries at the same day).
I would like to execute this query for every product we have. I could do this if the query returned only 1 row, such as:
SELECT
ProductID p,
(
SELECT
TOP(1) s.EntryDate
FROM
StockEntries s
WHERE
s.ProductID = p.ProductID
ORDER BY
s.EntryDate DESC
)
FROM
Products p
But as it returns multiple rows, I cannot see a straight way to do this.
Any ideas?
As you have phrased the question, cross apply seems very appropriate:
SELECT p.*, s.*
FROM products p CROSS APPLY
(SELECT TOP(1) s.*
FROM StockEntries s
WHERE s.ProductID = p.ProductID
ORDER BY s.EntryDate DESC
) s;
APPLY also allows you to select other columns from StockEntries.
you can use ROW_NUMBER() to rank each row and then just get the rows with the highest entry date per product.
SELECT *
FROM (SELECT p.productid,
s.EntryDate,
s.EntryPrice,
s.TaxID,
s.TransportCost,
ROW_NUMBER() OVER (PARTITION BY p.productid ORDER BY s.entrydate DESC) rownum
FROM products p
JOIN StockEntries s ON s.ProductID = p.ProductID
) t
WHERE rownum = 1

SQL Query to return the Top 2 Values

I am trying to return the top 2 most ordered items in our customer database. Below is what I have for the most ordered item but I am having trouble figuring out how to create another column for the 2nd most ordered item.
What is the best way to create the 2nd column?
SELECT FirstName, EmailAddress, Id, PreferredLocationId,
(
SELECT TOP 1 [Description] FROM [Order] o
INNER JOIN [OrderItem] oi ON oi.OrderId = o.OrderId
WHERE o.CustomerId = Customer.Id
GROUP BY [Description]
ORDER BY COUNT(*) DESC
) AS MostOrderedItem
FROM Customer
GROUP BY FirstName, EmailAddress, Id, PreferredLocationId
Lot's of different ways to handle this if you're using SQL Server 2012. I'm going to use a CTE to get the first two rows and use ROW_NUMBER()
WITH cte AS (
SELECT CustomerId, [Description]
, ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY COUNT(*) DESC) [RowID]
FROM [Order] o
INNER JOIN [OrderItem] oi ON oi.OrderId = o.OrderId
GROUP BY CustomerId, [Description]
)
SELECT FirstName, EmailAddress, Id, PreferredLocationId, cte1.Description, cte2.Description
FROM Customer
LEFT JOIN cte cte1 ON cte1.CustomerID = Customer.CustomerId AND cte1.RowID = 1
LEFT JOIN cte cte2 ON cte2.CustomerID = Customer.CustomerId AND cte2.RowID = 2
The Common Table Expression creates the list of all customers, descriptions and their row number. Note that if you have ties, you're not guarunteed which description will come first. You can add to to the windowing function description so that if there is a tie, whatever comes first in the alphabet will be the tie breaker.

Query That Returns First Results from Subquery

I'm trying to create a query that returns the results from a subquery in the result set.
Here are the tables I'm using:
Orders OrderDetails
------- -----------
orderId orderDetailId
(other data) orderId
productName
I'd like to get the first two order details for each order (Most orders have only one or two details). Here's an example of the desired result set:
orderId (other order data) productName1 productName2
------- ------------------ ------------ ------------
1 (other order data) apple grape
2 (other order data) orange banana
3 (other order data) apple orange
This is what I tried so far:
SELECT o.orderid,
Max(CASE WHEN detail = 1 THEN oi.productname END) AS ProductName1,
Max(CASE WHEN detail = 2 THEN oi.productname END) AS ProductName2
FROM orders AS o
OUTER apply (SELECT TOP 2 oi.*,
Row_number() OVER (ORDER BY orderdetailid) AS detail
FROM orderdetails AS oi
WHERE oi.orderid = o.orderid) AS oi
GROUP BY o.orderid
I'm doing this in the custom reporting module of a hosted ecommerce solution and getting the following unhelpful syntax error: SQL Error: Incorrect syntax near '('.
Unfortunately I don't know what version of SQL Server I'm using. Customer support knows nothing and select ##Version doesn't work.
Note, it appears the row_number() function is not properly supported even though error messages reference the function by name.
Thanks for the help!
Here is an alternative that does not use cross apply. Your ranking was correct but I added a partition by the order.
SELECT
*
FROM
(
SELECT
o.orderid,
ProductName=oi.productcode,
RowNumber=ROW_NUMBER() OVER (PARTITION BY o.orderid ORDER BY oi.orderdetailid)
FROM
orders as o
INNER JOIN orderdetailid oi ON oi.orderid=o.orderid
)AS X
WHERE
RowNumber=1
without using row_number
SELECT
orderdetails.*
Q1.*
FROM
(
SELECT
o.*,
FirstOrderDetailID=(SELECT MIN(orderdetails.orderdetailsid) FROM orderdetails WHERE orderid=o.orderid)
FROM
orders o
)AS Q1
LEFT OUTER JOIN orderdetails oi ON oi.orderdetailsid=Q1.FirstOrderDetailID
If you are just selecting orderid and the product, you don't need the join at all:
select orderid, productcode
from (SELECT oi.orderid, oi.productcode,
row_number() over (partition by oi.orderid order by oi.orderdetailid) as seqnum
FROM orderdetails oi
) oi
where seqnum = 1;
This may not fix the problem if row_number() is not working, but it simplifies the query. You can do this with the min() method as well:
select orderid, productcode
from orderdetails oi
where oi.orderdetailid in (select min(orderdetailid) from orderdetails group by orderid);

Select top and order from two tables

I am doing an e-commerce marketplace. There are many sellers selling in this marketplace. For each seller, I would like to display a Best Sellers list.
Database is in SQL Server. There are 2 main tables in this case:
Table 1: Stores each order's ordered products. Fields include SellerID, OrderID, ProductID, and Quantity.
Table 2: The products master table. Fields include ProductID, ...
How can I do a query to get the top 10 products with the most orders? My SQL below doesn't seem to work...
SELECT TOP (10) SUM(d.Quantity) AS total, d.ProductID, p.Title
From OrderDetails d, Products p
WHERE d.SellerID = 'xxx' AND
d.ProductID = p.ProductID
GROUP by d.ProductID
ORDER BY total DESC
Any help is much appreciated. Thank you!
select *, d.s
from products p
inner join
(
select top 10 productid, sum(quantity) as s
From OrderDetails
group by productid
order by sum(quantity) desc
)
d on d.productid = p.productid
See this SQLFiddle example
This is just a guess. If you want the "most orders" then I would rather count the orders instead of summing the quantity.
SELECT TOP 10
COUNT(d.OrderID) AS total, d.ProductID, p.Title
FROM OrderDetails d
INNER JOIN Products p ON d.ProductID = p.ProductID
WHERE d.SellerID = 'xxx'
GROUP by d.ProductID, p.Title
ORDER BY COUNT(d.OrderID) DESC
What else I fixed:
GROUP BY was missing a column. You have to name every column you have in your SELECT clause but is not in an aggregate function.
In the ORDER BY clause you have to name it exactly like you did in SELECT clause. Aliases don't work well in SQL Server.
Used the INNER JOIN syntax, which is less error prone to forgetting to specify the join in the WHERE clause.
This is not because of what type of database you using but aggregate function. There are lots of q & a regarding of this problem in stackoverflow. Please search for it.