SQL Query for top 10 items from two relations - sql

I'm struggling to right a SQL command to get the top 10 names from the following (using standard SQL, cant use TOP) for the following 2 relations:
Orders (customer_email, item_id, date)
Items(id, name, store, price)
Any advice on how to do this? I think I would need to group them, but then what do I do to get the top 10 groupings based on count?

select *
from (select x.*, row_number() over(order by num_orders desc) as rn
from (select i.name, count(*) as num_orders
from orders o
join items i
on o.item_id = i.id
group by i.name) x) x
where rn <= 10

SELECT
COUNT(*) count_per_item
, i.id
, i.name
FROM
Orders o
JOIN
Items i
ON (o.item_id = i.id)
GROUP BY
i.id
, i.name
ORDER BY
count_per_item DESC
LIMIT 10;

Related

Not getting the result that I need by using ROW_NUMBER()

I'm using advantureworks2017 and what I'm trying to get is the top 2 selling products by year,, what I have so far is this but it's showing me only the top 2 rows which is not what I need, I need the top 2 products in each year.
SELECT TOP (2) ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) ,
ProductID,
year(h.OrderDate) 'OrderYear'
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID
Try to add row_number in the subquery and then use that rank <= 2 in the outer query to select top 2
select
ProductID,
OrderYear
from
(
SELECT
ProductID,
year(h.OrderDate) 'OrderYear',
ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) as rnk
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID
) val
where rnk <= 2
When you ORDER your ROW_NUMBER by sum(linetotal) it's goning to fail if you have multiple sum(linetotal) which are equal.
I prefer to do it that way:
Declare table(number of columns = number of your query results columns + 1)
fill first column in declared table with identity(1,1) and next insert query results into the rest columns.

Get the second last record in a date column within a inner join

I need to pull the second last record in a date column called OrderDate. However, I need to bring only one date (I am making the search into a table with all the purchases orders, dates and costs, in which a have to bring only the second last and its cost). The way its query is written today (and working) is pulling me the the newest date.
select distinct
a.PurchaseNum, a.ItemID, a.SupplierNum, a.Location, a.OrderDate, a.Cost
from
PurchaseOrder a
inner join
(select
l.SupplierNum, l.ItemID, l.Location, maxdate = max(l.OrderDate)
from
PurchaseOrder l
where
l.Cost <> 0
group by
l.SupplierNum, l.itemid, l.Location) l on a.SupplierNum = l.SupplierNumand a.itemid = l.itemid
and l.Location = a.Location
and a.OrderDate = l.maxdate
I have tried to use lag(), offset (but with limitations once is within a join, forcing me to use the order by and include the dateOrder column which is not what I want because we need only one date)
A bit of context: I have a report in which I need to show the last and second last cost of a purchase order for each supplier. Bring the last cost of an order is easy, the problem is go back to the second last... and it is where I am stuck right now.
Any thought?
If I'm understanding you correctly, here's one option using row_number to return the 2 highest orderdate records:
select *
from (
select *,
row_number() over (partition by SupplierNum, ItemID, Location
order by OrderDate desc) rn
from PurchaseOrder
where cost <> 0
) t
where rn <= 2
Inner query does order by desc and outside query does order by asc.
select distinct top 1 a.*
from PurchaseOrder a
inner join
(
select Top 2 l.*
from PurchaseOrder l
where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location order by orderdate desc) l
on a.SupplierNum= l.SupplierNumand a.itemid = l.itemid and l.Location=a.Location and a.OrderDate = l.Orderdate
order by a.orderdate
or
SELECT TOP 1 * FROM (SELECT * FROM PurchaseOrder a
EXCEPT SELECT TOP (SELECT (COUNT(*)-2) FROM PurchaseOrder a where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location) * FROM PurchaseOrder) A
or
SELECT *
FROM PurchaseOrder a
WHERE OrderDate = ( SELECT MAX(OrderDate)
FROM PurchaseOrder
WHERE Orderdate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder l where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location
)
) ;
or
SELECT TOP (1) *
FROM PurchaseOrder
WHERE OrderDate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder where ....
)
ORDER BY OrderDate DESC ;

Select Top1 from multiple-column query for each product ID

I have a query in SQLServer that returns the last entry in our stock of a given product, as well as many other columns. Something like:
SELECT
TOP(1) EntryDate,
EntryPrice,
TaxID,
TransportCost,
...
FROM
StockEntries
WHERE
ProductID = #ID
ORDER BY
EntryDate DESC
I cannot use MAX to get the last entry because sometimes it returns duplicate rows (when there are two entries at the same day).
I would like to execute this query for every product we have. I could do this if the query returned only 1 row, such as:
SELECT
ProductID p,
(
SELECT
TOP(1) s.EntryDate
FROM
StockEntries s
WHERE
s.ProductID = p.ProductID
ORDER BY
s.EntryDate DESC
)
FROM
Products p
But as it returns multiple rows, I cannot see a straight way to do this.
Any ideas?
As you have phrased the question, cross apply seems very appropriate:
SELECT p.*, s.*
FROM products p CROSS APPLY
(SELECT TOP(1) s.*
FROM StockEntries s
WHERE s.ProductID = p.ProductID
ORDER BY s.EntryDate DESC
) s;
APPLY also allows you to select other columns from StockEntries.
you can use ROW_NUMBER() to rank each row and then just get the rows with the highest entry date per product.
SELECT *
FROM (SELECT p.productid,
s.EntryDate,
s.EntryPrice,
s.TaxID,
s.TransportCost,
ROW_NUMBER() OVER (PARTITION BY p.productid ORDER BY s.entrydate DESC) rownum
FROM products p
JOIN StockEntries s ON s.ProductID = p.ProductID
) t
WHERE rownum = 1

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory

SQL: improving join efficiency

If I turn this sub-query which selects sales persons and their highest price paid for any item they sell:
select *,
(select top 1 highestProductPrice
from orders o
where o.salespersonid = s.id
order by highestProductPrice desc ) as highestProductPrice
from salespersons s
in to this join in order to improve efficiency:
select *, highestProductPrice
from salespersons s join (
select salespersonid, highestProductPrice, row_number(
partition by salespersonid
order by salespersonid, highestProductPrice) as rank
from orders ) o on s.id = o.salespersonid
It still touches every order record (it enumerates the entire table before filtering by salespersonid it seems.) However you cannot do this:
select *, highestProductPrice
from salespersons s join (
select salespersonid, highestProductPrice, row_number(
partition by salespersonid
order by salespersonid, highestProductPrice) as rank
from orders
where orders.salepersonid = s.id) o on s.id = o.salespersonid
The where clause in the join causes a `multi-part identifier "s.id" could not be bound.
Is there any way to join the top 1 out of each order group with a join but without touching each record in orders?
Try
SELECT
S.*,
T.HighestProductPrice
FROM
SalesPersons S
CROSS APPLY
(
SELECT TOP 1 O.HighestProductPrice
FROM Orders O
WHERE O.SalesPersonid = S.Id
ORDER BY O.SalesPersonid, O.HighestProductPrice DESC
) T
would
select s.*, max(highestProductPrice)
from salespersons s
join orders o on o.salespersonid = s.id
group by s.*
or
select s.*, highestProductPrice
from salespersons s join (select salepersonid,
max(highestProductPrice) as highestProductPrice
from orders o) as o on o.salespersonid = s.id
work?