Not getting the result that I need by using ROW_NUMBER() - sql

I'm using advantureworks2017 and what I'm trying to get is the top 2 selling products by year,, what I have so far is this but it's showing me only the top 2 rows which is not what I need, I need the top 2 products in each year.
SELECT TOP (2) ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) ,
ProductID,
year(h.OrderDate) 'OrderYear'
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID

Try to add row_number in the subquery and then use that rank <= 2 in the outer query to select top 2
select
ProductID,
OrderYear
from
(
SELECT
ProductID,
year(h.OrderDate) 'OrderYear',
ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) as rnk
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID
) val
where rnk <= 2

When you ORDER your ROW_NUMBER by sum(linetotal) it's goning to fail if you have multiple sum(linetotal) which are equal.
I prefer to do it that way:
Declare table(number of columns = number of your query results columns + 1)
fill first column in declared table with identity(1,1) and next insert query results into the rest columns.

Related

Average of top 2

I would like to get the average of the top2 limit1 per policyid. I need my resulting table to also have objectid.
Limit1 and objectid come from the table p_coverage.
Policyid comes from the table p_risk.
The table p_item is a linking table between p_risk and p_coverage.
The way I thought I should build my query is: create a ranking of limit1 within each policyid. Then take the avg top2.
However the ranking doesn't work and give wrong result. My query works if I take columns from ONE table, but as soon as I add joins between them it gives false ranking.
SELECT policyid, limit1, /*pcob,*/ RANK() OVER(PARTITION BY policyid ORDER BY limit1 DESC) AS rn
FROM (SELECT policyid, limit1/*, pc.objectid ASpcob*/
FROM p_risk pr
LEFT JOIN p_item
ON pr.objectid=p_item.riskobjectid
LEFT JOIN p_coverage pc
ON p_item.objectid=pc.insuranceitemid) AS s
) AS SubQueryAlias
GROUP BY
policyid, limit1/*, pcob*/, rn
ORDER BY rn,policyid,limit1 DESC
The table at the end of the picture is what I'd like to have. The first table is the result of the query of Golden Linoff
If I understand correctly, you want the ROW_NUMBER() in the subquery and then to aggregate and filter in the outer query:
SELECT policyid, AVG(limit1) as avg_top2_limit1
FROM (SELECT policyid, limit1,
DENSE_RANK() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
) p
WHERE seqnum <= 2
GROUP BY policyid
thanks to previous comment! I succeed to do what I wanted. There is the query
select b.policyid, avg(b.limit1) as avg_top2_limit1 from(
SELECT distinct(policyid) policyid, limit1
FROM (SELECT policyid, limit1,
Dense_rank() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as
seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
WHERE seqnum <= 2 ) as b
GROUP BY policyid`

Get the second last record in a date column within a inner join

I need to pull the second last record in a date column called OrderDate. However, I need to bring only one date (I am making the search into a table with all the purchases orders, dates and costs, in which a have to bring only the second last and its cost). The way its query is written today (and working) is pulling me the the newest date.
select distinct
a.PurchaseNum, a.ItemID, a.SupplierNum, a.Location, a.OrderDate, a.Cost
from
PurchaseOrder a
inner join
(select
l.SupplierNum, l.ItemID, l.Location, maxdate = max(l.OrderDate)
from
PurchaseOrder l
where
l.Cost <> 0
group by
l.SupplierNum, l.itemid, l.Location) l on a.SupplierNum = l.SupplierNumand a.itemid = l.itemid
and l.Location = a.Location
and a.OrderDate = l.maxdate
I have tried to use lag(), offset (but with limitations once is within a join, forcing me to use the order by and include the dateOrder column which is not what I want because we need only one date)
A bit of context: I have a report in which I need to show the last and second last cost of a purchase order for each supplier. Bring the last cost of an order is easy, the problem is go back to the second last... and it is where I am stuck right now.
Any thought?
If I'm understanding you correctly, here's one option using row_number to return the 2 highest orderdate records:
select *
from (
select *,
row_number() over (partition by SupplierNum, ItemID, Location
order by OrderDate desc) rn
from PurchaseOrder
where cost <> 0
) t
where rn <= 2
Inner query does order by desc and outside query does order by asc.
select distinct top 1 a.*
from PurchaseOrder a
inner join
(
select Top 2 l.*
from PurchaseOrder l
where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location order by orderdate desc) l
on a.SupplierNum= l.SupplierNumand a.itemid = l.itemid and l.Location=a.Location and a.OrderDate = l.Orderdate
order by a.orderdate
or
SELECT TOP 1 * FROM (SELECT * FROM PurchaseOrder a
EXCEPT SELECT TOP (SELECT (COUNT(*)-2) FROM PurchaseOrder a where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location) * FROM PurchaseOrder) A
or
SELECT *
FROM PurchaseOrder a
WHERE OrderDate = ( SELECT MAX(OrderDate)
FROM PurchaseOrder
WHERE Orderdate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder l where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location
)
) ;
or
SELECT TOP (1) *
FROM PurchaseOrder
WHERE OrderDate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder where ....
)
ORDER BY OrderDate DESC ;

Select Top1 from multiple-column query for each product ID

I have a query in SQLServer that returns the last entry in our stock of a given product, as well as many other columns. Something like:
SELECT
TOP(1) EntryDate,
EntryPrice,
TaxID,
TransportCost,
...
FROM
StockEntries
WHERE
ProductID = #ID
ORDER BY
EntryDate DESC
I cannot use MAX to get the last entry because sometimes it returns duplicate rows (when there are two entries at the same day).
I would like to execute this query for every product we have. I could do this if the query returned only 1 row, such as:
SELECT
ProductID p,
(
SELECT
TOP(1) s.EntryDate
FROM
StockEntries s
WHERE
s.ProductID = p.ProductID
ORDER BY
s.EntryDate DESC
)
FROM
Products p
But as it returns multiple rows, I cannot see a straight way to do this.
Any ideas?
As you have phrased the question, cross apply seems very appropriate:
SELECT p.*, s.*
FROM products p CROSS APPLY
(SELECT TOP(1) s.*
FROM StockEntries s
WHERE s.ProductID = p.ProductID
ORDER BY s.EntryDate DESC
) s;
APPLY also allows you to select other columns from StockEntries.
you can use ROW_NUMBER() to rank each row and then just get the rows with the highest entry date per product.
SELECT *
FROM (SELECT p.productid,
s.EntryDate,
s.EntryPrice,
s.TaxID,
s.TransportCost,
ROW_NUMBER() OVER (PARTITION BY p.productid ORDER BY s.entrydate DESC) rownum
FROM products p
JOIN StockEntries s ON s.ProductID = p.ProductID
) t
WHERE rownum = 1

SQL Query for top 10 items from two relations

I'm struggling to right a SQL command to get the top 10 names from the following (using standard SQL, cant use TOP) for the following 2 relations:
Orders (customer_email, item_id, date)
Items(id, name, store, price)
Any advice on how to do this? I think I would need to group them, but then what do I do to get the top 10 groupings based on count?
select *
from (select x.*, row_number() over(order by num_orders desc) as rn
from (select i.name, count(*) as num_orders
from orders o
join items i
on o.item_id = i.id
group by i.name) x) x
where rn <= 10
SELECT
COUNT(*) count_per_item
, i.id
, i.name
FROM
Orders o
JOIN
Items i
ON (o.item_id = i.id)
GROUP BY
i.id
, i.name
ORDER BY
count_per_item DESC
LIMIT 10;

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory