Limiting SQL Result - sql

I have a query where I am trying to get results from a table:
SELECT P.P_CODE, P.P_JEWELRYTYPE,P.p_catalog, P_AVAILABLE, P_RESERVED , p.p_catalog
FROM products P
WHERE
P.p_catalog IN (8796093383256,8796093252184,8796093317720,8796093121112,8796093186648);
I want a query where I can limit the number of results to 500 of each catalog type. How should I modify my query to achieve this?

Analytical function row_number() will work which will assign row numbers to each p_catalog so that you could filter on that condition in WHERE clause in an outer query:
SELECT *
FROM (
SELECT
p.P_CODE, p.P_JEWELRYTYPE, p.p_catalog, p.P_AVAILABLE, p.P_RESERVED, p.p_catalog,
row_number() over (partition by p.p_catalog) as rn
FROM products p
WHERE p.p_catalog IN (8796093383256,8796093252184,8796093317720,8796093121112,8796093186648)
) p
WHERE rn <= 500;

Use row_number():
select p.*
from (select p.*, row_number() over (partition by catalog order by null) as seqnum
from products p
) p
where p.seqnum <= 500;
You can put the where filtering in either the inner or outer query.

Related

How optimize select with max subquery on the same table?

We have many old selects like this:
SELECT
tm."ID",tm."R_PERSONES",tm."R_DATASOURCE", ,tm."MATCHCODE",
d.NAME AS DATASOURCE,
p.PDID
FROM TABLE_MAPPINGS tm,
PERSONES p,
DATASOURCES d,
(select ID
from TABLE_MAPPINGS
where (R_PERSONES, MATCHCODE)
in (select
R_PERSONES, MATCHCODE
from TABLE_MAPPINGS
where
id in (select max(id)
from TABLE_MAPPINGS
group by MATCHCODE)
)
) tm2
WHERE tm.R_PERSONES = p.ID
AND tm.R_DATASOURCE=d.ID
and tm2.id = tm.id;
These are large tables, and queries take a long time.
How to rebuild them?
Thank you
You can query the table only once using something like (untested as you have not provided a minimal example of your create table statements or sample data):
SELECT *
FROM (
SELECT m.*,
COUNT(CASE WHEN rnk = 1 THEN 1 END)
OVER (PARTITION BY r_persones, matchcode) AS has_max_id
FROM (
SELECT tm.ID,
tm.R_PERSONES,
tm.R_DATASOURCE,
tm.MATCHCODE,
d.NAME AS DATASOURCE,
p.PDID,
RANK() OVER (PARTITION BY tm.matchcode ORDER BY tm.id DESC) As rnk
FROM TABLE_MAPPINGS tm
INNER JOIN PERSONES p ON tm.R_PERSONES = p.ID
INNER JOIN DATASOURCES d ON tm.R_DATASOURCE = d.ID
) m
)
WHERE has_max_id > 0;
First finding the maximum ID using the RANK analytic function and then finding all the relevant r_persones, matchcode pairs using conditional aggregation in a COUNT analytic function.
Note: you want to use the RANK or DENSE_RANK analytic functions to match the maximums as it can match multiple rows per partition; whereas ROW_NUMBER will only ever put a single row per partition first.
You're querying table_mappings 3 times; how about doing it only once?
WITH
tab_map
AS
(SELECT a.id,
a.r_persones,
a.matchcode,
a.datasource,
ROW_NUMBER ()
OVER (PARTITION BY a.matchcode ORDER BY a.id DESC) rn
FROM table_mappings a)
SELECT tm.id,
tm.r_persones,
tm.matchcode,
d.name AS datasource,
p.pdid
FROM tab_map tm
JOIN persones p ON p.id = tm.r_persones
JOIN datasources d ON d.id = tm.r_datasource
WHERE tm.rn = 1

Not getting the result that I need by using ROW_NUMBER()

I'm using advantureworks2017 and what I'm trying to get is the top 2 selling products by year,, what I have so far is this but it's showing me only the top 2 rows which is not what I need, I need the top 2 products in each year.
SELECT TOP (2) ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) ,
ProductID,
year(h.OrderDate) 'OrderYear'
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID
Try to add row_number in the subquery and then use that rank <= 2 in the outer query to select top 2
select
ProductID,
OrderYear
from
(
SELECT
ProductID,
year(h.OrderDate) 'OrderYear',
ROW_NUMBER() OVER (ORDER BY sum(linetotal) DESC) as rnk
from Sales.SalesOrderDetail so
left outer join Sales.SalesOrderHeader h
on so.SalesOrderID = h.SalesOrderID
group by year(h.OrderDate), ProductID
) val
where rnk <= 2
When you ORDER your ROW_NUMBER by sum(linetotal) it's goning to fail if you have multiple sum(linetotal) which are equal.
I prefer to do it that way:
Declare table(number of columns = number of your query results columns + 1)
fill first column in declared table with identity(1,1) and next insert query results into the rest columns.

Average of top 2

I would like to get the average of the top2 limit1 per policyid. I need my resulting table to also have objectid.
Limit1 and objectid come from the table p_coverage.
Policyid comes from the table p_risk.
The table p_item is a linking table between p_risk and p_coverage.
The way I thought I should build my query is: create a ranking of limit1 within each policyid. Then take the avg top2.
However the ranking doesn't work and give wrong result. My query works if I take columns from ONE table, but as soon as I add joins between them it gives false ranking.
SELECT policyid, limit1, /*pcob,*/ RANK() OVER(PARTITION BY policyid ORDER BY limit1 DESC) AS rn
FROM (SELECT policyid, limit1/*, pc.objectid ASpcob*/
FROM p_risk pr
LEFT JOIN p_item
ON pr.objectid=p_item.riskobjectid
LEFT JOIN p_coverage pc
ON p_item.objectid=pc.insuranceitemid) AS s
) AS SubQueryAlias
GROUP BY
policyid, limit1/*, pcob*/, rn
ORDER BY rn,policyid,limit1 DESC
The table at the end of the picture is what I'd like to have. The first table is the result of the query of Golden Linoff
If I understand correctly, you want the ROW_NUMBER() in the subquery and then to aggregate and filter in the outer query:
SELECT policyid, AVG(limit1) as avg_top2_limit1
FROM (SELECT policyid, limit1,
DENSE_RANK() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
) p
WHERE seqnum <= 2
GROUP BY policyid
thanks to previous comment! I succeed to do what I wanted. There is the query
select b.policyid, avg(b.limit1) as avg_top2_limit1 from(
SELECT distinct(policyid) policyid, limit1
FROM (SELECT policyid, limit1,
Dense_rank() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as
seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
WHERE seqnum <= 2 ) as b
GROUP BY policyid`

Select Top1 from multiple-column query for each product ID

I have a query in SQLServer that returns the last entry in our stock of a given product, as well as many other columns. Something like:
SELECT
TOP(1) EntryDate,
EntryPrice,
TaxID,
TransportCost,
...
FROM
StockEntries
WHERE
ProductID = #ID
ORDER BY
EntryDate DESC
I cannot use MAX to get the last entry because sometimes it returns duplicate rows (when there are two entries at the same day).
I would like to execute this query for every product we have. I could do this if the query returned only 1 row, such as:
SELECT
ProductID p,
(
SELECT
TOP(1) s.EntryDate
FROM
StockEntries s
WHERE
s.ProductID = p.ProductID
ORDER BY
s.EntryDate DESC
)
FROM
Products p
But as it returns multiple rows, I cannot see a straight way to do this.
Any ideas?
As you have phrased the question, cross apply seems very appropriate:
SELECT p.*, s.*
FROM products p CROSS APPLY
(SELECT TOP(1) s.*
FROM StockEntries s
WHERE s.ProductID = p.ProductID
ORDER BY s.EntryDate DESC
) s;
APPLY also allows you to select other columns from StockEntries.
you can use ROW_NUMBER() to rank each row and then just get the rows with the highest entry date per product.
SELECT *
FROM (SELECT p.productid,
s.EntryDate,
s.EntryPrice,
s.TaxID,
s.TransportCost,
ROW_NUMBER() OVER (PARTITION BY p.productid ORDER BY s.entrydate DESC) rownum
FROM products p
JOIN StockEntries s ON s.ProductID = p.ProductID
) t
WHERE rownum = 1

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory