SQL select x number of rows from table based on column value - sql

I'm looking for a way to select top 3 rows from 4 vendors from a table of products, following this criteria:
Must select 4 vendors.
Must select top 3 products for each vendor ordered by product rating.
I tried doing something like:
select top 12 * product, vendor
from products
order by productrating
but obvisously that goesn't give me 3 products for each vendor.
The product table has:
productid (int), productname (nvarchar(500)), productrating (float),
vendor (id), price (float).
These are the relevant columns.

You can use the ANSI standard row_number() function to get 3 products for each vendor:
select p.*
from (select p.*,
row_number() over (partition by vendor order by rating desc) as seqnum
from products p
) p
where p.seqnum <= 3
If you want 4 vendors:
select top 12 p.*
from (select p.*,
row_number() over (partition by vendor order by rating desc) as seqnum
from products p
) p
where p.seqnum <= 3
order by vendor;

This will give you top 3 Products per vendor. You didn't specify how you're selecting the 4 vendors. That logic could easily be included using the WHERE clause or using a different ORDER BY depending on how you select the 4 vendors.
SELECT TOP 12 vnd.Vendor, apl.ProductName
FROM Vendors vnd
CROSS APPLY (
SELECT TOP 3 ProductID
FROM Products prd
WHERE vnd.ProductID = prd.ProductID
ORDER BY prd.ProductRating DESC
) apl
ORDER BY vnd.VendorName

If you have fixed list of vendors you would like to query you can use the following approach:
SELECT TOP 3
p.ProductID
FROM Products p
WHERE p.ProductID IN ( SELECT v.ProductID
FROM Vendors v
WHERE v.VendorID IN (Vendor1ID, Vendor2ID, Vendor3ID, Vendor4ID)
ORDER BY p.ProductRating DESC
You will need to look for a work-around if you have vendor names to select them filtering out by its name but keeping the join coindition trhough its IDs.

Related

How to get row numbers without duplicating

I want to get row numbers from list. I tried ROW_NUMBER() and DENSE_RANK with different variations, getting just duplicated rows.
With my code (code is below) SQL returns list of all orders, which are including some product id's of order 20 (those three products '1013', '1024', '1025').
Problem is when I try to get row_numbers out of that list, it duplicate some rows because there are more than one product including in that order.
With my code it look like this:
Order_number
20
22
27
With ROW_NUMBER() it looks like this and that is problem:
Row_number | Order_number
1 20
2 20
3 20
4 22
5 27
6 27
I want it look like this:
Row_number | Order_number
1 20
2 22
3 27
SELECT DISTINCT ORDER_ID AS 'ORDERS, WHICH HAVE AT LEAST ONE PRODUCT OF ORDER 20'
FROM ORDERS
INNER JOIN STORAGE ON ORDERS.PRODUCT_ID = STORAGE.PRODUCT_ID
WHERE STORAGE.PRODUCT_ID IN ('1013', '1024', '1025');
I would suggest using exists, so you don't have to deal with duplicate elimination:
SELECT o.ORDER_ID
FROM ORDERS o
WHERE EXISTS (SELECT 1
FROM STORAGE S
WHERE o.PRODUCT_ID = s.PRODUCT_ID AND
s.PRODUCT_ID IN (1013, 1024, 1025)
);
With an index on STORAGE(ORDER_ID, PRODUCT_ID) this should have very good performance.
You can also do this directly using aggregation on STORAGE:
SELECT s.ORDER_ID
FROM STORAGE S
WHERE s.PRODUCT_ID IN (1013, 1024, 1025)
GROUP BY s.ORDER_ID;
1.Get your distinct ORDER_ID first, then number:
SELECT ROW_NUMBER() OVER (ORDER BY ORDER_ID), ORDER_ID
FROM
(
SELECT ORDER_ID
FROM ORDERS
INNER JOIN STORAGE ON ORDERS.PRODUCT_ID = STORAGE.PRODUCT_ID
WHERE STORAGE.PRODUCT_ID IN ('1013', '1024', '1025')
GROUP BY ORDER_ID
) dt
2.Don't join to STORAGE table, instead use a correlated subquery
SELECT ROW_NUMBER() OVER (PARTITION BY ORDER_ID), ORDER_ID
FROM ORDERS o
WHERE EXISTS
(
SELECT 1
FROM STORAGE s
WHERE s.PRODUCT_ID IN ('1013', '1024', '1025')
AND s.PRODUCT_ID = o.PRODUCT_ID
) dt
3.Use DENSE_RANK() (haven't tested since you don't say what RDBMS you are using, but it may work)
SELECT DISTINCT DENSE_RANK() OVER (ORDER BY ORDER_ID), ORDER_ID
FROM ORDERS
INNER JOIN STORAGE ON ORDERS.PRODUCT_ID = STORAGE.PRODUCT_ID
WHERE STORAGE.PRODUCT_ID IN ('1013', '1024', '1025')

SQL Select Group By Min() - but select other

I want to select the ID of the Table Products with the lowest Price Grouped By Product.
ID Product Price
1 123 10
2 123 11
3 234 20
4 234 21
Which by logic would look like this:
SELECT
ID,
Min(Price)
FROM
Products
GROUP BY
Product
But I don't want to select the Price itself, just the ID.
Resulting in
1
3
EDIT: The DBMSes used are Firebird and Filemaker
You didn't specify your DBMS, so this is ANSI standard SQL:
select id
from (
select id,
row_number() over (partition by product order by price) as rn
from orders
) t
where rn = 1
order by id;
If your DBMS doesn't support window functions, you can do that with joining against a derived table:
select o.id
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price;
Note that this will return a slightly different result then the first solution: if the minimum price for a product occurs more then once, all those IDs will be returned. The first solution will only return one of them. If you don't want that, you need to group again in the outer query:
select min(o.id)
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price
group by o.product;
SELECT ID
FROM Products as A
where price = ( select Min(Price)
from Products as B
where B.Product = A.Product )
GROUP BY id
This will show the ID, which in this case is 3.

SQL Server : select only last record per customer from a join query

Assume I have these 3 tables :
The first 2 tables define customers of different types ,i.e second table has other columns which are not included in table 1 i just left them the same to save complexity.
The third table defines orders for both types of customers . Each customer has more than one orders
I want to select the last order for every customer, i.e the order with order_id 4 for customer 1 which was created on 23/12/2016 and the order with order_id 5 for customer 2 which was created on 26/12/2016
I tried something like this :
select *
from customertype1
left join order on order.customer_id = customertype1.customer_id
order by order_id desc;
But this gives me multiple records for every customer, as I have stated above I want only the last order for every customertype1.
If you want the last order for each customer, then you only need the orders table:
select o.*
from (select o.*,
row_number() over (partition by customer_id order by datecreated desc) as seqnum
from orders o
) o
where seqnum = 1;
If you want to include all customers, then you need to combine the two tables. Assuming they are mutually exclusive:
with c as (
select customer_id from customers1 union all
select customer_id from customers2
)
select o.*
from c left join
(select o.*,
row_number() over (partition by customer_id order by datecreated desc) as seqnum
from orders o
) o
on c.customer_id = o.customer_id and seqnum = 1;
A note about your data structure: You should have one table for all customers. You can then define a foreign key constraint between orders and customers. For the additional columns, you can have additional tables for the different types of customers.
Use ROW_NUMBER() and PARTITION BY.
ROW_NUMBER(): it will give sequence no to your each row
PARTITION BY: it will group your data by given column
When you use ROW_NUMBER() and PARTITION BY both together then first partition by group your records and then row_number give then sequence no by each group, so for each group you have start sequence from 1
Help Link: Example of ROW_NUMBER() and PARTITION BY
This is the general idea. You can work out the details.
with customers as
(select customer_id, customer_name
from table1
union
select customer_id, customer_name
from table2)
, lastOrder as
(select customer_id, max(order_id) maxOrderId
from orders
group by customer_id)
select *
from lastOrder join customers on lastOrder.Customer_id = customers.customer_id
join orders on order_id = maxOrderId

Order By in subselect using concat and decode

Say I have two tables:
Product
product_id (other fields are of no concern)
Sku
product_id sku_id color_id color_name (other fields such as size but unimportant)
001 11 5 green
001 12 1 black
001 13 3 red
002 21 1 black
002 22 2 yellow
002 23 8 magenta
002 24 9 turquoise
I need to rewrite a query that gets a list of product ids with comma delimited lists for all colors/color ids associated with that product. The color ids/names must have the same order in both lists.
Desired output:
product_id colorIds colorNames
001 1,3,5 black,red,green
002 1,2,8,9 black,yellow,magenta,turquoise
Note that the concat list of color ids' order maps to the color names order.
Current output:
product_id colorIds colorNames
001 1,3,5 green,black,red -- out of order sometimes
002 1,2,8,9 black,yellow,magenta,turquoise
Currently used query:
select distinct(p.product_id) as product_id,
(select decode(dbms_lob.SubStr(wm_concat(DISTINCT color_name)),'NO COLOR','','No Color','','no color','',null,'',dbms_lob.SubStr(wm_concat(DISTINCT color_name))) as color_name from sku where product_id = p.product_id) as colorName,
(select decode(dbms_lob.SubStr(wm_concat(DISTINCT color_code)),'000','',dbms_lob.SubStr(wm_concat(DISTINCT color_code))) from sku where product_id = p.product_id) as colorCode
from product p;
I was thinking of just adding order by clauses in those sub selects, but the query just errors out, saying missing right parenthesis - oddly there seemed to be no mismatched parens. Any suggestions are welcome.
Edit *
The above query is highly simplified. In reality it joins with over a dozen other tables to get other data columns related to the product, most of which are non-aggregate pieces of data. The solution should have no group by clause in the main query or suggest a reasonable way to accommodate this requirement.
This might work for you:
SELECT p.product_id
, LISTAGG(s.color_id, ',') WITHIN GROUP ( ORDER BY s.color_id ) AS colorIds
, LISTAGG(s.color_name, ',') WITHIN GROUP ( ORDER BY s.color_id ) AS colorNames
FROM product p LEFT JOIN ( SELECT DISTINCT product_id, color_id, color_name FROM sku ) s
ON p.product_id = s.product_id
GROUP BY p.product_id
ORDER BY product_id
LISTAGG() can be sorted while WM_CONCAT() can't (and it's undocumented, etc.).
UPDATE per OP's comment about non-aggregate data:
WITH product_colors AS (
SELECT p.product_id
, LISTAGG(s.color_id, ',') WITHIN GROUP ( ORDER BY s.color_id ) AS colorIds
, LISTAGG(s.color_name, ',') WITHIN GROUP ( ORDER BY s.color_id ) AS colorNames
FROM product p LEFT JOIN ( SELECT DISTINCT product_id, color_id, color_name FROM sku ) s
ON p.product_id = s.product_id
GROUP BY p.product_id
)
SELECT t1.other_column, t2.other_column, etc.
FROM table1 t1 JOIN table2 t2 ON ...
JOIN product_colors pc ON ...
This will achieve the distinct effect (you cannot use distinct with listagg):
select product_id,
listagg(color_id, ',') within group(order by color_id) as colorids,
listagg(color_name, ',') within group(order by color_id) as colornames
from (select distinct product_id, color_id, color_name from sku)
group by product_id
If you want to show columns from the product table and/or you want to show products on the product table not on the sku table you can use:
select p.product_id,
listagg(s.color_id, ',') within group(order by s.color_id) as colorids,
listagg(s.color_name, ',') within group(order by s.color_id) as colornames
from product p
left join (select distinct product_id, color_id, color_name from sku) s
on p.product_id = s.product_id
group by p.product_id
Hi this might work as well .
select product_id,
listagg(color_id,',') within group(order by color_names) as color_ids,
listagg(color_names,',') within group (order by color_names) color_names
from sku
group by product_id;

Oracle Complex Sort - Multiple Children

I have a table as follows:
BRAND_ID PRODUCT_ID PRODUCT_DESC PRODUCT_TYPE
100 1000 Tools A
100 1500 Tools A
200 2000 Burgers B
300 3000 Clothing C
300 4000 Makeup D
300 5000 Clothing C
So a Brand can have multiple products, all of the same type or mixed types. If a brands products are all of the same type I need them first in the result, sorted by product type, followed by brands that have different product types. I can do this programatically but I wanted to see if there is a way to do it in the query.
I don't have access to Oracle, but I believe something along these lines should work...
WITH
ranked_data
AS
(
SELECT
COUNT(DISTINCT product_type) OVER (PARTITION BY brand_id) AS brand_rank,
MIN(product_type) OVER (PARTITION BY brand_id) AS first_product_type,
*
FROM
yourTable
)
SELECT
*
FROM
ranked_data
ORDER BY
brand_rank,
first_product_type,
brand_id,
product_type,
product_description
An alternative is to JOIN on to a sub-query to calculate the two sorting fields.
SELECT
yourTable.*
FROM
yourTable
INNER JOIN
(
SELECT
brand_id,
COUNT(DISTINCT product_type) AS brand_rank,
MIN(product_type) AS first_product_type,
FROM
yourTable
GROUP BY
brand_id
)
AS brand_summary
ON yourTable.brand_id = brand_summary.brand_id
ORDER BY
brand_summary.brand_rank,
brand_summary.first_product_type,
yourTable.brand_id,
yourTable.product_type,
yourTable.product_description
How about selecting from a sub-select that figures out number of distinct brands and then sorting by the count?
select t.BRAND_ID,
t.PRODUCT_ID,
t.PRODUCT_DESC,
t.PRODUCT_TYPE
from (select t2.BRAND_ID,
t2.PRODUCT_ID,
count(distinct t2.PRODUCT_TYPE) cnt
from YOURTABLE t2
group by t2.BRAND_ID, t2.PRODUCT_ID) data
join YOURTABLE t on t.BRAND_ID = data.BRAND_ID and t.PRODUCT_ID = data.PRODUCT_ID
order by data.cnt, BRAND_ID, PRODUCT_ID, PRODUCT_TYPE