greatest N per group with padding - sql

I've been trying to solve this problem over the weekend, without luck so far. I have two tables:
TopOffers:
OfferId RetailerId Order
1 38 0
2 8 3
3 17 2
4 22 1
And Offers:
Id RetailerId Name Description etc...
1 3 Strawberry Red and smelly
2 38 Cookie Crunchy
3 17 Onion Of the nice kind
4 22 Apple Cheap
5 8 Toothbrush Lasts extra long!
My goal is to get the top 10 Offers for each Retailer ID. The order in which they should be listed is specified by the Order field in the TopOffer table (Sort order is Ascending). On top of that, the result should be padded to 10 offers when there are less than 10 TopOffer records for a retailer. The TopOffer table always contains 10 or less records per retailer.
So far I've managed to get this going, which works (I realize it doesn't get the top 10, but rather everything that's in the TopOffer table, which is alright, since the TopOffer table is always equal to or smaller than the top 10 for any retailer):
SELECT b.*
FROM
(
SELECT o.Id, to.`Order` FROM Offer AS o
LEFT JOIN TopOffer AS to
ON o.Id = to.OfferId
) AS a,
(
SELECT o.*, to.`Order` FROM Offer AS o
LEFT JOIN TopOffer AS to
ON o.Id = to.OfferId
) AS b
WHERE a.`Order` >= b.`Order` AND a.Id = b.Id
GROUP BY b.RetailerId, b.Id
HAVING Count(1) BETWEEN 1 AND 10
ORDER BY RetailerId, `Order` ASC
Unfortunately I can't seem to find any way of padding the result of this query with offers that don't have an entry in the TopOffer table if there aren't 10 TopOffer records for that retailer.
My sincerest thanks in advance for any help!

If you create a virtual table with numbers 1-10 you can left join to your results to get 10 of each
select number, results.*
from
(select 1 as number union select 2 union select 3 ... union select 10) numbers
left join
(your query here) results
on numbers.number = results.rank

Related

How to avoid aggregate functions in recursive query's recursive term

I am having trouble getting around a sum() in the recursive term. Basically my problem is this.
Lets say 3 different finish products. 'ABC1', 'ABC2', 'ABC3' every one of them is made from 'ABC'. Every 'ABC' is made from 'AB'. Every 'AB' is made from 'A'. I went out and sold 10 of each 'ABC1', 'ABC2', 'ABC3'
I am trying to make a query give me a list of each item and how much I need of that item based on how much I have sold.
This is an example of the return that I am looking for
Item
Level
Sold
On Hand
Required
A
0
0
0
15
AB
1
0
10
25
ABC
2
10
0
25
ABC1
3
10
5
10
ABC2
3
10
5
10
ABC3
3
10
5
10
For a general table structure you would have
Item
item_id
item_onhand
AND
BOM
bom_product_id
bom_material_id
AND
Sales
sale_id
sale_item_id
sale_qty
I cant start at the top and go down in my case. because the dataset takes too long to process. So I have to start with all the sales and work up the tree from there.
My idea was to create a result for each level.
And then recursively go up the material tree. Something along the lines of
WITH RECURSIVE sales_req AS(
SELECT item_id,
SUM(sale_qty) AS sales_req_sold,
item_onhand AS sales_req_qoh
FROM sales JOIN item ON sales_item_id = item_id
GROUP BY item_id
UNION
SELECT
item_id,
SUM(sales_req_sold - sales_req_qoh),
item_onhand
FROM
bom
JOIN sales_req ON bom_product_id = sales_req.item_id
JOIN item mat ON bom_material_id = mat.item_id
WHERE sales_req_sold > sales_req_qoh
The first Query Returning Something Like this
Item
Required
ABC
10
ABC1
10
ABC2
10
ABC3
10
And The recursive portion returning something like this
Item
Required
Notes
ABC
15
( The sum of sales for "ABC1,ABC2,ABC3" minus the inventory for each one)
AB
25
( The sum of ABC requirements from 1,2 and 3 Plus the requirement for the sale of ABC)
A
15
( AB Minus the inventory on hand for AB)
I need some sort of alternate solution to sum function. However there are a few constraints. I have to start with the sales table. I cannot put a limit on the levels. In this example I have 4 levels and only one level has multiple parts on it. But there could be 7 levels and each level could have 3 parts on it. I can assume the top level to be 1 single item.
try this :
WITH RECURSIVE req AS(
SELECT item_id, item_onhand, SUM(sale_qty) AS item_sales
FROM sales INNER JOIN item ON sale_item_id = item_id
GROUP BY item_id, item_onhand
), accum (item_id, item_onhand, item_sales, item_req, level) AS (
SELECT item_id, item_onhand, item_sales, item_sales, 0
FROM req
UNION ALL
SELECT b.bom_product_id, a.item_onhand, a.item_sales, a.item_sales - a.item_onhand, a.level - 1
FROM accum AS a
INNER JOIN bom AS b ON b.bom_material_id = a.item_id
)
SELECT r.item_id, min(a.level) AS level, r.item_onhand AS on_hand, r.item_sales AS sold, sum(item_req) AS required
FROM accum AS a
INNER JOIN req AS r ON r.item_id = a.item_id
GROUP BY r.item_id, r.item_onhand, r.item_sales
ORDER BY level
see test result in https://dbfiddle.uk/J7PMY1fZ[enter link description here]1

ID appears twice in query when multiplied with different Prices

I am trying to get the top 5 stations by Sales,
but I ran into the problem that one station appears twice if multiplied by a different price.
This is my query:
SELECT distinct b_id, count(t_start_id) * v_preis AS START_PRICE
FROM bahnhof
INNER JOIN tickets
ON t_start_id = b_id
INNER JOIN connections
ON t_connection_id = v_id
GROUP BY b_id, v_preis
ORDER BY START_PRICE DESC LIMIT 5;
Which gives me the following result:
b_id
START_PRICE
7
75
6
50
4
30
1
16
1
15
What i need though is:
b_id
START_PRICE
7
75
6
50
1
31
4
30
I tried to group by ID only, but it didn't work since v_preis had to be in there too.
The price for 1 is 8 twice and 15 once, so I guess I have a problem with using different rows for one result.
I'm pretty new to SQL, so I'm sorry if this is a dumb question,
thank you in advance!
Did you try using SUM() aggregation along with only grouping by id?
SELECT DISTINCT b_id, SUM(v_preis) AS start_price
FROM bahnhof
JOIN tickets
ON t_start_id = b_id
JOIN connections
ON t_connection_id = v_id
GROUP BY b_id
ORDER BY START_PRICE DESC
LIMIT 5;

SQL counting query

Sorry if this is a basic question.
Basically, I have a table that is as follows, below is a basic sample
store-ProdCode-result
13p I10x 5
13p I20x 7
13p I30x 8
14a K38z 23
17a K38z 23
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
All help is appreciated.
Thanks
The expected output should be: for every store(s_code) display the top 10 prodcode
so:
store--prodcode--result
1a abc 5
1a abd 4
2a dgf 1
2a ldk 6
.(10 times until next store code)
You can use the table twice in the FROM clause, once for the data, and once to get a count of how many records have fewer results for that store.
SELECT a.s_code, a.prod_code, count(*)
FROM top10_secondary a
LEFT OUTER JOIN top10_secondary b
ON a.s_code = b.s_code
AND b.result < a.result
GROUP BY a.s_code, a.prod_code
HAVING count(*) < 10
With this technique though, you may get more than 10 records per store if the 10th result value exists multiple times. Because the limit rule is simply "include record as long as there are less than 10 records with result values than mine"
It looks like in your case, "result" is a ranking, so they would not be duplicated per store.
This is a good case for Window functions.
SELECT
s_code,
prod_code,
prod_count
FROM
(
SELECT
s_code,
prod_code,
prod_count,
RANK() OVER (PARTITION BY s_code ORDER BY prod_Count DESC) as prod_rank
FROM
(SELECT s_code as store, prod_code, count(prod_Code) prod_count FROM table GROUP BY s_code, prod_code) t1
) t2
WHERE prod_rank <= 10
The inner most query gets the count of each product at the store. The second inner more query determines the rank for those products for each store based on that count. Then the outer most query limits the results based on that rank.
o

SQL. how to compare values and from two table, and report per-row results

I have two Tables.
table A
id name Size
===================
1 Apple 7
2 Orange 15
3 Banana 22
4 Kiwi 2
5 Melon 28
6 Peach 9
And Table B
id size
==============
1 14
2 5
3 31
4 9
5 1
6 16
7 7
8 25
My desired result will be (add one column to Table A, which is the number of rows in Table B that have size smaller than Size in Table A)
id name Size Num.smaller.in.B
==============================
1 Apple 7 2
2 Orange 15 5
3 Banana 22 6
4 Kiwi 2 1
5 Melon 28 7
6 Peach 9 3
Both Table A and B are pretty huge. Is there a clever way of doing this. Thanks
Use this query it's helpful
SELECT id,
name,
Size,
(Select count(*) From TableB Where TableB.size<Size)
FROM TableA
The standard way to get your result involves a non-equi-join, which will be a product join in Explain. First duplicating 20,000 rows, followed by 7,000,000 * 20,000 comparisons and a huge intermediate spool before the count.
There's a solution based on OLAP-functions which is usually quite efficient:
SELECT dt.*,
-- Do a cumulative count of the rows of table #2
-- sorted by size, i.e. count number of rows with a size #2 less size #1
Sum(CASE WHEN NAME = '' THEN 1 ELSE 0 end)
Over (ORDER BY SIZE, NAME DESC ROWS Unbounded Preceding)
FROM
( -- mix the rows of both tables, an empty name indicates rows from table #2
SELECT id, name, size
FROM a
UNION ALL
SELECT id, '', size
FROM b
) AS dt
-- only return the rows of table #1
QUALIFY name <> ''
If there are multiple rows with the same size in table #2 you better count before the Union to reduce the size:
SELECT dt.*,
-- Do a cumulative sum of the counts of table #2
-- sorted by size, i.e. count number of rows with a size #2 less size #1
Sum(CASE WHEN NAME ='' THEN id ELSE 0 end)
Over (ORDER BY SIZE, NAME DESC ROWS Unbounded Preceding)
FROM
( -- mix the rows of both tables, an empty name indicates rows from table #2
SELECT id, name, size
FROM a
UNION ALL
SELECT Count(*), '', SIZE
FROM b
GROUP BY SIZE
) AS dt
-- only return the rows of table #1
QUALIFY NAME <> ''
There is no clever way of doing that, you just need to join the tables like this:
select a.*, b.size
from TableA a join TableB b on a.id = b.id
To improve performance you'll need to have indexes on the id columns.
maybe
select
id,
name,
a.Size,
sum(cnt) as sum_cnt
from
a inner join
(select size, count(*) as cnt from b group by size) s on
s.size < a.size
group by id,name,a.size
if you're working with large tables. Indexing table b's size field could help. I'm also assuming the values in table B converge, that there's many duplicates you don't care about, other than you want to count them.
sqlfiddle
#Ritesh solution is perfectly correct, another similar solution is using CROSS JOIN as shown below
use tempdb
create table dbo.A (id int identity, name varchar(30), size int );
create table dbo.B (id int identity, size int);
go
insert into dbo.A (name, size)
values ('Apple', 7)
,('Orange', 15)
,('Banana', 22)
,('Kiwi', 2 )
,('Melon', 28)
,('Peach', 6 )
insert into dbo.B (size)
values (14), (5),(31),(9),(1),(16), (7),(25)
go
-- using cross join
select a.*, t.cnt
from dbo.A
cross apply (select cnt=count(*) from dbo.B where B.size < A.size) T(cnt)
try this query
SELECT
A.id,A.name,A.size,Count(B.size)
from A,B
where A.size>B.size
group by A.size
order by A.id;

sql server : select rows who's sum matches a value [duplicate]

This question already has answers here:
How to get rows having sum equal to given value
(4 answers)
Closed 9 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
here is table T :-
id num
-------
1 50
2 20
3 90
4 40
5 10
6 60
7 30
8 100
9 70
10 80
and the following is a fictional sql
select *
from T
where sum(num) = '150'
the expected result is :-
(A)
id num
-------
1 50
8 100
(B)
id num
-------
2 20
7 30
8 100
(C)
id num
-------
4 40
5 10
8 100
the 'A' case is most preferred !
i know this case is related to combinations.
in real world - client gets items from a shop, and because of an agreement between him and the shop, he pay every Friday. the payment amount is not the exact total of items
for example: he gets 5 books of 50 € ( = 250 € ), and on Friday he bring 150 €, so the first 3 books are perfect match - 3 * 50 = 150. i need to find the id's of those 3 books !
any help would be appreciated!
You can use recursive query in MSSQL to solve this.
SQLFiddle demo
The first recursive query build a tree of items with cumulative sum <= 150. Second recursive query takes leafs with cumulative sum = 150 and output all such paths to its roots. Also in the final results ordered by ItemsCount so you will get preferred groups (with minimal items count) first.
WITH CTE as
( SELECT id,num,
id as Grp,
0 as parent,
num as CSum,
1 as cnt,
CAST(id as Varchar(MAX)) as path
from T where num<=150
UNION all
SELECT t.id,t.num,
CTE.Grp as Grp,
CTE.id as parent,
T.num+CTE.CSum as CSum,
CTE.cnt+1 as cnt,
CTE.path+','+CAST(t.id as Varchar(MAX)) as path
from T
JOIN CTE on T.num+CTE.CSum<=150
and CTE.id<T.id
),
BACK_CTE as
(select CTE.id,CTE.num,CTE.grp,
CTE.path ,CTE.cnt as cnt,
CTE.parent,CSum
from CTE where CTE.CSum=150
union all
select CTE.id,CTE.num,CTE.grp,
BACK_CTE.path,BACK_CTE.cnt,
CTE.parent,CTE.CSum
from CTE
JOIN BACK_CTE on CTE.id=BACK_CTE.parent
and CTE.Grp=BACK_CTE.Grp
and BACK_CTE.CSum-BACK_CTE.num=CTE.CSum
)
select id,NUM,path, cnt as ItemsCount from BACK_CTE order by cnt,path,Id
If you restrict your problem to "which two numbers add up to a value", the solution is as follows:
SELECT t1.id, t1.num, t2.id,t2.num
FROM T t1
INNER JOIN T t2
ON t1.id < t2.id
WHERE t1.num + t2.num = 150
If you also want the result for three and more numbers you can achieve that by using the above query as a base for recursive SQL. Don't forget to specify a maximum recursion depth!
To find the id's of the books that the client is paying, you would need to have a table with your clients, and another one to store the orders of the client, and what products he bought.
Otherwise it would be impossible to know what product the payment refers to.