Sql query for special record - sql

At the first excuse me for my bad english.
I have two tables:
master table:
| product id | pr_name | remain_Qty |
| 1 | x | 13 |
| 2 | y | 18 |
| 3 | z | 21 |
Detail Table (This table contain detail data of bought product):
| date | pr_id | Qty |price |
| 2010-01-01 | 1 | 3 | 1000 |
| 2010-01-02 | 1 | 5 | 1200 |
| 2010-01-01 | 2 | 11 | 1100 |
| 2010-01-03 | 1 | 4 | 1400 |
| 2010-01-04 | 3 | 3 | 1300 |
| 2010-01-01 | 2 | 6 | 1600 |
| 2010-01-03 | 1 | 7 | 1700 |
| 2010-01-02 | 3 | 3 | 1300 |
| 2010-01-01 | 3 | 5 | 1500 |
| 2010-01-04 | 3 | 7 | 1700 |
| 2010-01-06 | 2 | 8 | 1800 |
| 2010-01-07 | 2 | 4 | 1400 |
| 2010-01-03 | 1 | 3 | 1300 |
| 2010-01-04 | 3 | 6 | 1600 |
| 2010-01-08 | 1 | 1 | 1100 |
sum Qty of product 1 = 23
sum Qty of product 2 = 29
sum Qty of product 3 = 21
As a result I want list of the Details table, where the list is sorted by pr_id , date and price, but the sum(Qty) per pr_id don't exceed the remain_Qty of the product_id of the Master table.
For example:
| date | pr_id | Qty |price |
| 2010-01-01 | 1 | 3 | 1000 |
| 2010-01-02 | 1 | 5 | 1200 |
| 2010-01-03 | 1 | 4 | 1400 |
| 2010-01-03 | 1 | 1 | 1700 |
| 2010-01-01 | 2 | 11 | 1100 |
| 2010-01-01 | 2 | 6 | 1600 |
| 2010-01-01 | 3 | 5 | 1500 |
| 2010-01-02 | 3 | 3 | 1300 |
| 2010-01-04 | 3 | 3 | 1300 |
| 2010-01-04 | 3 | 7 | 1700 |

More of a clarification than a direct SQL answer. But what it LOOKS like they may be wanting is based on an inventory being depleted to fill orders from the known available quantity, but even that falls short as the may be missing a second qty of 3 on 2010-01-03 for product 1... which if looking at just ID=1 from his sample data would show...
| date | pr_id | Qty |price | Qty Available to fill order
| 2010-01-01 | 1 | 3 | 1000 | 13 - 3 = 10 avail next order
| 2010-01-02 | 1 | 5 | 1200 | 10 - 5 = 5 avail next order
| 2010-01-03 | 1 | 3 | 1300 | 5 - 3 = 2 avail next order
| 2010-01-03 | 1 | 4 | 1400 | only 2 to PARTIALLY fill this order
| 2010-01-03 | 1 | 7 | 1700 | none available
| 2010-01-08 | 1 | 1 | 1100 | none available
With the extra sample record removed, would result in...
| date | pr_id | Qty |price | Qty Available to fill order
| 2010-01-01 | 1 | 3 | 1000 | 13 - 3 = 10 avail next order
| 2010-01-02 | 1 | 5 | 1200 | 10 - 5 = 5 avail next order
| 2010-01-03 | 1 | 4 | 1400 | 5 - 4 = 1 avail for next order
| 2010-01-03 | 1 | 7 | 1700 | only 1 of the 7 available
| 2010-01-08 | 1 | 1 | 1100 | no more available...
So Aliasghar, does this better represent what you are trying to do??? Fill the available orders based on which order was entered into the system first, fill as many as possible based on inventory and stop there?
Please confirm by adding comment to this answer and maybe we can help resolve... Also, confirm WHICH Database are you using... SQL-Server, Oracle, MySQL, etc...

Here a working query for pr_id=1 , I used MySql:
select final.pr_date, final.pr_id, count(t_qty) as qty, final.price from
(select * FROM (select q.pr_date, q.pr_id, 1 as t_qty, q.price , #t := #t + t_qty total
SELECT d.pr_date, d.pr_id, 1 as t_qty, d.price
FROM detail_table d
JOIN generator_4k i
ON i.n between 1 and d.qty
WHERE d.pr_id= 1
Order by d.id, d.pr_date) q
CROSS JOIN (SELECT #t := 0) i) c
WHERE c.total <= (select remain_qty from master_table WHERE product_id = 1)) final
group by final.pr_date , final.pr_id , final.price ;
You have to adapt your detail_table to add a technical id as primary key and create some views, I renamed the date column as pr_date, You'll find the schema on the sql fiddle.
Here another query Using SQL SERVER
select final.pr_date, final.pr_id, count(t_qty) as qty, final.price from
(SELECT top(select remain_qty from master_table WHERE product_id = 1) d.pr_date, d.pr_id, 1 as t_qty, d.price
FROM detail_table d
JOIN generator_4k i
ON i.n between 1 and d.qty
WHERE d.pr_id= 1
Order by d.id, d.pr_date) final
group by final.pr_date , final.pr_id , final.price ;

Daywise product by info
Beneath a suggested statement.
select t2.date,t2.pr_id,t1.pr_name,sum(qty) as qty_buy,sum(price) as amount from master_table as t1
inner join detail_table as t2 on t1.product_id=t2.pr_id
group by t2.date,t2.pr_id
order by t1.date,t2.pr_id

I had a hard time to understand what you really wanted.
So if I understood well, you want some data that correspond to a product but do not go over your remained item.
So I coudn't bypass yet the first query that goes over, and only take the remaining from it.
So my query for now just stop until it s get to the remained items allowed
To be able to do what you want, you need to first create a view that generate row based on your quantity.
Like something like
> +--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
| 2010-01-01 | 1 | 3 | 1000 |
turn into something like
> +--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
| 2010-01-01 | 1 | 1 | 1000
| 2010-01-01 | 1 | 1 | 1000 |
| 2010-01-01 | 1 | 1 | 1000 |
Then you count your rows until your remained item allows you to do it.
After you regroup all of the row by price,pr_id and date.


Query Different Condition With 1 Column

I have table like :
| cd_hs | cd_cnt | name_cnt | dates | value |
| 1 | 1 | aaa | 2018-06-01 | 50 |
| 1 | 2 | bbb | 2018-07-01 | 150 |
| 1 | 3 | ccc | 2018-08-01 | 20 |
| 1 | 1 | aaa | 2018-06-02 | 40 |
| 1 | 2 | bbb | 2018-07-02 | 70 |
| 1 | 3 | ccc | 2018-08-02 | 80 |
Actually I have more data but I am just show the sample and what I want to do is
I want to group by cd_hs, name_cnt and based on year in dates column and do sum(value) but I have the 2 condition. First is to show value with condition cd_cnt with 1 and 2 and second condition cd_cnt without 1 and 2 so meaning I have much value other than 1 and 2 and do aliasing as other in one column
Expected Result :
| cd_hs | year | name_cnt | total_value |
| 1 | 2018 | aaa | 90 |
| 1 | 2018 | bbb | 220 |
| 1 | 2018 | other | 100 |
how can I do that? I am new in query and don't know what to do..
Your question is a bit confusing considering your spec doesn't seem to exactly line up with what you requested.
If the sample result you've provided is actually what you're looking for, a simple SUM and GROUP BY should do the trick here:
SELECT cd_hs, EXTRACT(YEAR from dates) as year, name_cnt, SUM(value_)
FROM foo
GROUP BY cd_hs, EXTRACT(YEAR from dates), name_cnt
| cd_hs | year | name_cnt | sum |
| 1 | 2018 | aaa | 90 |
| 1 | 2018 | bbb | 220 |
| 1 | 2018 | ccc | 100 |
Since you mentioned you wanted two different totals with two separate conditions, you could use JOIN in conjunction with some well-crafted subqueries:
SELECT a.cd_hs, EXTRACT(YEAR FROM a.dates), a.name_cnt, COALESCE(b.total_a, 0) as "Total A", COALESCE(c.total_b, 0) as "Total B"
FROM foo a
SELECT b.cd_hs, b.name_cnt, EXTRACT(YEAR FROM b.dates), SUM(value_) as total_a
FROM foo b
WHERE b.cd_cnt NOT IN (1, 2)
GROUP BY b.cd_hs, b.name_cnt, EXTRACT(YEAR from b.dates)
) b ON a.cd_hs = b.cd_hs AND a.name_cnt = b.name_cnt
SELECT c.cd_hs, c.name_cnt, EXTRACT(YEAR FROM c.dates), SUM(value_) as total_b
FROM foo c
WHERE c.cd_cnt IN (1, 2)
GROUP BY c.cd_hs, c.name_cnt, EXTRACT(YEAR from c.dates)
) c ON a.cd_hs = c.cd_hs AND a.name_cnt = c.name_cnt
This particular solution is readable and will get you to the correct end result but will most likely not be scalable in its current form.
| cd_hs | date_part | name_cnt | Total A | Total B |
| 1 | 2018 | aaa | 0 | 90 |
| 1 | 2018 | bbb | 0 | 220 |
| 1 | 2018 | ccc | 100 | 0 |
| 1 | 2018 | aaa | 0 | 90 |
| 1 | 2018 | bbb | 0 | 220 |
| 1 | 2018 | ccc | 100 | 0 |

windowing function avg in Hive with - over (order by colName)

i'm trying to understand how windowing function avg works, and somehow it seems to not be working as i expect.
here is the dataset :
select * from winsales;
| winsales.salesid | winsales.dateid | winsales.sellerid | winsales.buyerid | winsales.qty | winsales.qty_shipped |
| 30001 | NULL | 3 | b | 10 | 10 |
| 10001 | NULL | 1 | c | 10 | 10 |
| 10005 | NULL | 1 | a | 30 | NULL |
| 40001 | NULL | 4 | a | 40 | NULL |
| 20001 | NULL | 2 | b | 20 | 20 |
| 40005 | NULL | 4 | a | 10 | 10 |
| 20002 | NULL | 2 | c | 20 | 20 |
| 30003 | NULL | 3 | b | 15 | NULL |
| 30004 | NULL | 3 | b | 20 | NULL |
| 30007 | NULL | 3 | c | 30 | NULL |
| 30001 | NULL | 3 | b | 10 | 10 |
When i fire the following query ->
select salesid, sellerid, qty, avg(qty) over (order by sellerid) as avg_qty from winsales order by sellerid,salesid;
I get the following ->
| salesid | sellerid | qty | avg_qty |
| 10001 | 1 | 10 | 20.0 |
| 10005 | 1 | 30 | 20.0 |
| 20001 | 2 | 20 | 20.0 |
| 20002 | 2 | 20 | 20.0 |
| 30001 | 3 | 10 | 18.333333333333332 |
| 30001 | 3 | 10 | 18.333333333333332 |
| 30003 | 3 | 15 | 18.333333333333332 |
| 30004 | 3 | 20 | 18.333333333333332 |
| 30007 | 3 | 30 | 18.333333333333332 |
| 40001 | 4 | 40 | 19.545454545454547 |
| 40005 | 4 | 10 | 19.545454545454547 |
Question is - how is the avg(qty) being calculated.
Since i'm not using partition by, i would expect the avg(qty) to be the same for all rows.
Any ideas ?
if you want to have same avg(qty) to get for all rows then remove order by sellerid in over clause, then you are going to have 19.545454545454547 value for all the rows.
Query to get same avg(qty) for all rows:
hive> select salesid, sellerid, qty, avg(qty) over () as avg_qty from winsales order by sellerid,salesid;
If we include order by sellerid in over clause then you are getting cumulative avg is caluculated for each sellerid.
i.e. for
sellerid 1 you are having 2 records total 2 records with qty as 10,30 so avg would be
sellerid 2 you are having 2 records total 4 records with qty as 20,20 so avg would be
(10+30+20+20)/4 = 20.0
sellerid 3 you are having 5 records total 9 records with qty as so 10,10,15,20,30 avg would be
(10+30+20+20+10+10+15+20+30)/9 = 18.333
sellerid 4 avg is 19.545454545454547
when we include over clause then this is an expected behavior from hive.

Return the row with the value of the previous row within the same group (Oracle Sql)

I have a tabel that looks like this:
| Head | ID | Amount | Rank |
| 1 | 10 | 1000 | 1 |
| 1 | 11 | 1200 | 2 |
| 1 | 12 | 1500 | 3 |
| 2 | 20 | 3400 | 1 |
| 2 | 21 | 3600 | 2 |
| 2 | 22 | 4200 | 3 |
| 2 | 23 | 1700 | 4 |
I want a new column (New_column) that does the following:
| Head | ID | Amount | Rank | New_column |
| 1 | 10 | 1000 | 1 | 1000 |
| 1 | 11 | 1200 | 2 | 1000 |
| 1 | 12 | 1500 | 3 | 1200 |
| 2 | 20 | 3400 | 1 | 3400 |
| 2 | 21 | 3600 | 2 | 3400 |
| 2 | 22 | 4200 | 3 | 3600 |
| 2 | 23 | 1700 | 4 | 4200 |
Within each Head number, if rank is not 1, takes the amount of row within the Head number with Rank number before it (Rank 2 takes the amount of Rank 1 within the same Head and Rank 3 takes the amount of Rank 2 within the same Head and so on...)
I know how to fix it with a For loop in other programming languages but Don't know how to do it with SQL.
I think you basically want lag():
select t.*,
lag(amount, 1, amount) over (partition by head order by rank) as new_column
from t;
The three-argument form of lag() allows you to provide a default value.
You can join the same table(subquery) on rank-1 of derived table.
select t1.*,case when t1.rank=1 then amount else t2.amount new_amount
from your_table t1 left join (select Head,ID,Amount,Rank from your_table) t2
on t1.head=t2.head and t1.rank=t2.rank-1
You can use this update:
UPDATE your_table b
SET New_column = CASE WHEN rank = 1 then Amount
ELSE (select a.Amount FROM your_table a where a.ID = b.ID and a.rank = b.rank-1) END

Rolling total with no sub-select and no vendor specific extensions

What I'm trying to achieve: rolling total for quantity and amount for a given day, grouped by hour.
It's easy in most cases, but if you have some additional columns (dir and product in my case) and you don't want to group/filter on them, that's a problem.
I know there are extensions in Oracle and MSSQL specifically for that, and there's SELECT OVER PARTITION in Postgres.
At the moment I'm working on an app prototype, and it's backed by MySQL, and I have no idea what it will be using in production, so I'm trying to avoid vendor lock-in.
The entrire table:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
ORDER BY date, hour;
| id | dir | product | date | hour | quantity | amount |
| 2230 | 65 | ABCDEDF | 2014-09-11 | 1 | 1 | 10 |
| 2231 | 64 | ABCDEDF | 2014-09-11 | 3 | 4 | 40 |
| 2232 | 64 | ABCDEDF | 2014-09-11 | 5 | 5 | 50 |
| 2235 | 64 | ZZ | 2014-09-11 | 7 | 6 | 60 |
| 2233 | 64 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2237 | 66 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
For a given date:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
WHERE date = '2014-09-18'
ORDER BY hour;
| id | dir | product | date | hour | quantity | amount |
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
The results that I need, using sub-select:
> SELECT date, hour, SUM(quantity),
( SELECT SUM(quantity) FROM sales s2
WHERE s2.hour <= s1.hour AND s2.date = s1.date
) AS total
FROM sales s1
WHERE s1.date = '2014-09-18'
GROUP by date, hour;
| date | hour | sum(quantity) | total |
| 2014-09-18 | 3 | 3 | 3 |
| 2014-09-18 | 5 | 2 | 5 |
| 2014-09-18 | 7 | 3 | 8 |
My concerns for using sub-select:
once there are round million records in the table, the query may become too slow, not sure if it's subject to optimizations even though it has no HAVING statements.
if I had to filter on a product or dir, I will have to put those conditions to both main SELECT and sub-SELECT too (WHERE product = / WHERE dir =).
sub-select only counts a single sum, while I need two of them (sum(quantity) и sum(amount)) (ERROR 1241 (21000): Operand should contain 1 column(s)).
The closest result I were able to get using JOIN:
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount, s2.id
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
| ih | date | hour | quantity | amount | id |
| 3 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 3 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 3 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 5 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 5 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 7 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 7 | 2014-09-18 | 7 | 3 | 300 | 2229 |
| 7 | 2014-09-18 | 3 | 1 | 11 | 2234 |
I could stop here and just use those results to group by ih (hour), calculate the sum for quantity and amount and be happy. But something eats me up telling that this is wrong.
If I remove DISTINCT most rows become to be duplicated. Replacing JOIN with its invariants doesn't help.
Once I remove s2.id from statement you get a complete mess with disappearing/collapsion meaningful rows (e.g. ids 2236/2227 got collapsed):
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
| ih | date | hour | quantity | amount |
| 3 | 2014-09-18 | 3 | 1 | 100 |
| 3 | 2014-09-18 | 3 | 1 | 11 |
| 5 | 2014-09-18 | 3 | 1 | 100 |
| 5 | 2014-09-18 | 5 | 2 | 200 |
| 5 | 2014-09-18 | 3 | 1 | 11 |
| 7 | 2014-09-18 | 3 | 1 | 100 |
| 7 | 2014-09-18 | 5 | 2 | 200 |
| 7 | 2014-09-18 | 7 | 3 | 300 |
| 7 | 2014-09-18 | 3 | 1 | 11 |
Summing doesn't help, and it adds up to the mess.
First row (hour = 3) should have SUM(s2.quantity) equal 3, but it has 9. What does SUM(s1.quantity) shows is a complete mystery to me.
> SELECT DISTINCT(s1.hour) AS hour, sum(s1.quantity), s2.date, SUM(s2.quantity)
FROM sales s1 JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
GROUP BY hour;
| hour | sum(s1.quantity) | date | sum(s2.quantity) |
| 3 | 9 | 2014-09-18 | 9 |
| 5 | 8 | 2014-09-18 | 5 |
| 7 | 15 | 2014-09-18 | 8 |
Bonus points/boss level:
I also need a column that will show total_reference, the same rolling total for the same periods for a different date (e.g. 2014-09-11).
If you want a cumulative sum in MySQL, the most efficient way is to use variables:
SELECT date, hour,
(#q := q + #q) as cumeq, (#a := a + #a) as cumea
FROM (SELECT date, hour, SUM(quantity) as q, SUM(amount) as a
FROM sales s
WHERE s.date = '2014-09-18'
GROUP by date, hour
) dh cross join
(select #q := 0, #a := 0) vars
ORDER BY date, hour;
If you are planning on working with databases such as Oracle, SQL Server, and Postgres, then you should use a database more similar in functionality and that supports that ANSI standard window functions. The right way to do this is with window functions, but MySQL doesn't support those. Postgres, SQL Server, and Oracle all have free versions that yo can use for development purposes.
Also, with proper indexing, you shouldn't have a problem with the subquery approach, even on large tables.

Query to find the first date after a specific grouped sum value

I have an article table that holds the current stock for each article. I need to know the last date when new stock has arrived, after running out of stock for that specific article.
The table looks like this.
| ArticleID | StockDate | Stock |
| 1 | 1/1/2012 10:15 | 100 |
| 1 | 2/1/2012 13:39 | -50 |
| 1 | 2/1/2012 15:17 | -50 |
| 1 | 4/1/2012 08:05 | 100 |
| 2 | 5/1/2012 09:48 | 50 |
| 1 | 6/1/2012 14:21 | -25 |
| 1 | 7/1/2012 16:01 | 10 |
| 2 | 8/1/2012 13:42 | -10 |
| 1 | 9/1/2012 09:56 | -85 |
| 1 | 13/1/2012 08:12 | 100 |
| 1 | 13/1/2012 10:50 | -15 |
The output should look like this.
| ArticleID | StockDate |
| 2 | 5/1/2012 09:48 |
| 1 | 13/1/2012 08:12 |
How did i get this output? ArticleID 1 had a 100 in stock but reached 0 for the first time on 2/1/2012 15:17. Then new stock arrived and it hit 0 again at 9/1/2012 09:56. So the result should shows the first date after the last empty stock grouped by ArticleID. ArticleID 2 didn't had a 0 point, so the first stock date is shown.
I need a result set that can be joined with other queries. So a Stored Procedure does not work for me.
select ArticleID,stockdate from
select t.ArticleID, t.stockdate, ROW_NUMBER() Over (partition by t.articleid order by v.articleid desc, stockdate) rn
from yourtable t
left join
select ArticleID, MAX(stockdate) as msd from yourtable t1
cross apply (select sum(stock) as stockrt from yourtable where stockdate<=t1.stockdate and ArticleID=t1.ArticleID) rt
where stockrt = 0
group by articleid
) v
on t.ArticleID = v.ArticleID
and t.stockdate>v.msd
) v
where rn=1