Cumulative Total SQL Server - sql

I'm trying to get a cumulative running total by using a LAG function and SUM. The column I'm wanting to sum is adding row 1 + 2 together but it doesn't continue on by adding row 1, 2, 3, 4 etc. Once a "reset" amount is hit, the running total needs to go back to the reset amount times the Coinin amount on the same row.
Ultimately, I want to know at any given point in history how much a slot machines progression is for say level's 1 & 2 before and after a jackpot payout. (This query is just looking at level 1)
Select Distinct A.AID
,A.BID
,B.Level
,A.Date
,B.Reset
,B.Cap
,Description
,B.RateofProg
,A.Coinin
,LAG(B.RateofProg/100.00 * A.Coinin/100.00) OVER (order by AID, BID, Level) + SUM(B.RateofProg/100 * A.Coinin/100) as RunningTotal
,CASE When C.Eventcode = 10004500 THEN ProgressivePdAmt/100.00 Else 0 end as ProgressivePdAmt
From Payout A
Join Slot_Progression B
on A.Mnum = B.Mnum
Join Events C
on A.Date = C.Date
Where A.Mnum = '102026'
and level = '1'
and A.Coinin > '0'
Group by A.AID, A.BID, B.Level, A.Date, B.Reset, B.Cap, Description, C.ProgressivePdAmt, B.RateofProg, A.Coinin, C.Eventcode
Order by AID, BID, Level

The cumulative sum is calculated using sum() not lag(). Presumably you want something like this:
sum(B.RateofProg/100.00 * A.Coinin/100.00) OVER (order by AID, BID, Level) as RunningTotal

Related

PostgreSQL and matching row on multiple

I'm making a car statistic solution where I need to charge per kilometer driven.
I have the following table:
table: cars
columns: car_id, km_driven
table: pricing
columns: from, to, price
Content in my cars table can be:
car_id, km_driven
2, 430
3, 112
4, 90
Content on my pricing table can be:
from, to, price
0, 100, 2
101, 200, 1
201, null, 0.5
Meaning that the first 100 km cost 2USD per km, the next 100 km cost 1USD per km and everything above costs 0.5USD per km.
Is there a logic and simple way to calculate cost for my cars via PostgreSQL?
So if a car has driven ex. 201, then the price would be 100x2 + 100x1 + 0.5, not simply 201x0.5.
I would write the query as:
select c.car_id, c.km_driven,
sum(( least(p.to_km, c.km_driven) - p.from_km + 1) * p.price) as dist_price
from cars c join
pricing p
on c.km_driven >= p.from_km
group by c.car_id, c.km_driven;
Here is a db<>fiddle.
Modified from #sean-johnston's answer:
select
car_id, km_driven,
sum(case
when km_driven>=start then (least(finish,km_driven)-start+1)*price
else 0
end) as dist_price
from cars,pricing
group by car_id,km_driven
Original ranges kept
where km_driven >= start omitted (its optional but might improve performance)
fiddling a bit more, case can be omitted when where is in place
select
car_id, km_driven,
sum((least(finish,km_driven)-start+1)*price) as dist_price
from cars,pricing
where km_driven >= start
group by car_id,km_driven
dbfiddle
Judicious use of case/sum combinations. However, firstly need to make your ranges consistent. I'll choose to change the first range to 1,100. Given that then the following should give you want you're after. (I've also used start/finish as from/to are reserved words).
select
car_id, km_driven,
sum (case
when finish is null and km_driven >= start
then (km_driven-start+1) * price
when km_driven >= start
then (case
when (km_driven - start + 1) > finish
then (finish - start + 1)
else (km_driven - start + 1)
end) * price
else 0
end) as dist_price
from cars, pricing
where km_driven >= start
group by 1, 2;
Explanation:
We join against any range where the journey is at least as far as the start of the range.
The open ended range is handled in the first case clause and is fairly simple.
We need an inner case clause for the closed ranges as we only want the part of the journey in that range.
Then sum the results of that for the total journey price.
If you don't want to (or can't) make your ranges consistent then you'd need to add a third outer case for the start range.
I would definitely do this using a procedure, as it can be implemented in a very straightforward manner using loops. However, you should be able to do something similar to this:
select car_id, sum(segment_price)
from (
select
car_id,
km_driven,
f,
t,
price,
driven_in_segment,
segment_price
from (
select
car_id,
km_driven,
f,
t,
price,
(coalesce(least(t, km_driven), km_driven) - f) driven_in_segment,
price * (coalesce(least(t, km_driven), km_driven) - f) segment_price
from
-- NOTE: cartesian product here
cars,
pricing
where f < km_driven
)
) data
group by car_id
order by car_id
I find that pretty less readable, though.
UPDATE:
That query is a bit more complex than necessary, I was trying out some things with window functions that were not needed in the end. A simplified version here that should be equivalent:
select car_id, sum(segment_price)
from (
select
car_id,
km_driven,
f,
t,
price,
(coalesce(least(t, km_driven), km_driven) - f) driven_in_segment,
price * (coalesce(least(t, km_driven), km_driven) - f) segment_price
from
-- NOTE: cartesian product here
cars,
pricing
where f < km_driven
) data
group by car_id
order by car_id
you can use join and calculate your cost by using case when
select c.car_id, case when p.price=.5
then 100*2+100*1+(c.km_driven-200)*0.5
when p.price=1 then 100*2+(c.km_driven-100)*1
else c.km_driven*p.price as cost
from cars c join pricing p
on c.km_driven>=p.from and c.km_driven<=p.to

Recursive calculation in SQL (Oracle)

I'm having a tough time finding a solution to ETL some data into my resulting table. I think I cannot accomplish this using pure SQL and need to use PL-SQL due to the looping. Could the sql gurus help me go towards the right direction or provide some pointers to solve this problem?
Here's the scenario:
Tables: TABLEA and TABLEB.
Steps:
Group records in TABLEA by A_CD and SUM the A_AMT FIELD. (Lets assume A_FLAG is always same for any A_CD.). Lets call the grouped resultset as TABLEA_GRP (This is not a table, it is a grouped query).
Pick a row from TABLEB and if B_FLG is 'N' then pick all rows in TABLEA_GRP where A_FLG is 'N'. If the B_FLG is 'Y' then pick all rows in TABLEA_GRP.
Starting first record of rows picked in step 2, calculate the ratio of its TOTAL_AMT to SUM of ALL TOTAL_AMT for the selected rows. Multiply the ratio to B_AMT and add resulting amount to the rows TOTAL_AMT and store in RESULTING_AMT. Repeat this calculation for all rows picked in step 2.
Repeat step 2 and 3, now using the starting TOTAL_AMT VALUE from the RESULTING_AMT value from previous calculation of the same A_CD.
RESULTING _RATIO field is not needed to be saved, it is just given for demo purpose. How would you do this?
Basically I want to get data in RESULTING_TABLE from TABLEA and TABLEB
Could anyone help? Thanks a lot in advance for any guidance.
EDIT: I added A_DATE and B_DATE for supporting join between the two tables. For simplicity you can just do A.A_DATE = B.B_DATE, example this basic join:
SELECT
A.A_CD,
SUM(A.A_AMT) AS TOTAL_AMT,
A.A_FLAG,
A.A_DATE,
B.B_ID,
B.B_AMT,
B.B_FLAG
FROM
TABLEA A
JOIN TABLEB B
ON A.A_DATE = B.B_DATE
GROUP BY
A.A_CD,
A.A_FLAG,
A.A_DATE,
B.B_ID,
B.B_AMT,
B.B_FLAG
;
Okay I think I've got the solution. The numbers are a bit different to yours, but I'm fairly sure mine is doing what you want. We can do everything in steps 1 & 2 using a single query (main_sql). 3 and 4 have to be done using a recursive statement (recur_sql).
with main_sql as (
select a.*,
b.*,
sum(a_amt) over (partition by b_id) as cd_amt,
rank() over (partition by a_cd order by b_id) as rnk
from (select a_cd, a_flag, sum(a_amt) as a_amt
from tablea
group by a_cd, a_flag) a,
tableb b
where a.a_flag = case when b.b_flag = 'Y' then a.a_flag else b.b_flag end
order by b_id, a_cd
),
recur_sql (a_cd, b_id, total_amt, cd_amt, resulting_ratio, resulting_amt, rnk) as (
select m.a_cd,
m.b_id,
m.a_amt as total_amt,
m.cd_amt, m.a_amt / m.cd_amt as resulting_ratio,
m.a_amt + (m.a_amt / m.cd_amt * m.b_amt) as resulting_amt,
rnk
from main_sql m
where rnk = 1
union all
select m.a_cd,
m.b_id,
r.resulting_amt as total_amt,
m.cd_amt,
r.resulting_amt / m.cd_amt as resulting_ratio,
r.resulting_amt + (r.resulting_amt / m.cd_amt * m.b_amt) as resulting_amt,
m.rnk
from recur_sql r,
main_sql m
where m.rnk > 1
and r.a_cd = m.a_cd
and m.rnk - 1 = r.rnk
)
select a_cd, b_id, total_amt, resulting_ratio, resulting_amt
from recur_sql
order by 2, 1

FIFO Implementation in Inventory using SQL

This is basically an inventory project which tracks the "Stock In" and "Stock Out" of items through Purchase and sales respectively.
The inventory system follows FIFO Method (the items which are first purchased are always sold first). For example:
If we purchased Item A in months January, February and March
When a customer comes we give away items purchased during January
only when the January items are over we starts giving away February items and so on
So I have to show here the total stock in my hand and the split up so that I can see the total cost incurred.
Actual table data:
The result set I need to obtain:
My client insists that I should not use Cursor, so is there any other way of doing so?
As some comment already said a CTE can solve this
with cte as (
select item, wh, stock_in, stock_out, price, value
, row_number() over (partition by item, wh order by item, wh) as rank
from myTable)
select a.item, a.wh
, a.stock_in - coalesce(b.stock_out, 0) stock
, a.price
, a.value - coalesce(b.value, 0) value
from cte a
left join cte b on a.item = b.item and a.wh = b.wh and a.rank = b.rank - 1
where a.stock_in - coalesce(b.stock_out, 0) > 0
If the second "Item B" has the wrong price (the IN price is 25, the OUT is 35).
SQL 2008 fiddle
Just for fun, with sql server 2012 and the introduction of the LEAD and LAG function the same thing is possible in a somewhat easier way
with cte as (
select item, wh, stock_in
, coalesce(LEAD(stock_out)
OVER (partition by item, wh order by item, wh), 0) stock_out
, price, value
, coalesce(LEAD(value)
OVER (partition by item, wh order by item, wh), 0) value_out
from myTable)
select item
, wh
, (stock_in - stock_out) stock
, price
, (value - value_out) value
from cte
where (stock_in - stock_out) > 0
SQL2012 fiddle
Update
ATTENTION -> To use the two query before this point the data need to be in the correct order.
To have the details with more then one row per day you need something reliable to order the row with the same date, like a date column with time, an autoincremental ID or something down the same line, and it's not possible to use the query already written because they are based on the position of the data.
A better idea is to split the data in IN and OUT, order it by item, wh and data, and apply a rank on both data, like this:
SELECT d_in.item
, d_in.wh
, d_in.stock_in - coalesce(d_out.stock_out, 0) stock
, d_in.price
, d_in.value - coalesce(d_out.value, 0) value
FROM (SELECT item, wh, stock_in, price, value
, rank = row_number() OVER
(PARTITION BY item, wh ORDER BY item, wh, date)
FROM myTable
WHERE stock_out = 0) d_in
LEFT JOIN
(SELECT item, wh, stock_out, price, value
, rank = row_number() OVER
(PARTITION BY item, wh ORDER BY item, wh, date)
FROM myTable
WHERE stock_in = 0) d_out
ON d_in.item = d_out.item AND d_in.wh = d_out.wh
AND d_in.rank = d_out.rank
WHERE d_in.stock_in - coalesce(d_out.stock_out, 0) > 0
SQLFiddle
But this query is NOT completely reliable, the order of data in the same order group is not stable.
I haven't change the query to recalculate the price if the IN.price is different from the OUT.price
If cursors aren't an option, a SQLCLR stored procedure might be. This way you could obtain the raw data into .net objects, manipulate / sort it using c# or vb.net and set the resulting data as the procedure's output. Not only this will give you what you want, it may even turn up being much easier than trying to do the same in pure T-SQL, depending on your programming background.

Replace or updates null values with the sum of amount spend for customers

At first, I have a table looks like as below:
cust_id/Bill_amt/Brand/BrandA/BrandB/Total_value
100/350/A/NULL /NULL/NULL
100/250/A/NULL/NULL/NULL
100/100/B/NULL /NULL/NULL
300/200/B/NULL /NULL/NULL
I would like to replace the 'null' values with the amount of spend for the same customer, as you can see from the above table, there is repeated customers with cust_id 100, this is because this customer purchase both brand A and B at different dates, thus, I need your help to sum up everything for that customer in one row, after putting everything in one row, you will notice that there is 3 rows with the same record (duplication), which is shown as below:
cust_id/Bill_amt/Brand/BrandA/BrandB/Total_value
100/350/A/600/100/700
100/250/A/600/100/700
100/100/B/600/100/700
300/200/B/0/200/200
For example,cust_id 100 spend $600(350+250) for brand A, and this customer only spend $100 (look at the 3rd row of cutsomer_id 100) for brand B, thus, the total value is $700 (600+100).
I hope this explanation is clear enough for you.
After update the table as shown below, we will remove the duplicates by ourselves.
Please kindly provide us the SQL query to help us to replace or update the 'null' values with the sum of bill_amt as we have 200000 plus record to do it.
Thank you very much for taking your time to reply us.
why do you need BrandA,BrandB,Total_value column ?that is not require.
;WITH cte
AS (SELECT cust_id,
brand,
Sum(bill_amt) bill_amt
FROM #t
GROUP BY rollup( cust_id, brand ))
UPDATE #t
SET branda = COALESCE(a.bill_amt, 0),
brandb = COALESCE(c.bill_amt, 0),
total_value = COALESCE(d.bill_amt, 0)
FROM #t b
LEFT JOIN cte a
ON a.cust_id = b.cust_id
AND a.brand = 'A'
LEFT JOIN cte c
ON b.cust_id = c.cust_id
AND c.brand = 'B'
LEFT JOIN cte d
ON b.cust_id = d.cust_id
AND d.brand IS NULL
SELECT *
FROM #t

SQL Server: Group similar sales together

I'm trying to do some reporting in SQL Server.
Here's the basic table setup:
Order (ID, DateCreated, Status)
Product(ID, Name, Price)
Order_Product_Mapping(OrderID, ProductID, Quantity, Price, DateOrdered)
Here I want to create a report to group product with similar amount of sales over a time period like this:
Sales over 1 month:
Coca, Pepsi, Tiger: $20000 average(coca:$21000, pepsi: $19000, tiger: $20000)
Bread, Meat: $10000 avg (bread:$11000, meat: $9000)
Note that the text in () is just to clarify, not need in the report).
User define the varying between sales that can consider similar. Example sales with varying lower than 5% are consider similar and should be group together. The time period is also user defined.
I can calculate total sale over a period but has no ideas on how to group them together by sales varying. I'm using SQL Server 2012.
Any help is appreciated.
Sorry, my English is not very good :)
UPDATE: *I figured out about what I atually need ;)*
For an known array of numbers like: 1,2,3,50,52,100,102,105
I need to group them into groups which have at least 3 number and the difference between any two items in group is smaller than 10.
For the above array, output should be:
[1,2,3]
[100,102,105]
=> the algorithm take 3 params: the array, minimum items to form a group and maximum difference between 2 items.
How can I implement this in C#?
By the way, if you just want c#:
var maxDifference = 10;
var minItems = 3;
// I just assume your list is not ordered, so order it first
var array = (new List<int> {3, 2, 50, 1, 51, 100, 105, 102}).OrderBy(a => a);
var result = new List<List<int>>();
var group = new List<int>();
var lastNum = array.First();
var totalDiff = 0;
foreach (var n in array)
{
totalDiff += n - lastNum;
// if distance of current number and first number in current group
// is less than the threshold, add into current group
if (totalDiff <= maxDifference)
{
group.Add(n);
lastNum = n;
continue;
}
// if current group has 3 items or more, add to final result
if (group.Count >= minItems)
result.Add(group);
// start new group
group = new List<int>() { n };
lastNum = n;
totalDiff = 0;
}
// forgot the last group...
if (group.Count >= minItems)
Result.Add(group);
the key here is, the array need to be ordered, so that you do not need to jump around or store values to calculate distances
I can't believe I did it~~~
-- this threshold is the key in this query
-- it means that
-- if the difference between two values are less than the threshold
-- these two values are belong to one group
-- in your case, I think it is 200
DECLARE #th int
SET #th = 200
-- very simple, calculate total price for a time range
;WITH totals AS (
SELECT p.name AS col, sum(o.price * op.quantity) AS val
FROM order_product_mapping op
JOIN [order] o ON o.id = op.orderid
JOIN product p ON p.id = op.productid
WHERE dateordered > '2013-03-01' AND dateordered < '2013-04-01'
GROUP BY p.name
),
-- give a row number for each row
cte_rn AS ( --
SELECT col, val, row_number()over(ORDER BY val DESC) rn
FROM totals
),
-- show starts now,
-- firstly, we make each row knows the row before it
cte_last_rn AS (
SELECT col, val, CASE WHEN rn = 1 THEN 1 ELSE rn - 1 END lrn
FROM cte_rn
),
-- then we join current to the row before it, and calculate
-- the difference between the total price of current row and that of previous row
-- if the the difference is more than the threshold we make it '1', otherwise '0'
cte_range AS (
SELECT
c1.col, c1.val,
CASE
WHEN c2.val - c1.val <= #th THEN 0
ELSE 1
END AS range,
rn
FROM cte_last_rn c1
JOIN cte_rn c2 ON lrn = rn
),
-- even tricker here,
-- now, we join last cte to itself, and for each row
-- sum all the values (0, 1 that calculated previously) of rows before current row
cte_rank AS (
SELECT c1.col, c1.val, sum(c2.range) rank
FROM cte_range c1
JOIN cte_range c2 ON c1.rn >= c2.rn
GROUP BY c1.col, c1.val
)
-- now we have properly grouped theres total prices, and we can group on it's rank
SELECT
avg(c1.val) AVG,
(
SELECT c2.col + ', ' AS 'data()'
FROM cte_rank c2
WHERE c2.rank = c1.rank
ORDER BY c2.val desc
FOR xml path('')
) product,
(
SELECT cast(c2.val AS nvarchar(MAX)) + ', ' AS 'data()'
FROM cte_rank c2
WHERE c2.rank = c1.rank
ORDER BY c2.desc
FOR xml path('')
) price
FROM cte_rank c1
GROUP BY c1.rank
HAVING count(1) > 2
The result will look like:
AVG PRODUCT PRICE
28 A, B, C 30, 29, 27
12 D, E, F 15, 12, 10
3 G, H, I 4, 3, 2
for understanding how I did concatenate, please read this:
Concatenate many rows into a single text string?
This query should produce what you expect, it displays products sales for every months for which you have orders :
SELECT CONVERT(CHAR(4), OP.DateOrdered, 100) + CONVERT(CHAR(4), OP.DateOrdered, 120) As Month ,
Product.Name ,
AVG( OP.Quantity * OP.Price ) As Turnover
FROM Order_Product_Mapping OP
INNER JOIN Product ON Product.ID = OP.ProductID
GROUP BY CONVERT(CHAR(4), OP.DateOrdered, 100) + CONVERT(CHAR(4), OP.DateOrdered, 120) ,
Product.Name
Not tested, but if you provide sample data I could work on it
Look like I made things more complicate than it should be.
Here is what should solve the problem:
-Run a query to get sales for each product.
-Run K-mean or some similar algorithms.