I have a table with few fields like below
ID | OpeningBal | A | B | C | D | ClosingBal
Here opening balance of current day is the closing balance of previous day and closing balance is calculated with this formula
OpeningBal + A - B - (C + D) - C
but current data is wrong in this table because of wrong formula applied previously for Closing Balance. I have tried like looping through all the records of this table and update the closing balance to the actual value. I want to update the opening balance of next row with the closing balance of current row in this cursor but I don't have the ID of next row. Any thoughts?
Actual Result:
ID OPBal A B C D CLBal
1 0 80 4 6 0 90
2 90 8 6 0 0 104
5 104 5 4 0 9 122
7 122 10 3 5 0 140
expected result:
ID OPBal A B C D CLBal
1 0 80 4 6 0 64
2 64 46 6 0 0 104
5 104 5 4 0 9 96
7 96 10 3 5 0 93
update tablename set openingbalance=a.clcurrentrow from
(select LAG(closingbalance) over (order by id) clcurrentrow from tablename ) a
Thanks all. I got the solution using LAG and I will modify this based on my requirements.
SELECT
ID,
LAG(ClosingBalance,1,0.00) OVER (ORDER BY ID) PrevClosingBalance,
ClosingBalance
FROM Table1
UPDATE Table1
SET OpeningBalnace = #PrevClosingBalance
ClosingBalance = #PrevClosingBalance + A - B -(C+D)-C
WHERE ID = #ID
Related
I'm trying to create a new column, whose values are dependent to the previous index row values, like in the example below:
Index
ID
AGE
SEX
HireMonth
Tag
1
101
23
M
9
101
2
102
32
M
12
102
3
103
25
F
11
1
4
104
29
M
10
104
5
105
45
F
1
1
6
106
21
M
7
106
7
107
56
F
6
107
8
108
12
M
4
108
Here's how I'm creating the Tag column :
CASE WHEN AGE > 25 AND HireMonth => 9
THEN (next row value = 1 AND same row ID )
ELSE ID
END AS Tag
Have you tried:
select [Index], ID, AGE, SEX, HireMonth,
CASE WHEN LAG(AGE, 1) OVER (ORDER BY [Index]) > 25 AND LAG(HireMonth, 1) OVER (ORDER BY [Index]) >= 9
THEN 1
ELSE ID
END AS Tag
FROM SOME_TABLE
I have a data set :
data have;
input group $ value;
datalines;
A 4
A 3
A 2
A 1
B 1
C 1
D 2
D 1
E 1
F 1
G 2
G 1
H 1
;
run;
The first variable is a group identifier, the second a value.
For each group, I want a new variable "sum" with the sum of all values in the column, exept for the group the observation is in.
My issue is having to do that on nearly 30 millions of observations, so efficiency matters.
I found that using data step was more efficient than using procs.
The final database should looks like :
data want;
input group $ value $ sum;
datalines;
A 4 11
A 3 11
A 2 11
A 1 11
B 1 20
C 1 20
D 2 18
D 1 18
E 1 20
F 1 20
G 2 18
G 1 20
H 1 20
;
run;
Any idea how to perform this please?
Edit: I don't know if this matter but the example I gave is a simplified version of my issue. In the real case, I have 2 other group variable, thus taking the sum of the whole column and substract the sum in the group is not a viable solution.
The requirement
sum of all values in the column, except for the group the observation is in
indicates two passes of the data must occur:
Compute the all_sum and each group's group_sumA hash can store each group's sum -- computed via a specified suminc: variable and .ref() method invocation. A variable can accumulate allsum.
Compute allsum - group_sum for each row of a group.The group_sum is retrieved from hash and subtracted from allsum.
Example:
data want;
if 0 then set have; * prep pdv;
declare hash sums (suminc:'value');
sums.defineKey('group');
sums.defineDone();
do while (not hash_loaded);
set have end=hash_loaded;
sums.ref(); * adds value to internal sum of hash data record;
allsum + value;
end;
do while (not last_have);
set have end=last_have;
sums.sum(sum:sum); * retrieve groups sum. Do you hear the Dragnet theme too?;
sum = allsum - sum; * subtract from allsum;
output;
end;
stop;
run;
What is wrong with a straight forward approach? You need to make two passes no matter what you do.
Like this. I included extra variables so you can see how the values are derived.
proc sql ;
create table want as
select a.*,b.grand,sum(value) as total, b.grand - sum(value) as sum
from have a
, (select sum(value) as grand from have) b
group by a.group
;
quit;
Results:
Obs group value grand total sum
1 A 3 21 10 11
2 A 1 21 10 11
3 A 2 21 10 11
4 A 4 21 10 11
5 B 1 21 1 20
6 C 1 21 1 20
7 D 2 21 3 18
8 D 1 21 3 18
9 E 1 21 1 20
10 F 1 21 1 20
11 G 1 21 3 18
12 G 2 21 3 18
13 H 1 21 1 20
Note it does not matter what you have as your GROUP BY clause.
Do you really need to output all of the original observations? Why not just output the summary table?
proc sql ;
create table want as
select a.group, b.grand - sum(value) as sum
from have a
, (select sum(value) as grand from have) b
group by a.group
;
quit;
Results
Obs group total sum
1 A 10 11
2 B 1 20
3 C 1 20
4 D 3 18
5 E 1 20
6 F 1 20
7 G 3 18
8 H 1 20
I would break this out into two different segments:
1.) You could start by using PROC SQL to get the sums by the group
2.) Then use some IF/THEN statements to reassign the values by group
I work for a small company and we're trying to get away from Excel workbooks for Inventory control. I thought I had it figured out with help from (Nasser) but its beyond me. This is what I can get into a table, from there I need too get it to look like the table below.
My data
ID|GrpID|InOut| LoadFt | LoadCostft| LoadCost | RunFt | RunCost| AvgRunCostFt
1 1 1 4549.00 0.99 4503.51 4549.00 0 0
2 1 1 1523.22 1.29 1964.9538 6072.22 0 0
3 1 2 -2491.73 0 0 3580.49 0 0
4 1 2 -96.00 0 0 3484.49 0 0
5 1 1 8471.68 1.41 11945.0688 11956.17 0 0
6 1 2 -369.00 0 0 11468.0568 0 0
7 2 1 1030.89 5.07 5223.56 1030.89 0 0
8 2 1 314.17 5.75 1806.4775 1345.06 0 0
9 2 1 239.56 6.3 1508.24 1509.228 0 0
10 2 2 -554.46 0 0 954.768 0 0
11 2 1 826.24 5.884 4861.5961 1781.008 0 0
Expected output
ID|GrpID|InOut| LoadFt | LoadCostft| LoadCost | RunFt | RunCost| AvgRunCostFt
1 1 1 4549.00 0.99 4503.51 4549.00 4503.51 0.99
2 1 1 1523.22 1.29 1964.9538 6072.22 6468.4638 1.0653
3 1 2 -2491.73 1.0653 -2490.6647 3580.49 3977.7991 1.111
4 1 2 -96.00 1.111 -106.656 3484.49 3871.1431 1.111
5 1 1 8471.68 1.41 11945.0688 11956.17 15816.2119 1.3228
6 1 2 -369.00 1.3228 -488.1132 11468.0568 15328.0987 1.3366
7 2 1 1030.89 5.07 5223.56 1030.89 5223.56 5.067
8 2 1 314.17 5.75 1806.4775 1345.06 7030.0375 5.2266
9 2 1 239.56 6.3 1508.24 1509.228 8539.2655 5.658
10 2 2 -554.46 5.658 -3137.1346 954.768 5402.1309 5.658
11 2 1 826.24 5.884 4861.5961 1781.008 10263.727 5.7629
The first record of a group would be considered the opening balance. Inventory going into the yard have the ID of 1 and out of the yard are 2's. Load footage going into the yard always has a load cost per foot and I can calculate the the running total of footage. The first record of a group is easy to calculate the run cost and run cost per foot. The next record becomes a little more difficult to calculate. I need to move the average of run cost per foot forward to the load cost per foot when something is going out of the yard and then calculate the run cost and average run cost per foot again. Hopefully this makes sense to somebody and we can automate some of these calculations. Thanks for any help.
Here's an Oracle example I found;
SQL> select order_id
2 , volume
3 , price
4 , total_vol
5 , total_costs
6 , unit_costs
7 from ( select order_id
8 , volume
9 , price
10 , volume total_vol
11 , 0.0 total_costs
12 , 0.0 unit_costs
13 , row_number() over (order by order_id) rn
14 from costs
15 order by order_id
16 )
17 model
18 dimension by (order_id)
19 measures (volume, price, total_vol, total_costs, unit_costs)
20 rules iterate (4)
21 ( total_vol[any] = volume[cv()] + nvl(total_vol[cv()-1],0.0)
22 , total_costs[any]
23 = case SIGN(volume[cv()])
24 when -1 then total_vol[cv()] * nvl(unit_costs[cv()-1],0.0)
25 else volume[cv()] * price[cv()] + nvl(total_costs[cv()-1],0.0)
26 end
27 , unit_costs[any] = total_costs[cv()] / total_vol[cv()]
28 )
29 order by order_id
30 /
ORDER_ID VOLUME PRICE TOTAL_VOL TOTAL_COSTS UNIT_COSTS
---------- ---------- ---------- ---------- ----------- ----------
1 1000 100 1000 100000 100
2 -500 110 500 50000 100
3 1500 80 2000 170000 85
4 -100 150 1900 161500 85
5 -600 110 1300 110500 85
6 700 105 2000 184000 92
6 rows selected.
Let me say first off three things:
This is certainly not the best way to do it. There is a rule saying that if you need a while-loop, then you are most probably doing something wrong.
I suspect there is some calculation errors in your original "Expected output", please check the calculations since my calculated values are different according to your formulas.
This question could also be seen as a gimme teh codez type of question, but since you asked a decently formed question with some follow-up research, my answer is below. (So no upvoting since this is help for a specific case)
Now onto the solution:
I attempted to use my initial hint of the LAG statement in a nicely formed single update statement, but since you can only use a windowed function (aka LAG) inside a select or order by clause, that will not work.
What the code below does in short:
It calculates the various calculated fields for each record when they can be calculated and with the appropriate functions, updates the table and then moves onto the next record.
Please see comments in the code for additional information.
TempTable is a demo table (visible in the linked SQLFiddle).
Please read this answer for information about decimal(19, 4)
-- Our state and running variables
DECLARE #curId INT = 0,
#curGrpId INT,
#prevId INT = 0,
#prevGrpId INT = 0,
#LoadCostFt DECIMAL(19, 4),
#RunFt DECIMAL(19, 4),
#RunCost DECIMAL(19, 4)
WHILE EXISTS (SELECT 1
FROM TempTable
WHERE DoneFlag = 0) -- DoneFlag is a bit column I added to the table for calculation purposes, could also be called "IsCalced"
BEGIN
SELECT top 1 -- top 1 here to get the next row based on the ID column
#prevId = #curId,
#curId = tmp.ID,
#curGrpId = Grpid
FROM TempTable tmp
WHERE tmp.DoneFlag = 0
ORDER BY tmp.GrpID, tmp.ID -- order by to ensure that we get everything from one GrpID first
-- Calculate the LoadCostFt.
-- It is either predetermined (if InOut = 1) or derived from the previous record's AvgRunCostFt (if InOut = 2)
SELECT #LoadCostFt = CASE
WHEN tmp.INOUT = 2
THEN (lag(tmp.AvgRunCostFt, 1, 0.0) OVER (partition BY GrpId ORDER BY ID))
ELSE tmp.LoadCostFt
END
FROM TempTable tmp
WHERE tmp.ID IN (#curId, #prevId)
AND tmp.GrpID = #curGrpId
-- Calculate the LoadCost
UPDATE TempTable
SET LoadCost = LoadFt * #LoadCostFt
WHERE Id = #curId
-- Calculate the current RunFt and RunCost based on the current LoadFt and LoadCost plus the previous row's RunFt and RunCost
SELECT #RunFt = (LoadFt + (lag(RunFt, 1, 0) OVER (partition BY GrpId ORDER BY ID))),
#RunCost = (LoadCost + (lag(RunCost, 1, 0) OVER (partition BY GrpId ORDER BY ID)))
FROM TempTable tmp
WHERE tmp.ID IN (#curId, #prevId)
AND tmp.GrpID = #curGrpId
-- Set all our values, including the AvgRunCostFt calc
UPDATE TempTable
SET RunFt = #RunFt,
RunCost = #RunCost,
LoadCostFt = #LoadCostFt,
AvgRunCostFt = #RunCost / #RunFt,
doneflag = 1
WHERE ID = #curId
END
SELECT ID, GrpID, InOut, LoadFt, RunFt, LoadCost,
RunCost, LoadCostFt, AvgRunCostFt
FROM TempTable
ORDER BY GrpID, Id
The output with your sample data and a SQLFiddle demonstrating how it all works:
ID GrpID InOut LoadFt RunFt LoadCost RunCost LoadCostFt AvgRunCostFt
1 1 1 4549 4549 4503.51 4503.51 0.99 0.99
2 1 1 1523.22 6072.22 1964.9538 6468.4638 1.29 1.0653
3 1 2 -2491.73 3580.49 -2654.44 3814.0238 1.0653 1.0652
4 1 2 -96 3484.49 -102.2592 3711.7646 1.0652 1.0652
5 1 1 8471.68 11956.17 11945.0688 15656.8334 1.41 1.3095
6 1 2 -369 11587.17 -483.2055 15173.6279 1.3095 1.3095
7 2 1 1030.89 1030.89 5226.6123 5226.6123 5.07 5.07
8 2 1 314.17 1345.06 1806.4775 7033.0898 5.75 5.2288
9 2 1 239.56 1584.62 1509.228 8542.3178 6.3 5.3908
10 2 2 -554.46 1030.16 -2988.983 5553.3348 5.3908 5.3907
11 2 1 826.24 1856.4 4861.5962 10414.931 5.884 5.6103
If you are unclear about parts of the code, I can update with additional explanations.
CREATE TABLE TEMP(RESOURCE_VALUE VARCHAR2(63 BYTE),TOT_COUNT NUMBER)
I want an query which can extract the range from which to which I want to have breakup of the sum records to XYZ value. I will say 50,000 is the break up need. Then it has to display all the ranges from which RESOURCE_VALUE to which RESOURCE_VALUE I can get sum <=50,000. One RESOURCE_VALUE value can be included in only one range.
Example: sample data
The Below Is The input
resource_value | tot_count
---------------+----------
1 100
2 50
3 20
4 30
5 300
6 250
7 200
8 30
9 60
10 200
11 110
12 120
Then the output has to be something like this :
sample output 1: when sum(tot_count)<=300
start resource_value endresource_value sum
---------------------+---------------------+-------
1 4 300
5 5 300
6 6 250
7 9 290
10 10 200
11 12 230
sample output 2: when sum(tot_count)<=500
start resource_value end resource_value sum
---------------------+---------------------+------
1 4 300
5 5 300
6 8 480
9 12 490
I just guess that you use ORACLE, because of your table structure, and in oracle you can use this query to get your aim:
with vw1(val,flg,sumval) as
(select 1 val,0 flg,TOT_COUNT sumval
from TEMP where RESOURCE_VALUE = '1'
union all
select vw1.val + 1 val,
case when vw1.sumval + t1.TOT_COUNT > 300 then vw1.flg + 1 else vw1.flg end flg,
case when vw1.sumval + t1.TOT_COUNT > 300 then t1.TOT_COUNT else vw1.sumval + t1.TOT_COUNT end sumval
From TEMP t1,vw1 WHERE t1.RESOURCE_VALUE = TO_CHAR(vw1.val + 1))
select min(val) START_RESOURCE_VALUE,max(val) END_RESOURCE_VALUE,
max(sumval) "SUM" from vw1 group by flg order by min(val);
SQL Fiddle
Need your help with a SQL query in Oracle db. I have data that I want to partition into groups when event = "Start". E.g. Row 1-6 is a group, row 7-9 is a group. I want to ignore rows with event = "Ignore". Finally I want to calculate max(Value)-min(Value) for these groups. I dont have any way to group the data.
Can this be achieved? Is it possible to use partition by Event = start. Same data is below:
Row Event Value Required Result is max-min of value
1 Start 10
2 A 11
3 B 12
4 C 13
5 D 14
6 E 15 5
--------------------------------------------
7 Start 16
8 A 18
9 B 20 4
--------------------------------------------
10 Start 27
11 A 30
12 B 33
13 C 34 7
--------------------------------------------
14 Ignore 35
--------------------------------------------
15 Ignore 36
--------------------------------------------
16 Start 33
17 A 34
18 B 35
19 C 36
20 D 37
21 E 38 5
--------------------------------------------
Yes, you can do this in SQL.
The following query first finds the group that a row is in, by finding the largest start before the row id. This version uses a correlated subquery for this calculation.
It then does the grouping on the id and does the calculation.
select groupid, max(value) - min(value)
from (select t.*,
(select max(row) from t t2 where t2.row < t.row and t2.event = start
) as groupid
from t
) t
where event <> 'IGNORE'