Referencing the value of the previous calculcated value in Oracle - sql

How can one reference a calculated value from the previous row in a SQL query? In my case each row is an event that somehow manipulates the same value from the previous row.
The raw data looks like this:
Eventno Eventtype Totalcharge
3 ACQ 32
2 OUT NULL
1 OUT NULL
Lets say each Eventtype=OUT should half the previous row totalcharge in a column called Remaincharge:
Eventno Eventtype Totalcharge Remaincharge
3 ACQ 32 32
2 OUT NULL 16
1 OUT NULL 8
I've already tried the LAG analytic function but that does not allow me to get a calculated value from the previous row. Tried something like this:
LAG(remaincharge, 1, totalcharge) OVER (PARTITION BY ...) as remaincharge
But this didn't work because remaingcharge could not be found.
Any ideas how to achieve this? Would need a analytics function that can give me the the cumulative sum but given a function instead with access to the previous value.
Thank you in advance!
Update problem description
I'm afraid my example problem was to general, here is a better problem description:
What remains of totalcharge is decided by the ratio of outqty/(previous remainqty).
Eventno Eventtype Totalcharge Remainqty Outqty
4 ACQ 32 100 0
3 OTHER NULL 100 0
2 OUT NULL 60 40
1 OUT NULL 0 60
Eventno Eventtype Totalcharge Remainqty Outqty Remaincharge
4 ACQ 32 100 0 32
3 OTHER NULL 100 0 32 - (0/100 * 32) = 32
2 OUT NULL 60 40 32 - (40/100 * 32) = 12.8
1 OUT NULL 0 60 12.8 - (60/60 * 12.8) = 0

In your case you could work out the first value using the FIRST_VALUE() analytic function and the power of 2 that you have to divide by with RANK() in a sub-query and then use that. It's very specific to your example but should give you the general idea:
select eventno, eventtype, totalcharge
, case when eventtype <> 'OUT' then firstcharge
else firstcharge / power(2, "rank" - 1)
end as remaincharge
from ( select a.*
, first_value(totalcharge) over
( partition by 1 order by eventno desc ) as firstcharge
, rank() over ( partition by 1 order by eventno desc ) as "rank"
from the_table a
)
Here's a SQL Fiddle to demonstrate. I haven't partitioned by anything because you've got nothing in your raw data to partition by...

A variation on Ben's answer to use a windowing clause, which seems to take care of your updated requirements:
select eventno, eventtype, totalcharge, remainingqty, outqty,
initial_charge - case when running_outqty = 0 then 0
else (running_outqty / 100) * initial_charge end as remainingcharge
from (
select eventno, eventtype, totalcharge, remainingqty, outqty,
first_value(totalcharge) over (partition by null
order by eventno desc) as initial_charge,
sum(outqty) over (partition by null
order by eventno desc
rows between unbounded preceding and current row)
as running_outqty
from t42
);
Except it gives 19.2 instead of 12.8 for the third row, but that's what your formula suggests it should be:
EVENTNO EVENT TOTALCHARGE REMAININGQTY OUTQTY REMAININGCHARGE
---------- ----- ----------- ------------ ---------- ---------------
4 ACQ 32 100 0 32
3 OTHER 100 0 32
2 OUT 60 40 19.2
1 OUT 0 60 0
If I add another split so it goes from 60 to zero in two steps, with another non-OUT record in the mix too:
EVENTNO EVENT TOTALCHARGE REMAININGQTY OUTQTY REMAININGCHARGE
---------- ----- ----------- ------------ ---------- ---------------
6 ACQ 32 100 0 32
5 OTHER 100 0 32
4 OUT 60 40 19.2
3 OUT 30 30 9.6
2 OTHER 30 0 9.6
1 OUT 0 30 0
There's an assumption that the remaining quantity is consistent and you can effectively track a running total of what has gone before, but from the data you've shown that looks plausible. The inner query calculates that running total for each row, and the outer query does the calculation; that could be condensed but is hopefully clearer like this...

Ben's answer is the better one (will probably perform better) but you can also do it like this:
select t.*, (connect_by_root Totalcharge) / power (2,level-1) Remaincharge
from the_table t
start with EVENTTYPE = 'ACQ'
connect by prior eventno = eventno + 1;
I think it's easier to read
Here is a demo

Related

How to sum different columns and use the sum result set into another column sum using SQL Server dynamically

i am trying to achieve something as mentioned below
Row_Num ID Total Time Timeout
----------------------------------------
1 33 120 1
2 34 120 121
3 35 121 241
4 36 145 362
using sql queries, i would like try to find the timeout from column, based on previous row total time. for every row_number 1 , timeout should be 1
eg: 1+120=121, 3rd row, 121+120=242 so on..
please help me in this regard. any help would be appreciated
You can subtract out the first value and replace it with 1:
select t.*,
1 + sum(total_time) over (order by row_num) - first_value(total_time) over (order by row_num) as total_time
from t;
This is simply a cumulative SUM, but you replace the first value with 1:
SELECT SUM(CASE RowNum WHEN 1 THEN 1 ELSE TotalTime END) OVER (ORDER BY RowNum) AS TimeOut
For example:
SELECT RowNum,
TotalTime,
SUM(CASE RowNum WHEN 1 THEN 1 ELSE TotalTime END) OVER (ORDER BY RowNum) AS TimeOut
FROM (VALUES(1,120),
(2,120),
(3,121),
(4,145))V(RowNum,TotalTime);
Returns:
RowNum TotalTime TimeOut
----------- ----------- -----------
1 120 1
2 120 121
3 121 242
4 145 387
Appears the OP has changed their requirements, and the expected results are different now (/sigh). This would be:
SELECT RowNum,
TotalTime,
1 + ISNULL(SUM(TotalTime) OVER (ORDER BY RowNum ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),0) AS TimeOut
FROM (VALUES(1,120),
(2,120),
(3,121),
(4,145))V(RowNum,TotalTime);
This, however, assumes that the OP's latest expected results are wrong, as 241+121=362 not 352.

Working out the percentage of outcomes in a column within a table

I am using SQL developer and have a table called table1 which looks like this (but with loads more data):
item_id seller_id warranty postage_class
------- --------- -------- -------------
14 2 1 2
17 6 1 1
14 2 1 1
14 2 1 2
14 2 1 1
14 2 1 2
I want to identify the percentage of items sent by first class.
If anyone could help me out that would be amazing!
You can use conditional aggregation. The simplest method is probably:
select avg(case when postage_class = 1 then 1.0 else 0 end)
from t;
Note this calculates a ratio between 0 and 1. If you want a "percentage" between 0 and 100, then use 100.0 instead of 1.0.
Some databases make it possible to shorten this even further. For instance, in Postgres, you can do:
select avg( (postage_class = 1)::int )
from t;

Conditional SUM in SQL Server 2014

I am using SQL Server 2014. When I was testing my code I noticed a problem.
Assume that max personal hour is 80 hours.
SELECT
lsm.EmployeeName,
pd.absenceDate,
pd.amountInDays * 8 AS [HoursReported],
pd.status,
(SUM(CASE WHEN pd.[status]='App' THEN (pd.amountInDays * 8)
ELSE 0 END) OVER (partition by lsm.[EmployeeName] order by pd.absenceDate)) AS [TotalUsedHours]
( #maxPSHours ) - (sum(
CASE WHEN pd.[status]='App' THEN (pd.amountInDays * 8)
ELSE 0 END)
over (
partition by lsm.[EmployeeName] order by pd.absenceDate)) AS [TotalRemainingHours]
FROM
[LocationStaffMembers] lsm
INNER JOIN
[PersonalDays] pd ON lsm.staffMemberId = pd.staffMemberId
This query returns these results:
EmployeeName AbsenceDate HoursReported Status TotalUsdHrs TotalRemingHrs
X 11/11/2015 4 approved 4 76
X 11/15/2015 8 approved 12 68
X 11/20/2015 2 decline 14 66
X 11/20/2015 2 approved 14 66
So, query works fine for different status. First 2 rows are fine. But when an employee does more than one action in a day (decline, approved etc.), my query only shows the total used and total remaining for the day.
Here is the expected result.
EmployeeName AbsenceDate HoursReported Status TotalUsdHrs TotalRemingHrs
X 11/11/2015 4 approved 4 76
X 11/15/2015 8 approved 12 68
X 11/20/2015 2 decline 12 68
X 11/20/2015 2 approved 14 66
You are doing a cumulative sum that returns results based on the order of AbsenceDate (sum(...) over (partition by ... order by pd.absenceDate). But your last 2 records have the exact same date (11/20/2015) -- at least, according to what you are showing us. This creates an ambiguity.
So, it is absolutely conceivable, and legal, that SQL Server is processing the 2 approved hours row before the 2 declined hours row when calculating the cumulative sum --which would explain your current results--, despite the fact that rows themselves are returned to you in a different order (BTW, consider adding an order by clause to the query, otherwise, the order of the rows themselves are not guaranteed).
If the 2 rows do in fact share the exact same date, you'll have to find a 2nd column to remove the ambiguity and add that to the order by clause in the cumulative sum window function. Maybe you could add a timestamp field that you can order by.
Or maybe you always want the declined status to be considered ahead of the approved status when the AbsenceDate is the same. Here is an example of a query that would do exactly that (notice the changes in the order by clauses):
SELECT
lsm.EmployeeName,
pd.absenceDate,
pd.amountInDays * 8 AS [HoursReported],
pd.status,
(SUM(CASE WHEN pd.[status]='App' THEN (pd.amountInDays * 8)
ELSE 0 END) OVER (partition by lsm.[EmployeeName] order by pd.absenceDate,
case when pd.[status] = 'App' then 1 else 0 end)) AS [TotalUsedHours]
( #maxPSHours ) - (sum(
CASE WHEN pd.[status]='App' THEN (pd.amountInDays * 8)
ELSE 0 END)
over (
partition by lsm.[EmployeeName] order by pd.absenceDate,
case when pd.[status] = 'App' then 1 else 0 end)) AS [TotalRemainingHours]
FROM
[LocationStaffMembers] lsm
INNER JOIN
[PersonalDays] pd ON lsm.staffMemberId = pd.staffMemberId
ORDER BY lsm.[EmployeeName],
pd.absenceDate,
case when pd.[status] = 'App' then 1 else 0 end

How to find the SQL medians for a grouping

I am working with SQL Server 2008
If I have a Table as such:
Code Value
-----------------------
4 240
4 299
4 210
2 NULL
2 3
6 30
6 80
6 10
4 240
2 30
How can I find the median AND group by the Code column please?
To get a resultset like this:
Code Median
-----------------------
4 240
2 16.5
6 30
I really like this solution for median, but unfortunately it doesn't include Group By:
https://stackoverflow.com/a/2026609/106227
The solution using rank works nicely when you have an odd number of members in each group, i.e. the median exists within the sample, where you have an even number of members the rank method will fall down, e.g.
1
2
3
4
The median here is 2.5 (i.e. half the group is smaller, and half the group is larger) but the rank method will return 3. To get around this you essentially need to take the top value from the bottom half of the group, and the bottom value of the top half of the group, and take an average of the two values.
WITH CTE AS
( SELECT Code,
Value,
[half1] = NTILE(2) OVER(PARTITION BY Code ORDER BY Value),
[half2] = NTILE(2) OVER(PARTITION BY Code ORDER BY Value DESC)
FROM T
WHERE Value IS NOT NULL
)
SELECT Code,
(MAX(CASE WHEN Half1 = 1 THEN Value END) +
MIN(CASE WHEN Half2 = 1 THEN Value END)) / 2.0
FROM CTE
GROUP BY Code;
Example on SQL Fiddle
In SQL Server 2012 you can use PERCENTILE_CONT
SELECT DISTINCT
Code,
Median = PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Value) OVER(PARTITION BY Code)
FROM T;
Example on SQL Fiddle
SQL Server does not have a function to calculate medians, but you could use the ROW_NUMBER function like this:
WITH RankedTable AS (
SELECT Code, Value,
ROW_NUMBER() OVER (PARTITION BY Code ORDER BY VALUE) AS Rnk,
COUNT(*) OVER (PARTITION BY Code) AS Cnt
FROM MyTable
)
SELECT Code, Value
FROM RankedTable
WHERE Rnk = Cnt / 2 + 1
To elaborate a bit on this solution, consider the output of the RankedTable CTE:
Code Value Rnk Cnt
---------------------------
4 240 2 3 -- Median
4 299 3 3
4 210 1 3
2 NULL 1 2
2 3 2 2 -- Median
6 30 2 3 -- Median
6 80 3 3
6 10 1 3
Now from this result set, if you only return those rows where Rnk equals Cnt / 2 + 1 (integer division), you get only the rows with the median value for each group.

Filter rows based on condition sql server 2008

The below is the sample data.
Op_ID manual TT
------------------
1 0 32
1 1 38.4
2 0 4.56
2 1 7.5
55 1 50
55 1 30
case 1: i need to check Op_id and manual column, if the manual column is having 0 then i need to take tt value= 32 and ignore the below record. similarly needs to check the other records.i.e. op_id=2 and manual=0 then need to take tt=4.56.
case 2: if both records having manual =1 then i need to take max of tt, i.e tt=50.(for the op_id=55).
So i need the output like below.
Op_ID manual TT
------------------
1 0 32
2 0 4.56
55 1 50
select opid, manual, tt
from (
select *, row_number() over (partition by opid order by manual, tt desc) rn
from yourtable ) v
where rn = 1