Oracle SQL : Calculating weighted probability - sql

I'm struggling to retrieve a "weighted probability" from a database table in my SQL statement.
What do I need to do:
I have tabular information of probable financial values like:
Table my_table
ID
P [%]
Value [$]
1
50
200
2
50
200
3
60
100
I need to calculate the weighted probability of reasonable worst case financial value to occur.
The formula is:
P_weighted = 1 - (1 - P_1 * Value_1/Max(Value_1-n) * (1 - P_2 * Value_2/Max(Value_1-n) * ...
i.e.
P_weighted = 1 - Product(1 - P_i * Value_i / Max(Value_1-n)
P_weighted = 1 - (1 - 50% * 200 / 200) * (1 - 50% * 200 / 200) * (1 - 60% * 100 / 200) = 82.5%
I know the is not product function in (Oracle) SQL, and this can be substituted by EXP( SUM LN(x))) ensuring x is always positive.
Hence, if I were only to calculate the combined probability I could (regardless of the value I could do like:
SELECT EXP(SUM(LN(1 - t.P))) FROM FROM my_table t WHERE condition
When I need to include the Max(t.Value) I've got the following problem:
A SELECT list cannot include both a group function, such as AVG, COUNT, MAX, MIN, SUM, STDDEV, or VARIANCE, and an individual column expression, unless the individual column expression is included in a GROUP BY clause.
So I tried the following:
SELECT ROUND(1-EXP(SUM(LN(1 - t.P*t.Value/max(t.Value)))),1) FROM FROM my_table t WHERE condition GROUP BY t.P, t.Value
But this does obviously group the output by probability rather than multiplying it and just returns 0.5 or 50% instead of the product which should be 0.825 or 82.5%.
How do I get the weighted probability from by table above using (Oracle) SQL?

Does this do it:
with da as (select .50 as p, 200 as v from dual union all select .50 , 200 from dual union all select .60,100 from dual),
mx as (select max(v) mx from da)
select exp(sum(ln(1-da.p*da.v/mx))) from da, mx;
EXP(SUM(LN(1-DA.P*DA.V/MX)))
----------------------------
.175

with
test1 as(
select max(value) v_max from my_table
),
test2 as(
select 1-(my.p/100* value/t1.v_max) rez
from my_table my, test1 t1
)
select to_char(round((1-(EXP (SUM (LN (rez)))))*100,2))||'%' "Weighted probability"
from test2
RESULT:
Weighted probability
--------------------
82,5%

If you want the calculation per-row then you can use an analytic SUM:
SELECT id,
ROUND(1 - EXP(SUM(LN(1 - wp)) OVER (ORDER BY id)), 3) AS cwp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data:
CREATE TABLE table_name (ID, P, Value) AS
SELECT 1, .50, 200 FROM DUAL UNION ALL
SELECT 2, .50, 200 FROM DUAL UNION ALL
SELECT 3, .60, 100 FROM DUAL;
Outputs the cumulative weighted probabilities:
ID
CWP
1
.5
2
.75
3
.825
If you just want the total weighted probability then:
SELECT ROUND(1 - EXP(SUM(LN(1 - wp))), 3) AS twp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data, outputs:
TWP
.825
db<>fiddle here

Related

How to missing numbers by 100s in oracle

I need to find the missing numbers in a table column in oracle, where the missing numbers must be taken by 100s , meaning that if it's found 1 number at least between 2000 and 2099 , all missing numbers between 2000 and 2099 must be returned and so on.
here is an example that clarify what I need:
create table test1 ( a number(9,0));
insert into test1 values (2001);
insert into test1 values (2002);
insert into test1 values (2004);
insert into test1 values (2105);
insert into test1 values (3006);
insert into test1 values (9410);
commit;
the result must be 2000,2003,2005 to 2099,2100 to 2104,2106 to 2199,3000 to 3005,3007 to 3099,9400 to 9409,9411 to 9499.
I started with this query but it's obviously not returning what I need :
SELECT Level+(2000-1) FROM dual CONNECT BY LEVEL <= 9999
MINUS SELECT a FROM test1;
You can use the hiearchy query as follows:
SQL> SELECT A FROM (
2 SELECT A + COLUMN_VALUE - 1 AS A
3 FROM ( SELECT DISTINCT TRUNC(A, - 2) A
4 FROM TEST_TABLE) T
5 CROSS JOIN TABLE ( CAST(MULTISET(
6 SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= 100
7 ) AS SYS.ODCINUMBERLIST) ) LEVELS
8 )
9 MINUS
10 SELECT A FROM TEST_TABLE;
A
----------
2000
2003
2005
2006
2007
2008
2009
.....
.....
I like to use standard recursive queries for this.
with nums (a, max_a) as (
select min(a), max(a) from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
The with clause takes the minimum and maximum value of a in the table, and generates all numbers in between. Then, the outer query filters on those that do not exist in the table.
If you want to generate ranges of missing numbers instead of a comprehensive list, you can use window functions instead:
select a + 1 start_a, lead_a - 1 end_a
from (
select a, lead(a) over(order by a) lead_a
from test1
) t
where lead_a <> a + 1
Demo on DB Fiddle
EDIT:
If you want the missing values within ranges of thousands, then we can slightly adapt the recursive solution:
with nums (a, max_a) as (
select distinct floor(a / 100) * 100 a, floor(a / 100) * 100 + 100 from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
Assuming you define fixed upper and lower bound for the range, then just need to eliminate the results of the current query by use of NOT EXISTS such as
SQL> exec :min_val:=2000
SQL> exec :min_val:=2499
SQL> SELECT *
FROM
(
SELECT level + :min_val - 1 AS nr
FROM dual
CONNECT BY level <= :max_val - :min_val + 1
)
WHERE NOT EXISTS ( SELECT * FROM test1 WHERE a = nr )
ORDER BY nr;
/
Demo

Random data sampling with oracle sql, data generation

i need to generate some sample data from a population. I want to do this with an SQL query on an Oracle 11g database.
Here is a simple working example with population size 4 and sample size 2:
with population as (
select 1 as val from dual union all
select 2 from dual union all
select 3 from dual union all
select 4 from dual)
select val from (
select val, dbms_random.value(0,10) AS RANDORDER
from population
order by randorder)
where rownum <= 2
(the oracle sample() funtion didn't work in connection with the WITH-clause for me)
But now I, I want to "upscale" or multiply my sample data. So that I can get something like 150 % sample data of the population data (population size 4 and sample size 6, e.g.)
Is there a good way to achieve this with an SQL query?
You could use CONNECT BY:
with population(val, RANDOMORDER) as (
select level, dbms_random.value(0,10) AS RANDORDER
from dual
connect by level <= 6
ORDER BY RANDORDER
)
select val
FROM population
WHERE rownum <= 4;
db<>fiddle demo
The solution depends, if you want all rows from first initial set(s) and random additional rows from last one then use:
with params(size_, sample_) as (select 4, 6 from dual)
select val
from (
select mod(level - 1, size_) + 1 val, sample_,
case when level <= size_ * floor(sample_ / size_) then 0
else dbms_random.value()
end rand
from params
connect by level <= size_ * ceil(sample_ / size_)
order by rand)
where rownum <= sample_
But if you allow possibility of result like (1, 1, 2, 2, 3, 3), where some values may not appear at all in output (here 4) then use this:
with params(size_, sample_) as (select 4, 6 from dual)
select val
from (
select mod(level - 1, size_) + 1 val, sample_, dbms_random.value() rand
from params
connect by level <= size_ * ceil(sample_ / size_)
order by rand)
where rownum <= sample_
How it works? We build set of (1, 2, 3, 4) as many times as it results from division sample / size. Then we assign random values. In first case I assign 0 to first set(s), so they will be in output for sure, and random values to last set. In second case randoms are assigned to all rows.

Obtaining the percentage in sqllite

I made a query with the following statement :
select mood, count(*) * 100/ (select count(*) from entry)from entry group by mood having data>data-30 order by mood asc
mood is an integer from 0 to 2
the output is :
mood count
0 96,55
1 3,44
is there a way to add a row with mood 2 count 0?
SELECT MOOD, SUM (COUNTER) TOTAL
FROM ( SELECT 0 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT 1 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT 2 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT MOOD, COUNT ( * )
* 100.0
/ (SELECT COUNT ( * )
FROM ENTRY
WHERE DATA > DATE ('now') - 30)
FROM (SELECT *
FROM ENTRY
WHERE DATA > DATE ('now') - 30)
GROUP BY MOOD, DATA)
GROUP BY MOOD
ORDER BY MOOD ASC;
You have to enumerate (0, 1, 2, .....) all the possible numbers, associating a counter = 0.
Then, you sum the counters grouping by mood.
Please note that your condition having data>data-30 is absurd.
You have to select from ENTRY all the records satisfying the condition data > date('now') - 30, for example.
SQLite: A VIEW named "dual" that works the same as the Oracle "dual" table can be created as follows: "CREATE VIEW dual AS SELECT 'x' AS dummy;"

sql query to divide a value over different records

Consider the following recordset:
1 1000 -1
2 500 2
3 1000 -1
4 500 3
5 500 2
6 1000 -1
7 500 1
So 3x a number 1000 with -1, total -3.
4x a number 500 with different values
Now I'm in need of a query which divides the sum of code 1000 over the 4 number 500 and removes code 1000.
So the end result would look like:
1 500 1.25
2 500 2.25
3 500 1.25
4 500 0.25
The sum of code 1000 = -3
There's 4 times code 500 in the table over which -3 has to be divided.
-3/4 = -0.75
so the record "2 500 2" becomes "2 500 (2+ -0.75)" = 1.25
etc
As an SQL newbie I have no clue how to get this done, can anyone help?
You can use CTEs to do it "step-wise" and build your solution. Like this:
with sumup as
(
select sum(colb) as s
from table
where cola = 1000
), countup as
(
select count(*) as c
from table
where cola = 500
), change as
(
select s / c as v
from sumup, countup
)
select cola, colb - v
from table, change
where cola = 500
Two things to note:
This might not be the fastest solution, but it is often close.
You can test this code easy, just change to final select statement to select the name of the CTE and see what it is. For example this would be a good test if you are getting a bad result:
with sumup as
(
select sum(colb) as s
from table
where cola = 1000
), countup as
(
select count(*) as c
from table
where cola = 500
), change as
(
select s / c as v
from sumup, countup
)
select * form change
Select col1,(
(Select sum(col2 )
from tab
where col1 =1000)
/
(Select count(*)
from tab
where col1 =500))+Col2 as new_value
From tab
Where col1=500
Here tab, col1,col2 are table name, column with (1000 , 500) value, column with (1,2,3 value)
This will give the results you are after:
DECLARE #T TABLE (ID INT, Number INT, Value INT)
INSERT #T (ID, Number, Value)
VALUES
(1, 1000, -1),
(2, 500, 2),
(3, 1000, -1),
(4, 500, 3),
(5, 500, 2),
(6, 1000,-1),
(7, 500, 1);
SELECT Number, Value, NewValue = Value + (x.Total / COUNT(*) OVER())
FROM #T T
CROSS JOIN
( SELECT Total = CAST(SUM(Value) AS FLOAT)
FROM #T
WHERE Number = 1000
) x
WHERE T.Number = 500;
Inside the cross join we simply get the sum where the number is 1000, this could just as easily be done as a subselect:
SELECT Number, Value, NewValue = Value + ((SELECT CAST(SUM(Value) AS FLOAT) FROM #T WHERE Number = 1000) / COUNT(*) OVER())
FROM #T T
WHERE T.Number = 500;
Or with a variable:
DECLARE #Total FLOAT = (SELECT SUM(Value) FROM #T WHERE Number = 1000);
SELECT Number, Value, NewValue = Value + (#Total / COUNT(*) OVER())
FROM #T T
WHERE T.Number = 500;
Then using the analytic function COUNT(*) OVER() you can count the total number of results that are 500.
And here is another solution:
select number1, value1,
value1
+ (select sum(value1) from table1 where number1=1000)/
(select count(*) from table1 where number1=500) calc_value
from table1 where number1=500
http://sqlfiddle.com/#!6/c68a0/1
I hope I got your question right. Then this is imho the best to read.

How to recursively compute ratio of remaining amounts based on rounded values from preceding rows?

I need to split 1 amount into 2 fields. I know the total sums of the resulting fields = the ratio to split the first row, but i need to round the resulting sums and only then compute the ratio for next row (so the total sum of the rounded values will be correct).
How can i write this algorithm in Oracle 10g PL/SQL? I need to test some migrated data. Here is what i came up with (so far):
with temp as (
select 1 id, 200 amount, 642 total_a from dual union all
select 2, 200, 642 from dual union all
select 3, 200, 642 from dual union all
select 4, 200, 642 from dual union all
select 5, 200, 642 from dual
)
select
temp2.*,
remaining_a / remaining_amount ratio,
round(amount * remaining_a / remaining_amount, 0) rounded_a,
round(amount - amount * remaining_a / remaining_amount, 0) rounded_b
from (
select
temp.id,
temp.amount,
sum(amount) over (
order by id
range between current row and unbounded following
) remaining_amount,
case when id=1 then total_a /* else ??? */ end remaining_a
from temp
) temp2
Update: If you can't see the image above, expected rounded_A values are:
1 128
2 129
3 128
4 129
5 128
Here is my suggestion. It is not getting exactly what you want . . . by my calculation the 129 doesn't come until the 3rd row.
The idea is to add more columns. For each row, calculate the estimated split. Then, keep track of the accumulative fraction. When the cum remainder exceeds an integer, then bump up the A amount by 1. Once you have the A amount, you can calculate the rest:
WITH temp AS (
SELECT 1 id, 200 amount, 642 total_a FROM dual UNION ALL
SELECT 2, 200, 642 FROM dual UNION ALL
SELECT 3, 200, 642 FROM dual UNION ALL
SELECT 4, 200, 642 FROM dual UNION ALL
SELECT 5, 200, 642 FROM dual
)
select temp3.*,
sum(estArem) over (order by id) as cumrem,
trunc(estA) + (case when trunc(sum(estArem) over (order by id)) > trunc(- estArem + sum(estArem) over (order by id))
then 1 else 0 end)
from (SELECT temp2.*,
trunc(Aratio*amount) as estA,
Aratio*amount - trunc(ARatio*amount) as estArem
FROM (SELECT temp.id, temp.amount,
sum(amount) over (ORDER BY id range BETWEEN CURRENT ROW AND unbounded following
) remaining_amount,
sum(amount) over (partition by null) as total_amount,
max(total_a) over (partition by null)as maxA,
(max(total_a) over (partition by null) /
sum(amount) over (partition by null)
) as ARatio
FROM temp
) temp2
) temp3
This isn't exactly a partitioning problem. This is an integer approximation problem.
If you are rounding the values rather than truncating them, then you need a slight tweak to the logic.
trunc(estA) + (case when trunc(sum(0.5+estArem) over (order by id)) > trunc(0.5 - estArem + sum(estArem) over (order by id))
This statement was originally just looking for the cumulative remainder passing over the integer threshhold. This should do rounding instead of truncation.