AS400 DB2 query math expression in Select - sql

I have not done DB2 queries for a while so I am having issues with a math expression in my Select statement. It does not throw an error but I get the wrong result. Can someone tell me how DB2 evaluates the expression?
Part of my Select is below.
The values are:
t1.Points = 100
t2.Involvepoints = 1
(current date - t1.fromdt) in days is 1268 (so it would be current
date 7/19/2013 - 01/28/2010 in days)
It should read like (100 * 1) * (1 - (.000274 * 1268)) = 65.2568
SELECT Value1,
value2,
(CASE
WHEN (T1.POINTS * T2.INVOLVEPOINTS) * (1 - .000274 * DAYS(CURRENT DATE) - DAYS(T1.FROMDT)) >= 0 THEN (T1.POINTS * T2.INVOLVEPOINTS) * (1 - .000274 * DAYS(CURRENT DATE) - DAYS(T1.FROMDT))
ELSE 0
END) AS POINTSTOTAL
FROM TABLE1;

The parenthesis are not enforcing the correct precedence of operations and the join declaration is missing. In addition you can use the MAX scalar function instead of the repetitive CASE statement.
Here is a proof using common table expressions to simulate the source data:
with
t1 (value1, points, fromdt)
as (select 1, 100, '2010-01-28' from sysibm.sysdummy1),
t2 (value2, involvepoints)
as (select 2, 1 from sysibm.sysdummy1)
select value1, value2,
max(0, t1.points * t2.involvepoints *
(1 - .000274 * (DAYS('2013-07-19') - DAYS(t1.fromdt)))) as pointstotal
from t1, t2;
The result is:
VALUE1 VALUE2 POINTSTOTAL
------ ------ -----------
1 2 65.256800

Did you mean this?
...
(T1.POINTS * T2.INVOLVEPOINTS) * (1 - .000274 * ( DAYS(CURRENT DATE) - DAYS(T1.FROMDT) ) )
...
Note the extra pair of parentheses around the subtraction of dates. Normally multiplication takes precedence over addition, so in your original query you multiply today's date by 0.000274, subtract that from 1, then subtract the value of FROMDT from the result.
Curiously, you have those parentheses in your explanation, but not in the actual formula.

Related

Oracle SQL : Calculating weighted probability

I'm struggling to retrieve a "weighted probability" from a database table in my SQL statement.
What do I need to do:
I have tabular information of probable financial values like:
Table my_table
ID
P [%]
Value [$]
1
50
200
2
50
200
3
60
100
I need to calculate the weighted probability of reasonable worst case financial value to occur.
The formula is:
P_weighted = 1 - (1 - P_1 * Value_1/Max(Value_1-n) * (1 - P_2 * Value_2/Max(Value_1-n) * ...
i.e.
P_weighted = 1 - Product(1 - P_i * Value_i / Max(Value_1-n)
P_weighted = 1 - (1 - 50% * 200 / 200) * (1 - 50% * 200 / 200) * (1 - 60% * 100 / 200) = 82.5%
I know the is not product function in (Oracle) SQL, and this can be substituted by EXP( SUM LN(x))) ensuring x is always positive.
Hence, if I were only to calculate the combined probability I could (regardless of the value I could do like:
SELECT EXP(SUM(LN(1 - t.P))) FROM FROM my_table t WHERE condition
When I need to include the Max(t.Value) I've got the following problem:
A SELECT list cannot include both a group function, such as AVG, COUNT, MAX, MIN, SUM, STDDEV, or VARIANCE, and an individual column expression, unless the individual column expression is included in a GROUP BY clause.
So I tried the following:
SELECT ROUND(1-EXP(SUM(LN(1 - t.P*t.Value/max(t.Value)))),1) FROM FROM my_table t WHERE condition GROUP BY t.P, t.Value
But this does obviously group the output by probability rather than multiplying it and just returns 0.5 or 50% instead of the product which should be 0.825 or 82.5%.
How do I get the weighted probability from by table above using (Oracle) SQL?
Does this do it:
with da as (select .50 as p, 200 as v from dual union all select .50 , 200 from dual union all select .60,100 from dual),
mx as (select max(v) mx from da)
select exp(sum(ln(1-da.p*da.v/mx))) from da, mx;
EXP(SUM(LN(1-DA.P*DA.V/MX)))
----------------------------
.175
with
test1 as(
select max(value) v_max from my_table
),
test2 as(
select 1-(my.p/100* value/t1.v_max) rez
from my_table my, test1 t1
)
select to_char(round((1-(EXP (SUM (LN (rez)))))*100,2))||'%' "Weighted probability"
from test2
RESULT:
Weighted probability
--------------------
82,5%
If you want the calculation per-row then you can use an analytic SUM:
SELECT id,
ROUND(1 - EXP(SUM(LN(1 - wp)) OVER (ORDER BY id)), 3) AS cwp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data:
CREATE TABLE table_name (ID, P, Value) AS
SELECT 1, .50, 200 FROM DUAL UNION ALL
SELECT 2, .50, 200 FROM DUAL UNION ALL
SELECT 3, .60, 100 FROM DUAL;
Outputs the cumulative weighted probabilities:
ID
CWP
1
.5
2
.75
3
.825
If you just want the total weighted probability then:
SELECT ROUND(1 - EXP(SUM(LN(1 - wp))), 3) AS twp
FROM (
SELECT id,
p * value / MAX(value) OVER () AS wp
FROM table_name
)
Which, for the sample data, outputs:
TWP
.825
db<>fiddle here

Out of range integer: infinity

So I'm trying to work through a problem thats a bit hard to explain and I can't expose any of the data I'm working with but what Im trying to get my head around is the error below when running the query below - I've renamed some of the tables / columns for sensitivity issues but the structure should be the same
"Error from Query Engine - Out of range for integer: Infinity"
WITH accounts AS (
SELECT t.user_id
FROM table_a t
WHERE t.type like '%Something%'
),
CTE AS (
SELECT
st.x_user_id,
ad.name as client_name,
sum(case when st.score_type = 'Agility' then st.score_value else 0 end) as score,
st.obs_date,
ROW_NUMBER() OVER (PARTITION BY st.x_user_id,ad.name ORDER BY st.obs_date) AS rn
FROM client_scores st
LEFT JOIN account_details ad on ad.client_id = st.x_user_id
INNER JOIN accounts on st.x_user_id = accounts.user_id
--WHERE st.x_user_id IN (101011115,101012219)
WHERE st.obs_date >= '2020-05-18'
group by 1,2,4
)
SELECT
c1.x_user_id,
c1.client_name,
c1.score,
c1.obs_date,
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
FROM CTE c1
LEFT JOIN CTE c2 on c1.x_user_id = c2.x_user_id and c1.client_name = c2.client_name and c1.rn = c2.rn +2
I know the query works for sure because when I get rid of the first CTE and hard code 2 id's into a where clause i commented out it returns the data I want. But I also need it to run based on the 1st CTE which has ~5k unique id's
Here is a sample output if i try with 2 id's:
Based on the above number of row returned per id I would expect it should return 5000 * 3 rows = 150000.
What could be causing the out of range for integer error?
This line is likely your problem:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
When the value of c2.score is 0, 1.0/c2.score will be infinity and will not fit into an integer type that you’re trying to cast it into.
The reason it’s working for the two users in your example is that they don’t have a 0 value for c2.score.
You might be able to fix this by changing to:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / NULLIF(c2.score, 0)) * 100, 0) AS INT) AS score_diff

Return five rows of random DNA instead of just one

This is the code I have to create a string of DNA:
prepare dna_length(int) as
with t1 as (
select chr(65) as s
union select chr(67)
union select chr(71)
union select chr(84) )
, t2 as ( select s, row_number() over() as rn from t1)
, t3 as ( select generate_series(1,$1) as i, round(random() * 4 + 0.5) as rn )
, t4 as ( select t2.s from t2 join t3 on (t2.rn=t3.rn))
select array_to_string(array(select s from t4),'') as dna;
execute dna_length(20);
I am trying to figure out how to re-write this to give a table of 5 rows of strings of DNA of length 20 each, instead of just one row. This is for PostgreSQL.
I tried:
CREATE TABLE dna_table(g int, dna text);
INSERT INTO dna_table (1, execute dna_length(20));
But this does not seem to work. I am an absolute beginner. How to do this properly?
PREPARE creates a prepared statement that can be used "as is". If your prepared statement returns one string then you can only get one string. You can't use it in other operations like insert, e.g.
In your case you may create a function:
create or replace function dna_length(int) returns text as
$$
with t1 as (
select chr(65) as s
union
select chr(67)
union
select chr(71)
union
select chr(84))
, t2 as (select s,
row_number() over () as rn
from t1)
, t3 as (select generate_series(1, $1) as i,
round(random() * 4 + 0.5) as rn)
, t4 as (select t2.s
from t2
join t3 on (t2.rn = t3.rn))
select array_to_string(array(select s from t4), '') as dna
$$ language sql;
And use it in a way like this:
insert into dna_table(g, dna) select generate_series(1,5), dna_length(20)
From the official doc:
PREPARE creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed. This division of labor avoids repetitive parse analysis work, while allowing the execution plan to depend on the specific parameter values supplied.
About functions.
This can be much simpler and faster:
SELECT string_agg(CASE ceil(random() * 4)
WHEN 1 THEN 'A'
WHEN 2 THEN 'C'
WHEN 3 THEN 'T'
WHEN 4 THEN 'G'
END, '') AS dna
FROM generate_series(1,100) g -- 100 = 5 rows * 20 nucleotides
GROUP BY g%5;
random() produces random value in the range 0.0 <= x < 1.0. Multiply by 4 and take the mathematical ceiling with ceil() (cheaper than round()), and you get a random distribution of the numbers 1-4. Convert to ACTG, and aggregate with GROUP BY g%5 - % being the modulo operator.
About string_agg():
Concatenate multiple result rows of one column into one, group by another column
As prepared statement, taking
$1 ... the number of rows
$2 ... the number of nucleotides per row
PREPARE dna_length(int, int) AS
SELECT string_agg(CASE ceil(random() * 4)
WHEN 1 THEN 'A'
WHEN 2 THEN 'C'
WHEN 3 THEN 'T'
WHEN 4 THEN 'G'
END, '') AS dna
FROM generate_series(1, $1 * $2) g
GROUP BY g%$1;
Call:
EXECUTE dna_length(5,20);
Result:
| dna |
| :------------------- |
| ATCTTCGACACGTCGGTACC |
| GTGGCTGCAGATGAACAGAG |
| ACAGCTTAAAACACTAAGCA |
| TCCGGACCTCTCGACCTTGA |
| CGTGCGGAGTACCCTAATTA |
db<>fiddle here
If you need it a lot, consider a function instead. See:
What is the difference between a prepared statement and a SQL or PL/pgSQL function, in terms of their purposes?

SQL Server Percent Difference is Greater than Value

I have the following table structures:
table c_alert:
|dynamic|symbol|price_usd|
--------------------------
|5 |BTC |13000 |
table c_current:
|symbol|price_usd|
------------------
|BTC |13600 |
I have this query:
SELECT dbo.c_alert.symbol, dbo.c_alert.price_usd AS alert_price, dbo.c_current.price_usd AS current_price, (dbo.c_current.price_usd - dbo.c_alert.price_usd) * 100.0 / dbo.c_alert.price_usd AS pct_diff, dbo.c_alert.dynamic AS pct
FROM dbo.c_alert INNER JOIN
dbo.c_current
ON dbo.c_alert.symbol = dbo.c_current.symbol AND
dbo.c_alert.dynamic > (dbo.c_current.price_usd - dbo.c_alert.price_usd) * 100.0 / dbo.c_alert.price_usd
Which returns this:
|symbol|alert_price|current_price|pct_diff|dynamic|
-----------------------------------------------
|BTC |13000 |13613.3000 |4.7 |5 |
Not very strong with financial queries...Basically I would like to know when the price difference between alert_price and current_price are equal to or greater than value in the dynamic column as a boolean. So where the difference is equal or greater than 5% show True, else False. That dynamic value (integer) could change for each row in the c_alert table. Hope someone can provide a solution to the query.
Because the same percent difference term is required in multiple places in the query, I might go with using a CTE first, which calculates this term. Then, do a straightforward query on the CTE to get the output you want.
WITH cte AS (
SELECT
t2.symbol,
t2.dynamic,
t2.price_usd AS alert_price,
t1.price_usd AS current_price,
100.0*(t1.price_usd - COALESCE(t2.price_usd, 0.0)) / t2.price_usd AS pct_diff
FROM dbo.c_current t1
LEFT JOIN dbo.c_alert t2
ON t1.symbol = t2.symbol
)
SELECT
symbol,
alert_price,
current_price,
pct_diff,
dynamic,
CASE WHEN pct_diff > dynamic THEN 'TRUE' ELSE 'FALSE' END AS result
FROM cte;
Edit:
The logic seems to be working in the demo below. If you still have issues, then edit the demo and paste the link somewhere as a comment.
Demo
Use table aliases so your query is easier to write and to read. Then just use a case:
SELECT a.symbol, a.price_usd AS alert_price,
c.price_usd AS current_price,
(c.price_usd - a.price_usd) * 100.0 / a.price_usd AS pct_diff,
a.dynamic AS pct,
(case when (a.price_usd - c.price_used) > a.dynamic
then 'true' else 'false'
end) as flag
FROM dbo.c_alert a INNER JOIN
dbo.c_current c
ON a.symbol = c.symbol AND
a.dynamic > (c.price_usd - a.price_usd) * 100.0 / a.price_usd;
SQL Server doesn't have a boolean type, so this uses a string. You can use 0 and 1 instead.

Lots of WHEN conditions in CASE statement (binning)

How can I do binning in SQL Server 2008 if I need about 100 bins? I need to group records depending if a binning variable belongs to one of 100 equal intervals.
For example if there is continious variable age I could write:
CASE
WHEN AGE >= 0 AND AGE < 1 THEN '1'
WHEN AGE >= 1 AND AGE < 2 THEN '2'
...
WHEN AGE >= 99 AND AGE < 100 THEN '100'
END [age_group]
But this process would be timeconsuming? Are there some other ways how to do that?
Try This Code Once:
SELECT CASE
WHEN AGE = 0 THEN 1
ELSE Ceiling([age])
END [age_group]
FROM #T
Here CEILING function returns the smallest integer greater than or equal to the specified numeric expression.i.e select CEILING(0.1) SQL Returns 1 As Output
But According to Your Output Requirement Floor(age)+1 is enough to get Required Output.
SELECT Floor([age]) + 1 [age_group]
FROM #T
Here Floor Function Returns the largest integer less than or equal to the specified numeric expression.
Try this based upon your comment about the segments being 1200:
;With Number
AS
(
SELECT *
FROM (Values(1),(2), (3), (4), (5), (6), (7), (8), (9), (10))N(x)
),
Segments
As
(
SELECT (ROW_NUMBER() OVER(ORDER BY Num1.x) -1) * 1200 As StartNum,
ROW_NUMBER() OVER(ORDER BY Num1.x) * 1200 As EndNum
FROM Number Num1
CROSS APPLY Number Num2
)
SELECT *
FROM Segments
SELECT *
FROM Segments
INNER JOIN MyTable
ON MyTable.Price >= StartNum AND MyTable.Price < EndNum
Mathematics, I guess. In this case,
Ceiling(Age) AS [age_group]
cast as necessary into character type of your choice. Ceiling is the 'round up to an integer' function in SQL Server.
You can use arithmetic for this purpose. Something like this:
select floor(bins * (age - minage) / (range + 1)), count(*)
from t cross join
(select min(age) as minage, max(age) as maxage,
1.0*(max(age) - min(age)) as range, 100 as bins
from t
) m
group by floor(bins * (age - minage) / (range + 1))
However, this is overkill for your example, which doesn't need a case at all.
If your interval for the groups are fixed - for example 1200, you can just do an integer division to get the index with that grouping.
For example:
SELECT 1000 / 1200 equals 0
SELECT 2200 / 1200 equals 1
Remember - you need to cast to int to get the result if you're using a decimal datatype. Integer division requires int on both sides of the operator.
And then add 1 to get the group