How can wrap-on-overflow math be done in Postgres? - sql

Given the following example function:
CREATE OR REPLACE FUNCTION add_max_value(_x BIGINT)
RETURNS BIGINT
LANGUAGE sql
AS $$
SELECT 9223372036854775807 + _x;
$$;
If this function is called with any positive value, the following error is returned:
SELECT add_max_value(1); -- Expecting -9223372036854775808 if math wrapped
-- SQL Error [22003]: ERROR: bigint out of range
How can I do wrap-on-overflow integer math in Postgres?
Please note:
I want to do this in the database, not in the application
I don't want it to promote to an arbitrary precision integer (NUMERIC)
Although the example only does addition, in practice I'm interested in other operations as well

As a SQL function there isn't a way. SQL functions cannot process exceptions. But a plpgsql function can:
CREATE OR REPLACE FUNCTION add_max_value(_x BIGINT)
RETURNS BIGINT
LANGUAGE plpgsql
AS $$
declare
bigx bigint;
begin
bigx = 9223372036854775807 + _x;
return bigx;
exception
when sqlstate '22003' then
return (9223372036854775807::numeric + _x - 2^64)::bigint;
end;
$$;

This is easily in my top five of the most wasteful SQL I have ever written:
create or replace function add_max_value(_x bigint)
returns bigint
language sql
as $$
with recursive inputs as (
select s.rn, r.a::int, s.b::int, (r.a::int + s.b::int) % 2 as sumbit,
(r.a::bit & s.b::bit)::int as carry
from regexp_split_to_table((9223372036854775807::bit(64))::text, '') with ordinality as r(a, rn)
join regexp_split_to_table((_x::bit(64))::text, '') with ordinality as s(b, rn)
on s.rn = r.rn
), addition as (
select rn, sumbit, sumbit as s2, carry, carry as upcarry
from inputs
where rn = 64
union
select i.rn, i.sumbit, (i.sumbit + a.upcarry) % 2, i.carry,
(i.carry::bit | a.upcarry::bit)::int
from addition a
join inputs i on i.rn = a.rn - 1
)
select (string_agg(s2::text, '' order by rn)::bit(64))::bigint
from addition
$$;

Related

Select odd or even values from text array

How to select odd or even values from a text array in Postgres?
You can select by index (starts at 1)
select
text_array[1] as first
from (
select '{a,1,b,2,c,3}'::text[] as text_array
) as x
There is not a native function for this: https://www.postgresql.org/docs/13/functions-array.html. I see Postgres supports modulo math (https://www.postgresql.org/docs/13/functions-math.html) but I'm not sure how to apply that here as the below is invalid:
select
text_array[%2] as odd
from (
select '{a,1,b,2,c,3}'::text[] as text_array
) as x
The goal is to get {a,1,b,2,c,3} -> {a,b,c}. Likewise for even, {a,1,b,2,c,3} -> {1,2,3}.
Any guidance would be greatly appreciated!
Generate a list of subscripts (generate_series expression for the odd ones) then extract the array values and aggregate back into arrays. Null values by even subscripts need to be filtered if the array length is odd. Here is an illustration. t CTE is a "table" of sample data.
with t(arr) as
(
values
('{a,1,b,2,c,3}'::text[]),
('{11,12,13,14,15,16,17,18,19,20}'), -- even number of elements
('{21,22,23,24,25,26,27,28,29}') -- odd number of elements
)
select arr,
array_agg(arr[odd]) arr_odd,
array_agg(arr[odd + 1]) filter (where arr[odd + 1] is not null) arr_even
from t
cross join lateral generate_series(1, array_length(arr, 1), 2) odd
group by arr;
arr
arr_odd
arr_even
{21,22,23,24,25,26,27,28,29}
{21,23,25,27,29}
{22,24,26,28}
{a,1,b,2,c,3}
{a,b,c}
{1,2,3}
{11,12,13,14,15,16,17,18,19,20}
{11,13,15,17,19}
{12,14,16,18,20}
Or use these functions:
create function textarray_odd(arr text[]) returns text[] language sql as
$$
select array_agg(arr[i]) from generate_series(1, array_length(arr,1), 2) i;
$$;
create function textarray_even(arr text[]) returns text[] language sql as
$$
select array_agg(arr[i]) from generate_series(2, array_length(arr,1), 2) i;
$$;
select textarray_odd('{a,1,b,2,c,3}'); -- {a,b,c}
select textarray_even('{a,1,b,2,c,3}'); -- {1,2,3}
A more generic alternative:
create function array_odd(arr anyarray) returns anyarray language sql as
$$
select array_agg(v order by i)
from unnest(arr) with ordinality t(v, i)
where i % 2 = 1;
$$;

Wrong Calculation by User-Defined SQL-Function

I have a function defined by
CREATE OR REPLACE FUNCTION public.div(dividend INTEGER, divisor INTEGER)
RETURNS INTEGER
LANGUAGE 'sql'
IMMUTABLE
LEAKPROOF
STRICT
SECURITY DEFINER
PARALLEL SAFE
AS $BODY$
SELECT ($1 + $2/2) / $2;
$BODY$;
It should calculate a commercial rounded result. Most of the times, it does the job. I don't know why, but select div(5, 3) gives me the correct answer while it doesn't when one parameter is calculated by an aggregate, e.g. select div(sum(val), 3) from (select 1 as val UNION SELECT 4) list is sufficient to trigger that.
How can I fix div? I don't want to cast every input.
BTW, using SELECT (cast($1 as integer) + cast($2 as integer)/2) / cast($2 as integer); as the definition of div didn't help.
Allow floats as parameters, then explicitly cast at the calculation, otherwise you have an implied conversion whilst passing the parameter.
CREATE OR REPLACE FUNCTION my_div(dividend FLOAT, divisor FLOAT)
RETURNS INTEGER
LANGUAGE 'sql'
IMMUTABLE
-- LEAKPROOF -- not allowed at dbfiddle.uk
STRICT
SECURITY DEFINER
PARALLEL SAFE
AS $BODY$
SELECT --($1 + $2/2) / $2;
(cast($1 as integer) + cast($2 as integer)/2) / cast($2 as integer)
$BODY$;
✓
select my_div(sum(val), 3)
from (select 1 as val UNION SELECT 4) x
| my_div |
| -----: |
| 2 |
dbfiddle here
Change the name of the function.
The function div(numeric, numeric) is a builtin Postgres function and there is an ambiguity which function you want to call:
select div(5, 3) -- calls your function public.div(integer, integer)
select div(5::bigint, 3) -- calls pg_catalog.div(numeric, numeric)
In the second case the arguments have to be resolved and the system function is chosen as first.
Note that the function sum(integer) gives bigint as a result.

How to replace all subsets of characters based on values of other tables in pl/pgsql?

I've been doing some research on how to replace a subset of string of characters of a single row base on the values of the columns of other rows, but was not able to do so since the update are only for the first row values of the other table. So I'm planning to insert this in a loop in a plpsql function.
Here are the snippet of my tables. Main table:
Table "public.tbl_main"
Column | Type | Modifiers
-----------------------+--------+-----------
maptarget | text |
expression | text |
maptarget | expression
-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
43194-0 | 363787002:70434600=(386053000:704347000=(414237002:704320005=259470008,704318007=118539007,704319004=50863008),704327008=122592007,246501002=703690001,370132008=30766002)
Look-up table:
Table "public.tbl_values"
Column | Type | Modifiers
-----------------------+--------+-----------
conceptid | bigint |
term | text |
conceptid | term
-----------+------------------------------------------
386053000 | Patient evaluation procedure (procedure)
363787002 | Observable entity (observable entity)
704347000 | Observes (attribute)
704320005 | Towards (attribute)
704318007 | Property type (attribute)
I want to create a function that will replace all numeric values in the tbl_main.expression columns with their corresponding tbl_values.term using the tbl_values.conceptid as the link to each numeric values in the expression string.
I'm stuck currently in the looping part since I'm a newbie in LOOP of plpgsql. Here is the rough draft of my function.
--create first a test table
drop table if exists tbl_test;
create table tbl_test as select * from tbl_main limit 1;
--
create or replace function test ()
RETURNS SETOF tbl_main
LANGUAGE plpgsql
AS $function$
declare
resultItem tbl_main;
v_mapTarget text;
v_expression text;
ctr int;
begin
v_mapTarget:='';
v_expression:='';
ctr:=1;
for resultItem in (select * from tbl_test) loop
v_mapTarget:=resultItem.mapTarget;
select into v_expression expression from ee;
raise notice 'parameter used: %',v_mapTarget;
raise notice 'current expression: %',v_expression;
update ee set expression=replace(v_expression, new_exp::text, term) from (select new_exp::text, term from tbl_values offset ctr limit 1) b ;
ctr:=ctr+1;
raise notice 'counter: %', ctr;
v_expression:= (select expression from ee);
resultItem.expression:= v_expression;
raise notice 'current expression: %',v_expression;
return next resultItem;
end loop;
return;
end;
$function$;
Any further information will be much appreciated.
My Postgres version:
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit
PL/pgSQL function with dynamic SQL
Looping is always a measure of last resort. Even in this case it is substantially cheaper to concatenate a query string using a query, and execute it once:
CREATE OR REPLACE FUNCTION f_make_expression(_expr text, OUT result text) AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT ' || string_agg('replace(', '') || '$1,'
|| string_agg(format('%L,%L)', conceptid::text, v.term), ','
ORDER BY conceptid DESC)
FROM (
SELECT conceptid::bigint
FROM regexp_split_to_table($1, '\D+') conceptid
WHERE conceptid <> ''
) m
JOIN tbl_values v USING (conceptid)
)
USING _expr
INTO result;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT *, f_make_expression(expression) FROM tbl_main;
However, if not all conceptid have the same number of digits, the operation could be ambiguous. Replace conceptid with more digits first to avoid that - ORDER BY conceptid DESC does that - and make sure that replacement strings do not introduce ambiguity (numbers that might be replaced in the the next step). Related answer with more on these pitfalls:
Replace a string with another string from a list depending on the value
The token $1 is used two different ways here, don't be misled:
regexp_split_to_table($1, '\D+')
This one references the first function parameter _expr. You could as well use the parameter name.
|| '$1,'
This concatenates into the SQL string a references to the first expression passed via USING clause to EXECUTE. Parameters of the outer function are not visible inside EXECUTE, you have to pass them explicitly.
It's pure coincidence that $1 (_expr) of the outer function is passed as $1 to EXECUTE. Might as well hand over $7 as third expression in the USING clause ($3) ...
I added a debug function to the fiddle. With a minor modification you can output the generated SQL string to inspect it:
SQL function
Here is a pure SQL alternative. Probably also faster:
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM (
SELECT *, row_number() OVER () AS rn
FROM (
SELECT regexp_split_to_table($1, '\D+') conceptid
, regexp_split_to_table($1, '\d+') txt
) sub
) t
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
In Postgres 9.4 this can be much more elegant with two new features:
ROWS FROM to replacing the old (weird) technique to sync set-returning functions
WITH ORDINALITY to get row numbers on the fly reliably:
PostgreSQL unnest() with element number
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM ROWS FROM (
regexp_split_to_table($1, '\D+')
, regexp_split_to_table($1, '\d+')
) WITH ORDINALITY AS t(conceptid, txt, rn)
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
SQL Fiddle demonstrating all for Postgres 9.3.
There's also another way, without creating functions... using "WITH RECURSIVE". Used it with lookup talbe of thousands of rows.
You'll need to change following table names and columns to your names:
tbl_main, strsourcetext, strreplacedtext;
lookuptable, strreplacefrom, strreplaceto.
WITH RECURSIVE replaced AS (
(SELECT
strsourcetext,
strreplacedtext,
array_agg(strreplacefrom ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplacefrom,
array_agg(strreplaceto ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplaceto,
count(1) AS intcount,
1 AS intindex
FROM tbl_main, lookuptable WHERE tbl_main.strsourcetext LIKE '%' || strreplacefrom || '%'
GROUP BY strsourcetext)
UNION ALL
SELECT
strsourcetext,
replace(strreplacedtext, arrreplacefrom[intindex], arrreplaceto[intindex]) AS strreplacedtext,
arrreplacefrom,
arrreplaceto,
intcount,
intindex+1 AS intindex
FROM replaced WHERE intindex<=intcount
)
SELECT strsourcetext,
(array_agg(strreplacedtext ORDER BY intindex DESC))[1] AS strreplacedtext
FROM replaced
GROUP BY strsourcetext

Calculate function's fixed point in sql?

I want to write sql query that will calculate iterated function sequence of some function until it reaches fixed point, and then return that fixed point:
f(f(f(f ..(x)))) = x0 = f(x0)
E.g. let f(x) = (256/x + x)/2:
create function f(x float) returns float as $$
select (256/x + x) / 2
$$ language sql;
Here is my attempt to write the query:
create function f_sequence(x float) returns table(x0 float) as $$
with recursive
t(a,b) as
(select x, f(x)
union all
select b, f(b) from t where a <> b)
select a from t;
$$ language sql;
Now it is possible to get iterated sequence that converges to some fixed point:
=# select f_sequence(333);
f_sequence
------------------
333
166.884384384384
84.2091902577822
43.6246192451207
24.7464326525125
17.5456790321891
16.0680829640781
16.0001442390486
16.0000000006501
16
(10 rows)
(Actually it converges to √256 because this is Babylonian method of calculating square roots.)
Now I need one additional query to get only the last row from the sequence:
=# with
res as (select array_agg(f_sequence) as res from f_sequence(333))
select res[array_length(res,1)] from res;
res
-----
16
(1 row)
The question is: how to write this more concise?
In particular, I don't like that separate query to get the last value (and that I need to accumulate all intermediate values in array).
Leave the definition of f as it is.
In your sequence generation, retain the value of f(x) in the output.
create function f_sequence(x float) returns table(x0 float, fx float) as $$
with recursive
t(a,b) as
(select x, f(x)
union all
select b, f(b) from t where a <> b)
select a, b from t;
$$ language sql;
Then limit your result to just the fixed value.
select x0 from f_sequence(256) where x0 = fx;
Edit: Add a procedural version.
create function iterf(x float) returns float as $$
declare fx float := f(x);
begin
while fx != x loop
x := fx;
fx := f(x);
end loop;
return fx;
end;
$$ language plpgsql;

How to calculate an exponential moving average on postgres?

I'm trying to implement an exponential moving average (EMA) on postgres, but as I check documentation and think about it the more I try the more confused I am.
The formula for EMA(x) is:
EMA(x1) = x1
EMA(xn) = α * xn + (1 - α) * EMA(xn-1)
It seems to be perfect for an aggregator, keeping the result of the last calculated element is exactly what has to be done here. However an aggregator produces one single result (as reduce, or fold) and here we need a list (a column) of results (as map). I have been checking how procedures and functions work, but AFAIK they produce one single output, not a column. I have seen plenty of procedures and functions, but I can't really figure out how does this interact with relational algebra, especially when doing something like this, an EMA.
I did not have luck searching the Internets so far. But the definition for an EMA is quite simple, I hope it is possible to translate this definition into something that works in postgres and is simple and efficient, because moving to NoSQL is going to be excessive in my context.
Thank you.
PD: here you can see an example:
https://docs.google.com/spreadsheet/ccc?key=0AvfclSzBscS6dDJCNWlrT3NYdDJxbkh3cGJ2S2V0cVE
You can define your own aggregate function and then use it with a window specification to get the aggregate output at each stage rather than a single value.
So an aggregate is a piece of state, and a transform function to modify that state for each row, and optionally a finalising function to convert the state to an output value. For a simple case like this, just a transform function should be sufficient.
create function ema_func(numeric, numeric) returns numeric
language plpgsql as $$
declare
alpha numeric := 0.5;
begin
-- uncomment the following line to see what the parameters mean
-- raise info 'ema_func: % %', $1, $2;
return case
when $1 is null then $2
else alpha * $2 + (1 - alpha) * $1
end;
end
$$;
create aggregate ema(basetype = numeric, sfunc = ema_func, stype = numeric);
which gives me:
steve#steve#[local] =# select x, ema(x, 0.1) over(w), ema(x, 0.2) over(w) from data window w as (order by n asc) limit 5;
x | ema | ema
-----------+---------------+---------------
44.988564 | 44.988564 | 44.988564
39.5634 | 44.4460476 | 43.9035312
38.605724 | 43.86201524 | 42.84396976
38.209646 | 43.296778316 | 41.917105008
44.541264 | 43.4212268844 | 42.4419368064
These numbers seem to match up to the spreadsheet you added to the question.
Also, you can define the function to pass alpha as a parameter from the statement:
create or replace function ema_func(state numeric, inval numeric, alpha numeric)
returns numeric
language plpgsql as $$
begin
return case
when state is null then inval
else alpha * inval + (1-alpha) * state
end;
end
$$;
create aggregate ema(numeric, numeric) (sfunc = ema_func, stype = numeric);
select x, ema(x, 0.5 /* alpha */) over (order by n asc) from data
Also, this function is actually so simple that it doesn't need to be in plpgsql at all, but can be just a sql function, although you can't refer to parameters by name in one of those:
create or replace function ema_func(state numeric, inval numeric, alpha numeric)
returns numeric
language sql as $$
select case
when $1 is null then $2
else $3 * $2 + (1-$3) * $1
end
$$;
This type of query can be solved with a recursive CTE - try:
with recursive cte as (
select n, x ema from my_table where n = 1
union all
select m.n, alpha * m.x + (1 - alpha) * cte.ema
from cte
join my_table m on cte.n = m.n - 1
cross join (select ? alpha) a)
select * from cte;
--$1 Stock code
--$2 exponential;
create or replace function fn_ema(text,numeric)
returns numeric as
$body$
declare
alpha numeric := 0.5;
var_r record;
result numeric:=0;
n int;
p1 numeric;
begin
alpha=2/(1+$2);
n=0;
for var_r in(select *
from stock_old_invest
where code=$1 order by stock_time desc)
loop
if n>0 then
result=result+(1-alpha)^n*var_r.price_now;
else
p1=var_r.price_now;
end if;
n=n+1;
end loop;
result=alpha*(result+p1);
return result;
end
$body$
language plpgsql volatile
cost 100;
alter function fn_ema(text,numeric)
owner to postgres;