exception handling in postgres - sql

I have a query where in, I need to find the exp of a number. In some cases, the exp is too huge returning an overflow error or in some cases, the argument for function exp is too huge. In those cases I need to return some arbitrary value.
Example (which does not work)
select LEAST(exp(1000), 1::double precision);
Here, I attempt to find the exp(1000). If it is too huge, I return 1.
How can I do this in Postgres?

Something like this:
create or replace function safe_exp(val double precision)
returns double precision
language plpgsql
as
$body$
declare
result double precision;
begin
begin
result := exp(val);
exception
when others then
result := 1.0;
end;
return result;
end;
$body$
But due to the exception block this will be slower than a "regular" exp() call.

Related

Random Double Function - POSTGRESQL

I'm trying to write a function that takes a minimum and a maximum input and returns a double.
Inputs:
high (integer)
low (integer)
Output:
val (double)
My SQL code is:
CREATE OR REPLACE FUNCTION random_between(low INT ,high INT)
RETURNS DOUBLE AS
BEGIN
RETURN floor(random()* (high-low + 1) + low);
END;
The error:
ERROR: syntax error at or near "BEGIN"
You could write this as pure SQL function, like so:
create or replace function random_between(low int ,high int)
returns double precision as $$
select floor(random()* (high-low + 1) + low);
$$ language sql;
Problems with your code:
the body of the function needs to be surrounded with single quotes (or something equivalent, such as $$)
there is no double datatype in Postgres; maybe you meant double precision; note, however, that this is an inexact datatype: this might, or might not be what you want, but make sure that you understand the implications
you need to specify the language of the function

How to speed up custom window function with arrays and loops in PostgreSQL?

I'm currently learning UDFs and wrote the PostgreSQL UDF below to calculate the mean average deviation (MAD). It is the average absolute difference between the mean and the current value over any window.
In python pandas/numpy, to find the MAD, we could write something like this:
series_mad = abs(series - series.mean()).mean()
Where series is a set of numbers and series_mad is a single numeric value representing the MAD of the series.
I'm trying to write this in PostgreSQL using Windows and UDF. So far, this is what I have:
CREATE TYPE misc_tuple AS (
arr_store numeric[],
ma_period integer
);
CREATE OR REPLACE FUNCTION mad_func(prev misc_tuple, curr numeric, ma_period integer)
RETURNS misc_tuple AS $$
BEGIN
IF curr is null THEN
RETURN (null::numeric[], -1);
ELSEIF prev.arr_store is null THEN
RETURN (ARRAY[curr]::numeric[], ma_period);
ELSE
-- accumulate new values in array
prev.arr_store := array_append(prev.arr_store, curr);
RETURN prev;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION mad_final(prev misc_tuple)
RETURNS numeric AS $$
DECLARE
total_len integer;
count numeric;
mad_val numeric;
mean_val numeric;
BEGIN
count := 0;
mad_val := 0;
mean_val := 0;
total_len := array_length(prev.arr_store, 1);
-- first loop to find the mean of the series
FOR i IN greatest(1,total_len-prev.ma_period+1)..total_len
LOOP
mean_val := mean_val + prev.arr_store[i];
count := count + 1;
END LOOP;
mean_val := mean_val/NULLIF(count,0);
-- second loop to subtract mean from each value
FOR i IN greatest(1,total_len-prev.ma_period+1)..total_len
LOOP
mad_val := mad_val + abs(prev.arr_store[i]-mean_val);
END LOOP;
RETURN mad_val/NULLIF(count, 0);
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE AGGREGATE mad(numeric, integer) (
SFUNC = mad_func,
STYPE = misc_tuple,
FINALFUNC = mad_final
);
This is how I'm testing the performance:
-- find rolling 12-period MAD
SELECT x,
mad(x, 12) OVER (ROWS 12-1 PRECEDING)
FROM generate_series(0,1000000) as g(x);
Currently, it takes ~45-50 secs on my desktop (i5 4670, 3.4 GHz, 16 GB RAM). I'm still learning UDFs, so I'm not sure what else I could do to my function to make it faster. I have a few other similar UDFs - but ones which don't use arrays and they take <15 secs on the same 1m rows. My guess is maybe I'm not efficiently looping the arrays or something could be done about the 2 loops in the UDF.
Are there any changes I can make to this UDF to make it faster?
Your example code does not work, you have an extra comma in the type definition, and you use an undefined variable cnt in one of the functions.
Why are you specifying 12 as both an argument to the aggregate itself, and in the ROWS PRECEDING? That seems redundant.
Your comparison to numpy doesn't seem very apt, as that is not a sliding window function.
I have a few other similar UDFs - but ones which don't use arrays and they take <15 secs on the same 1m rows
Are they also used as sliding window functions? Also written in plpgsql? Could you show one and it's usage?
pl/pgsql is not generally a very efficient language, especially not when manipulating large arrays. Although in your usage, the arrays never get very large, so I would not expect that to be particularly a problem.
One way to make it more efficient would be to write the code in C rather than pl/pgsql, using INTERNAL datatype rather than an SQL composite type.
Another way to improve this particular usage (a large number of windows each of which is small) might be to implement the MINVFUNC function and friends for this aggregate, so that it doesn't have to keep restarting the aggregation from scratch for every row.
Here is an example inverse function, which doesn't change the output at all, but does cut the run time in about half:
CREATE OR REPLACE FUNCTION mad_invfunc(prev misc_tuple, curr numeric, ma_period integer)
RETURNS misc_tuple AS $$
BEGIN
-- remove prev value
prev.arr_store := prev.arr_store[2:];
RETURN prev;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE AGGREGATE mad(numeric, integer) (
SFUNC = mad_func,
STYPE = misc_tuple,
FINALFUNC = mad_final,
MSFUNC = mad_func,
MSTYPE = misc_tuple,
MFINALFUNC = mad_final,
MINVFUNC = mad_invfunc
);
If I change the type from numeric to double precision everywhere they cuts the run time in half again. So while the loops over the array might not be terribly efficient, when using only 12-member windows they are not the main bottleneck.

Check Input Parameter And Raise Exception

Lately I've been working with some functions that apperently are working.
I'd like to add some features, for example: "if the function input parameter is a string, it raises an exception, saying something". How can I do that?
/*
PLpgSQL function which behaves to aggregate the MIN(col)
*/
CREATE OR REPLACE FUNCTION searchMinimumValue (real,real) RETURNS real AS $$
DECLARE
BEGIN
IF $1 IS NULL OR $1 >= $2 THEN
RETURN $2;
ELSE
RETURN $1;
END IF;
END;
$$ LANGUAGE plpgsql;
/*
Function which given the minimum value returned from the previous function,
adds the Laplacian noise.
Our upper bound is computed by doubling the epsilon value and then adding our minimum value found by the previous function.
The returned value from the function below will be the Laplace distribution value added to the output from the previous function.
*/
CREATE OR REPLACE FUNCTION addLaplacianNoiseMinimum(real) RETURNS real AS $$
DECLARE
epsilon real := 1.2;
sensivity real := (epsilon * 2) + $1;
laplaceDistribution real;
BEGIN
laplaceDistribution := sensivity / (epsilon);
RETURN $1 + laplaceDistribution;
END;
$$ LANGUAGE plpgsql;
CREATE AGGREGATE minimumLaplaceValue (real)
(
sfunc = searchMinimumValue,
stype = real,
finalfunc = addLaplacianNoiseMinimum
);
As I said before, I'd like to type something like this:
IF $1 IS NOT A NUMBER THEN RAISE EXCEPTION 'WRONG TYPE INPUT PARAMETER'
I think so you cannot do this with Postgres - or you cannot to this without some unwanted side effects.
Postgres is strict type system - so all work with types should be done by Postgres.
But you can overload functions for some set of types of parameters:
CREATE OR REPLACE FUNCTION public.f1(numeric)
RETURNS numeric
LANGUAGE plpgsql
AS $function$
begin
return $1;
end;
$function$
CREATE OR REPLACE FUNCTION public.f1(text)
RETURNS text
LANGUAGE plpgsql
AS $function$
begin
raise exception 'only numeric type is supported';
end;
$function$
postgres=# select f1(10);
+----+
| f1 |
+----+
| 10 |
+----+
(1 row)
postgres=# select f1('ahoj');
ERROR: only numeric type is supported
CONTEXT: PL/pgSQL function f1(text) line 3 at RAISE
But strongly I don't recommend to use this pattern. Overloading is wild gun - can be good or bad friend, and should be used only when it is required and when it can do some work - it should not be used for just raising a exception. This is job for postgres' type system - it does this work better (although with different and maybe on first view strange error message).

Package Errors. What Am I doing wrong?

I'm trying to create a package that works out the number of days between two dates, I'm aware I have probably got this miles wrong, I am really struggling to troubleshoot this. My knownledge is low on oracle, I'm still quite new to this. The package I've written is below, but I am getting the error shown at the bottom.
How do I resolve this?
CREATE OR REPLACE PACKAGE PACKNAME
AS
FUNCTION TIME_SCALE RETURN NUMBER;
END;
/
CREATE OR REPLACE PACKAGE BODY PACKNAME
AS closed_date := '28-APR-14'
FUNCTION TIME_SCALE RETURN NUMBER;
IS BEGIN
TRUNC(mrc.closed_date - mrc.open_date) AS days_difference FROM TASKS mrc;
END;
​
Error at line 2: PLS-00103: Encountered the symbol "=" when expecting one of the following: constant exception <an identifier> <a double-quoted delimited-identifier> table long double ref char time timestamp interval date binary national character nchar
I have substituted some of the names to make them clearer for you to read.
The function is to output the number of days it has taken for a task to be completed. The columns basically include the date_opened and date_closed in simple terms as well as a few others and a unique ID which I believe is a sequence.
Try this:
CREATE OR REPLACE PACKAGE PACKNAME
AS
FUNCTION TIME_SCALE
RETURN NUMBER;
END;
/
CREATE OR REPLACE PACKAGE BODY PACKNAME
AS
closed_date VARCHAR2(50):= '28-APR-14';
days_difference NUMBER;
FUNCTION TIME_SCALE
RETURN NUMBER
IS
BEGIN
SELECT TRUNC(mrc.closed_date - mrc.open_date) INTO days_difference
FROM TASKS mrc;
RETURN days_difference;
END;
END;
What was wrong:
1) You have missed the type for closed_date
2) You had an ';' after RETURN NUMBER in function declaration
3) You have missed SELECT clause inside function
4) You have missed END for the package

How can I debug a "division by zero" error in postgresql

I am using many different views and plpgsql functions/aggregates in a single SELECT. When I run this SELECT on certain data sets, I get a division by zero error. Unfortunately, I don't get any details where exactly the division by zero occurs.
Is there I good way to pinpoint the exact place where the problem occurs?
Running the same code in psql will yield more helpful information, like:
ERROR: division by zero
CONTEXT: PL/pgSQL function "mean_estimator_sfunc" line 10 during statement block local variable initialization
Post your function. Here's some good example from Oracle documentation on how to avoid your exception. Same can be done with SQL only:
http://docs.oracle.com/cd/B28359_01/appdev.111/b28370/errors.htm
DECLARE
stock_price NUMBER := 9.73;
net_earnings NUMBER := 0;
pe_ratio NUMBER;
BEGIN
pe_ratio :=
CASE net_earnings
WHEN 0 THEN NULL
ELSE stock_price / net_earnings
end;
END;
/