Postgres: define a default value for CAST failures? - sql

Is it possible to define a default value that will be returned in case a CAST operation fails?
For example, so that:
SELECT CAST('foo' AS INTEGER)
Will return a default value instead of throwing an error?

There is no default value for a CAST:
A type cast specifies a conversion from one data type to another. PostgreSQL accepts two equivalent syntaxes for type casts:
CAST ( expression AS type )
expression::type
There is no room in the syntax for anything other than the expression to be casted and the desired target type.
However, you can do it by hand with a simple function:
create or replace function cast_to_int(text, integer) returns integer as $$
begin
return cast($1 as integer);
exception
when invalid_text_representation then
return $2;
end;
$$ language plpgsql immutable;
Then you can say things like cast_to_int('pancakes', 0) and get 0.
PostgreSQL also lets you create your own casts so you could do things like this:
create or replace function cast_to_int(text) returns integer as $$
begin
-- Note the double casting to avoid infinite recursion.
return cast($1::varchar as integer);
exception
when invalid_text_representation then
return 0;
end;
$$ language plpgsql immutable;
create cast (text as integer) with function cast_to_int(text);
Then you could say
select cast('pancakes'::text as integer)
and get 0 or you could say
select cast(some_text_column as integer) from t
and get 0 for the some_text_column values that aren't valid integers. If you wanted to cast varchars using this auto-defaulting cast then you'd have to double cast:
select cast(some_varchar::text as integer) from t
Just because you can do this doesn't make it a good idea. I don't think replacing the standard text to integer cast is the best idea ever. The above approach also requires you to leave the standard varchar to integer cast alone, you could get around that if you wanted to do the whole conversion yourself rather than lazily punting to the built in casting.
NULL handling is left as an (easy) exercise for the reader.

Trap the error as described in documentation and then specify an action to do instead.
Documentation on error trapping for PostgreSQL Snippet included below.
35.7.5. Trapping Errors
By default, any error occurring in a PL/pgSQL function aborts execution of the function, and indeed of the surrounding transaction as well. You can trap errors and recover from them by using a BEGIN block with an EXCEPTION clause. The syntax is an extension of the normal syntax for a BEGIN block:
[ <<label>> ]
[ DECLARE
declarations ]
BEGIN
statements
EXCEPTION
WHEN condition [ OR condition ... ] THEN
handler_statements
[ WHEN condition [ OR condition ... ] THEN
handler_statements
... ]
END;
If no error occurs, this form of block simply executes all the statements, and then control passes to the next statement after END. But if an error occurs within the statements, further processing of the statements is abandoned, and control passes to the EXCEPTION list. The list is searched for the first condition matching the error that occurred. If a match is found, the corresponding handler_statements are executed, and then control passes to the next statement after END. If no match is found, the error propagates out as though the EXCEPTION clause were not there at all: the error can be caught by an enclosing block with EXCEPTION, or if there is none it aborts processing of the function.

Related

Does Snowflake variables in UDF need to be in uppercase?

I created a UDF to test this point:
CREATE OR REPLACE function EDW_WEATHER.CHK_READING(p_reading VARCHAR2, P_SENSOR VARCHAR2 )
RETURNS VARCHAR2
LANGUAGE JAVASCRIPT
AS
$$
if (P_SENSOR == 'A') return p_reading;
$$
;
It runs correctly:
select EDW_WEATHER.CHK_READING('A', 'B');
By simply lowercasing the variable P_SENSOR as:
CREATE OR REPLACE function EDW_WEATHER.CHK_READING(p_reading VARCHAR2, p_sensor VARCHAR2 )
RETURNS VARCHAR2
LANGUAGE JAVASCRIPT
AS
$$
if (p_sensor == 'A') return p_reading;
$$
;
I get this when I run the UDF:
100132 (P0000): JavaScript execution error: Uncaught ReferenceError:
p_sensor is not defined in CHK_READING at 'if (p_sensor == 'A')
return p_reading;' position 0
My question is whether Snowflake really require variables (used in "if" or "case" statements) to be in uppercase, or am I doing something wrong.
So there are two things. One is your session's handling of database object names. Which defaults to all objects are upper case by default (aka case does not matter), and in that context using double quotes, will mean "how I have the case is the intended case to use, also white space and other stuff is ok in here". This can be changed by session variables, but that leads to all sorts of troubles with views that using quotes and the case being respected an not for different sessions.
Then there is where that name convention behavior interacts/intersects with UDF code, where by case matters, and Snowflake have gone with "the objects true name is what it needs to be called in the JavaScript.
So if you are accessing passed in parameter and you do not double quote it's name (when you declare it), you will always have to refer to it SHOUTING style. But if you use the double quotes on the variable names when declaring the function (in a session where "quotes are respected") then you can have lower case variables in your javascript, which I show in the second example.
thus normally this function
CREATE OR REPLACE function EDW_WEATHER.CHK_READING(p_reading VARCHAR2, p_sensor VARCHAR2 )
the inputs are only P_READING & P_SENSOR
and here in this case
CREATE OR REPLACE function EDW_WEATHER.CHK_READING("p_reading" VARCHAR2, "p_SeNsOr" VARCHAR2 )
the input are only p_reading & p_SeNsOr
so you could change your function like this
CREATE OR REPLACE function CHK_READING("p_reading" VARCHAR2, "p_sensor" VARCHAR2 )
RETURNS VARCHAR2
LANGUAGE JAVASCRIPT
AS
$$
if (p_sensor == 'A') return p_reading;
$$
;
and then happiness is true!
select CHK_READING('A', 'B');
see above how you named the function an object CHK_READING you can also call it via chk_reading because in SQL case (by default) does not matter.
So the last part: you question:
My question is whether Snowflake really require variables (used in "if" or "case" statements) to be in uppercase, or am I doing something wrong.
it is not a matter of things in IFs or CASEs, but when you use the input varaible.
As mentioned by the doc here:
https://docs.snowflake.com/en/developer-guide/udf/javascript/udf-javascript-introduction.html#javascript-arguments-and-returned-values
Note that an unquoted identifier must be referenced with the capitalized variable name.

Can't use LOOP PostgreSQL

I'm facing an issue from yesterday and I can't understand why my SQL is not working..
This may be a simple error since i'm a beginner in SQL but I can't find where it is.
Here is what I try to do:
CREATE FUNCTION test() RETURN integer AS $$
BEGIN
FOR i IN 1..5 LOOP
SELECT * from result WHERE id=i;
end loop;
RETURN 1;
END;
$$ LANGUAGE plpgsql;
This is just a simple loop as I can find in the documentation but I have this error:
Error report -
ERROR: syntax error at or near "RETURN" (this is the first RETURN statement in the function)
The database is in PostgreSQL and the version is 9.4.5
Why it's not working ?
There are several problems, apart from the fact that the function isn't doing anything useful:
It must be RETURNS integer, not RETURN integer.
That't what causes the error.
The SELECT has no destination. Either add an INTO clause or discard the result with
PERFORM * from result WHERE id=i;
You should indent the code correctly, so that you can read and understand it.

AnalysisException: Syntax error in line 1: error when taking modulus of a value using abs() in Impala

I want to take the modulus of a value when using Impala and I am aware of the abs() function. When I use this however like such
select abs(value) from table
It returns a value that is rounded to the nearest integer. The documentation found here states that I need to define the numeric_type. have tried this
select abs(float value) from table
but this gives me the following error
AnalysisException: Syntax error in line 1: ... abs(float value) from table ^ Encountered: FLOAT Expected: ALL, CASE, CAST, DEFAULT, DISTINCT, EXISTS, FALSE, IF, INTERVAL, NOT, NULL, TRUNCATE, TRUE, IDENTIFIER CAUSED BY: Exception: Syntax error
Any ideas how I set abs() to return a float?
This should work SELECT cast(Abs(-243.5) as float) AS AbsNum
I think you are misunderstanding the syntax. You call the function as abs(val). The return type is the same as the input type. It should work on integers, decimals, and floats.
If you want a particular type being returned, then you need to pass in that type, perhaps casting to the specific type.
The documentation is:
abs(numeric_type a)
Purpose: Returns the absolute value of the argument.
Return type: Same as the input value
Admittedly, this does look like the type should be part of the function call. But it is really using a programming language-style declaration to show the types that are expected.

postgresql send variables to a function, casting?

In one place I have
CREATE FUNCTION updateGeo2(text, float4, float4) RETURNS float AS $$
followed later by
SELECT updateGeo2('area', 40.88, -90.56);
and I get
error : ERROR: function updategeo2(unknown, numeric, numeric) does not exist
so it doesn't know that I tried to pass in a text variable, followed by a float variable and another float variable, it sees these as "unknown, numeric and numeric", lame. How do I let it know the types I am passing in?
try this way:
SELECT updateGeo2('area', (40.88)::float4, (-90.56)::float4);
Clarify misunderstanding
First of all, this should work as is, without type cast. I tested with PostgreSQL 9.1, 9.2 and also with 8.4.15. You must be running an earlier point-release or there is some other misunderstanding (like wrong search_path). Your information is misleading.
Except for ad-hoc calls, you should always add explicit type casts anyway to disambiguate. PostgreSQL allows function overloading. If another function should be created with the signature:
CREATE FUNCTION updateGeo2(text, numeric, numeric) RETURNS text AS $$ ..
... then it would take precedence over the other one due to the default type numeric for numeric literals. Existing code might break.
If, on the other hand, you add a function:
CREATE FUNCTION updateGeo2(char(5), numeric, numeric) RETURNS text AS $$ ..
Then Postgres does not know what to do any more and throws an exception:
ERROR: function updategeo2(unknown, numeric, numeric) is not unique
Proper syntax
SELECT updateGeo2('area', '40.88'::float4, '-90.56'::float4);
Or, more verbose in standard SQL:
SELECT updateGeo2('area', cast('40.88' AS float4), cast('-90.56' AS float4));
Or, if you really wanted to avoid single quotes (and colons):
SELECT updateGeo2('area', float4 '40.88', float4 '-90.56');
This way you cast a numeric literal to data type float4 (= real) directly.
More about type casting in the manual.
(40.88)::float4 works, too, but subtly less effective. First, 40.88 is taken to be of type numeric (the default type for this numeric literal containing a dot). Then the value is cast to float4. Makes two type casts.
More about numeric constants in the manual.

IF function in PostgreSQL as in MySQL

I am trying to replicate the IF function from MySQL into PostgreSQL.
The syntax of IF function is IF(condition, return_if_true, return_if_false)
I created following formula:
CREATE OR REPLACE FUNCTION if(boolean, anyelement, anyelement)
RETURNS anyelement AS $$
BEGIN
CASE WHEN ($1) THEN
RETURN ($2);
ELSE
RETURN ($3);
END CASE;
EXCEPTION WHEN division_by_zero THEN
RETURN ($3);
END;
$$ LANGUAGE plpgsql;
It works well with most of the things like if(2>1, 2, 1) but it raises an error for:
if( 5/0 > 0, 5, 0)
fatal error division by zero.
In my program I can't check the denominator as the condition is provided by user.
Is there any way around? Maybe if we can replace first parameter from boolean to something else, as in that case the function will work as it will raise and return the exception.
PostgreSQL is following the standard
This behaviour appears to be specified by the SQL standard. This is the first time I've seen a case where it's a real problem, though; you usually just use a CASE expression or a PL/PgSQL BEGIN ... EXCEPTION block to handle it.
MySQL's default behaviour is dangerous and wrong. It only works that way to support older code that relies on this behaviour. It has been fixed in newer versions when strict mode is active (which it absolutely always should be) but unfortunately has not yet been made the default. When using MySQL, always enable STRICT_TRANS_TABLES or STRICT_ALL_TABLES.
ANSI-standard zero division is a pain sometimes, but it'll also protect against mistakes causing data loss.
SQL injection warning, consider re-design
If you're executing expressions from the user then you quite likely have SQL injection problems. Depending on your security requirements you might be able to live with that, but it's pretty bad if you don't totally trust all your users. Remember, your users could be tricked into entering the malicious code from elsewhere.
Consider re-designing to expose an expression builder to the user and use a query builder to create the SQL from the user expressions. This would be much more complicated, but secure.
If you can't do that, see if you can parse the expressions the user enters into an abstract syntax, validate it before execution, and then produce new SQL expressions based on the parsed expression. That way you can at least limit what they can write, so they don't slip any nasties into the expression. You can also rewrite the expression to add things like checks for zero division. Finding (or writing) parsers for algebraic expressions isn't likely to be hard, but it'll depend on what kinds of expressions you need to let users write.
At minimum, the app needs to be using a role ("user") that has only SELECT privileges on the tables, is not a superuser, and does not own the tables. That'll minimise the harm any SQL injection will cause.
CASE won't solve this problem as written
In any case, because you currently don't validate and can't inspect the expression from the user, you can't use the SQL-standard CASE statement to solve this. For if( a/b > 0, a, b) you'd usually write something like:
CASE
WHEN b = 0 THEN b
ELSE CASE
WHEN a/b=0 THEN a
ELSE b
END
END
This explicitly handles the zero denominator case, but is only possible when you can break the expression up.
Ugly workaround #1
An alternative solution would be to get Pg to return a placeholder instead of raising an exception for division by zero by defining a replacement division operator or function. This will only solve the divide-by-zero case, not others.
I wanted to return 'NaN' as that's the logical result. Unfortunately, 'NaN' is greater than numbers not less then, and you want a less-than or false-like result.
regress=# SELECT NUMERIC 'NaN' > 0;
?column?
----------
t
(1 row)
This means we have to use the icky hack of returning NULL instead:
CREATE OR REPLACE FUNCTION div_null_on_zero(numeric,numeric) returns numeric AS $$
VALUES (CASE WHEN $2 = 0 THEN NULL ELSE $1/$2 END)
$$ LANGUAGE 'SQL' IMMUTABLE;
CREATE OPERATOR #/# (
PROCEDURE = div_null_on_zero(numeric,numeric),
LEFTARG = numeric,
RIGHTARG = numeric
);
with usage:
regress=# SELECT 5 #/# 0, 5 #/# 0>0, CASE WHEN 5 #/# 0 > 0 THEN 5 ELSE 0 END;
?column? | ?column? | case
----------+----------+------
| | 0
(1 row)
Your app can rewrite '/' in incoming expressions into #/# or whatever operator name you choose pretty easily.
There's one pretty critical problem with this approach, and that's that #/# will have different precedence to / so expressions without explicit parentheses may not be evaluated as you expect. You might be able to get around this by creating a new schema, defining an operator named / in that schema that does your null-on-error trick, and then adding that schema to your search_path before executing user expressions. It's probably a bad idea, though.
Ugly workaround #2
Since you can't inspect the denominator, all I can think of is to wrap the whole thing in a DO block (Pg 9.0+) or PL/PgSQL function and catch any exceptions from the evaluation of the expression.
Erwin's answer provides a better example of this than I did, so I've removed this. In any case, this is an awful and dangerous thing to do, do not do it. Your app needs to be fixed.
With a boolean argument, a division by zero will always throw an exception (and that's a good thing), before your function is even called. There is nothing you can do about it. It's already happened.
CREATE OR REPLACE FUNCTION if(boolean, anyelement, anyelement)
RETURNS anyelement LANGUAGE SQL AS
$func$
SELECT CASE WHEN $1 THEN $2 ELSE $3 END
$func$;
I would strongly advise against a function named if to begin with. IF is a keyword in PL/pgSQL. If you use user defined functions written in PL/pgSQL this will be very confusing.
Just use the standard SQL expression CASE directly.
The alternative would be to take a text argument and evaluate it with dynamic SQL.
Proof of concept
What you ask for would work like this:
CREATE OR REPLACE FUNCTION f_if(_expr text
, _true anyelement
, _else anyelement
, OUT result anyelement)
RETURNS anyelement LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE '
SELECT CASE WHEN (' || _expr || ') THEN $1 ELSE $2 END' -- !! dangerous !!
USING _true, _else
INTO result;
EXCEPTION WHEN division_by_zero THEN
result := _else;
-- possibly catch more types of exceptions ...
END
$func$;
Test:
SELECT f_if('TRUE' , 1, 2) --> 1
,f_if('FALSE' , 1, 2) --> 2
,f_if('NULL' , 1, 2) --> 2
,f_if('1/0 > 0', 1, 2); --> 2
This is a big security hazard in the hands of untrusted users. Read #Craig's answer about making this more secure.
However, I fail to see how it can be made bulletproof and would never use it.