Return two values from a scalar SQL Function - sql

I have a Scalar SQL function thats returns a decimal value, and this function is used in many stored procedures across my database. Now in some procedures, I need to set a value based on some criteria inside the function. To make it clearer, depending on some variables used in calculating the result of the function, I want to set another variable inside the Stored procedure, and return it to the client.
I don't want to change how the result is returned or the return type of the function. I am thinking of doing it by inserting the new value i want into an sql table and then reading it from the procedure, But is there another or better way to do it?
Thanks

No, you cannot. Functions are severely limited in SQL Server and do not allow any side effects.
What you can do, however, is convert your scalar function into a table function. In it, you can return a table with as many columns as you need, so returning more than one value is not a problem.

You have a couple of options
1) Change it from a function to a stored procedure, and add an output parameter.
2) Change it from a scalar function to a table valued function returning a single row, with the additional value as an additional column.
If you need to preserve the existing function signature then just create a new table valued function that does the work (As per option 2 above), and modify your existing function to select from the new table valued function.
Here is some example code demonstrating this:
-- the original scalar function
CREATE FUNCTION dbo.t1(#param1 INT)
RETURNS INT AS
BEGIN
RETURN #param1 + 1
END
GO
-- a new table valued function, that returns 2 values in a single row
CREATE FUNCTION dbo.t2(#param1 INT)
RETURNS TABLE AS
RETURN (SELECT #param1 + 1 AS [r1], #param1 + 2 AS [r2])
GO
-- the modified original function, now selecting from the new table valued function
CREATE FUNCTION dbo.t3(#param1 INT)
RETURNS INT AS
BEGIN
RETURN (SELECT r1 FROM dbo.t2(#param1))
END
GO
-- example usage
SELECT dbo.t1(1)
SELECT * FROM dbo.t2(1)
SELECT dbo.t3(1)

Table value functions that return a single row are my favorite technique when a single answer from a scalar function just isn't adequate (or slows the query too much). A table can have from zero to many rows. Once I realized a 'table' value function can be limited to returning only one row it became obvious that multiple questions that would require separate scalar functions can be accomplished in a single table value function. It's like a scalar function on steroids. I like to read in all needed data just once into an internal table variable, then manipulate that data assigning it to additional variables as needed, finally assembling the answers for the output 'table' of one record. My database environment is read only, not transaction based. Incredibly useful for large (Mult-TB) historical database like medical information. Frequently used to concatenate fields into an end user friendly 'sentence' to deal with data that can have zero to many values, like patient diagnosis. Outer Apply the table value function on filtered data and it is extremely efficient.

Related

Best way to write a udf in vertica, where I need to refer to data from one of the rate table and write a formula on top

I am planning to write a udf which can return the new value based on the rate setup for a specific date in a table, which means i need to write a query in udf
1. is it recommended as there are not enough example which refers to a table in udf
2. what are the other ways to solve this as Vetica procedural function does not allow to query within the function sql how it works in plsql
The requirement is not clear but a UDF function can be used. Below is the pseudocode
CREATE FUNCTION updateRate(x DATE) RETURN INT
AS BEGIN
RETURN (<your logic on updating rate> );
END;
And then call the function in update query
=>Update Mytable set rate=updateRate(colDate);

Declaring and assigning values to variables in PostgreSQL

First of all, I'm a total beginner in SQL. I have a table with 50+ columns, and now I'm doing calculations (on created temp table), but in some formulas, I got parameters, for example: A = 3
(A*(Column5 + Column7))/2
So, what is the best way to assign a value to a parameter?
This is what I was thinking about
DECLARE A DOUBLE PRECISION:=3;
But I don't know how implementing it.
The with option essentially creates a temp table that you can reference in a sql statement within the same transaction.
Your best bet is to create a function and then pass it the value of the parameter at run time. eg.
CREATE FUNCTION addColumns(
A integer,
firstColumn integer,
secondColumn integer
)
RETURNS integer
AS
RETURN (A*(firstColumn + secondColumn))/2
LANGUAGE SQL
IMMUTABLE;
Then use this in your query like:
select addColumns(3, column5, column7)
from [table];
As I could understand you want to store values using variables.
This is already answered here : How to declare a variable in a PostgreSQL query
There are many solutions there, but I particularly like using a WITH clause as pointed in one of the answers, when using plain SQL. For more fancy things, you should write proper stored procedures.

How to pass a set of rows from one function into another?

Overview
I'm using PostgreSQL 9.1.14, and I'm trying to pass the results of a function into another function. The general idea (specifics, with a minimal example, follow) is that we can write:
select * from (select * from foo ...)
and we can abstract the sub-select away in a function and select from it:
create function foos()
returns setof foo
language sql as $$
select * from foo ...
$$;
select * from foos()
Is there some way to abstract one level farther, so as to be able to do something like this (I know functions cannot actually have arguments with setof types):
create function more_foos( some_foos setof foo )
language sql as $$
select * from some_foos ... -- or unnest(some_foos), or ???
$$:
select * from more_foos(foos())
Minimal Example and Attempted Workarounds
I'm using PostgreSQL 9.1.14. Here's a minimal example:
-- 1. create a table x with three rows
drop table if exists x cascade;
create table if not exists x (id int, name text);
insert into x values (1,'a'), (2,'b'), (3,'c');
-- 2. xs() is a function with type `setof x`
create or replace function xs()
returns setof x
language sql as $$
select * from x
$$;
-- 3. xxs() should return the context of x, too
-- Ideally the argument would be a `setof x`,
-- but that's not allowed (see below).
create or replace function xxs(x[])
returns setof x
language sql as $$
select x.* from x
join unnest($1) y
on x.id = y.id
$$;
When I load up this code, I get the expected output for the table definitions, and I can call and select from xs() as I'd expect. But when I try to pass the result of xs() to xxs(), I get an error that "function xxs(x) does not exist":
db=> \i test.sql
DROP TABLE
CREATE TABLE
INSERT 0 3
CREATE FUNCTION
CREATE FUNCTION
db=> select * from xs();
1 | a
2 | b
3 | c
db=> select * from xxs(xs());
ERROR: function xxs(x) does not exist
LINE 1: select * from xxs(xs());
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
I'm a bit confused about "function xxs(x) does not exist"; since the return type of xs() was setof x, I'd expected that its return type would be setof x (or maybe x[]), not x. Following the complaints about the type, I can get to either of the following , but while with either definition I can select xxs(xs());, I can't select * from xxs(xs());.
create or replace function xxs( x )
returns setof x
language sql as $$
select x.* from x
join unnest(array[$1]) y -- unnest(array[...]) seems pretty bad
on x.id = y.id
$$;
create or replace function xxs( x )
returns setof x
language sql as $$
select * from x
where x.id in ($1.id)
$$;
db=> select xxs(xs());
(1,a)
(2,b)
(3,c)
db=> select * from xxs(xs());
ERROR: set-valued function called in context that cannot accept a set
Summary
What's the right way to pass the results of a set-returning function into another function?
(I have noted that create function … xxs( setof x ) … results in the error: ERROR: functions cannot accept set arguments, so the answer won't literally be passing a set of rows from one function to another.)
Table functions
I perform very high speed, complex database migrations for a living, using SQL as both the client and server language (no other language is used), all running server side, where the code rarely surfaces from the database engine. Table functions play a HUGE role in my work. I don't use "cursors" since they are too slow to meet my performance requirements, and everything I do is result set oriented. Table functions have been an immense help to me in completely eliminating use of cursors, achieving very high speed, and have contributed dramatically towards reducing code volume and improving simplicity.
In short, you use a query that references two (or more) table functions to pass the data from one table function to the next. The select query result set that calls the table functions serves as the conduit to pass the data from one table function to the next. On the DB2 platform / version I work on, and it appears based on a quick look at the 9.1 Postgres manual that the same is true there, you can only pass a single row of column values as input to any of the table function calls, as you've discovered. However, because the table function call happens in the middle of a query's result set processing, you achieve the same effect of passing a whole result set to each table function call, albeit, in the database engine plumbing, the data is passed only one row at a time to each table function.
Table functions accept one row of input columns, and return a single result set back into the calling query (i.e. select) that called the function. The result set columns passed back from a table function become part of the calling query's result set, and are therefore available as input to the next table function, referenced later in the same query, typically as a subsequent join. The first table function's result columns are fed as input (one row at a time) to the second table function, which returns its result set columns into the calling query's result set. Both the first and second table function result set columns are now part of the calling query's result set, and are now available as input (one row at a time) to a third table function. Each table function call widens the calling query's result set via the columns it returns. This can go on an on until you start hitting limits on the width of a result set, which likely varies from one database engine to the next.
Consider this example (which may not match Postgres' syntax requirements or capabilities as I work on DB2). This is one of many design patterns in which I use table functions, is one of the simpler ones, that I think is very illustrative, and one that I anticipate would have broad appeal if table functions were in heavy mainstream use (to my knowledge they are not, but I think they deserve more attention than they are getting).
In this example, the table functions in use are: VALIDATE_TODAYS_ORDER_BATCH, POST_TODAYS_ORDER_BATCH, and DATA_WAREHOUSE_TODAYS_ORDER_BATCH. On the DB2 version I work on, you wrap the table function inside "TABLE( place table function call and parameters here )", but based on quick look at a Postgres manual it appears you omit the "TABLE( )" wrapper.
create table TODAYS_ORDER_PROCESSING_EXCEPTIONS as (
select TODAYS_ORDER_BATCH.*
,VALIDATION_RESULT.ROW_VALID
,POST_RESULT.ROW_POSTED
,WAREHOUSE_RESULT.ROW_WAREHOUSED
from TODAYS_ORDER_BATCH
cross join VALIDATE_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as VALIDATION_RESULT ( ROW_VALID ) --example: 1/0 true/false Boolean returned
left join POST_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as POST_RESULT ( ROW_POSTED ) --example: 1/0 true/false Boolean returned
on ROW_VALIDATED = '1'
left join DATA_WAREHOUSE_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as WAREHOUSE_RESULT ( ROW_WAREHOUSED ) --example: 1/0 true/false Boolean returned
on ROW_POSTED = '1'
where coalesce( ROW_VALID, '0' ) = '0' --Capture only exceptions and unprocessed work.
or coalesce( ROW_POSTED, '0' ) = '0' --Or, you can flip the logic to capture only successful rows.
or coalesce( ROW_WAREHOUSED, '0' ) = '0'
) with data
If table TODAYS_ORDER_BATCH contains 1,000,000 rows, then
VALIDATE_TODAYS_ORDER_BATCH will be called 1,000,000 times, once for
each row.
If 900,000 rows pass validation inside VALIDATE_TODAYS_ORDER_BATCH, then POST_TODAYS_ORDER_BATCH will be called 900,000 times.
If only 850,000 rows successfully post, then VALIDATE_TODAYS_ORDER_BATCH needs some loopholes closed LOL, and DATA_WAREHOUSE_TODAYS_ORDER_BATCH will be called 850,000 times.
If 850,000 rows successfully made it into the Data Warehouse (i.e. no additional exceptions were generated), then table TODAYS_ORDER_PROCESSING_EXCEPTIONS will be populated with 1,000,000 - 850,000 = 150,000 exception rows.
The table function calls in this example are only returning a single column, but they could be returning many columns. For example, the table function validating an order row could return the reason why an order failed validation.
In this design, virtually all the chatter between a HLL and the database is eliminated, since the HLL requestor is asking the database to process the whole batch in ONE request. This results in a reduction of millions of SQL requests to the database, in a HUGE removal of millions of HLL procedure or method calls, and as a result provides a HUGE runtime improvement. In contrast, legacy code which often processes a single row at a time, would typically send 1,000,000 fetch SQL requests, 1 for each row in TODAYS_ORDER_BATCH, plus at least 1,000,000 HLL and/or SQL requests for validation purposes, plus at least 1,000,000 HLL and/or SQL requests for posting purposes, plus 1,000,000 HLL and/or SQL requests for sending the order to the data warehouse. Granted, using this table function design, inside the table functions SQL requests are being sent to the database, but when the database makes requests to itself (i.e from inside a table function), the SQL requests are serviced much faster (especially in comparison to a legacy scenario where the HLL requestor is doing single row processing from a remote system, with the worst case over a WAN - OMG please don't do that).
You can easily run into performance problems if you use a table function to "fetch a result set" and then join that result set to other tables. In that case, the SQL optimizer can't predict what set of rows will be returned from the table function, and therefore it can't optimize the join to subsequent tables. For that reason, I rarely use them for fetching a result set, unless I know that result set will be a very small number of rows, hence not causing a performance problem, or I don't need to join to subsequent tables.
In my opinion, one reason why table functions are underutilized is that they are often perceived as only a tool to fetch a result set, which often performs poorly, so they get written off as a "poor" tool to use.
Table functions are immensely useful for pushing more functionality over to the server, for eliminating most of the chatter between the database server and programs on remote systems, and even for eliminating chatter between the database server and external programs on the same server. Even chatter between programs on the same server carries more overhead than many people realize, and much of it is unnecessary. The heart of the power of table functions lies in using them to perform actions inside result set processing.
There are more advanced design patterns for using table functions that build on the above pattern, where you can maximize result set processing even further, but this post is a lot for most to absorb already.

using scalar function output in an update

Suppose I have a 2 column table with columns (arg, value) and a user-defined function foo.
Is there a way to have an update query that goes through the table and calls foo with argument arg and sticks the results in column value for every row in the table?
Assuming SQL Server the syntax is
Update YourTable
SET value = dbo.foo(arg)
It is often more efficient to not use scalar UDFs for Row by Row processing however. What is the scalar UDF doing?

integer Max value constants in SQL Server T-SQL?

Are there any constants in T-SQL like there are in some other languages that provide the max and min values ranges of data types such as int?
I have a code table where each row has an upper and lower range column, and I need an entry that represents a range where the upper range is the maximum value an int can hold(sort of like a hackish infinity). I would prefer not to hard code it and instead use something like SET UpperRange = int.Max
There are two options:
user-defined scalar function
properties table
In Oracle, you can do it within Packages - the closest SQL Server has is Assemblies...
I don't think there are any defined constants but you could define them yourself by storing the values in a table or by using a scalar valued function.
Table
Setup a table that has three columns: TypeName, Max and Min. That way you only have to populate them once.
Scalar Valued Function
Alternatively you could use scalar valued functions GetMaxInt() for example (see this StackOverflow answer for a real example.
You can find all the max/min values here: http://msdn.microsoft.com/en-us/library/ms187752.aspx
Avoid Scalar-Functions like the plague:
Scalar UDF Performance Problem
That being said, I wouldn't use the 3-Column table another person suggested.
This would cause implicit conversions just about everywhere you'd use it.
You'd also have to join to the table multiple times if you needed to use it for more than one type.
Instead have a column for each Min and Max of each Data Type (defined using it's own data type) and call those directly to compare to.
Example:
SELECT *
FROM SomeTable as ST
CROSS JOIN TypeRange as TR
WHERE ST.MyNumber BETWEEN TR.IntMin AND TR.IntMax