I have written a couple of AGGREGATE functions in CLR
abj.median
abj.percentile
A bit of an interesting issue. The functions, in structure , are very similar other than a small difference in the way the results are calculated AND, PERCENTILE requires 2 parameters, while median only one.
The common parameter on both functions is the field name. The percentile function also carries a value to determine which percentile (10, 75, 90 etc.....)
This command works fine.....
;
WITH p1 AS (
SELECT WAITTIMES_DAY / 7.0 AS waitWeeks,
abj.fyq(surg_sx_date) as fiscalYear,
SURG_SITE_ZONE
FROM dbo.Surgery
)
SELECT *
FROM p1 p
PIVOT (abj.median(waitweeks)
FOR fiscalYear IN ( [2013/14-Q1], [2013/14-Q2], [2013/14-Q3], [2013/14-Q4] )) b
This command fails with INCORRECT SYNTAX NEAR '90'. Expecting '.', ID, or QUOTED_ID.
;
WITH p1 AS (
SELECT WAITTIMES_DAY / 7.0 AS waitWeeks,
abj.fyq(surg_sx_date) as fiscalYear,
SURG_SITE_ZONE
FROM dbo.Surgery
)
SELECT *
FROM p1 p
PIVOT (abj.percentile(waitweeks,90)
FOR fiscalYear IN ( [2013/14-Q1], [2013/14-Q2], [2013/14-Q3], [2013/14-Q4] )) b
Has anybody ran into this Wierdness before, and how did they fix (other than to breakdown and write the PERCENTILE function with only one parameter, with the second changed to default of 90)
Thanks
Sven
When you use CLR functions you use them as is. If function abj.median was written to use only one parameter. It will works like this and only like this. If you want that median was able to use 2 parameters you need ask developer of this function rewrite it for you.
Related
As an example below, I am trying to figure out how I can use INTO (I have out parameters defined in a procedure I have made) when I am doing a SELECT statement that involves AS:
SELECT name, COUNT(addresses) AS TotalAddresses INTO outVar1, SUM(salary + tax) AS test INTO outVar2
FROM ...
Unfortunately, the compiler does not like this and well I have tried searching online, but no luck.
Use one into clause:
SELECT COUNT(addresses) AS TotalAddresses, SUM(salary + tax) AS test
INTO outVar1, outVar2
FROM ...
The results are going into variables. There is no need to select NAME if it is not going to a variable.
I am trying to call/convert a numeric variable into string inside a user-defined function. I was thinking about using to_char, but it didn't pass.
My function is like this:
create or replace function ntile_loop(x numeric)
returns setof numeric as
$$
select
max("billed") as _____(to_char($1,'99')||"%"???) from
(select "billed", "id","cm",ntile(100)
over (partition by "id","cm" order by "billed")
as "percentile" from "table_all") where "percentile"=$1
group by "id","cm","percentile";
$$
language sql;
My purpose is to define a new variable "x%" as its name, with x varying as the function input. In context, x is numeric and will be called again later in the function as a numeric (this part of code wasn't included in the sample above).
What I want to return:
I simply want to return a block of code so that every time I change the percentile number, I don't have to run this block of code again and again. I'd like to calculate 5, 10, 20, 30, ....90th percentile and display all of them in the same table for each id+cm group.
That's why I was thinking about macro or function, but didn't find any solutions I like.
Thank you for your answers. Yes, I will definitely read basics while I am learning. Today's my second day to use SQL, but have to generate some results immediately.
Converting numeric to text is the least of your problems.
My purpose is to define a new variable "x%" as its name, with x
varying as the function input.
First of all: there are no variables in an SQL function. SQL functions are just wrappers for valid SQL statements. Input and output parameters can be named, but names are static, not dynamic.
You may be thinking of a PL/pgSQL function, where you have procedural elements including variables. Parameter names are still static, though. There are no dynamic variable names in plpgsql. You can execute dynamic SQL with EXECUTE but that's something different entirely.
While it is possible to declare a static variable with a name like "123%" it is really exceptionally uncommon to do so. Maybe for deliberately obfuscating code? Other than that: Don't. Use proper, simple, legal, lower case variable names without the need to double-quote and without the potential to do something unexpected after a typo.
Since the window function ntile() returns integer and you run an equality check on the result, the input parameter should be integer, not numeric.
To assign a variable in plpgsql you can use the assignment operator := for a single variable or SELECT INTO for any number of variables. Either way, you want the query to return a single row or you have to loop.
If you want the maximum billed from the chosen percentile, you don't GROUP BY x, y. That might return multiple rows and does not do what you seem to want. Use plain max(billed) without GROUP BY to get a single row.
You don't need to double quote perfectly legal column names.
A valid function might look like this. It's not exactly what you were trying to do, which cannot be done. But it may get you closer to what you actually need.
CREATE OR REPLACE FUNCTION ntile_loop(x integer)
RETURNS SETOF numeric as
$func$
DECLARE
myvar text;
BEGIN
SELECT INTO myvar max(billed)
FROM (
SELECT billed, id, cm
,ntile(100) OVER (PARTITION BY id, cm ORDER BY billed) AS tile
FROM table_all
) sub
WHERE sub.tile = $1;
-- do something with myvar, depending on the value of $1 ...
END
$func$ LANGUAGE plpgsql;
Long story short, you need to study the basics before you try to create sophisticated functions.
Plain SQL
After Q update:
I'd like to calculate 5, 10, 20, 30, ....90th percentile and display
all of them in the same table for each id+cm group.
This simple query should do it all:
SELECT id, cm, tile, max(billed) AS max_billed
FROM (
SELECT billed, id, cm
,ntile(100) OVER (PARTITION BY id, cm ORDER BY billed) AS tile
FROM table_all
) sub
WHERE (tile%10 = 0 OR tile = 5)
AND tile <= 90
GROUP BY 1,2,3
ORDER BY 1,2,3;
% .. modulo operator
GROUP BY 1,2,3 .. positional parameter
It looks like you're looking for return query execute, returning the result from a dynamic SQL statement:
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html
http://www.postgresql.org/docs/current/static/plpgsql-statements.html
I have a piece of dynamic SQL inside part of which retrieves a function dependent on other results from the query, but also uses these results to evaluate this function. I know eval() does not exist in SQL so what do I use?
A very simplified version
select reading, functiontype, #result = eval(f.functionformula)
from readingstables r
join functiontable f on (r.functiontype = f.functiontype)
So basically (note these are only example formulae) I want to use the functionformula which is related to a set of readings via the formulatype
if f.functiontype == 'A' then f.functionformula = reading * reading
if f.functiontype == 'B' then f.functionformula = reading * costant / anothervalue
//etc etc
The real version is a huge piece of dynamic SQL in a stored procedure that drives a cursor. I would prefer to do it in one query but suspect I might have to compromise and have a second dynamic query driven from the first.
Why not simply use the POWER function:
Case functionType
When 'A' Then Power( reading, 2 )
When 'B' Then Power( reading, 3 )
...
End
You could even get super fancy like so:
Power( reading, Ascii( functionType ) - Ascii('A') + 2 )
Edit
Given your change to your OP, beyond dynamic SQL, there is no way to dynamically execute a function call. You could create a UDF which takes the function type parameter and executes the correct expression however the UDF itself would need to be a large Case expression.
Create Function FunctionTypeExpression( #FunctionType char(1) )
Returns float
As
Return Case #FunctionType
When 'A' Then ..expression 1
When 'B' Then ..expression 2
...
One note in this, you will need to make the return value of the function compatible with any possible return type from the expressions. Hopefully, they are all numeric. If they are not all numeric (or all text), then a more detailed explanation for why this is not the case would be needed.
When I want to test the behavior of some PostgreSQL function FOO() I'd find it useful to execute a query like SELECT FOO(bar), bar being some data I use as a direct input without having to SELECT from a real table.
I read we can omit the FROM clause in a statement like SELECT 1 but I don't know the correct syntax for multiple inputs. I tried SELECT AVG(1, 2) for instance and it does not work.
How can I do that ?
With PostgreSQL you can use a VALUES expression to generate an inlined table:
VALUES computes a row value or set of row values specified by value expressions. It is most commonly used to generate a "constant table" within a larger command, but it can be used on its own.
Emphasis mine. Then you can apply your aggregate function to that "constant table":
select avg(x)
from (
values (1.0), (2.0)
) as t(x)
Or just select expr if expr is not an aggregate function:
select sin(1);
You could also define your own avg function that operates on an array and hide your FROM inside the function:
create function avg(double precision[]) returns double precision as $$
select avg(x) from unnest($1) as t(x);
$$ language 'sql';
And then:
=> select avg(array[1.0, 2.0, 3.0, 4.0]);
avg
-----
2.5
But that's just getting silly unless you're doing this quite often.
Also, if you're using 8.4+, you can write variadic functions and do away with the array. The internals are the same as the array version, you just add variadic to the argument list:
create function avg(variadic double precision[]) returns double precision as $$
select avg(x) from unnest($1) as t(x);
$$ language 'sql';
And then call it without the array stuff:
=> select avg(1.0, 1.2, 2.18, 11, 3.1415927);
avg
------------
3.70431854
(1 row)
Thanks to depesz for the round-about-through-google pointer to variadic function support in PostgreSQL.
To express a SET in most varieties of SQL, you need to actually express a table..
SELECT
AVG(inlineTable.val)
FROM
(
SELECT 1 AS Val
UNION ALL
SELECT 2 AS Val
)
AS inLineTable
I found in MYSQL and apparently other database engines that there is a "greatest" function that can be used like: greatest(1, 2, 3, 4), and it would return 4. I need this, but I am using IBM's DB2. Does anybody know of such an equivalent function, even if it only accepts 2 parameters?
I found somewhere that MAX should do it, but it doesn't work... it only works on selecting the MAX of a column.
If there is no such function, does anybody have an idea what a stored procedure to do this might look like? (I have no stored procedure experience, so I have no clue what DB2 would be capable of).
Why does MAX not work for you?
select max(1,2,8,3,1,7) from sysibm.sysdummy1
gives me
1
---------------
8
1 record(s) selected.
As Dave points out, MAX should work as it's overloaded as both a scalar and a column function (the scalar takes 2 or more arguments). This is the case in DB2 for LUW, DB2 for z/OS and DB2 for i5/OS. What exact version and platform of DB2 are you using, and what is the exact statement you are using? One of the requirements of the scalar version of MAX is that all the arguments are "compatible" - I suspect there may be a subtle type difference in one or more of the arguments you're passing to the function.
On Linux V9.1, the "select max (1,2,3) ..." gives -
SQL0440N No authorized routine named "MAX" of type "FUNCTION" having
compatible arguments was found. SQLSTATE=42884
It is a scalar function requiring either a single value or a single column name. On z/os, it behaves differently.
However, It does work as expected on Linux 9.5.
Two options:
What about sorting the column in descending and grabbing the top 1 row?
According to my "SQL Pocket Guide", MAX(x) returns the greatest value in a set.
UPDATE: Apparently #1 won't work if you are looking at columns.
It sounds crazy, but no such function exists in DB2, at least not in version 9.1. If you want to select the greater of two columns, it would be best to use a case expression.
You can also define your own max function. For example:
create function importgenius.max2(x double, y double)
returns double
language sql
contains sql
deterministic
no external action
begin atomic
if y is null or x >= y then return x;
else return y;
end if;
end
Defining the inputs and outputs as doubles lets you take advantage of type promotion, so this function will also work for integers. The "deterministic" and "no external action" statements help the database engine optimize use of the function.
If you want another max function to work for character inputs, you'll have to give it another name.
Please check with following query:
select * from table1 a,
(select appno as sub_appno,max(sno) as sub_maxsno from table1 group by appno) as tab2
where a.appno =tab2.sub_appno and a.sno=tab2.sub_maxsno