Pass extra parameter to PostgreSQL aggregate final function - sql

Is the only way to pass an extra parameter to the final function of a PostgreSQL aggregate to create a special TYPE for the state value?
e.g.:
CREATE TYPE geomvaltext AS (
geom public.geometry,
val double precision,
txt text
);
And then to use this type as the state variable so that the third parameter (text) finally reaches the final function?
Why aggregates can't pass extra parameters to the final function themselves? Any implementation reason?
So we could easily construct, for example, aggregates taking a method:
SELECT ST_MyAgg(accum_number, 'COMPUTE_METHOD') FROM blablabla
Thanks

You can define an aggregate with more than one parameter.
I don't know if that solves your problem, but you could use it like this:
CREATE OR REPLACE FUNCTION myaggsfunc(integer, integer, text) RETURNS integer
IMMUTABLE STRICT LANGUAGE sql AS
$f$
SELECT CASE $3
WHEN '+' THEN $1 + $2
WHEN '*' THEN $1 * $2
ELSE NULL
END
$f$;
CREATE AGGREGATE myagg(integer, text) (
SFUNC = myaggsfunc(integer, integer, text),
STYPE = integer
);
It could be used like this:
CREATE TABLE mytab
AS SELECT * FROM generate_series(1, 10) i;
SELECT myagg(i, '+') FROM mytab;
myagg
-------
55
(1 row)
SELECT myagg(i, '*') FROM mytab;
myagg
---------
3628800
(1 row)

I solved a similar issue by making a custom aggregate function that did all the operations at once and stored their states in an array.
CREATE AGGREGATE myagg(integer)
(
INITCOND = '{ 0, 1 }',
STYPE = integer[],
SFUNC = myaggsfunc
);
and:
CREATE OR REPLACE FUNCTION myaggsfunc(agg_state integer[], agg_next integer)
RETURNS integer[] IMMUTABLE STRICT LANGUAGE 'plpgsql' AS $$
BEGIN
agg_state[1] := agg_state[1] + agg_next;
agg_state[2] := agg_state[2] * agg_next;
RETURN agg_state;
END;
$$;
Then made another function that selected one of the results based on the second argument:
CREATE OR REPLACE FUNCTION myagg_pick(agg_state integer[], agg_fn character varying)
RETURNS integer IMMUTABLE STRICT LANGUAGE 'plpgsql' AS $$
BEGIN
CASE agg_fn
WHEN '+' THEN RETURN agg_state[1];
WHEN '*' THEN RETURN agg_state[2];
ELSE RETURN 0;
END CASE;
END;
$$;
Usage:
SELECT myagg_pick(myagg("accum_number"), 'COMPUTE_METHOD') FROM "mytable" GROUP BY ...
Obvious downside of this is the overhead of performing all the functions instead of just one. However when dealing with simple operations such as adding, multiplying etc. it should be acceptable in most cases.

You would have to rewrite the final function itself, and in that case you might as well write a set of new aggregate functions, one for each possible COMPUTE_METHOD. If the COMPUTE_METHOD is a data value or implied by a data value, then a CASE statement can be used to select the appropriate aggregate method. Alternatively, you may want to create a custom composite type with fields for accum_number and COMPUTE_METHOD, and write a single new aggregate function that uses this new data type.

Related

Compute an aggregated tsrange from a set of entries?

I am trying to compute a aggregated tsrange from a set of row that I extract from an SQL query. Problem is that I keep getting errors that the input parameter is not being passed in.
CREATE OR REPLACE AGGREGATE range_merge(anyrange)
(
sfunc = range_merge,
stype = anyrange
);
DROP FUNCTION IF EXISTS aggregate_validity(entity_name regclass, entry bigint);
CREATE OR REPLACE FUNCTION aggregate_validity(entity_name regclass, entry bigint) returns tsrange AS
$$
DECLARE
result tsrange;
BEGIN
EXECUTE format('select range_merge(valid) from %s where entity_id = %U', entity_name, entry) into result;
return result;
END
$$ LANGUAGE plpgsql;
When I do:
select * from aggregate_validity(country, 1);
I get an error stating that the entity name and entry do not exist. It does not seem to parameterize the input into the statement properly.
Function:
EXECUTE format('select range_merge(valid) from %s where entity_id=%U',entity_name, entry)
into result;
=>
EXECUTE format('select range_merge(valid) from %I where entity_id=%s',entity_name, entry)
into result;
--%I for identifier, %s for value
Call:
select * from aggregate_validity(country, 1)
=>
select * from aggregate_validity('country', 1);
db<>fiddle demo
CREATE OR REPLACE AGGREGATE range_merge(anyrange) (
SFUNC = range_merge
, STYPE = anyrange
);
-- DROP FUNCTION IF EXISTS aggregate_validity(entity_name regclass, entry bigint);
CREATE OR REPLACE FUNCTION aggregate_validity(entity_name regclass, entry bigint, OUT result tsrange)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT range_merge(valid) FROM ' || entity_name || ' WHERE entity_id = $1'
INTO result
USING entry;
END
$func$;
Call:
SELECT aggregate_validity('country', 1);
db<>fiddle here
The call does not need SELECT * FROM, as the function returns a single value per definition.
I used an OUT parameter to simplify (OUT result tsrange). See:
Returning from a function with OUT parameter
Don't concatenate the entry value into the SQL string. Pass it as value with the USING clause. Cleaner, faster.
Since entity_name is passed as regclass, it's safe to simply concatenate (which is a bit cheaper). See:
Table name as a PostgreSQL function parameter
Plus, missing quotes and incorrect format specifiers, as Lukasz already provided.
Your custom aggregate function range_merge() has some caveats:
I wouldn't name it "range_merge", that being the name of the plain function range_merge(), too. While that's legal, it still invites confusing errors.
You are aware that the function range_merge() includes gaps between input ranges in the output range?
range_merge() returns NULL for any NULL input. So if your table has any NULL values in the column valid, the result is always NULL. I strongly suggest that any involved columns shall be defined as NOT NULL.
If you are at liberty to install additional modules, consider range_agg by Paul Jungwirth who is also here on Stackovflow. It provides the superior function range_agg() addressing some of the mentioned issues.
If you don't want to include gaps, consider the Postgres Wiki page on range aggregation.
I would probably not use aggregate_validity() at all. It obscures the nested functionality from the Postgres query planner and may lead so suboptimal query plans. Typically, you can replace it with a correlated or a LATERAL subquery, which can be planned and optimized by Postgres in context of the outer query. I appended a demo to the fiddle:
db<>fiddle here
Related:
What is the difference between LATERAL and a subquery in PostgreSQL?

How to write function for optional parameters in postgresql?

My requirement is write optional parameters to a function.Parameters are optional sometimes i will add or i will not pass parameters to function.Can anyone help me how to write function.
I am writing like
select *
from test
where field3 in ('value1','value2')
and ($1 is null or field1 = $1)
and ($2 is null or field2 = $2)
and ($3 is null or field3 = $3);
i am passing parameters to Query,But my output is not expected.when i pass all three parameters my output is correct,otherwise it is not expected output.
You can define optional parameters by supplying a default value.
create function foo(p_one integer default null,
p_two integer default 42,
p_three varchar default 'foo')
returns text
as
$$
begin
return format('p_one=%s, p_two=%s, p_three=%s', p_one, p_two, p_three);
end;
$$
language plpgsql;
You can "leave out" parameters from the end, so foo(), foo(1) or foo(1,2) are valid. If you want to only supply a parameter that is not the first you have to use the syntax that specifies the parameter names.
select foo();
returns: p_one=, p_two=42, p_three=foo
select foo(1);
returns: p_one=1, p_two=42, p_three=foo
select foo(p_three => 'bar')
returns: p_one=, p_two=42, p_three=bar
Apart of the VARIADIC option pointed by #a_horse_with_no_name, which is only a syntax sugar for passing an array with any number of elements of the same type, you can't define a function with optional parameters because, in postgres, functions are identified not only by its name but also by its arguments and the types of them.
That is: create function foo (int) [...] and create function foo (varchar) [...] will create different functions.
Which is called when you execute, for example, select foo(bar) depends on bar data type itself. That is: if it is an integer, you will call the first one and if it is varchar, then second one will be called.
More than that: if you execute, for example, select foo(now()), then a function not exists exception will be triggered.
So, as I said, you can't implement functions with variable arguments, but you can implement multiple functions with the same name and distinct argument (an/or type) sets returning the same data type.
If you (obviously) doesn't want to implement the function twice, the only thing you need to do is to implement a "master" function with all possible parameters and the others (which have fewer parameters) only calling the "master" one with default values for the non received parameters.
As an option, I got a function i tested with Navicat App:
CREATE OR REPLACE FUNCTION "public"."for_loop_through_query"(sponsor_name varchar default 'Save the Children')
It generates me this. (Note: Please look at the parameter difference)
CREATE OR REPLACE FUNCTION "public"."for_loop_through_query"("sponsor_name" varchar='Save the Children'::character varying)
CREATE OR REPLACE FUNCTION "public"."for_loop_through_query"("sponsor_name" varchar='Save the Children'::character varying)
RETURNS "pg_catalog"."void" AS $BODY$
DECLARE
rec RECORD;
BEGIN
FOR rec IN SELECT
companies."name" AS org_name,
"sponsors"."name" AS sponsor_name
FROM
"donor_companies"
JOIN "sponsors"
ON "donor_companies"."donor_id" = "sponsors"."id"
JOIN companies
ON "donor_companies"."organization_id" = companies."id"
WHERE
"public"."sponsors"."name" = sponsor_name
LOOP
RAISE NOTICE '%', rec.org_name;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

Intersection of multiple text arrays: ERROR: array value must start with "{"

I'm attemping to get the functions in this question to work: Intersection of multiple arrays in PostgreSQL
Unlike that question, I want to intersect text arrays instead of integer arrays. I've modified both functions accordingly. Base array intersect function:
CREATE FUNCTION array_intersect(a1 text[], a2 text[]) RETURNS text[] AS $$
DECLARE
ret text[];
BEGIN
IF a1 is null THEN
return a2;
ELSEIF a2 is null THEN
RETURN a1;
END IF;
SELECT array_agg(e) INTO ret
FROM (
SELECT unnest(a1)
INTERSECT
SELECT unnest(a2)
) AS dt(e);
RETURN ret;
END;
$$ language plpgsql;
Aggregate function definition:
CREATE AGGREGATE utility.array_intersect_agg(
sfunc = array_intersect,
basetype = text[],
stype = text[],
initcond = NULL
);
I get the error "ERROR: array value must start with "{" or dimension information
SQL state: 22P02" when I try to run the following code:
SELECT array_intersect_agg(test)
FROM(
SELECT ARRAY['A','B','C'] test
UNION ALL
SELECT ARRAY['A','C'] test
) a
What needs to change in order for these functions to work?
For the documentation:
initial_condition
The initial setting for the state value. This must be a string
constant in the form accepted for the data type state_data_type. If
not specified, the state value starts out null.
So the aggregate declaration should look like this:
CREATE AGGREGATE array_intersect_agg(
sfunc = array_intersect,
basetype = text[],
stype = text[]
);

Passing a ResultSet into a Postgresql Function

Is it possible to pass the results of a postgres query as an input into another function?
As a very contrived example, say I have one query like
SELECT id, name
FROM users
LIMIT 50
and I want to create a function my_function that takes the resultset of the first query and returns the minimum id. Is this possible in pl/pgsql?
SELECT my_function(SELECT id, name FROM Users LIMIT 50); --returns 50
You could use a cursor, but that very impractical for computing a minimum.
I would use a temporary table for that purpose, and pass the table name for use in dynamic SQL:
CREATE OR REPLACE FUNCTION f_min_id(_tbl regclass, OUT min_id int) AS
$func$
BEGIN
EXECUTE 'SELECT min(id) FROM ' || _tbl
INTO min_id;
END
$func$ LANGUAGE plpgsql;
Call:
CREATE TEMP TABLE foo ON COMMIT DROP AS
SELECT id, name
FROM users
LIMIT 50;
SELECT f_min_id('foo');
Major points
The first parameter is of type regclass to prevent SQL injection. More info in this related answer on dba.SE.
I made the temp table ON COMMIT DROP to limit its lifetime to the current transaction. May or may not be what you want.
You can extend this example to take more parameters. Search for code examples for dynamic SQL with EXECUTE.
-> SQLfiddle demo
I would take the problem on the other side, calling an aggregate function for each record of the result set. It's not as flexible but can gives you an hint to work on.
As an exemple to follow your sample problem:
CREATE OR REPLACE FUNCTION myMin ( int,int ) RETURNS int AS $$
SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END;
$$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE my_function ( int ) (
SFUNC = myMin, STYPE = int, INITCOND = 2147483647 --maxint
);
SELECT my_function(id) from (SELECT * FROM Users LIMIT 50) x;
It is not possible to pass an array of generic type RECORD to a plpgsql function which is essentially what you are trying to do.
What you can do is pass in an array of a specific user defined TYPE or of a particular table row type. In the example below you could also swap out the argument data type for the table name users[] (though this would obviously mean getting all data in the users table row).
CREATE TYPE trivial {
"ID" integer,
"NAME" text
}
CREATE OR REPLACE FUNCTION trivial_func(data trivial[])
RETURNS integer AS
$BODY$
DECLARE
BEGIN
--Implementation here using data
return 1;
END$BODY$
LANGUAGE 'plpgsql' VOLATILE;
I think there's no way to pass recordset or table into function (but I'd be glad if i'm wrong). Best I could suggest is to pass array:
create or replace function my_function(data int[])
returns int
as
$$
select min(x) from unnest(data) as x
$$
language SQL;
sql fiddle demo

Custom aggregate function

I am trying to understand aggregate functions and I need help.
So for instance the following sample:
CREATE OR REPLACE FUNCTION array_median(timestamp[])
RETURNS timestamp AS
$$
SELECT CASE WHEN array_upper($1,1) = 0 THEN null ELSE asorted[ceiling(array_upper(asorted,1)/2.0)] END
FROM (SELECT ARRAY(SELECT ($1)[n] FROM
generate_series(1, array_upper($1, 1)) AS n
WHERE ($1)[n] IS NOT NULL
ORDER BY ($1)[n]
) As asorted) As foo ;
$$
LANGUAGE 'sql' IMMUTABLE;
CREATE AGGREGATE median(timestamp) (
SFUNC=array_append,
STYPE=timestamp[],
FINALFUNC=array_median
)
I am not understanding the structure/logic that needs to go into the select statement in the aggregate function itself. Can someone explain what the flow/logic is?
I am writing an aggregate, a strange one, that the return is always the first string it ever sees.
You're showing a median calculation, but want the first text value you see?
Below is how to do that. Assuming you want the first non-null value, that is. If not, you'll need to keep track of if you've got a value already or not.
The accumulator function is written as plpgsql and sql - the plpgsql one lets you use variable names and debug it too. It simply uses COALESCE against the previous accumulated value and the new value and returns the first non-null. So - as soon as you have a non-null in the accumulator everything else gets ignored.
You may also want to consider the "first_value" window function for this sort of thing if you're on a modern (8.4+) version of PostgreSQL.
http://www.postgresql.org/docs/9.1/static/functions-window.html
HTH
BEGIN;
CREATE FUNCTION remember_first(acc text, newval text) RETURNS text AS $$
BEGIN
RAISE NOTICE '% vs % = %', acc, newval, COALESCE(acc, newval);
RETURN COALESCE(acc, newval);
END;
$$ LANGUAGE plpgsql IMMUTABLE;
CREATE FUNCTION remember_first_sql(text,text) RETURNS text AS $$
SELECT COALESCE($1, $2);
$$ LANGUAGE SQL IMMUTABLE;
-- No "initcond" means we start out with null
--
CREATE AGGREGATE first(text) (
sfunc = remember_first,
stype = text
);
CREATE TEMP TABLE tt (t text);
INSERT INTO tt VALUES ('abc'),('def'),('ghi');
SELECT first(t) FROM tt;
ROLLBACK;