PostgreSQL: Why calling function (NUMERIC -> NUMERIC) in SELECT is too expensive? - sql

I have a query with a long select part with many cases when statement and I tried to put them to function for clarity, but the query start running longer than before.
So, for testing, I create one simple table
CREATE TABLE t1 AS
select random()::numeric from generate_series(1, 5000000, 1)
and one very simple function
create funtion f1 (i NUMERIC)
returns NUMERIC
LANGUAGE plpgsql
AS
$body$
begin
return i+1
end
$body$
and running two queries:
1: SELECT random, random+1 from T1
2: SELECT random, f1(random) from T1
and I don't know why the first is running two times faster than the second, so please help.
Sorry for my English :(

That is because the function f1 has to be called for every row, and calling a PL/pgSQL function is quite expensive.
If you define it as an SQL function, it could be inlined, which would be much cheaper:
FREATE FUNCTION f1(i NUMERIC) RETURNS numeric
LANGUAGE sql AS
'SELECT i + 1';

Related

I'm trying to cross database with insert query in plpgsql, please help me

CREATE OR REPLACE FUNCTION "public"."cross_insert"("p_name" varchar, "p_detail" varchar)
RETURNS SETOF "pg_catalog"."varchar" AS $BODY$
BEGIN
SELECT * FROM public.dblink(
'
host=10.10.10.53
port=5432
user=sassuperuser
password=password10
dbname=blog2
',
'
SELECT * FROM public.funct_insert2(
'''||p_name||''',
'''||p_detail||'''
);
'
);
RETURN query
SELECT ('SUKSES')::character varying;
END$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000
Your code is little bit messy. Try start with reading documentation to PLpgSQL. There are three issues:
PL/pgSQL doesn't allow implicit throwing result of any query. You should to iterate over result (FOR IN SELECT) or you have to save result to variable (SELECT INTO). If you want to do it, then you should to use PERFORM command. So you query should to look like:
PERFORM public.dblink(' .....');
There is not any sense why your function is declared like RETURNS SETOF varchar. Then you have to use RETURN QUERY SELECT 'success'. It is absolutely useless. There is relatively high overhead of this functionality. This should be classic scalar function that returns text or better, that returns void type. The exception is raised on error. So you don't need to returns anything.
Second, don't do this - PostgreSQL has FDW interface. You can use foreign table instead. It will be faster and safer.

Postgresql: Filter function output vs passing filter parameter to the function

Is there a difference in performance between the following queries?
1. foo() returns an unfiltered set which is then filtered:
create function foo() returns setof table1 as
$$
select * from table1
$$ language SQL;
select * from foo() where name = 'bar'
2. foo() accepts a parameter and returns a filtered set:
create function foo(varchar) returns setof table1 as
$$
select * from table1 where name = $1
$$ language SQL;
select * from foo('bar')
I assume the DB is smart enough to "inline" the function before planning the final query, so that it make no difference in execution. But i'm not sure.
the first runs function without any parameters (possibly getting more data) and then filter data on field. So probably will cost more.
the second runs function with parameter (possibly reducing data on function run)
without body of the function, it is pure speculation
I have found a wiki page that answers this.
There are multiple conditions that have to be met for the function to be inlined and thus get opmimized as part of the whole query.
A function in question will be inlined if it's declared Immutable or Stable:
create function foo() returns setof table1 as
$$
select * from table1
$$ language SQL stable;
Stable is more appropriate because the function does a table lookup.
Put several million (for example 5) rows into table, try to use at least 10-20 different values for column name (including 'bar'). Then add index create index iixx on table1(name)
Then run select count(*) from foo() where name = 'bar'
then try second version select count(*) from foo1('bar')
and you will see difference

Input table for PL/pgSQL function

I would like to use a plpgsql function with a table and several columns as input parameter. The idea is to split the table in chunks and do something with each part.
I tried the following function:
CREATE OR REPLACE FUNCTION my_func(Integer)
RETURNS SETOF my_part
AS $$
DECLARE
out my_part;
BEGIN
FOR i IN 0..$1 LOOP
FOR out IN
SELECT * FROM my_func2(SELECT * FROM table1 WHERE id = i)
LOOP
RETURN NEXT out;
END LOOP;
END LOOP;
RETURN;
END;
$$
LANGUAGE plpgsql;
my_func2() is the function that does some work on each smaller part.
CREATE or REPLACE FUNCTION my_func2(table1)
RETURNS SETOF my_part2 AS
$$
BEGIN
RETURN QUERY
SELECT * FROM table1;
END
$$
LANGUAGE plpgsql;
If I run:
SELECT * FROM my_func(99);
I guess I should receive the first 99 IDs processed for each id.
But it says there is an error for the following line:
SELECT * FROM my_func2(select * from table1 where id = i)
The error is:
The subquery is only allowed to return one column
Why does this happen? Is there an easy way to fix this?
There are multiple misconceptions here. Study the basics before you try advanced magic.
Postgres does not have "table variables". You can only pass 1 column or row at a time to a function. Use a temporary table or a refcursor (like commented by #Daniel) to pass a whole table. The syntax is invalid in multiple places, so it's unclear whether that's what you are actually trying.
Even if it is: it would probably be better to process one row at a time or rethink your approach and use a set-based operation (plain SQL) instead of passing cursors.
The data types my_part and my_part2 are undefined in your question. May be a shortcoming of the question or a problem in the test case.
You seem to expect that the table name table1 in the function body of my_func2() refers to the function parameter of the same (type!) name, but this is fundamentally wrong in at least two ways:
You can only pass values. A table name is an identifier, not a value. You would need to build a query string dynamically and execute it with EXECUTE in a plpgsql function. Try a search, many related answers her on SO. Then again, that may also not be what you wanted.
table1 in CREATE or REPLACE FUNCTION my_func2(table1) is a type name, not a parameter name. It means your function expects a value of the type table1. Obviously, you have a table of the same name, so it's supposed to be the associated row type.
The RETURN type of my_func2() must match what you actually return. Since you are returning SELECT * FROM table1, make that RETURNS SETOF table1.
It can just be a simple SQL function.
All of that put together:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Note the parentheses, which are essential for decomposing a row type. Per documentation:
The parentheses are required here to show that compositecol is a column name not a table name
But there is more ...
Don't use out as variable name, it's a keyword of the CREATE FUNCTION statement.
The syntax of your main query my_func() is more like psudo-code. Too much doesn't add up.
Proof of concept
Demo table:
CREATE TABLE table1(table1_id serial PRIMARY KEY, txt text);
INSERT INTO table1(txt) VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g');
Helper function:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Main function:
CREATE OR REPLACE FUNCTION my_func(int)
RETURNS SETOF table1 AS
$func$
DECLARE
rec table1;
BEGIN
FOR i IN 0..$1 LOOP
FOR rec IN
SELECT * FROM table1 WHERE table1_id = i
LOOP
RETURN QUERY
SELECT * FROM my_func2(rec);
END LOOP;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM my_func(99);
SQL Fiddle.
But it's really just a a proof of concept. Nothing useful, yet.
As the error log is telling you.. you can return only one column in a subquery, so you have to change it to
SELECT my_func2(SELECT Specific_column_you_need FROM hasval WHERE wid = i)
a possible solution can be that you pass to funct2 the primary key of the table your funct2 needs and then you can obtain the whole table by making the SELECT * inside the function

PLPGSQL Function to Calculate Bearing

Basically I can't get my head around the syntax of plpgsql and would appreciate some help with the following efforts.
I have a table containing 1000's of wgs84 points. The following SQL will retrieve a set of points within a bounding box on this table:
SELECT id, ST_X(wgs_geom), ST_Y(wgs_geom), ST_Z(wgs_geom)
FROM points_table
INNER JOIN
(SELECT ST_Transform(ST_GeomFromText('POLYGON((-1.73576102027 1.5059743629,
-1.73591122397 51.5061067655,-1.73548743495 51.5062838333,-1.73533186682
1.5061514313,-1.73576102027 51.5059743629))', 4326, 27700)
) AS bgeom
) AS t2
ON ST_Within(local_geom, t2.bgeom)
What I need to do is add a bearing/azimuth column to the results that describes the bearing at each point in the returned data set.
So the approach I'm trying to implement is to build a plpgsql function that can select the data as per above and calculate the bearing between each set of points in a loop.
However my efforts at understanding basic data access and handling within a plpgsql function are failing miserably.
An example of the current version of the function I'm trying to create is as follows:
CREATE TYPE bearing_type AS (x numeric, y numeric, z numeric, bearing numeric);
--DROP FUNCTION IF EXISTS get_bearings_from_points();
CREATE OR REPLACE FUNCTION get_bearings_from_points()
RETURNS SETOF bearing_type AS
$BODY$
DECLARE
rowdata points_table%rowtype;
returndata bearing_type;
BEGIN
FOR rowdata IN
SELECT nav_id, wgs_geom
FROM points_table INNER JOIN
(SELECT ST_Transform(ST_GeomFromText('POLYGON((-1.73576102027
3.5059743629,-1.73591122397 53.5061067655,-1.73548743495
53.5062838333,-1.73533186682 53.5061514313,-1.73576102027
53.5059743629))', 4326), 27700)
AS bgeom)
AS t2 ON ST_Within(local_geom, t2.bgeom)
LOOP
returndata.x := ST_X(rowdata.wgs_geom);
returndata.y := ST_Y(rowdata.wgs_geom);
returndata.z := ST_Z(rowdata.wgs_geom);
returndata.bearing := ST_Azimuth(<current_point> , <next_point>)
RETURN NEXT returndata;
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
I would like to just call this function as follows:
SELECT get_bearings_from_points();
and get the desired result.
Basically the problems are understanding how to access the rowdata properly such that I can read the current and next points.
In the above example I've had various problems from how to call the ST_X etc SQL functions and have tried EXECUTE select statements with errors re geometry data types.
Any insights/help would be much appreciated.
In PL/pgSQL it's most effective to do as much as is elegantly possible in basic SQL queries at once. You can largely simplify.
I didn't get a definition of the sort order out of your question and left ??? to fill in for you:
CREATE OR REPLACE FUNCTION get_bearings_from_points(_bgeom geometry)
RETURNS TABLE (x numeric, y numeric, z numeric, bearing numeric) AS
$func$
BEGIN
FOR x, y, z, bearing IN
SELECT ST_X(t.wgs_geom), ST_Y(t.wgs_geom), ST_Z(t.wgs_geom)
, ST_Azimuth(t.wgs_geom, lead(t.wgs_geom) OVER (ORDER BY ???))
FROM points_table t
WHERE ST_Within(t.local_geom, _bgeom)
ORDER BY ???
LOOP
RETURN NEXT;
END LOOP;
END
$func$ LANGUAGE plpgsql;
The window function lead() references a column from the next row according to sort order.
This can be simplified further to a single SQL query - possibly wrapped into an SQL function:
CREATE OR REPLACE FUNCTION get_bearings_from_points(_bgeom geometry)
RETURNS TABLE (x numeric, y numeric, z numeric, bearing numeric) AS
$func$
SELECT ST_X(t.wgs_geom), ST_Y(t.wgs_geom), ST_Z(t.wgs_geom)
, ST_Azimuth(t.wgs_geom, lead(t.wgs_geom) OVER (ORDER BY ???))
FROM points_table t
WHERE ST_Within(t.local_geom, $1) -- use numbers in pg 9.1 or older
ORDER BY ???
$func$ LANGUAGE sql;
Parameter names can be referenced in pg 9.2 or later. Per release notes of pg 9.2:
Allow SQL-language functions to reference parameters by name (Matthew
Draper)

PL/pgSQL Return SETOF Records Error

I am relatively new to postgresql and battling my way to get familiarized with it. I had run in to an error while writing a new pl/sql function. ERROR: type "ordered_parts" does not exist
CREATE OR REPLACE FUNCTION get_ordered_parts(var_bill_to integer)
RETURNS SETOF ordered_parts AS
$BODY$
declare
var_ordered_id record;
var_part ordered_parts;
begin
for var_ordered in select order_id from view_orders where bill_to = var_bill_to
loop
for var_part select os.po_num,os.received,os.customer_note,orders.part_num,orders.description,orders.order_id,orders.remaining_quantity from (select vli.part_num,vli.description,vli.order_id,vli.quantity - vli.quantity_shipped as remaining_quantity from view_line_items as vli where vli.order_id in (select order_id from view_orders where bill_to = var_bill_to and order_id = var_ordered.order_id) and vli.quantity - vli.quantity_shipped > 0)as orders left join order_sales as os on orders.order_id = os.order_id
then
-- Then we've found a leaf part
return next var_part;
end if;
end loop;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION get_ordered_parts(integer) OWNER TO postgres;
just note - your code is perfect example how don't write stored procedure ever. For some longer results it can be extremely slow. Minimally two cycles can be joined to one, or better, you can use just only one RETURN QUERY statement. Next issue is zero formatting of embedded SQL - good length of line is between 70 and 100 chars - writing long SQL statement to one line going to zero readability and maintainability code.
Relation database is not array, and any query has some cost, so don't use nested FOR if you really don't need it. I am sorry for offtopic.
The error message is telling you that you have declared the return type of your function to be SETOF ordered_parts, but it doesn't know what kind of thing ordered_parts is. Within your Declare block you also have a variable declared as this same type (var_part ordered_parts).
If you had a table or view called ordered_parts, then its "row type" would be automatically created as a type, but this is not the case here. if you just want to use an arbitrary row from a result set, you can just use the generic type record.
So in this case your function should say RETURNS SETOF record, and your Declare block var_part record.
Bonus tip: rather than looping over the result of your query and running RETURN NEXT on each row, you can use RETURN QUERY to throw the whole result set into the returned set in one go. See this Postgres manual page.