Are there any escaping syntax for psql variable inside PostgreSQL functions? - sql

I write PSQL script and using variables (for psql --variable key=value commandline syntax).
This works perfectly for top level scope like select * from :key, but I create functions with the script and need variable value inside them.
So, the syntax like
create function foo() returns void as
$$
declare
begin
grant select on my_table to group :user;
end;
$$
language plpgsql;
fails at :user.
As far as I understand psql variables is a plain macro substitution feature, but it doesn't process function bodies.
Are there any escaping syntax for such cases? Surrounding :user with $$ works regarding substitution, but psql fails at $$.
Are there any other way to do this besides standalone macro processing (sed, awk, etc)?

PSQL SET variables aren't interpolated inside dollar-quoted strings. I don't know this for certain, but I think there's no escape or other trickery to turn on SET variable interpolation in there.
One might think you could wedge an unquoted :user between two dollar-quoted stretches of PL/pgSQL to get the desired effect. But this doesn't seem to work... I think the syntax requires a single string and not an expression concatenating strings. Might be mistaken on that.
Anyway, that doesn't matter. There's another approach (as Pasco noted): write the stored procedure to accept a PL/pgSQL argument. Here's what that would look like.
CREATE OR REPLACE FUNCTION foo("user" TEXT) RETURNS void AS
$$
BEGIN
EXECUTE 'GRANT SELECT ON my_table TO GROUP ' || quote_ident(user);
END;
$$ LANGUAGE plpgsql;
Notes on this function:
EXECUTE generates an appropriate GRANT on each invocation using on our procedure argument. The PG manual section called "Executing Dynamic Commands" explains EXECUTE in detail.
The declaration of procedure argument user must be double quoted. Double quotes force it to be interpreted as an identifier.
Once you define the function like this, you can call it using interpolated PSQL variables. Here's an outline.
Run psql --variable user="'whoever'" --file=myscript.sql. Single quotes are required around the username!
In myscript.sql, define function like above.
In myscript.sql, put select foo(:user);. This is where we rely on those single quotes we put in the value of user.
Although this seems to work, it strikes me as rather squirrely. I thought SET variables were intended for runtime configuration. Carrying data around in SET seems odd.
Edit: here's a concrete reason to not use SET variables. From the manpage: "These assignments are done during a very early stage of startup, so variables reserved for internal purposes might get overwritten later." If Postgres decided to use a variable named user (or whatever you pick), it could overwrite your script argument with something you never intended. In fact, psql already takes USER for itself -- this only works because SET is case sensitive. This very nearly broke things from the start!

You could use the method Dan LaRocque describes to make it kind of hack'ishly work (not a knock at Dan) - but psql is not really your friend for doing this kind of work (assuming this kind of work is scripting and not one-off things). If you've got a function, rewrite it to take a parameter like so:
create function foo(v_user text) returns void as
$$
begin
execute 'grant select on my_table to group '||quote_ident(v_user);
end;
$$
language plpgsql;
(quote_ident() makes it so you don't have to assure the double-quotes, it handles all that.)
But then use a real scripting language like Perl, Python, Ruby, etc that has a Postgres driver to invoke your function with a parameter.
Psql has its place, I'm just not sure that this is it.

Actually, there's a way in more modern versions of psql.
Since the function body has to be a string constant and psql does not perform variable interpolation within string constants we have to resort to assembling the function body in a separate first step:
select format($gexec$
create function foo() returns void as
$$
begin
grant select on my_table to group %I;
end;
$$
language plpgsql;
$gexec$, :'user') \gexec
The trailing \gexec command
executes the "outer" query using the format function to produce the desired query sting and then
executes the produced query string as a query.

Related

Struggling to create a "stored procedure" beyond INSERT

Whenever I try to call a stored procedure in PostgreSQL that goes beyond inserting data, it takes forever to run, and it isn't that the query is complicated or the dataset is huge. The dataset is small. I cannot return a table from a stored procedure and I cannot even return 1 row or 1 data point from a stored procedure. It says it is executing the query for a very long time until I finally stop the query from running. It does not give me a reason. I can't let it run for hours. Any ideas on what might be happening? I have included stored procedures that I have tried to call.
Non-working example #1:
CREATE PROCEDURE max_duration(OUT maxD INTERVAL)
LANGUAGE plpgsql AS $$
DECLARE maxD INTERVAL;
BEGIN
SELECT max(public.bikeshare3.duration)
INTO maxD
FROM public.bikeshare3;
END;
$$ ;
CALL max_duration(NULL);
Non-working example #2:
CREATE PROCEDURE getDataByRideId2(rideId varchar(16))
LANGUAGE SQL
AS $$
SELECT rideable_type FROM bikeshare3
WHERE ride_id = rideId
$$;
CALL getDataByRideId2('x78900');
Working example
The only one that worked when called is an insert procedure:
CREATE OR REPLACE PROCEDURE genre_insert_data(GenreId integer, Name_b character varying)
LANGUAGE SQL
AS $$
INSERT INTO public.bikeshare3 VALUES (GenreId, Name_b)
$$;
CALL genre_insert_data(1, 'testName');
FUNCTION or PROCEDURE?
The term "stored procedure" has been a widespread misnomer for the longest time. That got more confusing since Postgres 11 added CREATE PROCEDURE.
You can create a FUNCTION or a PROCEDURE in Postgres. Typically, you want a FUNCTION. A PROCEDURE mostly only makes sense when you need to COMMIT in the body. See:
How to return a value from a stored procedure (not function)?
Nothing in your question indicates the need for a PROCEDURE. You probably want a FUNCTION.
Question asked
Adrian already pointed out most of what's wrong in his comment.
Your first example could work like this:
CREATE OR REPLACE PROCEDURE max_duration(INOUT _max_d interval = NULL)
LANGUAGE plpgsql AS
$proc$
BEGIN
SELECT max(b.duration) INTO _max_d
FROM public.bikeshare3 b;
END
$proc$;
CALL max_duration();
Most importantly, your OUT parameter is visible inside the procedure body. Declaring another instance as variable hides the parameter. You could access the parameter by qualifying with the function name, max_duration.maxD in your original. But that's a measure of last resort. Rather don't introduce duplicate variable names to begin with.
I also did away with error-prone mixed-case identifiers in my answer. See:
Are PostgreSQL column names case-sensitive?
I made the parameter INOUT max_d interval = NULL. By adding a default value, we don't have to pass a value in the call (that's not used anyway). But it must be INOUT instead of OUT for this.
Also, OUT parameters only work for a PROCEDURE since Postgres 14. The release notes:
Stored procedures can now return data via OUT parameters.
While using an OUT parameter, this advise from the manual applies:
Arguments must be supplied for all procedure parameters that lack
defaults, including OUT parameters. However, arguments matching OUT
parameters are not evaluated, so it's customary to just write NULL
for them. (Writing something else for an OUT parameter might cause
compatibility problems with future PostgreSQL versions.)
Your second example could work like this:
CREATE OR REPLACE PROCEDURE get_data_by_ride_id2(IN _ride_id text
, INOUT _rideable_type text = NULL) -- return type?
LANGUAGE sql AS
$proc$
SELECT b.rideable_type
FROM public.bikeshare3 b
WHERE b.ride_id = _ride_id;
$proc$;
CALL get_data_by_ride_id2('x78900');
If the query returns multiple rows, only the first one is returned and the rest is discarded. Don't go there. This only makes sense while ride_id is UNIQUE. Even then, a FUNCTION still seems more suitable ...

How to write script instead of function in pl/pgsql?

I know how to define functions in pl/pgsql ... but (for testing purposes) I would now like to write pl/pgsql as a script. (That is, the code should not be enclosed in a function.) Somehow this does not seem possible. I get syntax errors for things I know are correct (inside a pl/pgsql-function), for example:
declare v_test character varying;
Even this simple one-line script fails.
How can I write a pl/pgsql script?
PostgreSQL parser doesn't support PLpgSQL. PLpgSQL can be parsed (executed) only inside functions (procedures) or inside anonymous block - statement DO
DO $$
DECLARE x int DEFAULT 10;
BEGIN
RAISE NOTICE '%', x;
END;
$$;
There are not any other possibility.

EXECUTE statement syntax error in PostgreSQL

I am working on an extension of a PostgreSQL library that takes a string representation of a query as input. Basically I need to instantiate the resulting table that this string-based query produces, modify it, and then pass it to another function.
Right now I am just trying to get the query instatiated as a temporary table, so I am using this sample query:
CREATE TEMPORARY TABLE pgr_table (seq INTEGER, path_seq INTEGER, node INTEGER, edge BIGINT, cost DOUBLE PRECISION, agg_cost DOUBLE PRECISION);
EXECUTE 'SELECT gid AS id, source, target, cost, reverse_cost FROM ways;' INTO pgr_table;
But this results in a syntax error, just after the EXECUTE command. Am I not using it correctly?
By the way, I am aware of the dangers of SQL injection and using EXECUTE willy-nilly. The queries that I am making are not designed for front-end use, and I am following the design patterns already set forth by the library which I am modifying.
you confuse SQL execute and plpgsql execute - first to execute prepared statement and is run in SQL (as you try). second is a part of function plpgsql code
https://www.postgresql.org/docs/current/static/sql-execute.html
EXECUTE — execute a prepared statement
https://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
Oftentimes you will want to generate dynamic commands inside your
PL/pgSQL functions, that is, commands that will involve different
tables or different data types each time they are executed. PL/pgSQL's
normal attempts to cache plans for commands (as discussed in Section
42.10.2) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided:
examples:
t=# prepare s as select now();
PREPARE
t=# execute s;
now
-------------------------------
2017-12-14 12:47:28.844485+00
(1 row)
and plpgsql:
t=# do
$$
declare
t text;
begin
execute 'select now()' into t;
raise info '%',t;
end;
$$
;
INFO: 2017-12-14 12:48:45.902768+00
DO
updtae
to avoid injection using dynamic code, use function format
https://www.postgresql.org/docs/current/static/functions-string.html
Format arguments according to a format string. This function is
similar to the C function sprintf.

Difference between language sql and language plpgsql in PostgreSQL functions

Am very new in Database development so I have some doubts regarding my following example:
Function f1() - language sql
create or replace function f1(istr varchar)
returns text as $$
select 'hello! '::varchar || istr;
$$ language sql;
Function f2() - language plpgsql
create or replace function f2(istr varchar)
returns text as $$
begin select 'hello! '::varchar || istr; end;
$$ language plpgsql;
Both functions can be called like select f1('world') or select f2('world').
If I call select f1('world') the output will be:
`hello! world`
And output for select f2('world'):
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function f11(character varying) line 2 at SQL statement
********** Error **********
I wish to know the difference and in which situations I should use language sql or language plpgsql.
Any useful link or answers regarding functions will much appreciated.
SQL functions
... are the better choice:
For simple scalar queries. Not much to plan, better save any overhead.
For single (or very few) calls per session. Nothing to gain from plan caching via prepared statements that PL/pgSQL has to offer. See below.
If they are typically called in the context of bigger queries and are simple enough to be inlined.
For lack of experience with any procedural language like PL/pgSQL. Many know SQL well and that's about all you need for SQL functions. Few can say the same about PL/pgSQL. (Though it's rather simple.)
A bit shorter code. No block overhead.
PL/pgSQL functions
... are the better choice:
When you need any procedural elements or variables that are not available in SQL functions, obviously.
For any kind of dynamic SQL, where you build and EXECUTE statements dynamically. Special care is needed to avoid SQL injection. More details:
Postgres functions vs prepared queries
When you have computations that can be reused in several places and a CTE can't be stretched for the purpose. In an SQL function you don't have variables and would be forced to compute repeatedly or write to a table. This related answer on dba.SE has side-by-side code examples for solving the same problem using an SQL function / a plpgsql function / a query with CTEs:
How to pass a parameter into a function
Assignments are somewhat more expensive than in other procedural languages. Adapt a programming style that doesn't use more assignments than necessary.
When a function cannot be inlined and is called repeatedly. Unlike with SQL functions, query plans can be cached for all SQL statements inside a PL/pgSQL functions; they are treated like prepared statements, the plan is cached for repeated calls within the same session (if Postgres expects the cached (generic) plan to perform better than re-planning every time. That's the reason why PL/pgSQL functions are typically faster after the first couple of calls in such cases.
Here is a thread on pgsql-performance discussing some of these items:
Re: pl/pgsql functions outperforming sql ones?
When you need to trap errors.
For trigger functions.
When including DDL statements changing objects or altering system catalogs in any way relevant to subsequent commands - because all statements in SQL functions are parsed at once while PL/pgSQL functions plan and execute each statement sequentially (like a prepared statement). See:
Why can PL/pgSQL functions have side effect, while SQL functions can't?
Also consider:
PostgreSQL Stored Procedure Performance
To actually return from a PL/pgSQL function, you could write:
CREATE FUNCTION f2(istr varchar)
RETURNS text AS
$func$
BEGIN
RETURN 'hello! '; -- defaults to type text anyway
END
$func$ LANGUAGE plpgsql;
There are other ways:
Can I make a plpgsql function return an integer without using a variable?
The manual on "Returning From a Function"
PL/PgSQL is a PostgreSQL-specific procedural language based on SQL. It has loops, variables, error/exception handling, etc. Not all SQL is valid PL/PgSQL - as you discovered, for example, you can't use SELECT without INTO or RETURN QUERY. PL/PgSQL may also be used in DO blocks for one-shot procedures.
sql functions can only use pure SQL, but they're often more efficient and they're simpler to write because you don't need a BEGIN ... END; block, etc. SQL functions may be inlined, which is not true for PL/PgSQL.
People often use PL/PgSQL where plain SQL would be sufficient, because they're used to thinking procedurally. In most cases when you think you need PL/PgSQL you probably actually don't. Recursive CTEs, lateral queries, etc generally meet most needs.
For more info ... see the manual.
just make the select query you wrote inside the function as the returned value:
create or replace function f2(istr varchar)
returns text as $$
begin return(select 'hello! '::varchar || istr); end;
$$ language plpgsql;

How to convert a big set of SQL queries into a single stored procedure that uses a variable?

I am trying to convert a big list of SQL statements into a PostgreSQL stored procedure that uses a variable, one that should be populated from the result of one SELECT.
If you want to see what has to be run, you can check it here
As far as I know PostgreSQL does not allow use to use variables inside stored procedures that are using SQL language, so I'm looking for solutions that would require a minimal number of changes.
It's much easier after you find the right syntax:
Here is the procedure definition for plpgsql language:
DECLARE myvar integer;
BEGIN
SELECT INTO myvar FROM ...;
-- use myvar
END;
The code seems to be pretty repetitive. Will EXECUTE be of any help? (manual about execute) (example and more information) It allows you to run predefined queries and create new ones on the fly.