How do I use a date as a variable in a postgreSQL function? - sql

I'm trying to write a function in postgreSQL 9.5 that takes a date as a parameter. My table has a column called inception_date, and I want the function to return all rows from the table where inception_date is greater than the date provided as the variable. Below is my function:
CREATE OR REPLACE FUNCTION somefunc(date) RETURNS setof table AS
$BODY$
DECLARE variable ALIAS FOR $1;
BEGIN
RETURN QUERY SELECT * FROM table WHERE inception_date > variable;
END;
$BODY$
LANGUAGE 'plpgsql';
SELECT somefunc('2014-07-02');
I haven't been able to find any info saying dates are handled differently in posgreSQL functions than other datatypes, but this function doesn't display any output, while the query
SELECT * FROM table WHERE inception_date > '2014-07-02';
returns 15 rows. Does anyone know what might be going wrong here?

What is going here, You interpret Your code differently from PostgreSQL parser. For You '2014-07-02' is a date. For parser this is a srting. But he tries to be nice and when it see You use it for filtering a date column, it tries to interpret it as a date.
But given '2014-07-02', did You mean seventh day of february, or second day of july? He does not know it (and I do not know either). So, basically this is error on Your side and You should not do it this way.
What I would propose, is to create function with named parameter of date type, and use it as date. Then, when invoking this function, and having string as parameter, I would turn this string to date, telling PostgreSQL what I mean.
Examples and references to manual below.
CREATE OR REPLACE FUNCTION somefunc(p_date date)
RETURNS setof table
AS
$BODY$
BEGIN
RETURN QUERY SELECT * FROM table WHERE inception_date > p_date;
END;
$BODY$
LANGUAGE 'plpgsql';
SELECT somefunc(to_date('2014-07-02','YYYY-MM-DD'));
Here You can find more information on date formatting and creating of functions.

I have this 'pattern' for using pseudo variables containing dates. It does NOT rely on a function.
I like using the WITH() CTE for this purpose:
WITH myvar (enddate) AS (VALUES(to_date('2020-06-26', 'YYYY-MM-DD')))
SELECT 'Total downloaded records', sum(d.total_records) FROM download_table d
JOIN myvar m ON date(d.created) BETWEEN '2020-01-01' AND m.enddate
Please notice that the WITH Common Table Expression contains the date cast to a date format. You need this because you need to compare between date types and not date <= text.

Related

Compute an aggregated tsrange from a set of entries?

I am trying to compute a aggregated tsrange from a set of row that I extract from an SQL query. Problem is that I keep getting errors that the input parameter is not being passed in.
CREATE OR REPLACE AGGREGATE range_merge(anyrange)
(
sfunc = range_merge,
stype = anyrange
);
DROP FUNCTION IF EXISTS aggregate_validity(entity_name regclass, entry bigint);
CREATE OR REPLACE FUNCTION aggregate_validity(entity_name regclass, entry bigint) returns tsrange AS
$$
DECLARE
result tsrange;
BEGIN
EXECUTE format('select range_merge(valid) from %s where entity_id = %U', entity_name, entry) into result;
return result;
END
$$ LANGUAGE plpgsql;
When I do:
select * from aggregate_validity(country, 1);
I get an error stating that the entity name and entry do not exist. It does not seem to parameterize the input into the statement properly.
Function:
EXECUTE format('select range_merge(valid) from %s where entity_id=%U',entity_name, entry)
into result;
=>
EXECUTE format('select range_merge(valid) from %I where entity_id=%s',entity_name, entry)
into result;
--%I for identifier, %s for value
Call:
select * from aggregate_validity(country, 1)
=>
select * from aggregate_validity('country', 1);
db<>fiddle demo
CREATE OR REPLACE AGGREGATE range_merge(anyrange) (
SFUNC = range_merge
, STYPE = anyrange
);
-- DROP FUNCTION IF EXISTS aggregate_validity(entity_name regclass, entry bigint);
CREATE OR REPLACE FUNCTION aggregate_validity(entity_name regclass, entry bigint, OUT result tsrange)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT range_merge(valid) FROM ' || entity_name || ' WHERE entity_id = $1'
INTO result
USING entry;
END
$func$;
Call:
SELECT aggregate_validity('country', 1);
db<>fiddle here
The call does not need SELECT * FROM, as the function returns a single value per definition.
I used an OUT parameter to simplify (OUT result tsrange). See:
Returning from a function with OUT parameter
Don't concatenate the entry value into the SQL string. Pass it as value with the USING clause. Cleaner, faster.
Since entity_name is passed as regclass, it's safe to simply concatenate (which is a bit cheaper). See:
Table name as a PostgreSQL function parameter
Plus, missing quotes and incorrect format specifiers, as Lukasz already provided.
Your custom aggregate function range_merge() has some caveats:
I wouldn't name it "range_merge", that being the name of the plain function range_merge(), too. While that's legal, it still invites confusing errors.
You are aware that the function range_merge() includes gaps between input ranges in the output range?
range_merge() returns NULL for any NULL input. So if your table has any NULL values in the column valid, the result is always NULL. I strongly suggest that any involved columns shall be defined as NOT NULL.
If you are at liberty to install additional modules, consider range_agg by Paul Jungwirth who is also here on Stackovflow. It provides the superior function range_agg() addressing some of the mentioned issues.
If you don't want to include gaps, consider the Postgres Wiki page on range aggregation.
I would probably not use aggregate_validity() at all. It obscures the nested functionality from the Postgres query planner and may lead so suboptimal query plans. Typically, you can replace it with a correlated or a LATERAL subquery, which can be planned and optimized by Postgres in context of the outer query. I appended a demo to the fiddle:
db<>fiddle here
Related:
What is the difference between LATERAL and a subquery in PostgreSQL?

How to convert Postgres plpgsql user-defined function to LANGUAGE SQL user-defined function?

My understanding is, within Postgres Database, we can write SQL style user -created function and PlpgSQL style user-created function. And they should be able to translate from one to the other. First off, am I conceptually wrong?
Here is an example:
I was trying to convert such code below:
CREATE OR REPLACE FUNCTION getNthHighestSalary(N integer) RETURNS integer
AS $$
BEGIN
return (
select distinct salary
from employee
order by salary
limit 1 offset $1-1);
END;$$ LANGUAGE plpgsql;
into something like:
CREATE OR REPLACE FUNCTION getNthHighestSalary(N integer) RETURNS integer
AS
BEGIN
return (
select distinct salary
from employee
order by salary
limit 1 offset $1-1);
END; LANGUAGE SQL;
no matter how I tried, the code I converted to won't work inside Postgres database, and always throws weird syntax error.
so how to convert the piece of code above to a viable Standard SQL function which is able to run within Postgres database? especially please explain where the problem is and what's the major difference between Standard SQL and Plpgsql syntax in the Postgres Database environment. Thanks a lot
BTW, here's the code for creating test table and inserting test data:
create table Employee
(
id varchar(255) PRIMARY KEY,
Salary numeric
);
insert into Employee values('1',100),('2',200),('3',300);
If you want to use LANGUAGE SQL, then there are a couple of changes you have to make.
First is to get rid of the BEGIN and END.
Second is to simply state the SELECT query without the RETURN keyword.
There were some other problems: You should order by salary desc, the return type is numeric rather than integer, and you need to escape the ; character, so enclose it with $$ as you do the plpgsql functions.
CREATE OR REPLACE FUNCTION getNthHighestSalary(N integer) RETURNS numeric
AS $$
select distinct salary
from employee
order by salary desc
limit 1 offset $1-1;
$$ LANGUAGE SQL;

Postgresql trying to use execute format in a function but getting column not found error when giving string format in coalesce

I'm trying to create a function and specify a date format but the date format is being taken as a column name because somehow inside format it's not being able to be represented as a string. I have tried %s, quote indent and everything else but doesnt work. Below is my code and the error I'm getting
drop function if exists foo(_t text);
create or replace function foo(_t text)
returns TABLE(Stage_ID bigint,Date varchar) as
$func$
begin
return query
execute format('Select Stage_ID,Date
from table
where to_date(Date, "YYYY-MM-DD")==%I',_t);
end
$func$ language plpgsql;
select * from foo('2010-01-01');
ERROR
ERROR: column "YYYY-MM-DD" does not exist
LINE: where TO_DATE(Date, "YYYY-MM-DD") = p...
This might do what you are looking for:
CREATE OR REPLACE FUNCTION foo(_t text)
RETURNS TABLE (Stage_ID bigint, Date varchar) AS
$func$
SELECT t.Stage_ID, t.Date
FROM tbl t
WHERE t.Date = _t::date;
$func$ LANGUAGE sql;
The expression where to_date(Date, "YYYY-MM-DD")==%I',_t); is backwards in multiple ways.
Single quotes for values: 'YYYY-MM-DD'.
The operator is =, not ==.
Seems like you really want t.Date = to_date(_t, 'YYYY-MM-DD')
And while _t is in standard ISO form 'YYYY-MM-DD', rather just cast instead: t.Date = _t::date.
Output column names are visible inside the function body. Table-qualify column of the same name. Better yet, avoid naming conflicts like that to begin with! See:
How to return result of a SELECT inside a function in PostgreSQL?
No need for dynamic SQL with EXECUTE. Passing a data value works just fin with plain SQL.
No need for plpgsql. The simple query does not require any procedural functionality. LANGUAGE sql does the job - if you need a function at all, plain SQL would seem just fine for the job.
Aside: don't use basic type names like "date" as identifier. Stick to legal, lower case identifiers. Related:
Are PostgreSQL column names case-sensitive?

Input table for PL/pgSQL function

I would like to use a plpgsql function with a table and several columns as input parameter. The idea is to split the table in chunks and do something with each part.
I tried the following function:
CREATE OR REPLACE FUNCTION my_func(Integer)
RETURNS SETOF my_part
AS $$
DECLARE
out my_part;
BEGIN
FOR i IN 0..$1 LOOP
FOR out IN
SELECT * FROM my_func2(SELECT * FROM table1 WHERE id = i)
LOOP
RETURN NEXT out;
END LOOP;
END LOOP;
RETURN;
END;
$$
LANGUAGE plpgsql;
my_func2() is the function that does some work on each smaller part.
CREATE or REPLACE FUNCTION my_func2(table1)
RETURNS SETOF my_part2 AS
$$
BEGIN
RETURN QUERY
SELECT * FROM table1;
END
$$
LANGUAGE plpgsql;
If I run:
SELECT * FROM my_func(99);
I guess I should receive the first 99 IDs processed for each id.
But it says there is an error for the following line:
SELECT * FROM my_func2(select * from table1 where id = i)
The error is:
The subquery is only allowed to return one column
Why does this happen? Is there an easy way to fix this?
There are multiple misconceptions here. Study the basics before you try advanced magic.
Postgres does not have "table variables". You can only pass 1 column or row at a time to a function. Use a temporary table or a refcursor (like commented by #Daniel) to pass a whole table. The syntax is invalid in multiple places, so it's unclear whether that's what you are actually trying.
Even if it is: it would probably be better to process one row at a time or rethink your approach and use a set-based operation (plain SQL) instead of passing cursors.
The data types my_part and my_part2 are undefined in your question. May be a shortcoming of the question or a problem in the test case.
You seem to expect that the table name table1 in the function body of my_func2() refers to the function parameter of the same (type!) name, but this is fundamentally wrong in at least two ways:
You can only pass values. A table name is an identifier, not a value. You would need to build a query string dynamically and execute it with EXECUTE in a plpgsql function. Try a search, many related answers her on SO. Then again, that may also not be what you wanted.
table1 in CREATE or REPLACE FUNCTION my_func2(table1) is a type name, not a parameter name. It means your function expects a value of the type table1. Obviously, you have a table of the same name, so it's supposed to be the associated row type.
The RETURN type of my_func2() must match what you actually return. Since you are returning SELECT * FROM table1, make that RETURNS SETOF table1.
It can just be a simple SQL function.
All of that put together:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Note the parentheses, which are essential for decomposing a row type. Per documentation:
The parentheses are required here to show that compositecol is a column name not a table name
But there is more ...
Don't use out as variable name, it's a keyword of the CREATE FUNCTION statement.
The syntax of your main query my_func() is more like psudo-code. Too much doesn't add up.
Proof of concept
Demo table:
CREATE TABLE table1(table1_id serial PRIMARY KEY, txt text);
INSERT INTO table1(txt) VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g');
Helper function:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Main function:
CREATE OR REPLACE FUNCTION my_func(int)
RETURNS SETOF table1 AS
$func$
DECLARE
rec table1;
BEGIN
FOR i IN 0..$1 LOOP
FOR rec IN
SELECT * FROM table1 WHERE table1_id = i
LOOP
RETURN QUERY
SELECT * FROM my_func2(rec);
END LOOP;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM my_func(99);
SQL Fiddle.
But it's really just a a proof of concept. Nothing useful, yet.
As the error log is telling you.. you can return only one column in a subquery, so you have to change it to
SELECT my_func2(SELECT Specific_column_you_need FROM hasval WHERE wid = i)
a possible solution can be that you pass to funct2 the primary key of the table your funct2 needs and then you can obtain the whole table by making the SELECT * inside the function

How to return result of a SELECT inside a function in PostgreSQL?

I have this function in PostgreSQL, but I don't know how to return the result of the query:
CREATE OR REPLACE FUNCTION wordFrequency(maxTokens INTEGER)
RETURNS SETOF RECORD AS
$$
BEGIN
SELECT text, count(*), 100 / maxTokens * count(*)
FROM (
SELECT text
FROM token
WHERE chartype = 'ALPHABETIC'
LIMIT maxTokens
) as tokens
GROUP BY text
ORDER BY count DESC
END
$$
LANGUAGE plpgsql;
But I don't know how to return the result of the query inside the PostgreSQL function.
I found that the return type should be SETOF RECORD, right? But the return command is not right.
What is the right way to do this?
Use RETURN QUERY:
CREATE OR REPLACE FUNCTION word_frequency(_max_tokens int)
RETURNS TABLE (txt text -- also visible as OUT param in function body
, cnt bigint
, ratio bigint)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT t.txt
, count(*) AS cnt -- column alias only visible in this query
, (count(*) * 100) / _max_tokens -- I added parentheses
FROM (
SELECT t.txt
FROM token t
WHERE t.chartype = 'ALPHABETIC'
LIMIT _max_tokens
) t
GROUP BY t.txt
ORDER BY cnt DESC; -- potential ambiguity
END
$func$;
Call:
SELECT * FROM word_frequency(123);
Defining the return type explicitly is much more practical than returning a generic record. This way you don't have to provide a column definition list with every function call. RETURNS TABLE is one way to do that. There are others. Data types of OUT parameters have to match exactly what is returned by the query.
Choose names for OUT parameters carefully. They are visible in the function body almost anywhere. Table-qualify columns of the same name to avoid conflicts or unexpected results. I did that for all columns in my example.
But note the potential naming conflict between the OUT parameter cnt and the column alias of the same name. In this particular case (RETURN QUERY SELECT ...) Postgres uses the column alias over the OUT parameter either way. This can be ambiguous in other contexts, though. There are various ways to avoid any confusion:
Use the ordinal position of the item in the SELECT list: ORDER BY 2 DESC. Example:
Select first row in each GROUP BY group?
Repeat the expression ORDER BY count(*).
(Not required here.) Set the configuration parameter plpgsql.variable_conflict or use the special command #variable_conflict error | use_variable | use_column in the function. See:
Naming conflict between function parameter and result of JOIN with USING clause
Don't use "text" or "count" as column names. Both are legal to use in Postgres, but "count" is a reserved word in standard SQL and a basic function name and "text" is a basic data type. Can lead to confusing errors. I use txt and cnt in my examples, you may want more explicit names.
Added a missing ; and corrected a syntax error in the header. (_max_tokens int), not (int maxTokens) - data type after name.
While working with integer division, it's better to multiply first and divide later, to minimize the rounding error. Or work with numeric or a floating point type. See below.
Alternative
This is what I think your query should actually look like (calculating a relative share per token):
CREATE OR REPLACE FUNCTION word_frequency(_max_tokens int)
RETURNS TABLE (txt text
, abs_cnt bigint
, relative_share numeric)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT t.txt, t.cnt
, round((t.cnt * 100) / (sum(t.cnt) OVER ()), 2) -- AS relative_share
FROM (
SELECT t.txt, count(*) AS cnt
FROM token t
WHERE t.chartype = 'ALPHABETIC'
GROUP BY t.txt
ORDER BY cnt DESC
LIMIT _max_tokens
) t
ORDER BY t.cnt DESC;
END
$func$;
The expression sum(t.cnt) OVER () is a window function. You could use a CTE instead of the subquery. Pretty, but a subquery is typically cheaper in simple cases like this one (mostly before Postgres 12).
A final explicit RETURN statement is not required (but allowed) when working with OUT parameters or RETURNS TABLE (which makes implicit use of OUT parameters).
round() with two parameters only works for numeric types. count() in the subquery produces a bigint result and a sum() over this bigint produces a numeric result, thus we deal with a numeric number automatically and everything just falls into place.
Please see the following link for documentation:
https://www.postgresql.org/docs/current/xfunc-sql.html
Example:
CREATE FUNCTION sum_n_product_with_tab (x int)
RETURNS TABLE(sum int, product int) AS $$
SELECT $1 + tab.y, $1 * tab.y FROM tab;
$$ LANGUAGE SQL;