How to put part of a SELECT statement into a Postgres function - sql

I have part of a SELECT statement that is a pretty lengthy set of conditional statements. I want to put it into a function so I can call it much more efficiently on any table I need to use it on.
So instead of:
SELECT
itemnumber,
itemname,
base,
CASE
WHEN labor < 100 AND overhead < .20 THEN
WHEN .....
WHEN .....
WHEN .....
.....
END AS add_cost,
gpm
FROM items1;
I can just do:
SELECT
itemnumber,
itemname,
base,
calc_add_cost(),
gpm
FROM items1;
Is it possible to add part of a SELECT to a function so that can be injected just by calling the function?
I am sorting through documentation and Google, and it seems like it might be possible if creating the function in the plpgsql language as opposed to sql. However, what I am reading isn't very clear.

You cannot just wrap any part of a SELECT statement into a function. But an expression like your CASE can easily be wrapped:
CREATE OR REPLACE FUNCTION pg_temp.calc_add_cost(_labor integer, _overhead numeric)
RETURNS numeric AS
$func$
SELECT
CASE
WHEN _labor < 100 AND _overhead < .20 THEN numeric '1' -- example value
-- WHEN .....
-- WHEN .....
-- WHEN .....
ELSE numeric '0' -- example value
END;
$func$
LANGUAGE sql IMMUTABLE;
While you could also use PL/pgSQL for this, the demo is a simple SQL function. See:
Difference between language sql and language plpgsql in PostgreSQL functions
Adapt input and output data types to your need. Just guessing integer and numeric for lack of information.
Call:
SELECT calc_add_cost(1, 0.1);
In your statement:
SELECT
itemnumber,
itemname,
base,
calc_add_cost(labor, overhead) AS add_cost, -- pass column values as input
gpm
FROM items1;
You should understand some basics about Postgres functions to make proper use. Might start with the manual page on CREATE FUNCTION.
There are also many related questions & answers here on SO.
Also see the related, more sophisticated case passing whole rows here:
Pass the table name used in FROM to function automatically in PostgreSQL 9.6.3

Related

Pass table name used in FROM to function automatically?

Working with PostgreSQL 9.6.3. I am new to functions in databases.
Let's say there are multiple tables of item numbers. Each one has the item number, the item cost and several other columns which are factored into the "additional cost". I would like to put the calculation into a function so I can call it for any of these tables.
So instead of:
SELECT
itemnumber,
itemname,
base,
CASE
WHEN labor < 100 AND overhead < .20 THEN
WHEN .....
WHEN .....
WHEN .....
.....
END AS add_cost,
gpm
FROM items1;
I can just do:
SELECT
itemnumber,
itemname,
base,
calc_add_cost(),
gpm
FROM items1;
If I want to be able to use it on any of the item tables, I guess I would need to set a table_name parameter that the function takes since adding the table name into the function would be undesirable to say the least.
calc_add_cost(items1)
However, is there a simpler way such that when I call calc_add_cost() it will just use the table name from the FROM clause?
SELECT ....., calc_add_cost(item1) FROM item1
Just seems redundant.
I did come across a few topics with titles that sounded like they addressed what I was hoping to accomplish, but upon reviewing them it looked like they were a different issue.
You can even emulate a "computed field" or "generated column" like you had in mind. Basics here:
Store common query as column?
Simple demo for one table:
CREATE OR REPLACE FUNCTION add_cost(items1) -- function name = default col name
RETURNS numeric AS
$func$
SELECT
CASE
WHEN $1.labor < 100 AND $1.overhead < .20 THEN numeric '1'
-- WHEN .....
-- WHEN .....
-- WHEN .....
ELSE numeric '0' -- ?
END;
$func$
LANGUAGE sql IMMUTABLE;
Call:
SELECT *, t.add_cost FROM items1 t;
Note the table-qualification in t.add_cost. I only demonstrate this syntax variant since you have been asking for it. My advise is to use the less confusing standard syntax:
SELECT *, add_cost(t) AS add_cost FROM items1 t; -- column alias is also optional
However, SQL is a strictly typed language. If you define a particular row type as input parameter, it is bound to this particular row type. Passing various whole table types is more sophisticated, but still possible with polymorphic input type.
CREATE OR REPLACE FUNCTION add_cost(ANYELEMENT) -- function name = default col name
RETURNS numeric AS
$func$
SELECT
CASE
WHEN $1.labor < 100 AND $1.overhead < .20 THEN numeric '1'
-- WHEN .....
-- WHEN .....
-- WHEN .....
ELSE numeric '0' -- ?
END;
$func$
LANGUAGE sql IMMUTABLE;
Same call for any table that has the columns labor and overhead with matching data type.
dbfiddle here
Also see the related simple case passing simple values here:
How to put part of a SELECT statement into a Postgres function
For even more complex requirements - like also returning various row types - see:
Refactor a PL/pgSQL function to return the output of various SELECT queries

Advantage of using a SQL function rather than a SELECT

In a Postgresql 9.0 database, I can create a url for a webpage with a SELECT statement using an integer (profile_id) in the WHERE clause.
In the past I've simply done this SELECT whenever it is convenient, for instance using a subquery as a column/field in a view. But I recently realized I could create a SQL function to do the same thing. (This is a SQL function, NOT plpgsql).
I want to know if there is an advantage, mostly in terms of resources that are being spent, in using a function rather than a SELECT in a case like this? See below and thanks in advance. I could not find anything on this topic elsewhere on this site. (Long-time reader, first-time caller).
The function is below.
CREATE OR REPLACE FUNCTION msurl(integer)
RETURNS text AS
$BODY$
SELECT (('https://www.thenameofmywebsite/'::text ||
CASE
WHEN prof.type = 1 THEN 'm'::text
ELSE 'f'::text
END) || '/handler/'::text) || prof.profile_id AS profile_url
FROM profile prof
WHERE prof.profile_id = $1;
$BODY$
LANGUAGE sql
To get my url I can either use
SELECT prof.name,
SELECT (('https://www.thenameofmywebsite/'::text ||
CASE
WHEN prof.type = 1 THEN 'm'::text
ELSE 'f'::text
END) || '/handler/'::text) || prof.profile_id AS profile_url, prof.start_date
FROM profile prof,
WHERE prof.profile_id = id_number;
OR the tidier version:
SELECT prof.name, msurl(id_number) as profile_url, prof.start_date FROM profile prof;
The way you are using the function will not have any advantage - the opposite is the case: it will slow down your select drastically. Because for each row that is retrieved from the main select statement (the one calling the function) you are running another select on the same table.
Functions do have an advantage when you want to encapsulate the logic of building the url. But you need to write the function differently to be more efficient by passing it the row you want to work with:
CREATE OR REPLACE FUNCTION msurl(profile)
RETURNS text AS
$BODY$
SELECT (('https://www.thenameofmywebsite/' ||
CASE
WHEN $1.type = 1 THEN 'm'
ELSE 'f'
END) || '/handler/' || $1.profile_id:: AS profile_url;
$BODY$
LANGUAGE sql;
Another option would have been to pass all columns that you need separately, but by passing the row (type) the function signature (and thus calls to it) will not need to be changed if the logic changes and you need more or less columns from the table.
You can then call this with the following syntax:
SELECT prof.name,
msurl( (prof) ) as profile_url,
prof.start_date
FROM profile prof;
Note that the alias must be enclosed in parentheses (prof) when passing it to the function. The additional parentheses are not optional here.
That way the function still gets called for each row, but it doesn't run another select on the profile table.
Due to the object oriented way Postgres treats such a function you can even call it as it it was a column of the table:
SELECT prof.name,
prof.msurl as profile_url,
prof.start_date
FROM profile prof;
Sense of functions (sql functions) is encapsulation. With functions some fragments of your SQL statements has name, semantics - you can reuse it, you can build a libraries. There are no any other benefits like performance - it has impact just only on your code readability.

Pass parameters to view (set returning function?)

I basically have a pretty complicated view that currently returns what I want from last-week aggregations.
SELECT *
FROM ...
WHERE t1.event_date >= ('now'::text::date - 7)
...
Now I'd like to be able to make the same calculations for any given 2 dates, so I want to be able to pass parameters to my view. Pretty much I want to replace the WHERE clause with something like:
WHERE t1.event_date BETWEEN %first date% AND %second_date%
I think I should be able to do this with a set returning function but can't figure exactly how to do it. Any ideas?
Create a function (sometimes called table-function, when returning a set of rows). There are many (procedural) languages, in particular PL/pgSQL. In your case LANGUAGE sql does the job:
CREATE OR REPLACE FUNCTION get_user_by_username(d1 date, d2 date)
RETURNS TABLE ( ... ) AS
$body$
SELECT ...
WHERE t1.event_date >= $1
AND t1.event_date < $2 -- note how I exclude upper border
$body$ LANGUAGE sql;
One of many examples for LANGUAGE sql:
Return rows from a PL/pgSQL function
One of many examples for LANGUAGE plpgsql:
PostgreSQL: ERROR: 42601: a column definition list is required for functions returning "record"

Selecting chained records from a table [duplicate]

I have a table similar to this:
CREATE TABLE example (
id integer primary key,
name char(200),
parentid integer,
value integer);
I can use the parentid field to arrange data into a tree structure.
Now here's the bit I can't work out. Given a parentid, is it possible to write an SQL statement to add up all the value fields under that parentid and recurse down the branch of the tree ?
UPDATE: I'm using posgreSQL so the fancy MS-SQL features are not available to me. In any case, I'd like this to be treated as a generic SQL question.
BTW, I'm very impressed to have 6 answers within 15 minutes of asking the question! Go stack overflow!
Here is an example script using common table expression:
with recursive sumthis(id, val) as (
select id, value
from example
where id = :selectedid
union all
select C.id, C.value
from sumthis P
inner join example C on P.id = C.parentid
)
select sum(val) from sumthis
The script above creates a 'virtual' table called sumthis that has columns id and val. It is defined as the result of two selects merged with union all.
First select gets the root (where id = :selectedid).
Second select follows the children of the previous results iteratively until there is nothing to return.
The end result can then be processed like a normal table. In this case the val column is summed.
Since version 8.4, PostgreSQL has recursive query support for common table expressions using the SQL standard WITH syntax.
If you want a portable solution that will work on any ANSI SQL-92 RDBMS, you will need to add a new column to your table.
Joe Celko is the original author of the Nested Sets approach to storing hierarchies in SQL. You can Google "nested sets" hierarchy to understand more about the background.
Or you can just rename parentid to leftid and add a rightid.
Here is my attempt to summarize Nested Sets, which will fall woefully short because I'm no Joe Celko: SQL is a set-based language, and the adjacency model (storing parent ID) is NOT a set-based representation of a hierarchy. Therefore there is no pure set-based method to query an adjacency schema.
However, most of the major platforms have introduced extensions in recent years to deal with this precise problem. So if someone replies with a Postgres-specific solution, use that by all means.
There are a few ways to do what you need in PostgreSQL.
If you can install modules, look at the tablefunc contrib. It has a connectby() function that handles traversing trees. http://www.postgresql.org/docs/8.3/interactive/tablefunc.html
Also check out the ltree contrib, which you could adapt your table to use: http://www.postgresql.org/docs/8.3/interactive/ltree.html
Or you can traverse the tree yourself with a PL/PGSQL function.
Something like this:
create or replace function example_subtree (integer)
returns setof example as
'declare results record;
child record;
begin
select into results * from example where parent_id = $1;
if found then
return next results;
for child in select id from example
where parent_id = $1
loop
for temp in select * from example_subtree(child.id)
loop
return next temp;
end loop;
end loop;
end if;
return null;
end;' language 'plpgsql';
select sum(value) as value_sum
from example_subtree(1234);
A standard way to make a recursive query in SQL are recursive CTE. PostgreSQL supports them since 8.4.
In earlier versions, you can write a recursive set-returning function:
CREATE FUNCTION fn_hierarchy (parent INT)
RETURNS SETOF example
AS
$$
SELECT example
FROM example
WHERE id = $1
UNION ALL
SELECT fn_hierarchy(id)
FROM example
WHERE parentid = $1
$$
LANGUAGE 'sql';
SELECT *
FROM fn_hierarchy(1)
See this article:
Hierarchical queries in PostgreSQL
If your using SQL Server 2005, there is a really cool way to do this using Common Table Expressions.
It takes all of the gruntwork out of creating a temporary table, and basicly allows you to do it all with just a WITH and a UNION.
Here is a good tutorial:
http://searchwindevelopment.techtarget.com/tip/0,289483,sid8_gci1278207,00.html
use a common table expression.
May want to indicate this is SQL Server 2005 or above only. Dale Ragan
here's an article on recursion by SqlTeam without common table expressions.
The following code compiles and it's tested OK.
create or replace function subtree (bigint)
returns setof example as $$
declare
results record;
entry record;
recs record;
begin
select into results * from example where parent = $1;
if found then
for entry in select child from example where parent = $1 and child parent loop
for recs in select * from subtree(entry.child) loop
return next recs;
end loop;
end loop;
end if;
return next results;
end;
$$ language 'plpgsql';
The condition "child <> parent" is needed in my case because nodes point to themselves.
Have fun :)
Oracle has "START WITH" and "CONNECT BY"
select
lpad(' ',2*(level-1)) || to_char(child) s
from
test_connect_by
start with parent is null
connect by prior child = parent;
http://www.adp-gmbh.ch/ora/sql/connect_by.html
Just as a brief aside although the question has been answered very well, it should be noted that if we treat this as a:
generic SQL question
then the SQL implementation is fairly straight-forward, as SQL'99 allows linear recursion in the specification (although I believe no RDBMSs implement the standard fully) through the WITH RECURSIVE statement. So from a theoretical perspective we can do this right now.
None of the examples worked OK for me so I've fixed it like this:
declare
results record;
entry record;
recs record;
begin
for results in select * from project where pid = $1 loop
return next results;
for recs in select * from project_subtree(results.id) loop
return next recs;
end loop;
end loop;
return;
end;
is this SQL Server? Couldn't you write a TSQL stored procedure that loops through and unions the results together?
I am also interested if there is a SQL-only way of doing this though. From the bits I remember from my geographic databases class, there should be.
I think it is easier in SQL 2008 with HierarchyID
If you need to store arbitrary graphs, not just hierarchies, you could push Postgres to the side and try a graph database such as AllegroGraph:
Everything in the graph database is stored as a triple (source node, edge, target node) and it gives you first class support for manipulating the graph structure and querying it using a SQL like language.
It doesn't integrate well with something like Hibernate or Django ORM but if you are serious about graph structures (not just hierarchies like the Nested Set model gives you) check it out.
I also believe Oracle has finally added a support for real Graphs in their latest products, but I'm amazed it's taken so long, lots of problems could benefit from this model.

Is it possible to make a recursive SQL query?

I have a table similar to this:
CREATE TABLE example (
id integer primary key,
name char(200),
parentid integer,
value integer);
I can use the parentid field to arrange data into a tree structure.
Now here's the bit I can't work out. Given a parentid, is it possible to write an SQL statement to add up all the value fields under that parentid and recurse down the branch of the tree ?
UPDATE: I'm using posgreSQL so the fancy MS-SQL features are not available to me. In any case, I'd like this to be treated as a generic SQL question.
BTW, I'm very impressed to have 6 answers within 15 minutes of asking the question! Go stack overflow!
Here is an example script using common table expression:
with recursive sumthis(id, val) as (
select id, value
from example
where id = :selectedid
union all
select C.id, C.value
from sumthis P
inner join example C on P.id = C.parentid
)
select sum(val) from sumthis
The script above creates a 'virtual' table called sumthis that has columns id and val. It is defined as the result of two selects merged with union all.
First select gets the root (where id = :selectedid).
Second select follows the children of the previous results iteratively until there is nothing to return.
The end result can then be processed like a normal table. In this case the val column is summed.
Since version 8.4, PostgreSQL has recursive query support for common table expressions using the SQL standard WITH syntax.
If you want a portable solution that will work on any ANSI SQL-92 RDBMS, you will need to add a new column to your table.
Joe Celko is the original author of the Nested Sets approach to storing hierarchies in SQL. You can Google "nested sets" hierarchy to understand more about the background.
Or you can just rename parentid to leftid and add a rightid.
Here is my attempt to summarize Nested Sets, which will fall woefully short because I'm no Joe Celko: SQL is a set-based language, and the adjacency model (storing parent ID) is NOT a set-based representation of a hierarchy. Therefore there is no pure set-based method to query an adjacency schema.
However, most of the major platforms have introduced extensions in recent years to deal with this precise problem. So if someone replies with a Postgres-specific solution, use that by all means.
There are a few ways to do what you need in PostgreSQL.
If you can install modules, look at the tablefunc contrib. It has a connectby() function that handles traversing trees. http://www.postgresql.org/docs/8.3/interactive/tablefunc.html
Also check out the ltree contrib, which you could adapt your table to use: http://www.postgresql.org/docs/8.3/interactive/ltree.html
Or you can traverse the tree yourself with a PL/PGSQL function.
Something like this:
create or replace function example_subtree (integer)
returns setof example as
'declare results record;
child record;
begin
select into results * from example where parent_id = $1;
if found then
return next results;
for child in select id from example
where parent_id = $1
loop
for temp in select * from example_subtree(child.id)
loop
return next temp;
end loop;
end loop;
end if;
return null;
end;' language 'plpgsql';
select sum(value) as value_sum
from example_subtree(1234);
A standard way to make a recursive query in SQL are recursive CTE. PostgreSQL supports them since 8.4.
In earlier versions, you can write a recursive set-returning function:
CREATE FUNCTION fn_hierarchy (parent INT)
RETURNS SETOF example
AS
$$
SELECT example
FROM example
WHERE id = $1
UNION ALL
SELECT fn_hierarchy(id)
FROM example
WHERE parentid = $1
$$
LANGUAGE 'sql';
SELECT *
FROM fn_hierarchy(1)
See this article:
Hierarchical queries in PostgreSQL
If your using SQL Server 2005, there is a really cool way to do this using Common Table Expressions.
It takes all of the gruntwork out of creating a temporary table, and basicly allows you to do it all with just a WITH and a UNION.
Here is a good tutorial:
http://searchwindevelopment.techtarget.com/tip/0,289483,sid8_gci1278207,00.html
use a common table expression.
May want to indicate this is SQL Server 2005 or above only. Dale Ragan
here's an article on recursion by SqlTeam without common table expressions.
The following code compiles and it's tested OK.
create or replace function subtree (bigint)
returns setof example as $$
declare
results record;
entry record;
recs record;
begin
select into results * from example where parent = $1;
if found then
for entry in select child from example where parent = $1 and child parent loop
for recs in select * from subtree(entry.child) loop
return next recs;
end loop;
end loop;
end if;
return next results;
end;
$$ language 'plpgsql';
The condition "child <> parent" is needed in my case because nodes point to themselves.
Have fun :)
Oracle has "START WITH" and "CONNECT BY"
select
lpad(' ',2*(level-1)) || to_char(child) s
from
test_connect_by
start with parent is null
connect by prior child = parent;
http://www.adp-gmbh.ch/ora/sql/connect_by.html
Just as a brief aside although the question has been answered very well, it should be noted that if we treat this as a:
generic SQL question
then the SQL implementation is fairly straight-forward, as SQL'99 allows linear recursion in the specification (although I believe no RDBMSs implement the standard fully) through the WITH RECURSIVE statement. So from a theoretical perspective we can do this right now.
None of the examples worked OK for me so I've fixed it like this:
declare
results record;
entry record;
recs record;
begin
for results in select * from project where pid = $1 loop
return next results;
for recs in select * from project_subtree(results.id) loop
return next recs;
end loop;
end loop;
return;
end;
is this SQL Server? Couldn't you write a TSQL stored procedure that loops through and unions the results together?
I am also interested if there is a SQL-only way of doing this though. From the bits I remember from my geographic databases class, there should be.
I think it is easier in SQL 2008 with HierarchyID
If you need to store arbitrary graphs, not just hierarchies, you could push Postgres to the side and try a graph database such as AllegroGraph:
Everything in the graph database is stored as a triple (source node, edge, target node) and it gives you first class support for manipulating the graph structure and querying it using a SQL like language.
It doesn't integrate well with something like Hibernate or Django ORM but if you are serious about graph structures (not just hierarchies like the Nested Set model gives you) check it out.
I also believe Oracle has finally added a support for real Graphs in their latest products, but I'm amazed it's taken so long, lots of problems could benefit from this model.