Postgres analogue to CROSS APPLY in SQL Server - sql

I need to migrate SQL queries written for MS SQL Server 2005 to Postgres 9.1.
What is the best way to substitute for CROSS APPLY in this query?
SELECT *
FROM V_CitizenVersions
CROSS APPLY
dbo.GetCitizenRecModified(Citizen, LastName, FirstName, MiddleName,
BirthYear, BirthMonth, BirthDay, ..... ) -- lots of params
GetCitizenRecModified() function is a table valued function. I can't place code of this function because it's really enormous, it makes some difficult computations and I can't abandon it.

In Postgres 9.3 or later use a LATERAL join:
SELECT v.col_a, v.col_b, f.* -- no parentheses, f is a table alias
FROM v_citizenversions v
LEFT JOIN LATERAL f_citizen_rec_modified(v.col1, v.col2) f ON true
WHERE f.col_c = _col_c;
Why LEFT JOIN LATERAL ... ON true?
Record returned from function has columns concatenated
For older versions, there is a very simple way to accomplish what I think you are trying to with a set-returning function (RETURNS TABLE or RETURNS SETOF record OR RETURNS record):
SELECT *, (f_citizen_rec_modified(col1, col2)).*
FROM v_citizenversions v
The function computes values once for every row of the outer query. If the function returns multiple rows, resulting rows are multiplied accordingly. All parentheses are syntactically required to decompose a row type. The table function could look something like this:
CREATE OR REPLACE FUNCTION f_citizen_rec_modified(_col1 int, _col2 text)
RETURNS TABLE(col_c integer, col_d text)
LANGUAGE sql AS
$func$
SELECT s.col_c, s.col_d
FROM some_tbl s
WHERE s.col_a = $1
AND s.col_b = $2
$func$;
You need to wrap this in a subquery or CTE if you want to apply a WHERE clause because the columns are not visible on the same level. (And it's better for performance anyway, because you prevent repeated evaluation for every output column of the function):
SELECT col_a, col_b, (f_row).*
FROM (
SELECT col_a, col_b, f_citizen_rec_modified(col1, col2) AS f_row
FROM v_citizenversions v
) x
WHERE (f_row).col_c = _col_c;
There are several other ways to do this or something similar. It all depends on what you want exactly.

Necromancing:
New in PostgreSQL 9.3:
The LATERAL keyword
left | right | inner JOIN LATERAL
INNER JOIN LATERAL is the same as CROSS APPLY
and LEFT JOIN LATERAL is the same as OUTER APPLY
Example usage:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE

I like Erwin Brandstetter's answer however, I've discovered a performance problem:
when running
SELECT *, (f_citizen_rec_modified(col1, col2)).*
FROM v_citizenversions v
The f_citizen_rec_modified function will be ran 1 time for every column it returns (multiplied by every row in v_citizenversions). I did not find documentation for this effect, but was able to deduce it by debugging. Now the question becomes, how can we get this effect (prior to 9.3 where lateral joins are available) without this performance robbing side effect?
Update: I seem to have found an answer. Rewrite the query as follows:
select x.col1, x.col2, x.col3, (x.func).*
FROM (select SELECT v.col1, v.col2, v.col3, f_citizen_rec_modified(col1, col2) func
FROM v_citizenversions v) x
The key difference being getting the raw function results first (inner subquery) then wrapping that in another select that busts those results out into the columns. This was tested on PG 9.2

This link appears to show how to do it in Postgres 9.0+:
PostgreSQL: parameterizing a recursive CTE
It's further down the page in the section titled "Emulating CROSS APPLY with set-returning functions". Please be sure to note the list of limitations after the example.

Related

Select rows according to another table with a comma-separated list of items

Have a table test.
select b from test
b is a text column and contains Apartment,Residential
The other table is a parcel table with a classification column. I'd like to use test.b to select the right classifications in the parcels table.
select * from classi where classification in(select b from test)
this returns no rows
select * from classi where classification =any(select '{'||b||'}' from test)
same story with this one
I may make a function to loop through the b column but I'm trying to find an easier solution
Test case:
create table classi as
select 'Residential'::text as classification
union
select 'Apartment'::text as classification
union
select 'Commercial'::text as classification;
create table test as
select 'Apartment,Residential'::text as b;
You don't actually need to unnest the array:
SELECT c.*
FROM classi c
JOIN test t ON c.classification = ANY (string_to_array(t.b, ','));
db<>fiddle here
The problem is that = ANY takes a set or an array, and IN takes a set or a list, and your ambiguous attempts resulted in Postgres picking the wrong variant. My formulation makes Postgres expect an array as it should.
For a detailed explanation see:
How to match elements in an array of composite type?
IN vs ANY operator in PostgreSQL
Note that my query also works for multiple rows in table test. Your demo only shows a single row, which is a corner case for a table ...
But also note that multiple rows in test may produce (additional) duplicates. You'd have to fold duplicates or switch to a different query style to get de-duplicate. Like:
SELECT c.*
FROM classi c
WHERE EXISTS (
SELECT FROM test t
WHERE c.classification = ANY (string_to_array(t.b, ','))
);
This prevents duplication from elements within a single test.b, as well as from across multiple test.b. EXISTS returns a single row from classi per definition.
The most efficient query style depends on the complete picture.
You need to first split b into an array and then get the rows. A couple of alternatives:
select * from nj.parcels p where classification = any(select unnest(string_to_array(b, ',')) from test)
select p.* from nj.parcels p
INNER JOIN (select unnest(string_to_array(b, ',')) from test) t(classification) ON t.classification = p.classification;
Essential to both is the unnest surrounding string_to_array.

Pass dynamic parameters into a Table Valued Function [duplicate]

I have a table-value function that takes an ID number of a person and returns a few rows and columns. In another query, I am creating a SELECT that retrieves a lot of information about many people. How can I pass an id number from my main query to my function to sum a column and join it to my main query? I wish I didn't have a table value function since that would work easily, however, this function is used elsewhere and I'd like to reuse it. Perhaps this isn't even possible with a table-value function and I need to create a scalar one.
My main Query looks like this:
select id_num, name, balance
from listOfPeople
And the table-value function looks like this:
calculatePersonalDiscount(id_number)
I would like to do something like:
select id_num, name, balance
from listOfPeople
left join
(
SELECT id_num, SUM(discount)
FROM calculatePersonalDiscount(listOfPeople.id_num)
) x ON x.id_num = listOfPeople.id_num
But you can't pass listOfPeople.id_num into the function since it's not really the same scope.
In SQL Server 2005 you can use the CROSS APPLY syntax:
select id_num, name, balance, SUM(x.discount)
from listOfPeople
cross apply dbo.calculatePersonalDiscount(listOfPeople.id_num) x
Likewise there's an OUTER APPLY syntax for the equivalent of a LEFT OUTER join.
I needed this badly and you gave me the start, and here I was able to JOIN on another View. (another table would work too), and I included another function in the WHERE clause.
SELECT CL_ID, CL_LastName, x.USE_Inits
FROM [dbo].[tblClient]
JOIN dbo.vw_InsuranceAppAndProviders ON dbo.vw_InsuranceAppAndProviders.IAS_CL_ID = tblClient.CL_ID
CROSS APPLY dbo.ufx_HELPER_Client_CaseWorker(tblClient.CL_ID) x
WHERE dbo.vw_InsuranceAppAndProviders.IAS_CL_ID = tblClient.CL_ID
AND dbo.ufx_Client_IsActive_By_CLID(tblClient.CL_ID) = 1
AND (dbo.vw_InsuranceAppAndProviders.IAS_InsuranceType = 'Medicare')

Why does this Snowflake query work without requiring the LATERAL keyword?

I have this view in Snowflake:
create or replace view foo as
select $1 as id_foo, $2 as sales
from (values (1,100),(2,200),(3,300));
And this user-defined table function:
CREATE OR REPLACE FUNCTION build_aux_table ( restriction number )
RETURNS TABLE ( aux_id number )
AS 'select $1 as aux_id from (VALUES (1),(2),(3)) where aux_id = restriction';
The following query works, and returns 3 rows:
select id_foo, baz.aux_id
from foo
cross join table(build_aux_table(foo.id_foo)) baz;
However, I didn't expect the query to compile, because the UDTF-generated table with which we are joining depends on a column from the first table. My understanding was that this sort of intra-table dependency required a LATERAL join, like the following (which also works):
select id_foo, baz.aux_id
from foo
cross join lateral build_aux_table(foo.id_foo) baz;
Why does the query without LATERAL work?
The code:
select id_foo, baz.aux_id
from foo
cross join table(build_aux_table(foo.id_foo)) baz;
is basically the same as:
select id_foo, baz.aux_id
from foo
,table(build_aux_table(foo.id_foo)) baz;
It is described in docs:
Calling JavaScript UDTFs in Queries
This simple example shows how to call a UDTF. This example passes literal values.
SELECT * FROM TABLE(js_udtf(10.0::FLOAT, 20.0::FLOAT));
This example calls a UDTF and passes it values from another table. In this example, the UDTF named js_udtf is called once for each row in the table named tab1. Each time that the function is called, it is passed values from columns c1 and c2 of the current row.
SELECT * FROM tab1, TABLE(js_udtf(tab1.c1, tab1.c2)) ;
Sample UDTFs - Examples with joins
Use the UDTF in a join with another table; note that the join column from the table is passed as an argument to the function.
select *
from favorite_years y join table(favorite_colors(y.year)) c;
It's not mentioned explicitly in the Snowflake manual (at least where you linked to), but it's standard behavior that the LATERAL keyword is optional for functions. Quoting the PostgreSQL manual:
Table functions appearing in FROM can also be preceded by the key
word LATERAL, but for functions the key word is optional; the
function's arguments can contain references to columns provided by
preceding FROM items in any case.
Bold emphasis mine.
Makes sense too, IMHO. Where would a function get its arguments, if not from other tables? Would be restricted to manual input without lateral references.

SQL IN() operator with condition inside

I've got table with few numbers inside (or even empty): #states table (value int)
And I need to make SELECT from another table with WHERE clause by definite column.
This column's values must match one of #states numbers or if #states is empty then accept all values (like there is no WHERE condition for this column).
So I tried something like this:
select *
from dbo.tbl_docs docs
where
docs.doc_state in(iif(exists(select 1 from #states), (select value from #states), docs.doc_state))
Unfortunately iif() can't return subquery resulting dataset. I tried different variations with iif() and CASE but it wasn't successful. How to make this condition?
select *
from dbo.tbl_docs docs
where
(
(select count(*) from #states) > 0
AND
docs.doc_state in(select value from #states)
)
OR
(
(select count(*) from #states)=0
AND 1=1
)
Wouldn't a left join do?
declare #statesCount int;
select #statesCount = count(1) from #states;
select
docs.*
from dbo.tbl_docs docs
left join #states s on docs.doc_state = s.value
where s.value is not null or #statesCount = 0;
In general, whenever your query contains sub-queries, you should stop for five minutes, and think hard about whether you really need a sub-query at all.
And if you've got a server capable of doing that, in many cases it might be better to preprocess the input parameters first, or perhaps use constructs such as MS SQL's with.
select *
from dbo.tbl_docs docs
where exists (select 1 from #states where value = doc_state)
or not exists (select 1 from #state)

PostgreSQL: Serially apply table-valued function to a set of values and UNION ALL results

I have a table-valued PL/pgsql function that takes as 1 input an integer, an ID. The table that is returned has fixed columns (say 5) but varying number of rows.
There is a large table of these unique IDs. I'd like to apply this function to each ID and UNION ALL the results.
Looking online I keep seeing CROSS APPLY as the solution, but it does not appear to be available in PostgreSQL. How can I do this "apply" operation?
One trivial solution is to re-write the table-valued function with an additional outer loop. But is there a way to do this directly in SQL?
I think it's impossible to do in current version of PostgreSQL (9.2). In 9.3 there would be LATERAL join which does exactly what you want.
You can, however, apply function returning set of simple values:
select id, func(id) as f from tbl1
sql fiddle demo
SQL Fiddle
create table t (id int);
insert into t (id) select generate_series(1, 10);
create or replace function f (i integer)
returns table(id_2 integer, id_3 integer) as $$
select id * 2 as id_2, id * 3 as id_3
from t
where id between i - 1 and i + 1
$$ language sql;
select id, (f(id)).*
from t;