Use format pattern from regexp_replace in to_char - sql

I have a SQL stored in a column where the format for date is in square brackets:
sql_column
------------------------------------------------------------------------------
select col1 from table1 where col2 = to_date('[DD-MON-YYYY]', 'DD-MON-YYYY');
select col1 from table2 where col2 between [YYYYMMDD]00000 and [YYYYMMDD]99999 and col3 = to_date('[DD-MON-YYYY]', 'DD-MON-YYYY');
....
I don't know which format can be there beforehand, but it is always a date format.
Is there a way to use regexp_replace (or regexp_substr or even with regexp_* functions) to find and replace the pattern with the result of to_char of my given date and pattern taken from the db column.
Perhaps something like this (which doesn't work obviously):
select sql_column,
regexp_replace(sql_column, '\[(.+?)\]', to_char(some_date, '\1'))
from my_table;
Could you please help?

There is no way to parse ALL the possible date formats. For example, if you see '01/11/2017' in a varchar field you cannot say if it refers to November 1st or January 11th.
Said that, you can use a common table expression to choose the best pattern and then use it to convert the string value to date. For example:
select * from
(select
case
when REGEXP_LIKE(col2, '^[0-9]{2}/[0-9]{2}/[0-9]{4}$') then 'DD/MM/YYYY'
when REGEXP_LIKE(col2, '^[0-9]{14}$') then 'YYYYMMDDHH24MISS'
end pattern, table1.*
from table1) x
where to_date(x.col2, x.pattern) = to_date('01/11/2017','DD/MM/YYYY')
This approach would probably lead to a full table scan so it is far from being efficient. If the original table is large, I advice creating a materialized view with the converted date to improve performance.

You'll need a bit of dynamic SQL to solve this problem.
I do not present a complete solution, but this should give you a hint how to approach it.
Let's approach it bottom up.
What you actually need is a REPLACE statement, that transforms you SQL text in the required form with the paramater date.
This could be for your example this statement for teh first example (parameter date is 30.11.2017)
select replace(sql_column, '[DD-MON-YYYY]','30-NOV-2017') from my_table;
If you run the first statement on your first row you get the expected result:
select col1 from table1 where col2 = to_date('30-NOV-2017', 'DD-MON-YYYY');
So how to get those REPLACE statements. One possibility is to write a PL/SQL function.
The function has two parameters - the original SQL string and teh parameter date.
Using regexp you scrap the date format mask.
With dynamic SQL (EXECUTE IMMEDIATE) you format you parameter DATE as string with propper format.
Finally the REPLACE statement is returned.
create or replace function format_date(i_txt IN VARCHAR2, i_date DATE) return VARCHAR2 is
v_date_format VARCHAR2(4000);
v_form_date VARCHAR2(4000);
v_param VARCHAR2(4000);
v_form VARCHAR2(4000);
v_sql VARCHAR2(4000);
BEGIN
v_param := regexp_substr(i_txt, '\[(.+?)\]');
v_date_format := replace(replace(v_param,'[',null),']',null);
v_sql := 'select to_char(:d,'''||v_date_format||''') as my_dt from dual';
execute immediate v_sql into v_form_date using i_date;
v_form := 'select replace(sql_column, '''||v_param||''','''||v_form_date||''') from my_table';
return (v_form);
END;
/
NOTE that I handle only the first date mask in the string, you'll need to loop on all occcureces to get correct the second example!

Related

Oracle Query Logic Declare and With

I want to printout a Dynamic built query. Yet I am stuck at variable declaration; Error At line 2.
I need the maximum size for these VARCHAR2 variables.
Do I have a good overall structure ?
I use the result of the WITH inside the dynamic query.
DECLARE l_sql_query VARCHAR2(2000);
l_sql_queryFinal VARCHAR2(2000);
with cntp as (select distinct
cnt.code code_container,
*STUFF*
FROM container cnt
WHERE
cnt.status !='DESTROYED'
order by cnt.code)
BEGIN
FOR l_counter IN 2022..2032
LOOP
l_sql_query := l_sql_query || 'SELECT cntp.code_container *STUFF*
FROM cntp
GROUP BY cntp.code_container ,cntp.label_container, cntp.Plan_Classement, Years
HAVING
cntp.Years=' || l_counter ||'
AND
/*stuff*/ TO_DATE(''31/12/' || l_counter ||''',''DD/MM/YYYY'')
AND SUM(cntp.IsA)=0
AND SUM(cntp.IsB)=0
UNION
';
END LOOP;
END;
l_sql_queryFinal := SUBSTR(l_sql_query, 0, LENGTH (l_sql_query) – 5);
l_sql_queryFinal := l_sql_queryFinal||';'
dbms_output.put_line(l_sql_queryFinal);
The code you posted has quite a few issues, among them:
you've got the with (CTE) as a standlone fragment in the declare section, which isn't valid. If you want it to be part of the dynamic string then put it in the string;
your END; is in the wrong place;
you have – instead of -;
you remove the last 5 characters, but you end with a new line, so you need to remove 6 to include the U of the last UNION;
the line that appens a semicolon is itself missing one (though for dynamic SQL you usually don't want a semicolon, so the whole line can probably be removed);
2000 characters is too small for your example, but it's OK with the actual maximum of 32767.
DECLARE
l_sql_query VARCHAR2(32767);
l_sql_queryFinal VARCHAR2(32767);
BEGIN
-- initial SQL which just declares the CTE
l_sql_query := q'^
with cntp as (select distinct
cnt.code code_container,
*STUFF*
FROM container cnt
WHERE
cnt.status !='DESTROYED'
order by cnt.code)
^';
-- loop around each year...
FOR l_counter IN 2022..2032
LOOP
l_sql_query := l_sql_query || 'SELECT cntp.code_container *STUFF*
FROM cntp
GROUP BY cntp.code_container ,cntp.label_container, cntp.Plan_Classement, Years
HAVING
cntp.Years=' || l_counter ||'
AND
MAX(TO_DATE(cntp.DISPOSITION_DATE,''DD/MM/YYYY'')) BETWEEN TO_DATE(''01/01/'|| l_counter ||''',''DD/MM/YYYY'') AND TO_DATE(''31/12/' || l_counter ||''',''DD/MM/YYYY'')
AND SUM(cntp.IsA)=0
AND SUM(cntp.IsB)=0
UNION
';
END LOOP;
l_sql_queryFinal := SUBSTR(l_sql_query, 0, LENGTH (l_sql_query) - 6);
l_sql_queryFinal := l_sql_queryFinal||';';
dbms_output.put_line(l_sql_queryFinal);
END;
/
db<>fiddle
The q[^...^] in the first assignment is the alternative quoting mechanism, which means you don't have to escape (by doubling-up) the quotes within that string, around 'DESTYORED'. Notice the ^ delimiters do not appear in the final generated query.
Whether the generated query actually does what you want is another matter... The cntp.Years= part should probably be in a where clause, not having; and you might be able to simplify this to a single query instead of lots of unions, as you're already aggregating. All of that is a bit beyond the scope of your question though.
Is there a way to put the maximum size "automaticcaly" like "VARCHAR2(MAX_STRING_SIZE) does it work ?
No. And no.
The maximum size of varchar2 in PL/SQL is 32767. If you want to hedge against that changing at some point in the future you can declare a user-defined subtype in a shared package ...
create or replace package my_subtypes as
subtype max_string_size is varchar2(32767);
end my_subtypes;
/
... and reference that in your program...
DECLARE
l_sql_query my_subtypes.max_string_size;
l_sql_queryFinal my_subtypes.max_string_size;
...
So if Oracle subsequently raises the maximum permitted size of a VARCHAR2 in PL/SQL you need only change the definition of my_subtypes.max_string_size for the bounds to be raised wherever you used that subtype.
Alternatively, just use a CLOB. Oracle is pretty clever about treating a CLOB as a VARCHAR2 when its size is <= 32k.
To solve your other problem you need to treat the WITH clause as a string and assign it to your query variable.
l_sql_query my_subtypes.max_string_size := q'[
with cntp as (select distinct
cnt.code code_container,
*STUFF*
FROM container cnt
WHERE cnt.status !='DESTROYED'
order by cnt.code) ]';
Note the use of the special quote syntax q'[ ... ]' to avoid the need to escape the quotation marks in your query snippet.
A dynamic string query do not access a temp table ?
Dynamic SQL is a string containing a DML or DDL statement which we execute with EXECUTE IMMEDIATE or DBMS_SQL commands. Otherwise it is exactly the same as static SQL, it doesn't behave any differently. In fact the best way to write dynamic SQL is to start by writing the static statement in a worksheet, make it correct and then figure out which bits need to be dynamic (variables, placeholders) and which bits remain static (boilerplate). In your case the WITH clause is a static part of the statement.
As has already been pointed out, you seem to have misunderstood what a with clause is: it's a clause of a SQL statement, not a procedural declaration. My definition, it must be followed by select.
But also, as a general rule, I would recommend avoiding dynamic SQL when possible. In this case, if you can simulate a table with the range of years you want, you can join instead of having to run the same query multiple times.
The easy trick to doing that is to use Oracle's connect by syntax to use a recursive query to produce the expected number of rows.
Once you've done that, adding this table as a join pretty trivially:
WITH cntp AS
(
SELECT DISTINCT code code_container,
[additional columns]
FROM container
WHERE status !='DESTROYED') cntc,
(
SELECT to_date('01/01/'
|| (LEVEL+2019), 'dd/mm/yyyy') AS start_date,
to_date('31/12/'
|| (LEVEL+2019), 'dd/mm/yyyy') AS end_date,
(LEVEL+2019) AS year
FROM dual
CONNECT BY LEVEL <= 11) year_table
SELECT cntp.code_container,
[additional columns]
FROM cntp
join year_table
ON cntp.years = year_table.year
GROUP BY [additional columns],
years,
year_table.start_date,
year_table.end_date
HAVING max(to_date(cntp.disposition_date,''dd/mm/yyyy'')) BETWEEN year_table.start_date AND year_table.end_date
AND SUM(cntp.isa)=0
AND SUM(cntp.isb)=0
(This query is totally untested and may not actually fulfill your needs; I am providing my best approximation based on the information available.)

PLS00215: String length constraints must be in range (1..32767)

I am new to pl/sql. I want to create a procedure that has three parameters called 'startMonth', 'endMonth', 'thirdMonth'. In the procedure, I am executing a sql query which is in 'run_sql' column in table_query. Values for 'startMonth', 'endMonth', 'thirdMonth' are needed to this query. This is how I wrote the procedure. My plan is to put all the sql queries in a separate table and execute in the for loop in the procedure. There I am creating a table called table1 and in the next month I want to drop it and create the table again. This is how I have written the procedure.
CREATE OR REPLACE procedure schema.sixMonthAverage (startMonth varchar,endMonth varchar ,thirdMonth varchar )
IS
start_date varchar := startMonth;
end_date varchar := endMonth;
begin
for c_rec in(select run_sql from table_query)
loop
dbms_output.put_line(startmonth);
dbms_output.put_line(endmonth);
execute immediate c_rec.run_sql using start_date, end_date;
Execute IMMEDIATE 'commit';
END LOOP;
EXCEPTION
WHEN OTHERS THEN
dbms_output.put_line('Exception');
END;
This is the query in the run_sql column in table_query.
create table table1
as
select account_num,bill_seq,bill_version,
to_char(start_of_bill_dtm,''YYYYMM-DD'') st_bill_dtm,
to_char(bill_dtm - 1,''YYYYMM-DD'') en_bill_dtm,
to_char(actual_bill_dtm,''YYYYMM-DD'') act_bill_dtm,
round((invoice_net_mny + invoice_tax_mny)/1000,0) mon_bill,
bill_type_id,bill_status
from billsummary
where to_char(bill_dtm - 1,''YYYYMM'') between'||chr(32)||
startMonth ||chr(32)||'and'|| chr(32)||endMonth ||chr(32)||
'and cancellation_dtm is null;
But when I try to compile the procedure it gives me the error 'PLS00215: String length constraints must be in range (1..32767). Though I searched for the error I could not find the exact reason. It seems to be a problem in variable assigning. But I could not resolve it.
--Update
As it is given in the answer I converted the strings to dates.
CREATE OR REPLACE procedure REPO.sixMonthAverage (startMonth varchar2,endMonth varchar2 ,thirdMonth varchar2 )
IS
start_date date := TO_DATE(startMonth, 'yyyymm');
end_date date := TO_DATE(endMonth, 'yyyymm');
But when executing the query it gives the error message that ORA-00904: "END_DATE": invalid identifier. But it does not show any error message for the start_date and what would be the reason for this error message?
The error is pointing you to where the problem is. String declarations (char, varchar, varchar2 - but you should only be using varchar2, not varchar) need a length; so for example:
CREATE OR REPLACE procedure sixMonthAverage (startMonth varchar2,endMonth varchar2 ,thirdMonth varchar2 )
IS
start_date varchar2(10) := startMonth;
end_date varchar2(10) := endMonth;
...
Notice the procedure arguments do not specify a length; only the local variable declarations.
If those represent dates then they, and passed-in arguments, should probably be dates, not strings. It depends what your dynamic SQL is expecting though - if that is converting the strings to dates and specifying the format mask then I guess it's OK; otherwise you should be passed dates, or convert the strings to dates. The example you showed doesn't seem to have any bind variables to populate, though.
Dropping and recreating tables is generally not something you want to be doing though. You could delete/truncate and repopulate a table; or use partitioning if you want to keep more than one month; or use a view (or materialized view).

Having a hard time with procedure not formatting the query entry correctly

Create a dynamic procedure that will change the contents of any column for any row in the AA_EMPLOYEE table using the employee id. i.e.
BEGIN
dyn_aa_employee('emp_dob', '01-jan-18', 110);
END;
Will change the date of birth for employee ID 110
CREATE OR REPLACE PROCEDURE dyn_aa_employee
(p_col VARCHAR2,
p_dob IN aa_employee.emp_dob%TYPE,
p_id NUMBER)
IS
BEGIN
EXECUTE IMMEDIATE 'UPDATE aa_employee
SET '|| p_col ||' = :ph_dob
WHERE EMP_NUM = :ph_id'
USING p_dob, p_id;
BEGIN
dyn_aa_employee('emp_dob', '01-jan-18', 110);
END;
The top code has to work for the bottom code. The issue is it's changing the emp dob to 01-jan-0018, however I want it to change to exactly 01-jan-18.
My professor gave me a 0 for this assignment I'm just trying to figure out what I did wrong.
Assuming that aa_employee.emp_dob is of type date AND assuming that by '01-jan-18' you mean January 1st, 2018, either you do this:
BEGIN
dyn_aa_employee('emp_dob', date '2018-01-01', 110);
END;
or you could change your procedure to:
CREATE OR REPLACE PROCEDURE dyn_aa_employee (
p_col VARCHAR2,
p_dob IN aa_employee.emp_dob%TYPE,
p_id NUMBER)
IS
BEGIN
EXECUTE IMMEDIATE
'UPDATE aa_employee SET ' || p_col || ' = :ph_dob WHERE EMP_NUM = :ph_id'
USING TO_DATE (p_dob, 'DD-MON-YY'), p_id;
END;
It would be interesting to see the actual assignment, though. It's a rather confused scenario overall and I don't see how it teaches you much except to identify several things you probably shouldn't do.
Well first off you didn't specify the date format you were passing. Therefore accepting whatever format the professor had setup. This is a bad plan, to be safe always specify your date format and don't depend on the default. Defaults can and are changed too often. This is a good lesson why you should always specify the actual date format you're using. Try the following to see the difference:
with ds as
(select '01-jan-18' dt_stg from dual)
select to_date(dt_stg), to_date(dt_stg, 'dd-mon-rr') from ds;
Secondly, seems the assignment was to create a procedure that could update any column. This procedure can update only a column having the same type as "aa_employee.emp_dob%TYPE",presumable a date, but it cannot update any other column type.
Finally, if your trying to figure out what professor thinks you did wrong, then try something strange: ask your professor!

Trying to create dynamic query strings with PL/PgSQL to make DRY functions in PostgreSQL 9.6

I have tables that contain the same type of data for every year, but the data gathered varies slightly in that they may not have the same fields.
d_abc_2016
d_def_2016
d_ghi_2016
d_jkl_2016
There are certain constants for each table: company_id, employee_id, salary.
However, each one might or might not have these fields that are used to calculate total incentives: bonus, commission, cash_incentives. There are a lot more, but just using these as a examples. All numeric
I should note at this point, users only have the ability to run SELECT statements.
What I would like to be able to do is this:
Give the user the ability to call in SELECT and specify their own fields in addition to the call
Pass the table name being used into the function to use in conditional logic to determine how the query string should be constructed for the eventual total_incentives calculation in addition to passing the whole table so a ton of arguments don't have to be passed into the function
Basically this:
SELECT employee_id, salary, total_incentives(t, 'd_abc_2016')
FROM d_abc_2016 t;
So the function being called will calculate total_incentives which is numeric for that employee_id and also show their salary. But the user might choose to add other fields to look at.
For the function, because the fields used in the total_incentives function will vary from table to table, I need to create logic to construct the query string dynamically.
CREATE OR REPLACE FUNCTION total_incentives(ANYELEMENT, t text)
RETURNS numeric AS
$$
DECLARE
-- table name lower case in case user typed wrong
tbl varchar(255) := lower($2;
-- parse out the table code to use in conditional logic
tbl_code varchar(255) := split_part(survey, '_', 2);
-- the starting point if the query string
base_calc varchar(255) := 'salary + '
-- query string
query_string varchar(255);
-- have to declare this to put computation INTO
total_incentives_calc numeric;
BEGIN
IF tbl_code = 'abc' THEN
query_string := base_calc || 'bonus';
ELSIF tbl_code = 'def' THEN
query_string := base_calc || 'bonus + commission';
ELSIF tbl_code = 'ghi' THEN
-- etc...
END IF;
EXECUTE format('SELECT $1 FROM %I', tbl)
INTO total_incentives_calc
USING query_string;
RETURN total_incentives_calc;
END;
$$
LANGUAGE plpgsql;
This results in an:
ERROR: invalid input syntax for type numeric: "salary + bonus"
CONTEXT: PL/pgSQL function total_incentives(anyelement,text) line 16 at EXECUTE
Since it should be returning a set of numeric values. Change it to the following:
CREATE OR REPLACE FUNCTION total_incentives(ANYELEMENT, t text)
RETURNS SETOF numeric AS
$$
...
RETURN;
Get the same error.
Figure well, maybe it is a table it is trying to return.
CREATE OR REPLACE FUNCTION total_incentives(ANYELEMENT, t text)
RETURNS TABLE(tot_inc numeric) AS
$$
...
Get the same error.
Really, any variation produces that result. So really not sure how to get this to work.
Look at RESULT QUERY, RESULT NEXT, or RESULT QUERY EXECUTE.
https://www.postgresql.org/docs/9.6/static/plpgsql-control-structures.html
RESULT QUERY won't work because it takes a hard coded query from what I can tell, which won't take in variables.
RESULT NEXT iterates through each record, which I don't think will be suitable for my needs and seems like it will be really slow... and it takes a hard coded query from what I can tell.
RESULT QUERY EXECUTE sounds promising.
-- EXECUTE format('SELECT $1 FROM %I', tbl)
-- INTO total_incentives_calc
-- USING query_string;
RETURN QUERY
EXECUTE format('SELECT $1 FROM %I', tbl)
USING query_string;
And get:
ERROR: structure of query does not match function result type
DETAIL: Returned type character varying does not match expected type numeric in column 1.
CONTEXT: PL/pgSQL function total_incentives(anyelement,text) line 20 at RETURN QUERY
It should be returning numeric.
Lastly, I can get this to work, but it won't be DRY. I'd rather not make a bunch of separate functions for each table with duplicative code. Most of the working examples I have seen have the whole query in the function and are called like such:
SELECT total_incentives(d_abc_2016, 'd_abc_2016');
So any additional columns would have to be specified in the function as:
EXECUTE format('SELECT employee_id...)
Given the users will only be able to run SELECT in query this really isn't an option. They need to specify any additional columns they want to see inside a query.
I've posted a similar question but was told it was unclear, so hopefully this lengthier version will more clearly explain what I am trying to do.
The column names and tables names should not be used as query parameters passed by USING clause.
Probably lines:
RETURN QUERY
EXECUTE format('SELECT $1 FROM %I', tbl)
USING query_string;
should be:
RETURN QUERY
EXECUTE format('SELECT %s FROM %I', query_string, tbl);
This case is example why too DRY principle is sometimes problematic. If you write it directly, then your code will be simpler, cleaner and probably shorter.
Dynamic SQL is one from last solution - not first. Use dynamic SQL only when your code will be significantly shorter with dynamic sql than without dynamic SQL.

Dynamic column name as date in postgresql crosstab [duplicate]

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.