Using dynamic query + user defined datatype in Postgres

Using dynamic query + user defined datatype in Postgres - sql

I need a function to normalize my input table features values.
My features table has 9 columns out of which x1,x2...x6 are the input columns I need to scale.
I'm able to do it by using a static query:
create or replace function scale_function()
returns void as $$
declare tav1 features%rowtype; rang1 features%rowtype;
begin
select avg(n),avg(x0),avg(x1),avg(x2),avg(x3),avg(x4),avg(x5),avg(x6),avg(y)
into tav1 from features;
select max(n)-min(n),max(x0)-min(x0),max(x1)-min(x1),max(x2)-min(x2),max(x3)-min(x3),
max(x4)-min(x4),max(x5)-min(x5),max(x6)-min(x6),max(y)-min(y)
into rang1 from features;
update features
set x1= (x1-tav1.x1)/(rang1.x1),x2= (x2-tav1.x2)/(rang1.x2),
x3= (x3-tav1.x3)/(rang1.x3),x4= (x4-tav1.x4)/(rang1.x4),
x5= (x5-tav1.x5)/(rang1.x5),x6= (x6-tav1.x6)/(rang1.x6),
y= (y-tav1.y)/(rang1.y);
return;
end;
$$ language plpgsql;
But now I require a dynamic query to scale n column values i.e., x1,x2...,xn (say I've 200+ columns) in my features table. I'm trying this code but this won't work as there is an issue with a user defined data type:
create or replace function scale_function(n int)
returns void as $$
declare
tav1 features%rowtype;
rang1 features%rowtype;
query1 text :=''; query2 text :='';
begin
for i in 0..n
loop
query1 := query1 ||',avg(x'||i||')';
query2 := query2||',max(x'||i||')-min(x'||i||')';
end loop;
query1 := 'select avg(n)'||query1||',avg(y) into tav1 from features;';
execute query1;
query2 := 'select max(n)-min(n)'||query2||',max(y)-min(y) into rang1 from features;';
execute query2;
update features
set x1= (x1-tav1.x1)/(rang1.x1), ... ,xn=(xn-tav1.xn)/(rang1.xn)
,y= (y-tav1.y)/(rang1.y);
return;
end;
$$ language plpgsql;
Here I'm trying to take the avg() values of the columns into a user-defined rowtype tav1 and have to use that tav1 value to update.
Can any one help me how to update the features table values using dynamic query for 'n' such columns?
************ Error ************
ERROR: column "avg" specified more than once
SQL state: 42701
Context: SQL statement "select avg(n),avg(x0),avg(x1),avg(x2),avg(x3),avg(x4),avg(x5),avg(x6),avg(y) into tav1 from features;"
PL/pgSQL function scale_function(integer) line 12 at EXECUTE statement
I'm using PostgreSQL 9.3.0.

Basic UPDATE
Replace the first query with this much shorter and more efficient single UPDATE command:
UPDATE features
SET (x1,x2,x3,x4,x5,x6, y)
= ((x1 - g.avg1) / g.range1
, (x2 - g.avg2) / g.range2
-- , (x3 - ...
, (y - g.avgy) / g.rangey)
FROM (
SELECT avg(x1) AS avg1, max(x1) - min(x1) AS range1
, avg(x2) AS avg2, max(x2) - min(x2) AS range2
-- , avg(x3) ...
, avg(y) AS avgy, max(y) - min(y) AS rangey
FROM features
) g;
About the short UPDATE syntax:
SQL update fields of one table from fields of another one
Dynamic function
Building on the simpler query, here is a dynamic function for any number of columns:
CREATE OR REPLACE FUNCTION scale_function_dyn()
RETURNS void AS
$func$
DECLARE
cols text; -- list of target columns
vals text; -- list of values to insert
aggs text; -- column list for aggregate query
BEGIN
SELECT INTO cols, vals, aggs
string_agg(quote_ident(attname), ', ')
, string_agg(format('(%I - g.%I) / g.%I'
, attname, 'avg_' || attname, 'range_' || attname), ', ')
, string_agg(format('avg(%1$I) AS %2$I, max(%1$I) - min(%1$I) AS %3$I'
, attname, 'avg_' || attname, 'range_' || attname), ', ')
FROM pg_attribute
WHERE attrelid = 'features'::regclass
AND attname NOT IN ('n', 'x0') -- exclude columns from update
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('UPDATE features
SET (%s) = (%s)
FROM (SELECT %s FROM features) g'
, cols, vals, aggs);
END
$func$ LANGUAGE plpgsql;
Related answer with more explanation:
Update multiple columns in a trigger function in plpgsql
SQL Fiddle.

Related

In Postgres, how would I retrieve the default value of a column, preferably inline in an insert statement?

Here's my example table:
CREATE TABLE IF NOT EXISTS public.cars
(
id serial PRIMARY KEY,
make varchar(32) not null,
model varchar(32),
has_automatic_transmission boolean not null default false,
created_on_date timestamptz not null DEFAULT NOW()
);
I have a function that allows my data service to insert a car into the database. It looks like this:
drop function if exists cars_insert;
create function cars_insert
(
in make_in text,
in model_in text,
in has_automatic_transmission_in boolean,
in created_on_date_in timestamptz
)
returns public.carsas
$$
declare result_set public.cars;
begin
insert into cars
(
make,
model,
has_automatic_transmission,
created_on_date
)
values
(
make_in,
model_in,
has_automatic_transmission_in,
created_on_date_in
)
returning * into result_set;
return result_set;
end;
$$
language 'plpgsql';
This works really well until the service wants to insert a car with no value for has_automatic_transmission or created_on_date. In that case they'd send null for those parameters and would expect the database to use a default value. But instead the database rejects that null for obvious reasons (NOT NULL!).
What I want to do is have the insert routine do a coalesce to DEFAULT, but that doesn't work. Here's the logic I want for the insert:
insert into cars
(
make,
model,
has_automatic_transmission,
created_on_date
)
values
(
make,
model,
COALESCE(has_automatic_transmission_in, DEFAULT),
COALESCE(created_on_date_in, DEFAULT)
)
How can I effectively achieve that? Ideally it'd be some method I can apply inline to every column so that we don't need special knowledge of which columns do or don't have defaults, but I'll take anything at this point...
Except I'd like to avoid Dynamic SQL if possible.

While you need to pass values to a function, and want to insert default values instead of NULL dynamically, you could look them up like this (but see disclaimer below!):
CREATE OR REPLACE FUNCTION cars_insert (make_in text
, model_in text
, has_automatic_transmission_in boolean
, created_on_date_in timestamptz)
RETURNS public.cars AS
$func$
INSERT INTO cars(make, model, has_automatic_transmission, created_on_date)
VALUES (make_in
, model_in
, COALESCE(has_automatic_transmission_in
, (SELECT pg_get_expr(d.adbin, d.adrelid)::bool -- default_value
FROM pg_catalog.pg_attribute a
JOIN pg_catalog.pg_attrdef d ON (d.adrelid, d.adnum) = (a.attrelid, a.attnum)
WHERE a.attrelid = 'public.cars'::regclass
AND a.attname = 'has_automatic_transmission'))
, COALESCE(created_on_date_in
, (SELECT pg_get_expr(d.adbin, d.adrelid)::timestamptz -- default_value
FROM pg_catalog.pg_attribute a
JOIN pg_catalog.pg_attrdef d ON (d.adrelid, d.adnum) = (a.attrelid, a.attnum)
WHERE a.attrelid = 'public.cars'::regclass
AND a.attname = 'created_on_date'))
)
RETURNING *;
$func$
LANGUAGE sql;
db<>fiddle here
You also have to know the column type to cast the text returned from pg_get_expr().
I simplified to an SQL function, as nothing here requires PL/pgSQL.
See:
Get the default values of table columns in Postgres?
However, this only works for constants and types where a cast from text is defined. Other expressions (incl. functions) are not evaluated without dynamic SQL. now() in the example only happens to work by coincidence, as 'now' (ignoring parentheses) is a special input string for timestamptz that evaluates to the the same as the function now(). Misleading coincidence. See:
Difference between now() and current_timestamp
To make it work for expressions that have to be evaluated, dynamic SQL is required - which you ruled out. But if dynamic SQL is allowed, it's much more efficient to build the target list of the INSERT dynamically and omit columns that are supposed get default values. Or keep the target list constant and switch NULL values for the DEFAULT keyword. See:
Function to INSERT dynamic list of columns in multiple tables
Test for null in function with varying parameters
Generate DEFAULT values in a CTE UPSERT using PostgreSQL 9.3

I like Erwin's solution from the playfulness point of view, but it is quite expensive to have these subqueries in every INSERT. For practical purposes, I would recommend one of the following:
Have four INSERT statements in the function, one for each combination of default/non-default arguments, and use IF statements to pick the right one.
Don't use DEFAULT, but write a BEFORE INSERT trigger that replaces NULLs with the appropriate value.
Of course this will add overhead too. You should benchmark the different options.

Building on the suggestions made by previous commentators, I would write a function that generates, in a dynamic fashion, an insert function for each table.
The advantage of such approach is that the resulting insert function will not use dynamic SQL at all.
Function generating function:
CREATE OR REPLACE FUNCTION f_generate_insert_function(tableid regclass) RETURNS VOID LANGUAGE PLPGSQL AS
$$
DECLARE
tablename text := tableid::text;
funcname text := tablename || '_insert';
ddl text := $ddl$
CREATE OR REPLACE FUNCTION %s (%s) RETURNS %s LANGUAGE PLPGSQL AS $func$
DECLARE
result_set %s;
BEGIN
INSERT INTO %s
(
%s
)
VALUES
(
%s
)
RETURNING * INTO result_set;
RETURN result_set;
END;
$func$
$ddl$;
argument_list text := '';
column_list text := '';
value_list text := '';
r record;
BEGIN
FOR r IN
SELECT attname nam, pg_catalog.format_type(atttypid, atttypmod) typ, pg_catalog.pg_get_expr(adbin, adrelid) def
FROM pg_catalog.pg_attribute
JOIN pg_catalog.pg_type t
ON t.oid = atttypid
LEFT JOIN pg_catalog.pg_attrdef
ON adrelid = attrelid AND adnum = attnum AND atthasdef
WHERE attrelid = tableid
AND attnum > 0
LOOP
IF r.def LIKE 'nextval%' THEN
CONTINUE;
END IF;
argument_list := argument_list || r.nam || '_in ' || r.typ || ',';
column_list := column_list || r.nam || ',';
IF r.def IS NULL THEN
value_list := value_list || r.nam || '_in,';
ELSE
value_list := value_list || 'coalesce(' || r.nam || '_in,' || r.def || '),';
END IF;
END LOOP;
argument_list := rtrim(argument_list, ',');
column_list := rtrim(column_list, ',');
value_list := rtrim(value_list, ',');
EXECUTE format(ddl, funcname, argument_list, tablename, tablename, tablename, column_list, value_list);
END;
$$;
In your case, the resulting insert function will be:
CREATE OR REPLACE FUNCTION public.cars_insert(make_in character varying, model_in character varying, has_automatic_transmission_in boolean, created_on_date_in timestamp with time zone)
RETURNS cars
LANGUAGE plpgsql
AS $function$
DECLARE
result_set cars;
BEGIN
INSERT INTO cars
(
make,model,has_automatic_transmission,created_on_date
)
VALUES
(
make_in,model_in,coalesce(has_automatic_transmission_in,false),coalesce(created_on_date_in,now())
)
RETURNING * INTO result_set;
RETURN result_set;
END;
$function$

You need two Insert Statements; one where the Nullable columns are filled and another one which omits these columns as the default is only used if you do not reference the columns for insert.

Replacing Placeholder values with another table's data

I have 2 tables .The first table contains rows with placeholders and the second table contains those placeholders values.
I want a query which fetches data from the first table and replaces placeholders with actual values which are stored in the second table.
Ex:
Table1 Data
id value
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://{POSTGRESIP}:{POSTGRESPORT}/{TESTDB}
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://{SERVICEIP}/svc/{TESTSERVICE}/
Table2 Data
id placeolder value
201FEBFE-DF92-4474-A945-A592D046CA02 POSTGRESIP 1.2.3.4
20D9DE14-643F-4CE3-B7BF-4B7E01963366 POSTGRESPORT 5432
45611605-F2D9-40C8-8C0C-251E300E183C TESTDB mytest
FA8E2E4E-014C-4C1C-907E-64BAE6854D72 SERVICEIP 10.90.30.40
45B76C68-8A0F-4FD3-882F-CA579EC799A6 TESTSERVICE mytest-service
Required output is
id value
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://1.2.3.4:5432/mytest
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://10.90.30.40/svc/mytest-service/

If you want to use Python-like named placeholders then you need the helper function written on plpythonu:
create extension plpythonu;
create or replace function formatpystring( str text, a json ) returns text immutable language plpythonu as $$
import json
d = json.loads(a)
return str.format(**d)
$$;
Then simple test:
select formatpystring('{foo}.{bar}', '{"foo": "win", "bar": "amp"}');
formatpystring
----------------
win.amp
Finally you need to compose those arguments from your tables. It is simple:
select t1.id, formatpystring(t1.value, json_object_agg(t2.placeholder, t2.value)) as value
from table1 as t1, table2 as t2
group by t1.id, t1.value;
(Query was not tested but you have the direction)

(Clumsy) dynamic SQL implementation, featuring an outer join, but generating a recursive function call:
This function will not be very efficient, but probably the translation table is relatively small.
CREATE TABLE xlat_table (aa text ,bb text);
INSERT INTO xlat_table (aa ,bb ) VALUES( 'BBB', '/1.2.3.4/')
,( 'ccc', 'OMG') ,( 'ddd', '/4.3.2.1/') ;
CREATE FUNCTION dothe_replacements(_arg1 text) RETURNS text
AS
$func$
DECLARE
script text;
braced text;
res text;
found record; -- (aa text, bb text, xx text);
BEGIN
script := '';
res := format('%L', _arg1);
for found IN SELECT xy.aa,xy.bb
, regexp_matches(_arg1, '{\w+}','g' ) AS xx
FROM xlat_table xy
LOOP
-- RAISE NOTICE '#xx=%', found.xx[1];
-- RAISE NOTICE 'aa=%', found.aa;
-- RAISE NOTICE 'bb=%', found.bb;
braced := '{'|| found.aa || '}';
IF (found.xx[1] = braced ) THEN
-- RAISE NOTICE 'Res=%', res;
script := format ('replace(%s, %L, %L)'
,res,braced,found.bb);
res := format('%s', script);
END IF;
END LOOP;
if(length(script) =0) THEN return res; END IF;
script :='Select '|| script;
-- RAISE NOTICE 'script=%', script;
EXECUTE script INTO res;
return res;
END;
$func$
LANGUAGE plpgsql;
SELECT dothe_replacements( 'aaa{BBB}ccc{ddd}eee' );
SELECT dothe_replacements( '{AAA}bbb{CCC}DDD}{EEE}' );
Results:
CREATE TABLE
INSERT 0 3
CREATE FUNCTION
dothe_replacements
-----------------------------
aaa/1.2.3.4/ccc/4.3.2.1/eee
(1 row)
dothe_replacements
--------------------------
'{AAA}bbb{CCC}DDD}{EEE}'
(1 row)
The above method has quadratic behaviour(wrt the numberof xlat-entries); which is horrible.
But,we could dynamically create a function (once) and call it multiple times
(a poor man's generator)
Selecting only the relevant entries from the xlat table should probably be added.
And, you should of course re-create the function everytime the xlat table is changed.
CREATE FUNCTION create_replacement_function(_name text) RETURNS void
AS
$func$
DECLARE
argname text;
res text;
script text;
braced text;
found record; -- (aa text, bb text, xx text);
BEGIN
script := '';
argname := '_arg1';
res :=format('%I', argname);
for found IN SELECT xy.aa,xy.bb
FROM xlat_table xy
LOOP
-- RAISE NOTICE 'aa=%', found.aa;
-- RAISE NOTICE 'bb=%', found.bb;
-- RAISE NOTICE 'Res=%', res;
braced := '{'|| found.aa || '}';
script := format ('replace(%s, %L, %L)'
,res,braced,found.bb);
res := format('%s', script);
END LOOP;
script :=FORMAT('CREATE FUNCTION %I (_arg1 text) RETURNS text AS
$omg$
BEGIN
RETURN %s;
END;
$omg$ LANGUAGE plpgsql;', _name, script);
RAISE NOTICE 'script=%', script;
EXECUTE script ;
return ;
END;
$func$
LANGUAGE plpgsql;
SELECT create_replacement_function( 'my_function');
SELECT my_function('aaa{BBB}ccc{ddd}eee' );
SELECT my_function( '{AAA}bbb{CCC}DDD}{EEE}' );
And the result:
CREATE FUNCTION
NOTICE: script=CREATE FUNCTION my_function (_arg1 text) RETURNS text AS
$omg$
BEGIN
RETURN replace(replace(replace(_arg1, '{BBB}', '/1.2.3.4/'), '{ccc}', 'OMG'), '{ddd}', '/4.3.2.1/');
END;
$omg$ LANGUAGE plpgsql;
create_replacement_function
-----------------------------
(1 row)
my_function
-----------------------------
aaa/1.2.3.4/ccc/4.3.2.1/eee
(1 row)
my_function
------------------------
{AAA}bbb{CCC}DDD}{EEE}
(1 row)

The following offers a plpgsql solution in a with a single function.
You'll notice I've 'renamed' the value column. It's bad practice using rserved/key words as object names. Also soq is the schema I use for all SO code.
The process first takes the holder-values from table2 and generates a set of key-value pairs (in this case hstore, but jsonb would also work). It then builds an array from the value column (my column name: val_string) containing the place_holder name from the value. Finally, it iterates that array replacing the actual holder-name with the value from the key-values using the array value as the lookup key.
The performance would not be great with a larger volume from either table. If you need to process a large volume at a time to a single row temp table may yield better performance.
create or replace function soq.replace_holders( place_holder_line_in text)
returns text
language plpgsql
as $$
declare
l_holder_values hstore;
l_holder_line text;
l_holder_array text[];
l_indx integer;
begin
-- transform cloumns to key-value pairs of holder-value
select string_agg(place,',')::hstore
into l_holder_values
from (
select concat( '"',place_holder,'"=>"',place_value,'"') place
from soq.table2
) p;
-- raise notice 'holder_array_in==%',l_holder_values;
-- extract the text line and build array of place_holder names
select phv, string_to_array (string_agg(v,','),',')
into l_holder_line,l_holder_array
from (
select replace(replace(place_holder_line_in,'{',''),'}','') phv
, replace(replace(replace(regexp_matches(place_holder_line_in,'({[^}]+})','g')::text ,'{',''),'}',''),'"','') v
) s
group by phv;
-- raise notice 'Array==%',l_holder_array::text;
-- replace each key from text line with the corresponding value
for l_indx in 1 .. array_length(l_holder_array,1)
loop
l_holder_line = replace(l_holder_line,l_holder_array[l_indx],l_holder_values -> l_holder_array[l_indx]);
end loop;
-- done
return l_holder_line;
end;
$$;
-- Test driver
select id, soq.replace_holders(val_string) result_value from soq.table1;

I have created a simple query for this solution and it working as required.
WITH RECURSIVE cte(id, value, level) AS (
SELECT id,value, 0 as level
FROM Table1
UNION
SELECT ts.id,replace(ts.value,'{'||tp.placeholder||'}',tp.value) as value, level+1
FROM cte ts, Table2 tp WHERE ts.value LIKE CONCAT('%',tp.placeholder, '%')
)
SELECT id, value FROM cte c
where level =
(
select Max(level)
from cte c2 where c.id=c2.id
)
Output is
id value
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://10.90.30.40/svc/mytest-service/
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://1.2.3.4:5432/mytest

Iterate through column names to get counts in a PL/pgSQL function

I have a table in my Postgres database that I'm trying to determine fill rates for (that is, I'm trying to understand how often data is/isn't missing). I need to make a function that, for each column (in a list of a couple dozen columns I've selected), counts the number and percentage of columns with non-null values.
The problem is, I don't really know how to iterate through a list of columns in a programmatic way, because I don't know how to reference a column from a string of its name. I've read about how you can use the EXECUTE command to run dynamically-written SQL, but I haven't been able to get it to work. Here's my current function:
CREATE OR REPLACE FUNCTION get_fill_rates() RETURNS TABLE (field_name text, fill_count integer, fill_percentage float) AS $$
DECLARE
fields text[] := array['column_a', 'column_b', 'column_c'];
total_rows integer;
BEGIN
SELECT reltuples INTO total_rows FROM pg_class WHERE relname = 'my_table';
FOR i IN array_lower(fields, 1) .. array_upper(fields, 1)
LOOP
field_name := fields[i];
EXECUTE 'SELECT COUNT(*) FROM my_table WHERE $1 IS NOT NULL' INTO fill_count USING field_name;
fill_percentage := fill_count::float / total_rows::float;
RETURN NEXT;
END LOOP;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM get_fill_rates() ORDER BY fill_count DESC;
This function, as written, returns every field as having a 100% fill rate, which I know to be false. How can I make this function work?

I know you already solved it. But let me suggest you to avoid concatenating identifiers on dynamic queries, you can use format with a identifier wildcard instead:
CREATE OR REPLACE FUNCTION get_fill_rates() RETURNS TABLE (field_name text, fill_count integer, fill_percentage float) AS $$
DECLARE
fields text[] := array['column_a', 'column_b', 'column_c'];
table_name name := 'my_table';
total_rows integer;
BEGIN
SELECT reltuples INTO total_rows FROM pg_class WHERE relname = table_name;
FOREACH field_name IN ARRAY fields
LOOP
EXECUTE format('SELECT COUNT(*) FROM %I WHERE %I IS NOT NULL', table_name, field_name) INTO fill_count;
fill_percentage := fill_count::float / total_rows::float;
RETURN NEXT;
END LOOP;
END;
$$ LANGUAGE plpgsql;
Doing this way will help you preventing SQL-injection attacks and will reduce query parse overhead a bit. More info here.

I figured out the solution after I wrote my question but before I submitted it -- since I've already done the work of writing the question, I'll just go ahead and share the answer. The problem was in my EXECUTE statement, specifically with that USING field_name bit. I think it was getting treated as a string literal when I did it that way, which meant the query was evaluating if "a string literal" IS NOT NULL which of course, is always true.
Instead of parameterizing the column name, I need to inject it directly into the query string. So, I changed my EXECUTE line to the following:
EXECUTE 'SELECT COUNT(*) FROM my_table WHERE ' || field_name || ' IS NOT NULL' INTO fill_count;

Some problems in the code aside (see below), this can be substantially faster and simpler with a single scan over the table in a plain query:
SELECT v.*
FROM (
SELECT count(column_a) AS ct_column_a
, count(column_b) AS ct_column_b
, count(column_c) AS ct_column_c
, count(*)::numeric AS ct
FROM my_table
) sub
, LATERAL (
VALUES
(text 'column_a', ct_column_a, round(ct_column_a / ct, 3))
, (text 'column_b', ct_column_b, round(ct_column_b / ct, 3))
, (text 'column_c', ct_column_c, round(ct_column_c / ct, 3))
) v(field_name, fill_count, fill_percentage);
The crucial "trick" here is that count() only counts non-null values to begin with, no tricks required.
I rounded the percentage to 3 decimal digits, which is optional. For this I cast to numeric.
Use a VALUES expression to unpivot the results and get one row per field.
For repeated use or if you have a long list of columns to process, you can generate and execute the query dynamically. But, again, don't run a separate count for each column. Just build above query dynamically:
CREATE OR REPLACE FUNCTION get_fill_rates(tbl regclass, fields text[])
RETURNS TABLE (field_name text, fill_count bigint, fill_percentage numeric) AS
$func$
BEGIN
RETURN QUERY EXECUTE (
-- RAISE NOTICE '%', ( -- to debug if needed
SELECT
'SELECT v.*
FROM (
SELECT count(*)::numeric AS ct
, ' || string_agg(format('count(%I) AS %I', fld, 'ct_' || fld), ', ') || '
FROM ' || tbl || '
) sub
, LATERAL (
VALUES
(text ' || string_agg(format('%L, %2$I, round(%2$I/ ct, 3))', fld, 'ct_' || fld), ', (') || '
) v(field_name, fill_count, fill_pct)
ORDER BY v.fill_count DESC'
FROM unnest(fields) fld
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM get_fill_rates('my_table', '{column_a, column_b, column_c}');
As you can see, this works for any given table and column list now.
And all identifiers are properly quoted automatically, using format() or by the built-in virtues of the regclass type.
Related:
Table name as a PostgreSQL function parameter
How to unpivot a table in PostgreSQL
Query for crosstab view
Convert one row into multiple rows with fewer columns
Your original query could be improved like this, but this is just lipstick on a pig. Do not use this inefficient approach.
CREATE OR REPLACE FUNCTION get_fill_rates()
RETURNS TABLE (field_name text, fill_count bigint, fill_percentage float) AS
$$
DECLARE
fields text[] := '{column_a, column_b, column_c}'; -- must be legal identifiers!
total_rows float; -- use float right away
BEGIN
SELECT reltuples INTO total_rows FROM pg_class WHERE relname = 'my_table';
FOREACH field_name IN ARRAY fields -- use FOREACH
LOOP
EXECUTE 'SELECT COUNT(*) FROM big WHERE ' || field_name || ' IS NOT NULL'
INTO fill_count;
fill_percentage := fill_count / total_rows; -- already type float
RETURN NEXT;
END LOOP;
END
$$ LANGUAGE plpgsql;
Plus, pg_class.reltuples is only an estimate. Since you are counting anyway, use an actual count.
Related:
Iterating over integer[] in PL/pgSQL
Fast way to discover the row count of a table in PostgreSQL

Split given string and prepare case statement

Table: table_name
create table table_name
(
given_dates timestamp,
set_name varchar
);
Insertion of records:
insert into table_name values('2001-01-01'),('2001-01-05'),('2001-01-10'),
('2001-01-15'),('2001-01-20'),('2001-01-25'),
('2001-02-01'),('2001-02-05'),('2001-02-10'),
('2001-02-15');
Now I want to update set_name for some dates.
For example:
I want to update table like this:
given_dates set_name
----------------------
2001-01-01 s1
2001-01-05 s1
2001-01-10 s2
2001-01-15 s2
2001-01-20
2001-01-25
2001-02-01
2001-02-05
2001-02-10
2001-02-15
Note: The given_dates and set_name are pass a parameter because of they are dynamic. I may pass 2 sets
as shown above s1,s2 or may pass 4 sets according to the requirement.
So I need the dynamic case statement for update the set_name.
Given two parameters:
declare p_dates varchar := '2001-01-01to2001-01-05,2001-01-10to2001-01-15';
declare p_sets varchar := 's1,s2';
Well I can do this by using following static script:
Static Update statement:
update table_name
SET set_name =
CASE
when given_dates between '2001-01-01' and '2001-01-05' then 's1'
when given_dates between '2001-01-10' and '2001-01-15' then 's2'
else ''
end;
The above update statement does the job done but statically.
Like the same way to update table I want to prepare only case statement which should be dynamic which can change as per the parameters (p_dates,p_sets) changes.
Questions:
How to split the given dates that is p_dates? (I have to keyword in between two dates.)
How to split the given sets that is p_sets? (I have ',' comma in between two set_names.)
How to prepare dynamic case statement after splitting the p_dates and p_sets?
This question relates to Dynamic case statement using SQL Server 2008 R2, which is the same thing but for Microsoft SQL Server.

Clean setup:
CREATE TABLE tbl (
given_date date
, set_name varchar
);
Use a singular term as column name for a single value.
The data type is obviously date and not a timestamp.
To transform your text parameters into a useful table:
SELECT unnest(string_to_array('2001-01-01to2001-01-05,2001-01-10to2001-01-15', ',')) AS date_range
, unnest(string_to_array('s1,s2', ',')) AS set_name;
"Parallel unnest" is handy but has its caveats. Postgres 9.4 adds a clean solution, Postgres 10 eventually sanitized the behavior of this. See below.
Dynamic execution
Prepared statement
Prepared statements are only visible to the creating session and die with it. Per documentation:
Prepared statements only last for the duration of the current database session.
PREPARE once per session:
PREPARE upd_tbl AS
UPDATE tbl t
SET set_name = s.set_name
FROM (
SELECT unnest(string_to_array($1, ',')) AS date_range
, unnest(string_to_array($2, ',')) AS set_name
) s
WHERE t.given_date BETWEEN split_part(date_range, 'to', 1)::date
AND split_part(date_range, 'to', 2)::date;
Or use tools provided by your client to prepare the statement.
Execute n times with arbitrary parameters:
EXECUTE upd_tbl('2001-01-01to2001-01-05,2001-01-10to2001-01-15', 's1,s4');
Server-side function
Functions are persisted and visible to all sessions.
CREATE FUNCTION once:
CREATE OR REPLACE FUNCTION f_upd_tbl(_date_ranges text, _names text)
RETURNS void AS
$func$
UPDATE tbl t
SET set_name = s.set_name
FROM (
SELECT unnest(string_to_array($1, ',')) AS date_range
, unnest(string_to_array($2, ',')) AS set_name
) s
WHERE t.given_date BETWEEN split_part(date_range, 'to', 1)::date
AND split_part(date_range, 'to', 2)::date
$func$ LANGUAGE sql;
Call n times:
SELECT f_upd_tbl('2001-01-01to2001-01-05,2001-01-20to2001-01-25', 's2,s5');
SQL Fiddle
Superior design
Use array parameters (can still be provided as string literals), a daterange type (both pg 9.3) and the new parallel unnest() (pg 9.4).
CREATE OR REPLACE FUNCTION f_upd_tbl(_dr daterange[], _n text[])
RETURNS void AS
$func$
UPDATE tbl t
SET set_name = s.set_name
FROM unnest($1, $2) s(date_range, set_name)
WHERE t.given_date <# s.date_range
$func$ LANGUAGE sql;
<# being the "element is contained by" operator.
Call:
SELECT f_upd_tbl('{"[2001-01-01,2001-01-05]"
,"[2001-01-20,2001-01-25]"}', '{s2,s5}');
Details:
Unnest multiple arrays in parallel

String_to_array
declare p_dates varchar[] := string_to_array('2001-01-01,2001-01-05,
2001-01-10,2001-01-15*2001-01-01,2001-01-05,2001-01-10,2001-01-15','*');
declare p_sets varchar[] := string_to_array('s1,s2',',');
declare p_length integer=0;
declare p_str varchar[];
declare i integer;
select array_length(p_dates ,1) into p_count;
for i in 1..p_count loop
p_str := string_to_array( p_dates[i],',')
execute 'update table_name
SET set_name =
CASE
when given_dates between'''|| p_str [1] ||''' and '''|| p_str [2]
||''' then ''' || p_sets[1] ||'''
when given_dates between '''|| p_str [3] ||''' and '''
|| p_str [4] ||''' then ''' || p_sets[2] ||'''
else ''''
end';
end loop;

now we can use datemultirange.
create or replace function f_upd_tbl_multirange(_dr datemultirange , _n text[])
returns void as
$func$
UPDATE tbl t
SET set_name = s.set_name
FROM unnest($1,$2) s(date_range,set_name)
WHERE t.given_date <# s.date_range
$func$ language sql;
run it.
SELECT f_upd_tbl_multirange(
'{[''2022-01-01'',''2022-01-05''],[''2022-02-06'',''2022-02-25'']}', '{s2,s5}');

Define a function in an SQL Navigator query

I would like to extract a function from a piece of SQL-code which is used multiple times in one query. I'm looking for a functionality which is similar to the following (invented by me) syntax:
with f(x) as (return x+1)
select f(thing1), f(thing2), f(thing3) from things
thing1, thing2, thing3 are integer columns in the table "things" in the example. Also, imagine that f is more complicated than an add-one function.
How do I define a function inside a query?

Declaration of a function in the WITH clause of a query is not possible but according to the information presented at OOW it will be in 12c version. So for now you need to create a function as a schema object whether it would be a stand-alone function or part of a package. For example:
create or replace function F(p_p in number)
return number
is
begin
return p_p + 1;
end;
And then call it in a query, ensuring that the data type of a column you are passing in to the function as a parameter is of the same data type as the parameter of the function:
select f(col1)
, f(col2)
, ...
, f(coln)
from your_table

Are you trying to build a function for dynamic table and table's columns, I would do some code like this? And you cannot declare a function in the WITH clause of a query.
SELECT f ( tablename,columnname1 ),
f ( tablename,columnname2 ),
........
FROM tablename;
Create or replace function f (tableName varchar2,ColumnName varchar2)
Return somethingHere
Is
varTableName varchar2(200);
varColumnName varchar2(200);
varValue integer;
t_cid INTEGER;
t_command VARCHAR2(200);
Begin
--Get tableName
varTableName := tableName ;
--Get columnName
varColumnName := ColumnName ;
t_command := 'SELECT ' || varColumnName ||' FROM ' || varTableName;
--Here execute dynamic sql statement
DBMS_SQL.PARSE
DBMS_SQL.DEFINE_COLUMN
DBMS_SQL.EXECUTE
--fatch row values into varValue
DBMS_SQL.COLUMN_VALUE (..,..,varValue);
--then do your x+1 magic here
varValue := varValue+1
--then output your value.
End;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Using dynamic query + user defined datatype in Postgres - sql

Related

In Postgres, how would I retrieve the default value of a column, preferably inline in an insert statement?

Replacing Placeholder values with another table's data

Iterate through column names to get counts in a PL/pgSQL function

Split given string and prepare case statement

Define a function in an SQL Navigator query

Categories

Resources