Select a row and insert it with different IDs n times - sql

I am trying to come up with a script in Postgres that will select the first row in a table and insert that row x number of times back into the same table.
Here is what I have:
INSERT INTO campaign (select column_name from campaign)
SELECT x.id from generate_series(50, 500) as x(id);
The above obviously doesn't work.

Just get the syntax for the INSERT statement right:
INSERT INTO campaign (id, column_name)
SELECT g.g, t.column_name
FROM (SELECT column_name FROM campaign LIMIT 1) t -- picking arbitrary row
,generate_series(50, 500) g(g); -- 451 times
The CROSS JOIN to generate_series() multiplies each selected row.
Selecting one arbitrary row, since the question didn't define "first". There is no natural order in a table. To pick a certain row, add ORDER BY and/or WHERE.
There is no syntactical shortcut to select all columns except the one named "id". You have to use the complete row or provide a list of selected columns.
Automation with dynamic SQL
To get around this, build the query string from catalog tables (or the information schema) and use EXECUTE in a plpgsql function (or some other procedural language). Only using pg_attribute.
format() requires Postgres 9.1 or later.
CREATE OR REPLACE FUNCTION f_multiply_row(_tbl regclass
, _idname text
, _minid int
, _maxid int)
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT format('INSERT INTO %1$s (%2$I, %3$s)
SELECT g.g, %3$s
FROM (SELECT * FROM %1$s LIMIT 1) t
,generate_series($1, $2) g(g)'
, _tbl
, _idname
, string_agg(quote_ident(attname), ', ')
)
FROM pg_attribute
WHERE attrelid = _tbl
AND attname <> _idname -- exclude id column
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
)
USING _minid, _maxid;
END
$func$ LANGUAGE plpgsql;
Call in your case:
SELECT f_multiply_row('campaign', 'id', 50, 500);
SQL Fiddle.
Major points
Properly escape identifiers to avoid SQL injection. Using format() and regclass for the table name. Details:
Table name as a PostgreSQL function parameter
_idname is the column name to exclude ('id' in your case). Case sensitive!
Pass values in the USING clause. $1 and $2 in generate_series($1, $2) reference those parameters (not the function parameters).
More explanation in related answers. Try a search:
https://stackoverflow.com/search?q=[plpgsql]+[dynamic-sql]+format+pg_attribute

Related

If the first field doesn't exists in a table then look at a different field in the same table

Is there a way to select a field from a table and if that field doesn't exist then select a different field from the same table? example:
SELECT MY_FIELD from MY_TABLE
error: "MY_FIELD": invalid identifier
is there any way to check if it exists and if it does then use that field for the query, if it doesn't exist then use example:
SELECT my_field2 from client.
My problem is
I am writing a report that will be used on two databases, but the field names on occasion can be named slightly different depending on the database.
What you really need to do is talk to your management / development leads about why the different databases are not harmonized. But, since this is a programming site, here is a programming answer using dynamic SQL.
As has been pointed out, you could create views in the different databases to provide yourself with a harmonized layer to query from. If you are unable to create views, you can do something like this:
create table test ( present_column NUMBER );
insert into test select rownum * 10 from dual connect by rownum <= 5;
declare
l_rc SYS_REFCURSOR;
begin
BEGIN
OPEN l_rc FOR 'SELECT missing_column FROM test';
EXCEPTION
WHEN others THEN
OPEN l_rc FOR 'SELECT present_column FROM test';
END;
-- This next only works in 12c and later
-- In earlier versions, you've got to process l_rc on your own.
DBMS_SQL.RETURN_RESULT(l_rc);
end;
This is inferior to the other solutions (either harmonizing the databases or creating views). For one thing, you get no compile time checking of your queries this way.
That won't compile, so - I'd say not. You might try with dynamic SQL which reads contents of the USER_TAB_COLUMNS and create SELECT statement on-the-fly.
Depending on reporting tool you use, that might (or might not) be possible. For example, Apex offers (as reports's source) a function that returns query, so you might use it there.
I'd suggest a simpler option - create views on both databases which have unified column names, so that your report always selects from the view and works all the time. For example:
-- database 1:
create view v_client as
select client_id id,
client_name name
from your_table;
-- database 2:
create view v_client as
select clid id,
clnam name
from your_table;
-- reporting tool:
select id, name
from v_client;
This can be done in a single SQL statement using DBMS_XMLGEN.GETXML, but it gets messy. It would probably be cleaner to use dynamic SQL or a view, but there are times when it's difficult to create supporting objects.
Sample table:
--Create either table.
create table my_table(my_field1 number);
insert into my_table values(1);
insert into my_table values(2);
create table my_table(my_field2 number);
insert into my_table values(1);
insert into my_table values(2);
Query:
--Get the results by converting XML into rows.
select my_field
from
(
--Convert to an XMLType.
select xmltype(clob_results) xml_results
from
(
--Conditionally select either MY_FIELD1 or MY_FIELD2, depending on which exists.
select dbms_xmlgen.GetXML('select my_field1 my_field from my_table') clob_results
from user_tab_columns
where table_name = 'MY_TABLE'
and column_name = 'MY_FIELD1'
--Stop transformations from running the XMLType conversion on nulls.
and rownum >= 1
union all
select dbms_xmlgen.GetXML('select my_field2 my_field from my_table') clob_results
from user_tab_columns
where table_name = 'MY_TABLE'
and column_name = 'MY_FIELD2'
--Stop transformations from running the XMLType conversion on nulls.
and rownum >= 1
)
--Only convert non-null values.
where clob_results is not null
)
cross join
xmltable
(
'/ROWSET/ROW'
passing xml_results
columns
my_field number path 'MY_FIELD'
);
Results:
MY_FIELD
--------
1
2
Here's a SQL Fiddle if you want to see it running.

How to get postgresql to interpret column names from sub-select?

I have the following query in PostgreSQL 9.4.1 on x86_64-apple-darwin14.1.0, compiled by Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn), 64-bit.
SELECT (
SELECT string_agg(trim(cols::text, '()'), ', ')
FROM (
SELECT 'dog.' || column_name
FROM information_schema.columns
WHERE table_name='dog')
AS cols
)
FROM dog;
This produces the following output once for every row in the dogs table:
dog.id, dog.created_date, dog.updated_date, dog.author_id, dog.mother_id, dog.start_date, dog.end_date, dog.session_type_id, dog.note, dog.cancelled, dog.life_phase
But the problem is that the outer select query has not dereferenced the column names and interpreted them as column names but rather labeled the column string_agg and spit out the column names once for every row in the dog table.
How do I get postgresql to ensure that the inner select generating the column names is interpreted by the outer select properly?
You are trying to execute a dynamic SQL. You need a plpgsql function to do that. Example:
create or replace function select_dogs()
returns setof dog language plpgsql as $$
declare
list_of_columns text;
begin
SELECT string_agg(trim(cols::text, '()'), ', ')
INTO list_of_columns
FROM (
SELECT 'dog.' || column_name
FROM information_schema.columns
WHERE table_name='dog'
) AS cols;
return query execute format (
'select %s from dog', list_of_columns);
end $$;
select * from select_dogs();
Read more in the documentation: Executing Dynamic Commands.

nzsql - Converting a subquery into columns for another select

Goal: Use a given subquery's results (a single column with many rows of names) to act as the outer select's selection field.
Currently, my subquery is the following:
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'test_table' AND column_name not in ('colRemove');
What I am doing in this subquery is grabbing all the column names from a table (i.e. test_table) and outputting all except for the column name specified (i.e. colRemove). As stated in the "goal", I want to use this subquery as such:
SELECT (*enter subquery from above here*)
FROM actual_table
WHERE (*enter specific conditions*)
I am working on a Netezza SQL server that is version 7.0.4.4. Ideally, I would like to make the entire query executable in one line, but for now, a working solution would be much appreciated. Thanks!
Note: I do not believe that the SQL extensions has been installed (i.e. arrays), but I will need to double check this.
A year too late, here's the best I can come up with but, as you already noticed, it requires a stored procedure to do the dynamic SQL. The stored proc creates a view with the all the columns from the source table minus the one you want to exclude.
-- Create test data.
CREATE TABLE test (firstcol INTEGER, secondcol INTEGER, thirdcol INTEGER);
INSERT INTO test (firstcol, secondcol, thirdcol) VALUES (1, 2, 3);
INSERT INTO test (firstcol, secondcol, thirdcol) VALUES (4, 5, 6);
-- Install stored procedure.
CREATE OR REPLACE PROCEDURE CreateLimitedView (varchar(ANY), varchar(ANY)) RETURNS BOOLEAN
LANGUAGE NZPLSQL AS
BEGIN_PROC
DECLARE
tableName ALIAS FOR $1;
columnToExclude ALIAS FOR $2;
colRec RECORD;
cols VARCHAR(2000); -- Adjust as needed.
isfirstcol BOOLEAN;
BEGIN
isfirstcol := true;
FOR colRec IN EXECUTE
'SELECT ATTNAME AS NAME FROM _V_RELATION_COLUMN
WHERE
NAME=UPPER('||quote_literal(tableName)||')
AND ATTNAME <> UPPER('||quote_literal(columnToExclude)||')
ORDER BY ATTNUM'
LOOP
IF isfirstcol THEN
cols := colRec.NAME;
ELSE
cols := cols || ', ' || colRec.NAME;
END IF;
isfirstcol := false;
END LOOP;
-- Should really check if 'LimitedView' already exists as a view, table or synonym.
EXECUTE IMMEDIATE 'CREATE OR REPLACE VIEW LimitedView AS SELECT ' || cols || ' FROM ' || quote_ident(tableName);
RETURN true;
END;
END_PROC
;
-- Run the stored proc to create the view.
CALL CreateLimitedView('test', 'secondcol');
-- Select results from the view.
SELECT * FROM limitedView WHERE firstcol = 4;
FIRSTCOL | THIRDCOL
----------+----------
4 | 6
You could have the stored proc return a resultset directly but then you wouldn't be able to filter results with a WHERE clause.

Update multiple columns that start with a specific string

I am trying to update a bunch of columns in a DB for testing purposes of a feature. I have a table that is built with hibernate so all of the columns that are created for an embedded entity begin with the same name. I.e. contact_info_address_street1, contact_info_address_street2, etc.
I am trying to figure out if there is a way to do something to the affect of:
UPDATE table SET contact_info_address_* = null;
If not, I know I can do it the long way, just looking for a way to help myself out in the future if I need to do this all over again for a different set of columns.
You need dynamic SQL for this. So you must defend against possible SQL injection.
Basic query
The basic query to generate the DML command needed can look like this:
SELECT format('UPDATE tbl SET (%s) = (%s)'
,string_agg (quote_ident(attname), ', ')
,string_agg ('NULL', ', ')
)
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass
AND NOT attisdropped
AND attnum > 0
AND attname ~~ 'foo_%';
Returns:
UPDATE tbl SET (foo_a, foo_b, foo_c) = (NULL, NULL, NULL);
Make use of the "column-list syntax" of UPDATE to shorten the code and simplify the task.
I query the system catalogs instead of information schema because the latter, while being standardized and guaranteed to be portable across major versions, is also notoriously slow and sometimes unwieldy. There are pros and cons, see:
Get column names and data types of a query, table or view
quote_ident() for the column names prevents SQL-injection - also necessary for identifiers.
string_agg() requires 9.0+.
Full automation with PL/pgSQL function
CREATE OR REPLACE FUNCTION f_update_cols(_tbl regclass, _col_pattern text
, OUT row_ct int, OUT col_ct bigint)
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text;
BEGIN
SELECT INTO _sql, col_ct
format('UPDATE tbl SET (%s) = (%s)'
, string_agg (quote_ident(attname), ', ')
, string_agg ('NULL', ', ')
)
, count(*)
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped columns
AND attnum > 0 -- no system columns
AND attname LIKE _col_pattern; -- only columns matching pattern
-- RAISE NOTICE '%', _sql; -- output SQL for debugging
EXECUTE _sql;
GET DIAGNOSTICS row_ct = ROW_COUNT;
END
$func$;
COMMENT ON FUNCTION f_update_cols(regclass, text)
IS 'Updates all columns of table _tbl ($1)
that match _col_pattern ($2) in a LIKE expression.
Returns the count of columns (col_ct) and rows (row_ct) affected.';
Call:
SELECT * FROM f_update_cols('myschema.tbl', 'foo%');
To make the function more practical, it returns information as described in the comment. More about obtaining the result status in plpgsql in the manual.
I use the variable _sql to hold the query string, so I can collect the number of columns found (col_ct) in the same query.
The object identifier type regclass is the most efficient way to automatically avoid SQL injection (and sanitize non-standard names) for the table name, too. You can use schema-qualified table names to avoid ambiguities. I would advise to do so if you (can) have multiple schemas in your db! See:
Table name as a PostgreSQL function parameter
db<>fiddle here
Old sqlfiddle
There's no handy shortcut sorry. If you have to do this kind of thing a lot, you could create a function to dynamically execute sql and achieve your goal.
CREATE OR REPLACE FUNCTION reset_cols() RETURNS boolean AS $$ BEGIN
EXECUTE (select 'UPDATE table SET '
|| array_to_string(array(
select column_name::text
from information_schema.columns
where table_name = 'table'
and column_name::text like 'contact_info_address_%'
),' = NULL,')
|| ' = NULL');
RETURN true;
END; $$ LANGUAGE plpgsql;
-- run the function
SELECT reset_cols();
It's not very nice though. A better function would be one that accepts the tablename and column prefix as args. Which I'll leave as an exercise for the readers :)

Update multiple columns in a trigger function in plpgsql

Given the following schema:
create table account_type_a (
id SERIAL UNIQUE PRIMARY KEY,
some_column VARCHAR
);
create table account_type_b (
id SERIAL UNIQUE PRIMARY KEY,
some_other_column VARCHAR
);
create view account_type_a view AS select * from account_type_a;
create view account_type_b view AS select * from account_type_b;
I try to create a generic trigger function in plpgsql, which enables updating the view:
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
An unsuccessful effort of mine was:
create function updateAccount() returns trigger as $$
declare
target_table varchar := substring(TG_TABLE_NAME from '(.+)_view');
cols varchar;
begin
execute 'select string_agg(column_name,$1) from information_schema.columns
where table_name = $2' using ',', target_table into cols;
execute 'update ' || target_table || ' set (' || cols || ') = select ($1).*
where id = ($1).id' using NEW;
return NULL;
end;
$$ language plpgsql;
The problem is the update statement. I am unable to come up with a syntax that would work here. I have successfully implemented this in PL/Perl, but would be interested in a plpgsql-only solution.
Any ideas?
Update
As #Erwin Brandstetter suggested, here is the code for my PL/Perl solution. I incoporated some of his suggestions.
create function f_tr_up() returns trigger as $$
use strict;
use warnings;
my $target_table = quote_ident($_TD->{'table_name'}) =~ s/^([\w]+)_view$/$1/r;
my $NEW = $_TD->{'new'};
my $cols = join(',', map { quote_ident($_) } keys $NEW);
my $vals = join(',', map { quote_literal($_) } values $NEW);
my $query = sprintf(
"update %s set (%s) = (%s) where id = %d",
$target_table,
$cols,
$vals,
$NEW->{'id'});
spi_exec_query($query);
return;
$$ language plperl;
While #Gary's answer is technically correct, it fails to mention that PostgreSQL does support this form:
UPDATE tbl
SET (col1, col2, ...) = (expression1, expression2, ..)
Read the manual on UPDATE.
It's still tricky to get this done with dynamic SQL. I'll assume a simple case where views consist of the same columns as their underlying tables.
CREATE VIEW tbl_view AS SELECT * FROM tbl;
Problems
The special record NEW is not visible inside EXECUTE. I pass NEW as a single parameter with the USING clause of EXECUTE.
As discussed, UPDATE with list-form needs individual values. I use a subselect to split the record into individual columns:
UPDATE ...
FROM (SELECT ($1).*) x
(Parenthesis around $1 are not optional.) This allows me to simply use two column lists built with string_agg() from the catalog table: one with and one without table qualification.
It's not possible to assign a row value as a whole to individual columns. The manual:
According to the standard, the source value for a parenthesized
sub-list of target column names can be any row-valued expression
yielding the correct number of columns. PostgreSQL only allows the
source value to be a row constructor or a sub-SELECT.
INSERT is implemented simpler. If the structure of view and table are identical we can omit the column definition list. (Can be improved, see below.)
Solution
I made a couple of updates to your approach to make it shine.
Trigger function for UPDATE:
CREATE OR REPLACE FUNCTION f_trg_up()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME from '(.+)_view$'));
_cols text;
_vals text;
BEGIN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ')
, string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s
SET (%s) = (%s)
FROM (SELECT ($1).*) x', _tbl, _cols, _vals)
USING NEW;
RETURN NEW; -- Don't return NULL unless you knwo what you're doing
END
$func$;
Trigger function for INSERT:
CREATE OR REPLACE FUNCTION f_trg_ins()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME FROM '(.+)_view$'));
BEGIN
EXECUTE format('INSERT INTO %s SELECT ($1).*', _tbl)
USING NEW;
RETURN NEW; -- Don't return NULL unless you know what you're doing
END
$func$;
Triggers:
CREATE TRIGGER trg_instead_up
INSTEAD OF UPDATE ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_up();
CREATE TRIGGER trg_instead_ins
INSTEAD OF INSERT ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_ins();
Before Postgres 11 the syntax (oddly) was EXECUTE PROCEDURE instead of EXECUTE FUNCTION - which also still works.
db<>fiddle here - demonstrating INSERT and UPDATE
Old sqlfiddle
Major points
Include the schema name to make the table reference unambiguous. There can be multiple table of the same name in one database with multiple schemas!
Query pg_catalog.pg_attribute instead of information_schema.columns. Less portable, but much faster and allows to use the table-OID.
How to check if a table exists in a given schema
Table names are NOT safe against SQLi when concatenated as strings for dynamic SQL. Escape with quote_ident() or format() or with an object-identifer type. This includes the special trigger function variables TG_TABLE_SCHEMA and TG_TABLE_NAME!
Cast to the object identifier type regclass to assert the table name is valid and get the OID for the catalog look-up.
Optionally use format() to build the dynamic query string safely.
No need for dynamic SQL for the first query on the catalog tables. Faster, simpler.
Use RETURN NEW instead of RETURN NULL in these trigger functions unless you know what you are doing. (NULL would cancel the INSERT for the current row.)
This simple version assumes that every table (and view) has a unique column named id. A more sophisticated version might use the primary key dynamically.
The function for UPDATE allows the columns of view and table to be in any order, as long as the set is the same.
The function for INSERT expects the columns of view and table to be in identical order. If you want to allow arbitrary order, add a column definition list to the INSERT command, just like with UPDATE.
Updated version also covers changes to the id column by using OLD additionally.
Postgresql doesn't support updating multiple columns using the set (col1,col2) = select val1,val2 syntax.
To achieve the same in postgresql you'd use
update target_table
set col1 = d.val1,
col2 = d.val2
from source_table d
where d.id = target_table.id
This is going to make the dynamic query a bit more complex to build as you'll need to iterate the column name list you're using into individual fields. I'd suggest you use array_agg instead of string_agg as an array is easier to process than splitting the string again.
Postgresql UPDATE syntax
documentation on array_agg function