How to add a new column in a table after the 2nd or 3rd column in the table using postgres?
My code looks as follows
ALTER TABLE n_domains ADD COLUMN contract_nr int after owner_id
No, there's no direct way to do that. And there's a reason for it - every query should list all the fields it needs in whatever order (and format etc) it needs them, thus making the order of the columns in one table insignificant.
If you really need to do that I can think of one workaround:
dump and save the description of the table in question (using pg_dump --schema-only --table=<schema.table> ...)
add the column you want where you want it in the saved definition
rename the table in the saved definition so not to clash with the name of the old table when you attempt to create it
create the new table using this definition
populate the new table with the data from the old table using 'INSERT INTO <new_table> SELECT field1, field2, <default_for_new_field>, field3,... FROM <old_table>';
rename the old table
rename the new table to the original name
eventually drop the old, renamed table after you make sure everything's alright
The order of columns is not irrelevant, putting fixed width columns at the front of the table can optimize the storage layout of your data, it can also make working with your data easier outside of your application code.
PostgreSQL does not support altering the column ordering (see Alter column position on the PostgreSQL wiki); if the table is relatively isolated, your best bet is to recreate the table:
CREATE TABLE foobar_new ( ... );
INSERT INTO foobar_new SELECT ... FROM foobar;
DROP TABLE foobar CASCADE;
ALTER TABLE foobar_new RENAME TO foobar;
If you have a lot of views or constraints defined against the table, you can re-add all the columns after the new column and drop the original columns (see the PostgreSQL wiki for an example).
The real problem here is that it's not done yet. Currently PostgreSQL's logical ordering is the same as the physical ordering. That's problematic because you can't get a different logical ordering, but it's even worse because the table isn't physically packed automatically, so by moving columns you can get different performance characteristics.
Arguing that it's that way by intent in design is pointless. It's somewhat likely to change at some point when an acceptable patch is submitted.
All of that said, is it a good idea to rely on the ordinal positioning of columns, logical or physical? Hell no. In production code you should never be using an implicit ordering or *. Why make the code more brittle than it needs to be? Correctness should always be a higher priority than saving a few keystrokes.
As a work around, you can in fact modify the column ordering by recreating the table, or through the "add and reorder" game
See also,
Column tetris reordering in order to make things more space-efficient
The column order is relevant to me, so I created this function. See if it helps. It works with indexes, primary key, and triggers. Missing Views and Foreign Key and other features are missing.
Example:
SELECT xaddcolumn('table', 'col3 int NOT NULL DEFAULT 0', 'col2');
Source code:
CREATE OR REPLACE FUNCTION xaddcolumn(ptable text, pcol text, pafter text) RETURNS void AS $BODY$
DECLARE
rcol RECORD;
rkey RECORD;
ridx RECORD;
rtgr RECORD;
vsql text;
vkey text;
vidx text;
cidx text;
vtgr text;
ctgr text;
etgr text;
vseq text;
vtype text;
vcols text;
BEGIN
EXECUTE 'CREATE TABLE zzz_' || ptable || ' AS SELECT * FROM ' || ptable;
--colunas
vseq = '';
vcols = '';
vsql = 'CREATE TABLE ' || ptable || '(';
FOR rcol IN SELECT column_name as col, udt_name as coltype, column_default as coldef,
is_nullable as is_null, character_maximum_length as len,
numeric_precision as num_prec, numeric_scale as num_scale
FROM information_schema.columns
WHERE table_name = ptable
ORDER BY ordinal_position
LOOP
vtype = rcol.coltype;
IF (substr(rcol.coldef,1,7) = 'nextval') THEN
vtype = 'serial';
vseq = vseq || 'SELECT setval(''' || ptable || '_' || rcol.col || '_seq'''
|| ', max(' || rcol.col || ')) FROM ' || ptable || ';';
ELSIF (vtype = 'bpchar') THEN
vtype = 'char';
END IF;
vsql = vsql || E'\n' || rcol.col || ' ' || vtype;
IF (vtype in ('varchar', 'char')) THEN
vsql = vsql || '(' || rcol.len || ')';
ELSIF (vtype = 'numeric') THEN
vsql = vsql || '(' || rcol.num_prec || ',' || rcol.num_scale || ')';
END IF;
IF (rcol.is_null = 'NO') THEN
vsql = vsql || ' NOT NULL';
END IF;
IF (rcol.coldef <> '' AND vtype <> 'serial') THEN
vsql = vsql || ' DEFAULT ' || rcol.coldef;
END IF;
vsql = vsql || E',';
vcols = vcols || rcol.col || ',';
--
IF (rcol.col = pafter) THEN
vsql = vsql || E'\n' || pcol || ',';
END IF;
END LOOP;
vcols = substr(vcols,1,length(vcols)-1);
--keys
vkey = '';
FOR rkey IN SELECT constraint_name as name, column_name as col
FROM information_schema.key_column_usage
WHERE table_name = ptable
LOOP
IF (vkey = '') THEN
vkey = E'\nCONSTRAINT ' || rkey.name || ' PRIMARY KEY (';
END IF;
vkey = vkey || rkey.col || ',';
END LOOP;
IF (vkey <> '') THEN
vsql = vsql || substr(vkey,1,length(vkey)-1) || ') ';
END IF;
vsql = substr(vsql,1,length(vsql)-1) || ') WITHOUT OIDS';
--index
vidx = '';
cidx = '';
FOR ridx IN SELECT s.indexrelname as nome, a.attname as col
FROM pg_index i LEFT JOIN pg_class c ON c.oid = i.indrelid
LEFT JOIN pg_attribute a ON a.attrelid = c.oid AND a.attnum = ANY(i.indkey)
LEFT JOIN pg_stat_user_indexes s USING (indexrelid)
WHERE c.relname = ptable AND i.indisunique != 't' AND i.indisprimary != 't'
ORDER BY s.indexrelname
LOOP
IF (ridx.nome <> cidx) THEN
IF (vidx <> '') THEN
vidx = substr(vidx,1,length(vidx)-1) || ');';
END IF;
cidx = ridx.nome;
vidx = vidx || E'\nCREATE INDEX ' || cidx || ' ON ' || ptable || ' (';
END IF;
vidx = vidx || ridx.col || ',';
END LOOP;
IF (vidx <> '') THEN
vidx = substr(vidx,1,length(vidx)-1) || ')';
END IF;
--trigger
vtgr = '';
ctgr = '';
etgr = '';
FOR rtgr IN SELECT trigger_name as nome, event_manipulation as eve,
action_statement as act, condition_timing as cond
FROM information_schema.triggers
WHERE event_object_table = ptable
LOOP
IF (rtgr.nome <> ctgr) THEN
IF (vtgr <> '') THEN
vtgr = replace(vtgr, '_#eve_', substr(etgr,1,length(etgr)-3));
END IF;
etgr = '';
ctgr = rtgr.nome;
vtgr = vtgr || 'CREATE TRIGGER ' || ctgr || ' ' || rtgr.cond || ' _#eve_ '
|| 'ON ' || ptable || ' FOR EACH ROW ' || rtgr.act || ';';
END IF;
etgr = etgr || rtgr.eve || ' OR ';
END LOOP;
IF (vtgr <> '') THEN
vtgr = replace(vtgr, '_#eve_', substr(etgr,1,length(etgr)-3));
END IF;
--exclui velha e cria nova
EXECUTE 'DROP TABLE ' || ptable;
IF (EXISTS (SELECT sequence_name FROM information_schema.sequences
WHERE sequence_name = ptable||'_id_seq'))
THEN
EXECUTE 'DROP SEQUENCE '||ptable||'_id_seq';
END IF;
EXECUTE vsql;
--dados na nova
EXECUTE 'INSERT INTO ' || ptable || '(' || vcols || ')' ||
E'\nSELECT ' || vcols || ' FROM zzz_' || ptable;
EXECUTE vseq;
EXECUTE vidx;
EXECUTE vtgr;
EXECUTE 'DROP TABLE zzz_' || ptable;
END;
$BODY$ LANGUAGE plpgsql VOLATILE COST 100;
#Jeremy Gustie's solution above almost works, but will do the wrong thing if the ordinals are off (or fail altogether if the re-ordered ordinals make incompatible types match). Give it a try:
CREATE TABLE test1 (one varchar, two varchar, three varchar);
CREATE TABLE test2 (three varchar, two varchar, one varchar);
INSERT INTO test1 (one, two, three) VALUES ('one', 'two', 'three');
INSERT INTO test2 SELECT * FROM test1;
SELECT * FROM test2;
The results show the problem:
testdb=> select * from test2;
three | two | one
-------+-----+-------
one | two | three
(1 row)
You can remedy this by specifying the column names in the insert:
INSERT INTO test2 (one, two, three) SELECT * FROM test1;
That gives you what you really want:
testdb=> select * from test2;
three | two | one
-------+-----+-----
three | two | one
(1 row)
The problem comes when you have legacy that doesn't do this, as I indicated above in my comment on peufeu's reply.
Update: It occurred to me that you can do the same thing with the column names in the INSERT clause by specifying the column names in the SELECT clause. You just have to reorder them to match the ordinals in the target table:
INSERT INTO test2 SELECT three, two, one FROM test1;
And you can of course do both to be very explicit:
INSERT INTO test2 (one, two, three) SELECT one, two, three FROM test1;
That gives you the same results as above, with the column values properly matched.
The order of the columns is totally irrelevant in relational databases
Yes.
For instance if you use Python, you would do :
cursor.execute( "SELECT id, name FROM users" )
for id, name in cursor:
print id, name
Or you would do :
cursor.execute( "SELECT * FROM users" )
for row in cursor:
print row['id'], row['name']
But no sane person would ever use positional results like this :
cursor.execute( "SELECT * FROM users" )
for id, name in cursor:
print id, name
Well, it's a visual goody for DBA's and can be implemented to the engine with minor performance loss. Add a column order table to pg_catalog or where it's suited best. Keep it in memory and use it before certain queries. Why overthink such a small eye candy.
# Milen A. Radev
The irrelevant need from having a set order of columns is not always defined by the query that pulls them. In the values from pg_fetch_row does not include the associated column name and therefore would require the columns to be defined by the SQL statement.
A simple select * from would require innate knowledge of the table structure, and would sometimes cause issues if the order of the columns were to change.
Using pg_fetch_assoc is a more reliable method as you can reference the column names, and therefore use a simple select * from.
Related
I wanted to make a table that sanity checked record integrity for any duplications among my db.
I have a table currently with object names (tables) and their primary keys:
I want to create a procedure that loops through those objects with their keys, and inserts into a separate table the count of duplicates:
below is my code, but I've never done anything like this before and am new to postgres. What I have is from hours of googling/researching but every time I get closer I get a new error and am quite stuck :( Any insights would be greatly appreciated.
My newest error is I believe from my quote_ident(object_names). I don't want to query the column as postgres is reading it, I'd want that to be a raw string:
code:
do $$
declare
object_names varchar;
keys varchar;
rec record;
begin
for rec in select object_name, key from mfr_incentives.public.t_jh_dup_check
loop
object_names = rec.object_name;
keys = rec.key;
execute 'insert into mfr_incentives.public.t_jh_dup_check_final_output
select * from
(select ' || quote_ident(object_names) || ', ' || quote_ident(keys) || ', ' || ' count(*), current_date from
( select ' || keys || ', count(*)
from ' || object_names ||
' group by ' || keys || ' having count(*) > 1
) a
) a';
end loop;
end;
$$;
Found out my problem!
Being unfamiliar with the topic I finally found that I wanted quote_literal() instead of quote_ident().
The below works:
create or replace procedure public.proc_jh_dup_check()
language plpgsql
--IT WORKS NOW
as $$
declare
rec record;
begin
for rec in select object_name, key from mfr_incentives.public.t_jh_dup_check
loop
execute 'insert into mfr_incentives.public.t_jh_dup_check_final_output
select * from
(select ' || quote_literal(rec.object_name) || ', ' || quote_literal(rec.key) || ', ' || ' count(*), current_date from
( select ' || rec.key || ', count(*)
from ' || rec.object_name ||
' group by ' || rec.key || ' having count(*) > 1
) a
) a';
end loop;
end;
$$;
I am writing a plpgsql function that should update a table based on a provided JSON object. The JSON contains a table representation with all the same columns as the table itself has.
The function currently looks as follows:
CREATE OR REPLACE FUNCTION update (updated json)
BEGIN
/* transfrom json to table */
WITH updated_vals AS (
SELECT
*
FROM
json_populate_recordset(NULL::my_table, updated)
),
/* Retrieve all columns from mytable and also with reference to updated_vals table */
cols AS (
SELECT
string_agg(quote_ident(columns), ',') AS table_cols,
string_agg('updated_vals.' || quote_ident($1), ',') AS updated_cols
FROM
information_schema
WHERE
table_name = 'my_table' -- table name, case sensitive
AND table_schema = 'public' -- schema name, case sensitive
AND column_name <> 'id' -- all columns except id and user_id
AND column_name <> 'user_id'
),
/* Define the table columns separately */
table_cols AS (
SELECT
table_cols
FROM
cols
),
/* Define the updated columns separately */
updated_cols AS (
SELECT
updated_cols
FROM
cols)
/* Execute the update statement */
EXECUTE 'UPDATE my_table'
|| ' SET (' || table_cols::text || ') = (' || updated_cols::text || ') '
|| ' FROM updated_vals '
|| ' WHERE my_table.id = updated_vals.id '
|| ' AND my_table.user_id = updated_vals.user_id';
COMMIT;
END;
I noticed that the combination of the WITH clause combined with the EXECUTE will always trigger the error syntax error at or near EXECUTE, even if those are very simple and straightforward. Is this indeed the case, and if so, what would be an alternative approach to provide the required variables (updated_vals, table_cols and updated_cols) to EXECUTE?
If you have any other improvements on this code I'd be happy to see those for I am very new to sql/plpgsql.
If you wrote table name (my_table) in your function, this means that you will update always only one specified table from JSON data. Because of this, you can write table names and column names in your function manually, not using information_schema. This is the simple and easy way.
For example:
CREATE OR REPLACE FUNCTION rbac.update_users_json(updated json)
RETURNS boolean
LANGUAGE plpgsql
AS $function$
begin
update rbac.users usr
set
username = jsn.username,
first_name = jsn.first_name,
last_name = jsn.last_name
from (
select * from json_populate_recordset(NULL::rbac.users, updated)
) jsn
where jsn.id = usr.id;
return true;
END;
$function$
;
For dynamic tables:
CREATE OR REPLACE FUNCTION rbac.update_users_json_dynamic(updated json)
RETURNS boolean
LANGUAGE plpgsql
AS $function$
declare
f record;
exec_sql text;
sep text;
begin
exec_sql = 'update rbac.users usr set ' || E'\n';
sep = '';
for f in
select clm.column_name
from
information_schema."tables" tbl
inner join
information_schema."columns" clm on
clm.table_name = tbl.table_name
and clm.table_schema = tbl.table_schema
where
tbl.table_schema = 'test'
and tbl.table_name = 'users'
and clm.column_name <> 'id'
loop
exec_sql = exec_sql || sep || f.column_name || ' = ' || 'jsn.' || f.column_name;
sep = ', ' || E'\n';
end loop;
exec_sql = exec_sql || E'\n' || 'from (select * from json_populate_recordset(NULL::rbac.users, ''' ||
updated::text || ''')) jsn ' || E'\n' || 'where jsn.id = usr.id';
execute exec_sql;
return true;
END;
$function$
;
I have a read access only oracle db that exposes Views for me to consume and I want to have local db where I insert the above data. To do that I need to generate create table DDL based on a views and I've found no way to do this.
To clarify I've used
SELECT dbms_metadata.get_ddl('VIEW','table','schema') FROM dual
The result of that statement is
CREATE OR REPLACE VIEW "SCHEMA"."VIEW_NAME" ("ID","NAME") AS
SELECT * FROM SQUARE S
JOIN SHAPE SH ON( S.ID==SH.ID)
where what I want to generate is
CREATE TABLE table_name (
ID NUMBER,
NAME VARCHER2(100),
Also I can't just run
CREATE TABLE new_table
AS (SELECT * FROM view WHERE 1=2);
As the db I can read I can't create tables in.
Is there any tool that allows to run this with 2 db connections ? would that work?
EDIT: For those that can't figure how to do the database Link here is a garbage throwaway script that worked for me
DECLARE
starting boolean := TRUE;
r_owner varchar2(30) := '$1';
r_table_name varchar2(30) := '';
BEGIN
FOR v IN ( --views from a owner
SELECT
VIEW_NAME
FROM
all_views
WHERE owner = r_owner)
LOOP
r_table_name:= v.view_name;
dbms_output.put_line('create table ' || r_owner || '.' || r_table_name || '(');
starting := TRUE;
FOR r IN ( -- columns from table
SELECT
column_name,
data_type,
data_length,
data_precision,
nullable
FROM
all_tab_columns
WHERE
table_name = upper(r_table_name)
AND owner = upper(r_owner)
ORDER BY column_id)
LOOP
IF starting THEN
starting := FALSE;
ELSE
dbms_output.put_line(',');
END IF;
IF r.data_type = 'NUMBER' THEN
IF r.data_length = 22 THEN
dbms_output.put(' '|| r.column_name || ' NUMBER');
ELSE
dbms_output.put(' '|| r.column_name || ' NUMBER(' || r.data_length || ')');
END IF;
ELSIF r.data_type = 'FLOAT' THEN
dbms_output.put(' '|| r.column_name || ' FLOAT(' || r.data_precision || ')');
ELSIF instr(r.data_type, 'CHAR') >0 then
dbms_output.put(' '|| r.column_name||' '||r.data_type||'('||r.data_length||')');
ELSE
dbms_output.put(' '|| r.column_name || ' ' || r.data_type);
END IF;
IF r.nullable = 'N' THEN
dbms_output.put(' NOT NULL');
END IF;
END LOOP;
dbms_output.put_line('');
dbms_output.put_line(' ); ');
dbms_output.put_line('');
END LOOP;
END;
In your local db, create a link to the 'read only' db:
CREATE DATABASE LINK READONLY_DB
CONNECT TO scott IDENTIFIED BY tiger
USING 'readonlydb';
Note, the USING 'readonbly' is referencing a tnsnames.ora entry for your readonly database. Substitute 'readonlydb' with the appropriate, correct value.
Then, with the db link created:
create table my_table as select * from readonly_table#readonly_db;
Where:
'readonly_table' represents the name of the table at the readonly database
'readonly_db' is the name of the database link you created in the first step.
I am trying to write a query that will provide all non UTF-8 encoded characters in a table, that is not specific to a column name. I am doing so by comparing the length of a column not equal to the byte length. %1 is the table name I want to check entered in a parameter. I am joining to user_tab_columns to get the COLUMN_NAME. I then want to take the COLUMN_NAME results and filter down to only show rows that have bad UTF-8 data (where length of a column is not equal to the byte length). Below is what I have come up with but it's not functioning. Can somebody help me tweak this query to get desired results?
SELECT
user_tab_columns.TABLE_NAME,
user_tab_columns.COLUMN_NAME AS ColumnName,
a.*
FROM %1 a
JOIN user_tab_columns
ON UPPER(user_tab_columns.TABLE_NAME) = UPPER('%1')
WHERE (SELECT * FROM %1 WHERE LENGTH(a.ColumnName) != LENGTHB(a.ColumnName))
In your query LENGTH(a.ColumnName) would represent the length of the column name, not the contents of that column. You can't use a value from one table as the column name in another table in static SQL.
Here's a simple demonstration of using dynamic SQL in an anonymous block to report which columns contain any multibyte characters, which is what comparing length with lengthb will tell you (discussed in comments to not rehashing that here):
set serveroutput on size unlimited
declare
sql_str varchar2(256);
flag pls_integer;
begin
for rec in (
select utc.table_name, utc.column_name
from user_tab_columns utc
where utc.table_name = <your table name or argument>
and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
order by utc.column_id
) loop
sql_str := 'select nvl(max(1), 0) from "' || rec.table_name || '" '
|| 'where length("' || rec.column_name || '") '
|| '!= lengthb("' || rec.column_name || '") and rownum = 1';
-- just for debugging, to see the generated query
dbms_output.put_line(sql_str);
execute immediate sql_str into flag;
-- also for debugging
dbms_output.put_line (rec.table_name || '.' || rec.column_name
|| ' flag: ' || flag);
if flag = 1 then
dbms_output.put_line(rec.table_name || '.' || rec.column_name
|| ' contains multibyte characters');
end if;
end loop;
end;
/
This uses a cursor loop to get the column names - I've included the table name too in case you want to wild-card or remove the filter - and inside that loop constructs a dynamic SQL statement, executes it into a variable, and then checks that variable. I've left some debugging output in to see what's happening. With a dummy table created as:
create table t42 (x varchar2(20), y varchar2(20));
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'single byte');
insert into t42 values ('single byte test', 'multibyte ' || unistr('\00FF'));
running that block gets the output:
anonymous block completed
select nvl(max(1), 0) from "T42" where length("X") != lengthb("X") and rownum = 1
T42.X flag: 0
select nvl(max(1), 0) from "T42" where length("Y") != lengthb("Y") and rownum = 1
T42.Y flag: 1
T42.Y contains multibyte characters
To display the actual multibyte-containing values you could use a dynamic loop over the selected values:
set serveroutput on size unlimited
declare
sql_str varchar2(256);
curs sys_refcursor;
val_str varchar(4000);
begin
for rec in (
select utc.table_name, utc.column_name
from user_tab_columns utc
where utc.table_name = 'T42'
and utc.data_type in ('VARCHAR2', 'NVARCHAR2', 'CLOB', 'NCLOB')
order by utc.column_id
) loop
sql_str := 'select "' || rec.column_name || '" '
|| 'from "' || rec.table_name || '" '
|| 'where length("' || rec.column_name || '") '
|| '!= lengthb("' || rec.column_name || '")';
-- just for debugging, to see the generated query
dbms_output.put_line(sql_str);
open curs for sql_str;
loop
fetch curs into val_str;
exit when curs%notfound;
dbms_output.put_line (rec.table_name || '.' || rec.column_name
|| ': ' || val_str);
end loop;
end loop;
end;
/
Which with the same table gets:
anonymous block completed
select "X" from "T42" where length("X") != lengthb("X")
select "Y" from "T42" where length("Y") != lengthb("Y")
T42.Y: multibyte ÿ
As a starting point anyway; it would need some tweaking if you have CLOB values, or NVARCHAR2 or NCLOB - for example you could have one local variable of each type, include the data type in the outer cursor query, and fetch into the appropriate local variable.
I need to update a column matching a specific pattern in all tables in an oracle database.
For example I have in all tables this column *_CID with is a foreign key to master table witch has a primary key CID
Thanks
You can use the naming convention and query all_tab_columns
declare
cursor c is
select table_owner , column_name, table_name from all_tab_columns where column_name like '%_CID';
begin
for x in c loop
execute immediate 'update ' || x.table_owner || '.' || x.table_name ||' set ' || x.column_name||' = 0';
end loop;
end;
If you have valid Fk's you can also use all_tab_constraints the fetch enabled FK's for your main table and fetch the columns name of the r_constraint_name.
I found a solution to my question:
BEGIN
FOR x IN (SELECT owner, table_name, column_name FROM all_tab_columns) LOOP
EXECUTE IMMEDIATE 'update ' || x.owner || '.' || x.table_name ||' set ' || x.column_name||' = 0 where '||x.column_name||' = 1';
END LOOP;
END;
thanks