I know that PostgreSQL tables that use a SERIAL primary key end up with an implicit index, sequence and constraint being created by PostgreSQL. The question is how to rename these implicit objects when the table is renamed. Below is my attempt at figuring this out with specific questions at the end.
Given a table such as:
CREATE TABLE foo (
pkey SERIAL PRIMARY KEY,
value INTEGER
);
Postgres outputs:
NOTICE: CREATE TABLE will create implicit sequence "foo_pkey_seq" for serial column "foo.pkey"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo"
Query returned successfully with no result in 52 ms.
pgAdmin III SQL pane shows the following DDL script for the table (decluttered):
CREATE TABLE foo (
pkey serial NOT NULL,
value integer,
CONSTRAINT foo_pkey PRIMARY KEY (pkey )
);
ALTER TABLE foo OWNER TO postgres;
Now rename the table:
ALTER table foo RENAME TO bar;
Query returned successfully with no result in 17 ms.
pgAdmin III:
CREATE TABLE bar (
pkey integer NOT NULL DEFAULT nextval('foo_pkey_seq'::regclass),
value integer,
CONSTRAINT foo_pkey PRIMARY KEY (pkey )
);
ALTER TABLE bar OWNER TO postgres;
Note the extra DEFAULT nextval('foo_pkey_seq'::regclass), this means that renaming the table does not rename the sequence for the primary keys but now we have this explicit nextval().
Now rename the sequence:
I want to keep the database naming consistent so I tried:
ALTER SEQUENCE foo_pkey_seq RENAME TO bar_pkey_seq;
Query returned successfully with no result in 17 ms.
pgAdmin III:
CREATE TABLE bar (
pkey serial NOT NULL,
value integer,
CONSTRAINT foo_pkey PRIMARY KEY (pkey )
);
ALTER TABLE bar OWNER TO postgres;
The DEFAULT nextval('foo_pkey_seq'::regclass), is gone.
QUESTIONS
Why did the DEFAULT nextval('foo_pkey_seq'::regclass) statement appear and disappear?
Is there a way to rename the table and have the primary key sequence renamed at the same time?
Is it safe to rename the table then sequence while clients are connected to the database, are there any concurrency issues?
How does postgres know which sequence to use? Is there a database trigger being used internally? Is there anything else to rename other than the table and the sequence?
What about the implicit index created by a primary key? Should that be renamed? If so, how can that be done?
What about the constraint name above? It is still foo_pkey. How is a constraint renamed?
serial is not an actual data type. The manual states:
The data types smallserial, serial and bigserial are not true types,
but merely a notational convenience for creating unique identifier columns
The pseudo data type is resolved doing all of this:
create a sequence named tablename_colname_seq
create the column with type integer (or int2 / int8 respectively for smallserial / bigserial)
make the column NOT NULL DEFAULT nextval('tablename_colname_seq')
make the column own the sequence, so that it gets dropped with it automatically
The system does not know whether you did all this by hand or by way of the pseudo data type serial. pgAdmin checks on the listed features and if all are met, the reverse engineered DDL script is simplified with the matching serial type. If one of the features is not met, this simplification does not take place. That is something pgAdmin does. For the underlying catalog tables it's all the same. There is no serial type as such.
There is no way to automatically rename owned sequences. You can run:
ALTER SEQUENCE ... RENAME TO ...
like you did. The system itself doesn't care about the name. The column DEFAULT stores an OID ('foo_pkey_seq'::regclass), you can change the name of the sequence without breaking that - the OID stays the same. The same goes for foreign keys and similar references inside the database.
The implicit index for the primary key is bound to the name of the PK constraint, which will not change if you change the name of the table. In Postgres 9.2 or later you can use
ALTER TABLE ... RENAME CONSTRAINT ..
to rectify that, too.
There can also be indexes named in reference to the table name. Similar procedure:
ALTER INDEX .. RENAME TO ..
You can have all kinds of informal references to the table name. The system cannot forcibly rename objects that can be named anything you like. And it doesn't care.
Of course you don't want to invalidate SQL code that references those names. Obviously, you don't want to change names while application logic references them. Normally this wouldn't be a problem for names of indexes, sequences or constraints, since those are not normally referenced by name.
Postgres also acquires a lock on objects before renaming them. So if there are concurrent transaction open that have any kind of lock on objects in question, your RENAME operation is stalled until those transactions commit or roll back.
System catalogs and OIDs
The database schema is stored in tables of the system catalog in the system schema pg_catalog. All details in the manual here. If you don't know exactly what you are doing, you shouldn't be messing with those tables at all. One false move and you can break your database. Use the DDL commands Postgres provides.
For some of the most important tables Postgres provides object identifier types and type casts to get the name for the OID and vice versa quickly. Like:
SELECT 'foo_pkey_seq'::regclass
If the schema name is in the search_path and the table name is unique, that gives you the same as:
SELECT oid FROM pg_class WHERE relname = 'foo_pkey_seq';
The primary key of most catalog tables is oid and internally, most references use OIDs.
Related
I am trying to insert into a table in a Postgres database from two other Postgres databases using Foreign Data Wrappers. The objective is to have an autogenerate primary key, independent of the source, as there will be more than two in.
I first defined the tables like so:
Target database:
create table dummy (
dummy_pk bigserial primary key
-- other fields
);
Sources databases:
create foreign table dummy (
dummy_pk bigserial
-- other fields
) server ... ;
This solution worked fine as long as I inserted from only one source, when I tried to insert from the other one, without specifying dummy_pk, I got this message:
Duplicate key (dummy_pk)=(1)
Because postgres tries to insert an id of 1, I believe the sequence used for each source foreign table is different. I changed the source tables a bit in an attempt to let the target table's sequence do the job for the id:
create foreign table dummy (
dummy_pk bigint
-- other fields
) server ... ;
This time I got a diffrent error:
NULL value violates NOT NULL constaint on column « dummy_pk »
Therefore I believe the source server sends a query to the target where the dummy_pk is null, and the target does not replace it with the default value.
So, is there a way I can force the use of the target's sequence in a query executed on the source? Maybe I have to share that sequence, can I create a foreign sequence? I cannot remove the column on the foreign tables as I need a read access to them.
Thanks!
Remove dummy_pk from foreign tables so that destination table does not get NULL nor value and so fall backs to DEFAULT or NULL if no DEFAULT specified. If you attempt to pass DEFAULT to foreign table it will try to use DEFAULT value of foreign table instead.
create foreign table dummy (
/*dummy_pk bigserial,*/
column1 text,
column2 int2,
-- other fields
) server ... ;
Another way would be to grab sequence values from destination server using dblink, but I think this is better (if you can afford to have this column removed from foreign tables).
I am trying to create my very first table in postgres, but when I execute this SQL:
create table public.automated_group_msg (
automated_group_msg_idx integer NOT NULL DEFAULT nextval ('automated_group_msg_idx'::regclass),
group_idx integer NOT NULL,
template_idx integer NOT NULL,
CONSTRAINT automated_group_msg_pkey PRIMARY KEY (automated_group_msg_idx),
CONSTRAINT automated_group_msg_group_idx_fkey FOREIGN KEY (group_idx)
REFERENCES public.groups (group_idx) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT automated_msg_template_idx_fkey FOREIGN KEY (template_idx)
REFERENCES public.template (template_idx) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
)
WITH (
OIDS = FALSE
);
I get the following error:
ERROR: relation "automated_group_msg_idx" does not exist
Your error is (likely) because the sequence you're trying to use doesn't exist yet.
But you can create a sequence on the fly using this syntax:
create table public.automated_group_msg (
id serial primary key,
... -- other columns
)
Not directly related to your question, but naming columns with the table name in the name of the column is generally speaking an anti-pattern, especially for primary keys for which id is the industry standard. It also allows for app code refactoring using abstract classes whose id column is always id. It's crystal clear what automated_group_msg.id means and also crystal clear that automated_group_msg.automated_group_msg_id is a train wreck and contains redundant information. Attribute column names like customer.birth_date should also not be over-decorated as customer.customer_birth_date for the same reasons.
You just need to create the sequence before creating the table
CREATE SEQUENCE automated_group_msg_idx;
When I try to change the data type of a column in a table by alter command...
alter table temp alter column id type bigserial;
I get
ERROR: type "bigserial" does not exist
How can I change the datatype from bigint to bigserial?
As explained in the documentation, SERIAL is not a datatype, but a shortcut for a collection of other commands.
So while you can't change it simply by altering the type, you can achieve the same effect by running these other commands yourself:
CREATE SEQUENCE temp_id_seq;
ALTER TABLE temp ALTER COLUMN id SET NOT NULL;
ALTER TABLE temp ALTER COLUMN id SET DEFAULT nextval('temp_id_seq');
ALTER SEQUENCE temp_id_seq OWNED BY temp.id;
Altering the owner will ensure that the sequence is removed if the table/column is dropped. It will also give you the expected behaviour in the pg_get_serial_sequence() function.
Sticking to the tablename_columnname_seq naming convention is necessary to convince some tools like pgAdmin to report this column type as BIGSERIAL. Note that psql and pg_dump will always show the underlying definition, even if the column was initially declared as a SERIAL type.
As of Postgres 10, you also have the option of using an SQL standard identity column, which handles all of this invisibly, and which you can easily add to an existing table:
ALTER TABLE temp ALTER COLUMN id
ADD GENERATED BY DEFAULT AS IDENTITY
ALTERing a column from BIGINTEGER to BIGSERIAL in order to make it auto-increment won't work. BIGSERIAL is not a true type, it is a trick that automates PK and SEQUENCE creation.
Instead you can create a sequence yourself, then assign it as the default for a column:
CREATE SEQUENCE "YOURSCHEMA"."SEQNAME";
ALTER TABLE "YOURSCHEMA"."TABLENAME"
ALTER COLUMN "COLUMNNAME" SET DEFAULT nextval('"YOURSCHEMA"."SEQNAME"'::regclass);
ALTER TABLE "YOURSCHEMA"."TABLENAME" ADD CONSTRAINT pk PRIMARY KEY ("COLUMNNAME");
This is a simple workaround:
ALTER TABLE table_name drop column column_name, add column column_name bigserial;
Sounds like alot of professionals out there on this subject... if the original table did indeed have data then the real answer to this dilemma is to have designed the db correctly in the first place. However, that being the case, to change the column rule (type) would require integrity verification of that column for the new paradigm. And, don't forget, anywhere where that column is manipulated (added/updated) then that would need to be looked into.
If it's a new table then okay, simples: delete column and re-add new column (takes care of the sequence for you). Again, design, design, design.
I think we've all fouled on this.
I create the following table in H2:
CREATE TABLE TEST
(ID BIGINT NOT NULL PRIMARY KEY)
Then I look into INFORMATION_SCHEMA.TABLES table:
SELECT SQL
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'TEST'
Result:
CREATE CACHED TABLE TEST(
ID BIGINT NOT NULL
)
Then I look into INFORMATION_SCHEMA.CONSTRAINTS table:
SELECT SQL
FROM INFORMATION_SCHEMA.CONSTRAINTS
WHERE TABLE_NAME = 'TEST'
Result:
ALTER TABLE TEST
ADD CONSTRAINT CONSTRAINT_4C
PRIMARY KEY(ID)
INDEX PRIMARY_KEY_4C
These statements are not the ones which I have stated, therefore, the question is:
Is the information in TABLES and CONSTRAINS reflects how real SQL which was executed in database?
In original CREATE TABLE statement
there was no CACHED word. (not a problem)
I have never executed ALTER TABLE .. ADD CONSTRAINT statement.
The actual reason why I am asking the question is that I am not sure which statement should I execute in order to guarantee that primary key is used in a clustered index.
If you look at my previous question H2 database: clustered index support then you may find in the answer of Thomas Mueller the following statement:
If a primary key is created after the table has been created then the primary key is stored in a new index b-tree.
Therefore, if the statements are executed as such they are shown in INFORMATION_SCHEMA, then primary key is created after the table is created and hence ID is not used in a clustered index (basically as a key in a data b-tree).
Is there a way how one can guarantee that primary key is used in a clustered index in H2?
Is the information in TABLES and CONSTRAINS reflects how real SQL which was executed in database?
Yes. Basically, those are the statements that are run when opening the database.
If you look at my previous question
The answer "If a primary key is created after the table has been created..." was incorrect, I fixed it now to "If a primary key is created after data has been inserted...".
Is there a way how one can guarantee that primary key is used as a clustered index in H2?
This is now better described in the H2 documentation at "How Data is Stored Internally": "If a single column primary key of type BIGINT, INT, SMALLINT, TINYINT is specified when creating the table (or just after creating the table, but before inserting any rows), then this column is used as the key of the data b-tree."
I have a oracle repository up and running and has say 10 million records. One of the table is say
CREATE TABLE TABLE_A
NAME VARCHAR2(128),
VER VARCHAR2(128),
TYPE VARCHAR2(32),
DESCRIPTION VARCHAR2(256),
CONSTRAINT TABLE_A_PK PRIMARY KEY ("NAME","VERSION");
This table is being used for long and now say I have a requirement to change the primary key constraint. Now I have the requirement to have another column say ID and primary key to be combination of NAME, VER, TYPE and LANG.
In the upgrade script I can have something like
EXECUTE IMMEDIATE
ALTER TABLE TABLE_A ADD LANG VARCHAR2(32);
EXECUTE IMMEDIATE
UPDATE TABLE TABLE_A SET LANG ='|| 'en_US';
EXECUTE IMMEDIATE
UPDATE TABLE TABLE_A SET TYPE='||'n/a'||' WHERE TYPE IS NULL;
Before TYPE can have values and sometimes null. Since after upgrade its part of primary key it cannot be null so making it n/a if its null.
But doing above thing for 10 million records requires upgrade downtime of 5 hours atleast. Is there any other way I can make a previous column as primary key and still won't require much downtime.
Kindly also suggest me if I am wrong with my approach. Thanks in Advance
First of all, I don't understand why using EXECUTE IMMEDIATE.
Then, what about creating a PK using Enabled Novalidated Constraints, it will apply to the new inserted rows but not to the old ones. Like that you can run batch to modify existing data to commit the new PK.
Find out more :
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/data_acc.htm#i6516
For LANG column, you could also give default value :
ALTER TABLE TABLE_A ADD LANG VARCHAR2(32) default 'en_US';
then
ALTER TABLE TABLE_A MODIFY LANG VARCHAR2(32) default null;
Nicolas.
The current primary key would have a supporting index which is probably a unique index on NAME/VERSION.
Once the columns have been added, you can create a unique index on those four columns. Then replace the primary key constraint, drop the old index (if it doesn't do so automatically when you drop the PK constraint) and use the newly created index.
It won't cut the total time, bu it may allow you to break the whole operation out into, say, 5 1-hour steps, rather than a single 5-hour step.