I've a table I need to purge without disrupting the service. About 99.99% of data should be deleted, so I'm trying to recreate the table and moving the 0.01% usefull data into the new table as following (and I will truncate the old table later) :
BEGIN ISOLATION LEVEL SERIALIZABLE;
LOCK TABLE table1 IN ACCESS EXCLUSIVE MODE;
/* I rename the old table */
ALTER TABLE table1 RENAME TO table1_to_be_deleted;
/* And I recreate the table */
CREATE TABLE table1 (
...
);
/* Restore usefull data from old table to new one */
INSERT INTO table1 SELECT * FROM table1_to_be_deleted WHERE toBeKept = 1;
COMMIT;
But when I run my transaction I've got some client's error due to rows not found into the new table but present in the old one. These rows are well tagged as to be kept so they should be copied from old table to the new inside the transaction and found by the client's request....
When other requests are waiting for a lock acquired on a table, has it got a pointer to the targeted object? It's the only I've which can explained the update of the old table after I commit my transaction...
PS : I'm using Postgres 9.1
To do that I'd rather:
create auxilary table
create rules to DML instead of original table to auxilary
create rule to select instead of original, `unionned' both
move good data from ONLY original to auxilary
truncate original
either move back data (will not need to rebuild references) or rename
drop obsoleted rules and objects
But really, I'd just delete from where 99%, not inventing the wheel
Related
We are trying to create a table from another table with method -
create table tab1 as select * from tab2;
But the process failed with error
ORA-01652: unable to extend temp segment by 8192 in tablespace
However the table tab1 is created with partial data only. There is a count mismatch in tab1 and tab2. Any of these two tables being not populated/ updated by any transaction. This happened with a couple of tables.
What my knowledge says about it, a create table should create a table at all or not at all. There is no possibility of table being created partially.
Any insight is suggested from experts.
Putting the cause of the error aside (addressed by #Leo in his answer):
I have not found anything specific on transactions for CREATE TABLE ... AS SELECT. Any CREATE TABLE statement is a DDL operation, which in turn are generally non-transactional operations.
This is just a speculation, but I'd say that the table creation did succeed. The instruction you gave is basically a two in one, where the first one is the actual table creation, which does work (and as it is not transactional, it can't be affected by the second one) and the second is a variant of a bulk insert from select (with implicit commits for batches), which breaks at some point.
This is probably not answering your question, but as the operation is apparently two-phase anyway, if you need more transactional approach, you would benefit from splitting the operation into two separate ones:
first:
CREATE TABLE tab1 AS SELECT * FROM tab2 WHERE 1 = 2;
second:
INSERT INTO tab1 SELECT * FROM tab2;
This way if the second part fails, you will not end up with a partial insert. You will still have the table in place though.
Execute the following to determine the filename for the existing tablespace as sysadmin
SELECT * FROM DBA_DATA_FILES;
Then extend the size of the datafile as follows (replace the filename with the one from the previous query):
ALTER DATABASE DATAFILE 'C:\ORACLEXE\ORADATA\XE\SYSTEM.DBF' RESIZE 4096M;
You can first try below command or ask DBA to give the privilege:
grant unlimited tablespace to <schema_name>;
I am trying to create a stored procedure to recreate a table from scratch, with a possible change of schema (including possible additions/removals of columns), by using a DROP TABLE followed by a SELECT INTO, like this:
BEGIN TRAN
DROP TABLE [MyTable]
SELECT (...) INTO [MyTable] FROM (...)
COMMIT
My concern is that errors could be generated if someone tries to access the table after it has been dropped but before the SELECT INTO has completed. Is there a way to lock [MyTable] in a way that will persist through the DROP?
Instead of DROP/SELECT INTO, I could TRUNCATE/INSERT INTO, but this would not allow the schema to be changed. SELECT INTO is convenient in my situation because it allows the new schema to be automatically determined. Is there a way to make this work safely?
Also, I would like to be sure that the source tables in "FROM (...)" are not locked during this process.
If you try to make a significant change to the table (like adding a column in the middle of existing columns, not at the end) using SSMS and see what script it generates, you'll see that SSMS uses sp_rename.
The general structure of the SSMS's script:
create a new table with temporary name
populate the new table with data
drop the old table
rename the new table to the correct name.
All this in a transaction.
This should keep the time when tables are locked to a minimum.
BEGIN TRANSACTION
SELECT (...) INTO dbo.Temp_MyTable FROM (...)
DROP TABLE dbo.MyTable
EXECUTE sp_rename N'dbo.Temp_MyTable', N'dbo.MyTable', 'OBJECT'
COMMIT
DROP TABLE MyTable acquires a schema modification (Sch-M) lock on it until the end of transaction, so all other queries using MyTable would wait. Even if other queries use the READ UNCOMMITTED isolation level (or the infamous WITH (NOLOCK) hint).
See also MSDN Lock Modes:
Schema Locks
The Database Engine uses schema modification (Sch-M)
locks during a table data definition language (DDL) operation, such as
adding a column or dropping a table. During the time that it is held,
the Sch-M lock prevents concurrent access to the table. This means the
Sch-M lock blocks all outside operations until the lock is released.
Basically I want to do this:
begin;
lock table a;
alter table a rename to b;
alter table a1 rename to a;
drop table b;
commit;
i.e. gain control and replace my old table while no one has access to it.
Simpler:
BEGIN;
DROP TABLE a;
ALTER TABLE a1 RENAME TO a;
COMMIT;
DROP TABLE acquires an ACCESS EXCLUSIVE lock on the table anyway. An explicit LOCK command is no better. And renaming a dead guy is just a waste of time.
You may want to write-lock the old table while preparing the new, to prevent writes in between. Then you'd issue a lock like this earlier in the process:
LOCK TABLE a IN SHARE MODE;
What happens to concurrent transactions trying to access the table? It's not that simple, read this:
Best way to populate a new column in a large table?
Explains why you may have seen error messages like this:
ERROR: could not open relation with OID 123456
Create SQL-backup, make changes you need directly at the backup.sql file and restore database. I used this trick when have added INHERIT for group of tables (Postgres dbms) to remove inherited fields from subtable.
I would use answer#13, but I agree, it will not inherit the constraints, and drop table might fail
line up the relevant constraints first (like from pg_dump --schema-only,
drop the constraints
do the swap per answer#13
apply the constraints (sql snippets from the schema dump)
I would like to populate a table from a (potentially large) view on a scheduled basis.
My process would be:
Disable indexes on table
Truncate table
Copy data from view to table
Enable indexes on table
In SQL Server, I can wrap the process in a transaction such that when I truncate the table a schema modification lock will be held until I commit. This effectively means that no other process can insert/update/whatever until the entire process is complete.
However I am aware that in Oracle the truncate table statement is considered DDL and will thus issue an implicit commit.
So my question is how can I mimic the behaviour of SQL Server here? I don't want any other process trying to insert/update/whatever whilst I am truncating and (re)populating the table. I would also prefer my other process to be unaware of any locks.
Thanks in advance.
Make your table a partitioned table with a single partition and local indexes only. Then whenever you need to refresh:
Copy data from view into a new temporary table
CREATE TABLE tmp AS SELECT ... FROM some_view;
Exchange the partition with the temporary table:
ALTER TABLE some_table
EXCHANGE PARTITION part WITH TABLE tmp
WITHOUT VALIDATION;
The table is only locked for the duration of the partition exchange, which, without validation and global index update, should be instant.
In my project having 23 million records and around 6 fields has been indexed of that table.
Earlier I tested to add delta column for Thinking Sphinx search but it turns in holding the whole database lock for an hour. Afterwards when the file is added and I try to rebuild indexes this is the query that holds the database lock for around 4 hours:
"update user_messages set delta = false where delta = true"
Well for making the server up I created a new database from db dump and promote it as database so server can be turned live.
Now what I am looking is that adding delta column in my table with out table lock is it possible? And once the column delta is added then why is the above query executed when I run the index rebuild command and why does it block the server for so long?
PS.: I am on Heroku and using Postgres with ika db model.
Postgres 11 or later
Since Postgres 11, only volatile default values still require a table rewrite. The manual:
Adding a column with a volatile DEFAULT or changing the type of an existing column will require the entire table and its indexes to be rewritten.
Bold emphasis mine. false is immutable. So just add the column with DEFAULT false. Super fast, job done:
ALTER TABLE tbl ADD column delta boolean DEFAULT false;
Postgres 10 or older, or for volatile DEFAULT
Adding a new column without DEFAULT or DEFAULT NULL will not normally force a table rewrite and is very cheap. Only writing actual values to it creates new rows. But, quoting the manual:
Adding a column with a DEFAULT clause or changing the type of an
existing column will require the entire table and its indexes to be rewritten.
UPDATE in PostgreSQL writes a new version of the row. Your question does not provide all the information, but that probably means writing millions of new rows.
While doing the UPDATE in place, if a major portion of the table is affected and you are free to lock the table exclusively, remove all indexes before doing the mass UPDATE and recreate them afterwards. It's faster this way. Related advice in the manual.
If your data model and available disk space allow for it, CREATE a new table in the background and then, in one transaction: DROP the old table, and RENAME the new one. Related:
Best way to populate a new column in a large table?
While creating the new table in the background: Apply all changes to the same row at once. Repeated updates create new row versions and leave dead tuples behind.
If you cannot remove the original table because of constraints, another fast way is to build a temporary table, TRUNCATE the original one and mass INSERT the new rows - sorted, if that helps performance. All in one transaction. Something like this:
BEGIN
SET temp_buffers = 1000MB; -- or whatever you can spare temporarily
-- write-lock table here to prevent concurrent writes - if needed
LOCK TABLE tbl IN SHARE MODE;
CREATE TEMP TABLE tmp AS
SELECT *, false AS delta
FROM tbl; -- copy existing rows plus new value
-- ORDER BY ??? -- opportune moment to cluster rows
-- DROP all indexes here
TRUNCATE tbl; -- empty table - truncate is super fast
ALTER TABLE tbl ADD column delta boolean DEFAULT FALSE; -- NOT NULL?
INSERT INTO tbl
TABLE tmp; -- insert back surviving rows.
-- recreate all indexes here
COMMIT;
You could add another table with the one column, there won't be any such long locks. Of course there should be another column, a foreign key to the first column.
For the indexes, you could use "CREATE INDEX CONCURRENTLY", it doesn't use too heavy locks on this table http://www.postgresql.org/docs/9.1/static/sql-createindex.html.