During the ETL we do the following operations:
begin transaction;
drop table if exists target_tmp;
create table target_tmp like target;
insert into target_tmp select * from source_a inner join source_b on ...;
analyze table target_tmp;
drop table target;
alter table target_tmp rename to target;
commit;
The SQL command is performed by AWS Data Pipeline, if this is important.
However, the pipelines sometimes fail with the following error:
ERROR: table 111566 dropped by concurrent transaction
Redshift supports serializable isolation. Does one of the commands break isolation?
Yes that works, but if generating the temp table takes a while you can expect to see that error for other queries while it runs. You could try generating the temp table in a separate transaction (transaction may not be needed unless you worry about updates to the source tables). Then do a quick rotation of the table names so there is much less time for contention:
-- generate target_tmp first then
begin;
alter table target rename to target_old;
alter table target_tmp rename to target;
commit;
drop table target_old;
Related
I'm trying to create a temporary table to save some codes, but when I try to insert a code it throws me the following error as if the table did not exist:
can't format message 13:796 -- message file C:\Windows\firebird.msg
not found. Dynamic SQL Error. SQL error code = -204. Table unknown.
TEMPCODES. At line 1, column 13.
These are the lines that I try to run:
create global temporary table TEMPCODES
(
codigo varchar(13)
)
on commit delete rows;
insert into TEMPCODES values('20-04422898-0');
Why can't it find the table if I'm creating it before?
In Firebird, you cannot use a database object in the same transaction that created it. You need to commit before you can use the table.
In other words, you should use:
create global temporary table TEMPCODES
(
codigo varchar(13)
)
on commit delete rows;
commit;
insert into TEMPCODES values('20-04422898-0');
Also, it is important to realise that global temporary tables (GTT) are intended as permanent objects. The idea is to create a GTT once, and then use it whenever you need it. The content of a GTT is only visible to the current transaction (on commit delete rows) or to the current connection (on commit preserve rows). Creating a GTT on the fly is not the normal usage pattern for GTTs.
I'm planning to truncate the hive external table which has one partition. So, I have used the following command to truncate the table :
hive> truncate table abc;
But, it is throwing me an error stating : Cannot truncate non-managed table abc.
Can anyone please suggest me out regarding the same ...
Make your table MANAGED first:
ALTER TABLE abc SET TBLPROPERTIES('EXTERNAL'='FALSE');
Then truncate:
truncate table abc;
And finally you can make it external again:
ALTER TABLE abc SET TBLPROPERTIES('EXTERNAL'='TRUE');
By default, TRUNCATE TABLE is supported only on managed tables. Attempting to truncate an external table results in the following error:
Error: org.apache.spark.sql.AnalysisException: Operation not allowed: TRUNCATE TABLE on external tables
Action Required
Change applications. Do not attempt to run TRUNCATE TABLE on an external table.
Alternatively, change applications to alter a table property to set external.table.purge to true to allow truncation of an external table:
ALTER TABLE mytable SET TBLPROPERTIES ('external.table.purge'='true');
There is an even better solution to this, which is basically a one liner.
insert overwrite table table_xyz select * from table_xyz where 1=2;
This code will delete all the files and create a blank file in the external folder location with absolute zero records.
Look at https://issues.apache.org/jira/browse/HIVE-4367 : use
truncate table my_ext_table force;
I have to create a table through ETL in redshift. First, I am loading the data in a temporary table and then transacting it to the main table. The approach below is good when I want to update a data for a particular period. If I have just created the main_table and it doesn't have any data then how do the TRANSACTION and DELETE statements work for initial upload of data in main_table. I am new to ETL process.
BEGIN TRANSACTION;
DELETE FROM main_table
USING temporary_table
WHERE main_table.id = temporary_table.id
AND main_table.time_stamp = temporary_table.time_stamp
INSERT INTO main_table
SELECT * FROM temporary_table ;
END TRANSACTION;
I have a table in a SQL db that I want to remove the data from? I want to keep the columns though.
e.g. my table has 3 columns, Name, Age, Date. I don't want to remove these, i just want to remove the data.
Should I should Truncate, Delete or Drop?
Don't drop - it will delete the data and the definition.
If you delete - the data is gone and auto-increment values go on from the last value.
If you truncate - then it is like you just did create the table. No data and all counters resetted
Truncate is very fast - like quick format of the table. It does not require any extra space when deleting. You can not rollback his operation. You can not specify conditions. This is best choice for deleting all data from table.
Delete is much slower and you need extra space for this operation, because you must be able to rollback the data. If you need to delete all data, use truncate. If you need to specify conditions, use delete.
Drop table - you can delete data from table by dropping and creating it again like you would truncate it, but it is slower and you can have some other problems, like foreign key dependencies. I definitely don't recommend this operation.
delete from TableName should do the trick
DROPing the table will remove the colums too.
Delete: The DELETE Statement is used to delete rows from a table.
Truncate: The SQL TRUNCATE command is used to delete all the rows from the table and free the space containing the table.
*Drop:* The SQL DROP command is used to remove an object from the database. If you drop a table, all the rows in the table is deleted and the table structure is removed from the database
Truncate the table. That would be good option in your case
We can rollback the data in conditions of Delete, Truncate & Drop.
But must be used Begin Transaction before executing query Delete, Drop & Truncate.
Here is example :
Create Database Ankit
Create Table Tbl_Ankit(Name varchar(11))
insert into tbl_ankit(name) values('ankit');
insert into tbl_ankit(name) values('ankur');
insert into tbl_ankit(name) values('arti');
Select * From Tbl_Ankit
/*======================For Delete==================*/
Begin Transaction
Delete From Tbl_Ankit where Name='ankit'
Rollback
Select * From Tbl_Ankit
/*======================For Truncate==================*/
Begin Transaction
Truncate Table Tbl_Ankit
Rollback
Select * From Tbl_Ankit
/*======================For Drop==================*/
Begin Transaction
Drop Table Tbl_Ankit
Rollback
Select * From Tbl_Ankit
I have TABLE1, before making changes I made a backup of the table:
SELECT * INTO TABLE1BACKUP FROM TABLE1
I have made changes to the data in the backup of the table, so now I want to copy the backup table data into the main table. How can I get back my original data? I need to truncate my main table and copy all the data from backup table.
You can only SELECT INTO a new table. In your case, you need:
TRUNCATE TABLE dbo.Table1;
INSERT dbo.Table1 SELECT * FROM dbo.Table1Backup;
Or other options (e.g. the above won't work if there are foreign keys):
DELETE dbo.Table1;
INSERT dbo.Table1 SELECT * FROM dbo.Table1Backup;
If there are foreign keys and child rows that point to this table, you'll need to drop or disable those constraints first too.
If there are no constraints etc. that you need to worry about, an even less intrusive way to do this is:
BEGIN TRANSACTION;
EXEC sp_rename 'dbo.Table1', N'Table1Old', OBJECT;
EXEC sp_rename 'dbo.Table1Backup', N'Table1', OBJECT;
COMMIT TRANSACTION;
DROP TABLE dbo.Table1Old;