Redshift: TRUNCATE TABLE IF EXISTS - sql

It is recommended to use TRUNCATE TABLE instead of DELETE. However, truncate table does not support the IF EXISTS clause. The alternative is to DROP TABLE and recreate but needs DDL. Is there a way to do TRUNCATE TABLE if only the table exists?

You have two options to achieve it:
SQL Procedure/Script
Using IF condition, checking if table exists, then only truncate your table.
With Plain SQL Statements
Use Create table with if not exists in combination with Truncate, this will ensure table always exists & your consecutive SQL statements don't error out & stop.
CREATE TABLE #tobetruncated IF NOT EXISTS
TRUNCATE TABLE #tobetruncated
NOTE: This is not specific to REDSHFIT, mostly applies to all DB unless it supports special functions (like one I know Oracle has TABLE_EXISTS_ACTION). Truncate is like a all or nothing operation, and thats what makes it much better in performance than DELETE.

Related

Clean/truncate table before selecting into it

Just had a quick question to know if this is the right way to do something.
I have a query that I want to create a table. I thought about updating the table with only the changed rows, but since my query only takes about a minute, I think it is easier to just drop the whole table and rerun the query every time I do my hourly update.
Will this be the way to do it?
Truncate Table
Select *
Into Table
From TableTwo
Where X
And then just take that query, turn it into a stored procedure, and turn the procedure into a job that runs once an hour.
Also, I want this table to have indexes. Will they be preserved even if I truncate every time.
You can do this. I would probably advise dropping the table instead. This will better handle changes in table structure.
If you do use truncate, you want insert rather than into:
insert into Table -- column list is recommended
Select *
From TableTwo
Where X;
Dropping the table might take an iota more time, and it doesn't preserve triggers, constraints, foreign key references, and storage definitions. (I'm guessing those are not important.) However, it does allow the query to change over time, which might be useful to future-proof the code.
Truncate doesn't drop the Table from the database, it just cleans up the table.
So you won't be able to run SELECT INTO because ObjectID already exists. However Truncate preserves all the indexes and keys and table integrity
However if you drop the table you get rid it's ObjectID and then you can run SELECT INTO. It's only a good idea if the Table you're inserting from is going to have a lot of changes all the time(which is a bad thing on its own). This method doesn't preserve any indexes or keys and you'd have to create them in the Stored proc every time you run it.
Which is again a bad thing on it's own.
My Suggestion is you should turn it into the Insert Into stored procedure with Truncate in it. If your company decides to make changes to the Table then you go and change your MyTable and SP, it's more headache, but usually companies don't change their database structure very often, unless the database is still in a Development or Testing and not live. In that case SELECT INTO will be only a temporary solution.
CREATE PROC MyProc
as
TRUNCATE TABLE MyTable;
INSERT INTO MyTable (Col1, Col2, Col3)
SELECT Col1, Col2, Col3
FROM TableTwo

Drop table in Oracle 10g, but use where condition

How can I drop a table in oracle, but I want to use a where condition in the query; for example:
drop table employees where employee_id='100';
Is this possible or not?
You're not looking to DROP the table; dropping the table removes the entire table and all data therein.
It appears as though you want to remove some rows from the table. If this is the case you should use the DELETE statement:
delete from employees where employee_id = 100;
It might be worth investigating some basic functionality or taking a course before continuing.
Incidentally, 100 is a number. Using ' implies that it's a string. If the datatype of the column EMPLOYEE_ID is a number you don't need to quote it.
If you need to drop a table there is no need of checking some conditions.
If you need to specify where clause you are probably looking for deleting some rows. Better use delete.
Using condition to check for dropping a table is a new thing and i am gonna try it now.

Creating temporary tables in SQL

I am trying to create a temporary table that selects only the data for a certain register_type. I wrote this query but it does not work:
$ CREATE TABLE temp1
(Select
egauge.dataid,
egauge.register_type,
egauge.timestamp_localtime,
egauge.read_value_avg
from rawdata.egauge
where register_type like '%gen%'
order by dataid, timestamp_localtime ) $
I am using PostgreSQL.
Could you please tell me what is wrong with the query?
You probably want CREATE TABLE AS - also works for TEMPORARY (TEMP) tables:
CREATE TEMP TABLE temp1 AS
SELECT dataid
, register_type
, timestamp_localtime
, read_value_avg
FROM rawdata.egauge
WHERE register_type LIKE '%gen%'
ORDER BY dataid, timestamp_localtime;
This creates a temporary table and copies data into it. A static snapshot of the data, mind you. It's just like a regular table, but resides in RAM if temp_buffers is set high enough. It is only visible within the current session and dies at the end of it. When created with ON COMMIT DROP it dies at the end of the transaction.
Temp tables come first in the default schema search path, hiding other visible tables of the same name unless schema-qualified:
How does the search_path influence identifier resolution and the "current schema"
If you want dynamic, you would be looking for CREATE VIEW - a completely different story.
The SQL standard also defines, and Postgres also supports: SELECT INTO. But its use is discouraged:
It is best to use CREATE TABLE AS for this purpose in new code.
There is really no need for a second syntax variant, and SELECT INTO is used for assignment in plpgsql, where the SQL syntax is consequently not possible.
Related:
Combine two tables into a new one so that select rows from the other one are ignored
ERROR: input parameters after one with a default value must also have defaults in Postgres
CREATE TABLE LIKE (...) only copies the structure from another table and no data:
The LIKE clause specifies a table from which the new table
automatically copies all column names, their data types, and their
not-null constraints.
If you need a "temporary" table just for the purpose of a single query (and then discard it) a "derived table" in a CTE or a subquery comes with considerably less overhead:
Change the execution plan of query in postgresql manually?
Combine two SELECT queries in PostgreSQL
Reuse computed select value
Multiple CTE in single query
Update with results of another sql
http://www.postgresql.org/docs/9.2/static/sql-createtable.html
CREATE TEMP TABLE temp1 LIKE ...

Improving performance of Sql Delete

We have a query to remove some rows from the table based on an id field (primary key). It is a pretty straightforward query:
delete all from OUR_TABLE where ID in (123, 345, ...)
The problem is no.of ids can be huge (Eg. 70k), so the query takes a long time. Is there any way to optimize this?
(We are using sybase - if that matters).
There are two ways to make statements like this one perform:
Create a new table and copy all but the rows to delete. Swap the tables afterwards (alter table name ...) I suggest to give it a try even when it sounds stupid. Some databases are much faster at copying than at deleting.
Partition your tables. Create N tables and use a view to join them into one. Sort the rows into different tables grouped by the delete criterion. The idea is to drop a whole table instead of deleting individual rows.
Consider running this in batches. A loop running 1000 records at a time may be much faster than one query that does everything and in addition will not keep the table locked out to other users for as long at a stretch.
If you have cascade delete (and lots of foreign key tables affected) or triggers involved, you may need to run in even smaller batches. You'll have to experiement to see which is the best number for your situation. I've had tables where I had to delete in batches of 100 and others where 50000 worked (fortunate in that case as I was deleting a million records).
But in any even I would put my key values that I intend to delete into a temp table and delete from there.
I'm wondering if parsing an IN clause with 70K items in it is a problem. Have you tried a temp table with a join instead?
Can Sybase handle 70K arguments in IN clause? All databases I worked with have some limit on number of arguments for IN clause. For example, Oracle have limit around 1000.
Can you create subselect instead of IN clause? That will shorten sql. Maybe that could help for such a big number of values in IN clause. Something like this:
DELETE FROM OUR_TABLE WHERE ID IN
(SELECT ID FROM somewhere WHERE some_condition)
Deleting large number of records can be sped up with some interventions in database, if database model permits. Here are some strategies:
you can speed things up by dropping indexes, deleting records and recreating indexes again. This will eliminate rebalancing index trees while deleting records.
drop all indexes on table
delete records
recreate indexes
if you have lots of relations to this table, try disabling constraints if you are absolutely sure that delete command will not break any integrity constraint. Delete will go much faster because database won't be checking integrity. Enable constraints after delete.
disable integrity constraints, disable check constraints
delete records
enable constraints
disable triggers on table, if you have any and if your business rules allow that. Delete records, then enable triggers.
last, do as other suggested - make a copy of the table that contains rows that are not to be deleted, then drop original, rename copy and recreate integrity constraints, if there are any.
I would try combination of 1, 2 and 3. If that does not work, then 4. If everything is slow, I would look for bigger box - more memory, faster disks.
Find out what is using up the performance!
In many cases you might use one of the solutions provided. But there might be others (based on Oracle knowledge, so things will be different on other databases. Edit: just saw that you mentioned sybase):
Do you have foreign keys on that table? Makes sure the referring ids are indexed
Do you have indexes on that table? It might be that droping before delete and recreating after the delete might be faster.
check the execution plan. Is it using an index where a full table scan might be faster? Or the other way round? HINTS might help
instead of a select into new_table as suggested above a create table as select might be even faster.
But remember: Find out what is using up the performance first.
When you are using DDL statements make sure you understand and accept the consequences it might have on transactions and backups.
Try sorting the ID you are passing into "in" in the same order as the table, or index is stored in. You may then get more hits on the disk cache.
Putting the ID to be deleted into a temp table that has the Ids sorted in the same order as the main table, may let the database do a simple scanned over the main table.
You could try using more then one connection and spiting the work over the connections so as to use all the CPUs on the database server, however think about what locks will be taken out etc first.
I also think that the temp table is likely the best solution.
If you were to do a "delete from .. where ID in (select id from ...)" it can still be slow with large queries, though. I thus suggest that you delete using a join - many people don't know about that functionality.
So, given this example table:
-- set up tables for this example
if exists (select id from sysobjects where name = 'OurTable' and type = 'U')
drop table OurTable
go
create table OurTable (ID integer primary key not null)
go
insert into OurTable (ID) values (1)
insert into OurTable (ID) values (2)
insert into OurTable (ID) values (3)
insert into OurTable (ID) values (4)
go
We can then write our delete code as follows:
create table #IDsToDelete (ID integer not null)
go
insert into #IDsToDelete (ID) values (2)
insert into #IDsToDelete (ID) values (3)
go
-- ... etc ...
-- Now do the delete - notice that we aren't using 'from'
-- in the usual place for this delete
delete OurTable from #IDsToDelete
where OurTable.ID = #IDsToDelete.ID
go
drop table #IDsToDelete
go
-- This returns only items 1 and 4
select * from OurTable order by ID
go
Does our_table have a reference on delete cascade?

Multi Rows Deletion from table in SQL Server

How I can Delete 1.5 Millions Rows From SQL Server 2000, And how much time it will take to complete this task.
I dont want to delete all records from table.... I just want to delete all records which are fullfilling WHERE condition.
EDITED from a comment to an answer below.
"I fire the same query i.e. delete from table_name with Where Clause... Is it possible to Disable Indexing at the running Query, becuase Query is going on from past 20 hr.. Also help me out how i can disable Indexing.."
If (and only if) you want to delete all of the records in a table, you can use DROP TABLE or TRUNCATE TABLE.
DELETE removes one record at a time and records an entry in the transaction log for each deleted row.
TRUNCATE TABLE is much faster because it doesn't record the activity in the transaction log. It removes all rows from a table, but the table structure & its columns, constraints, indexes and so on remain. DROP TABLE would remove those.
Use caution if you decide to TRUNCATE. It's irreversible (unless you have a backup).
create a second table, inserting all rows from the first that you don't want deleting.
delete the first table
rename the second table to be the first
(or a variation on the above)
This can often be quicker than doing a delete of selected records from a big table.
You may want to try deleting in batches too. I just tested this on a table I have and the delete operation went from 13 seconds to 3 seconds.
While Exists(Select * From YourTable Where YourCondition = True)
Delete Top (100000)
From YourTable
Where YourCondition = True
I don't think you can use the TOP predicate if you are running SQL2000, but it works with SQL2005 and up. If you are using SQL2000, then you can use this syntax instead:
Set RowCount 100000
While Exists(Select * From YourTable Where YourCondition = True)
Delete
From YourTable
Where YourCondition = True
DELETE FROM table WHERE a=b;
When deleting that many rows you may want to disable the indexes so they don't get updated on every delete. Rewriting the indexes on every deletion will significantly slow down the whole process.
You'll want to disable these indexes before beginning your deletion or else there may be table locks already in place.
--Disable Index
ALTER INDEX [IX_MyIndex] ON MyTable.MyColumn DISABLE
--Enable Index
ALTER INDEX [IX_MyIndex] ON MyTable.MyColumn REBUILD
If you wish to remove all entries in a table you can use TRUNCATE.
Does the table you are deleting from have multiple foreign keys, or cascaded deletes or triggers? All of these will impact performance.
Depending on what you want to do and the transactional integrity, can you delete things in small batches e.g. if you are trying to delete 1.5 million records that is 1 years worth of data, can you do it 1 week at a time?
Delete from table where condition for those 1.5 million rows
The time depends.
On Oracle it is also possible to use
truncate table <table>
Not sure if that is standard SQL or available in SQL Server. It will however clear the whole table - but then it is quicker than "delete from " (it will also conduct a commit).
TRUNCATE will also ignore any referential integrity or triggers on the table. DELETE FROM ... WHERE will respect both. The time will depend on the indexing of your condition columns, your hardware, and any additional system load.
The delete SQL is exactly the same as a normal SQL delete
delete from table where [your condition ]
However if your worried about time then I'll assume your question is a little deeper than this. If your table is has a significant number of non-clustered indexes then in some circumstances it may be faster to drop all these indexes first and rebuild after the delete. This is unusual but in cases where your straightforward delete is vulnerable to timeout issues it may be helpful
CREATE TABLE new_table as select <data you want to keep> from old_table;
index new_table
grant on new table
add constraints on new_table
etc on new_table
drop table old_table
rename new_table to old_table;