Truncating And Inserting on the same table at an instant - sql

We use a DB2 database. Some datawarehouse tables are TRUNCATEd and reloaded every day. We run into deadlock issues when another process is running an INSERT statement against that same table.
Scenario
TRUNCATE is executed on a table.
At the same time another process INSERTS some data in the same table.(The process is based on a trigger and can start at any time )
is there a work around?
What we have thought so far is to prioritize the truncate and then go thruogh with the insert. Is there any way to iplement this. Any help would be appreciated.

You should request a table lock before you execute the truncate.
If you do this you can't get a deadlock -- the table lock won't be granted before the insert finishes and once you have the lock another insert can't occur.
Update from comment:
You can use the LOCK TABLE command. The details depend on your situation but you should be able too get away with SHARED mode. This will allow reads but not inserts (this is the issue you are having I believe.)
It is possible this won't fix your problem. That probably means your insert statement is to complicated -- maybe it is reading from a bunch of other tables or from a federated table. If this is the case, re-architect your solution to include a staging table (first insert into the staging table .. slowly.. then insert into the target table from the staging table).

Related

Mitigate Redshift Locks?

Hi I am running ETL via Python .
I have simple sql file that I run from Python like
truncate table foo_stg;
insert into foo_stg
(
select blah,blah .... from tables
);
truncate table foo;
insert into foo
(
select * from foo_stg
);
This query sometimes takes lock on table which it does not release .
Due to which other processes get queued .
Now I check which table has the lock and kill the process that had caused the lock .
I want to know what changes I can make in my code to mitigate such issues ?
Thanks in Advance!!!
The TRUNCATE is probably breaking your transaction logic. Recommend doing all truncates upfront. I'd also recommend adding some processing logic to ensure that each instance of the ETL process either: A) has exclusive access to the staging tables or B) uses a separate set of staging tables.
TRUNCATE in Redshift (and many other DBs) does an implicit COMMIT.
…be aware that TRUNCATE commits the transaction in which it is run.
Redshift tries to makes this clear by returning the following INFO message to confirm success: TRUNCATE TABLE and COMMIT TRANSACTION. However, this INFO message may not be displayed by the SQL client tool. Run the SQL in psql to see it.
in my case, I created a table the first time and tried to load it from the stage table using insert into a table from select c1,c2,c3 from stage;I am running this using python script.
The table is locking and not loading the data. Another interesting scenario is when I run the same insert SQL from the editor, it is loading, and after that my python script loads the same table without any locks. But the first time only the table lock is happening. Not sure what is the issue.

Performance tuning in SQL Server table

How to do performance tuning for a SQL Server table to speed up the inserts?
For example in an Employee table I have 150 000 records. When I am trying to insert a few more records (around 20k), it is taking 10-15 minutes.
Performance tuning using wait stats is a good approach in your case..below are few steps i would do
step1:
Run insert query
step2:
open another session and run below
select * from sys.dm_exec_requests
Now status and wait type column should give you enough info on what are your next steps
Ex:
If status is blocked(normally inserts won't be blocked),check the blocking query and see why it is blocked
Above is just an example and there is more info online for any wait type you might encounter
My suggestion for speeding inserts is to do a bulk insert into a temporary table and then a single insert into the final table.
This assumes that the source of the new records is an external file.
In any case, your question leaves lots of important information unexplained:
How are you doing the inserts now? It should not be taking 10-15 minutes to insert 20k records.
How many indexes are on the table?
How large is each record?
What triggers are on the table?
What other operations are taking place on the server?
What is the source of the records being inserted?
Do you have indexed views that use the table?
There are many reasons why inserts could be slow.
Some ideas in addition to checking for locks
Disable any on insert triggers if They exist and incorporate there logic into your insert. Also disable any indexes on table and reenable post bulk insert.

Can we insert into a table if a tablock is applied by some other query on the table already in SQL Server?

I am trying to insert some records in a table using insert into...select statement (with TABLOCK). But I am not able to do so.
Thing is that some other query is already applied TABLOCK on the table. Is this the reason?
When I am trying to insert without tablock, it is getting inserted.
Truncate and Drop statement are also not working.
WITH (TABLOCK) changes the behaviour of SQL Server's locking - instead of locking just the affected rows, it locks the entire table.
So if you have an INSERT operation going, it will place an exclusive, table-wide lock - no other operations can now access that table in any way, until that first transaction holding the lock is finished.
Without the (TABLOCK) hint, SQL Server only places exclusive locks on those rows being inserted - any other rows are still accessible (and can even be updated or deleted).
WITH (TABLOCK) is a big heavy sledgehammer - use it WITH CAUTION!

What will happen if a hive(0.13) SELECT and INSERT OVERWRITE are running at the same time

I would like to know what will happen if a hive SELECT and INSERT OVERWRITE is running at the same time. Please help me to understand what will hive query return in the below scenarios.
Run the query first, while the query is running, INSERT OVERWRITE the same table.
Run the INSERT OVERWRITE first, while overwriting, pull the data from the same table with SELECT.
Are we going to get the old data, new data, mixed data, nothing, or unpredictable data?
I am using MapR 4.0.1, Hive 0.13.
Best regards,
Ryan
Read Hive Locking:
For a non-partitioned table, the lock modes are pretty intuitive. When the table is being read, a S lock is acquired, whereas an X lock is acquired for all other operations (insert into the table, alter table of any kind etc.)
So SELECT and INSERT acquire incompatible locks so they can never run in parallel. One will acquire the lock first and the other will wait.
For partitioned tables things are a bit more complex as the locks acquire are hierarchical (S on table, S/X on partition). Read the link.

CREATE TRIGGER is taking more than 30 minutes on SQL Server 2005

On our live/production database I'm trying to add a trigger to a table, but have been unsuccessful. I have tried a few times, but it has taken more than 30 minutes for the create trigger statement to complete and I've cancelled it.
The table is one that gets read/written to often by a couple different processes. I have disabled the scheduled jobs that update the table and attempted at times when there is less activity on the table, but I'm not able to stop everything that accesses the table.
I do not believe there is a problem with the create trigger statement itself. The create trigger statement was successful and quick in a test environment, and the trigger works correctly when rows are inserted/updated to the table. Although when I created the trigger on the test database there was no load on the table and it had considerably less rows, which is different than on the live/production database (100 vs. 13,000,000+).
Here is the create trigger statement that I'm trying to run
CREATE TRIGGER [OnItem_Updated]
ON [Item]
AFTER UPDATE
AS
BEGIN
SET NOCOUNT ON;
IF update(State)
BEGIN
/* do some stuff including for each row updated call a stored
procedure that increments a value in table based on the
UserId of the updated row */
END
END
Can there be issues with creating a trigger on a table while rows are being updated or if it has many rows?
In SQLServer triggers are created enabled by default. Is it possible to create the trigger disabled by default?
Any other ideas?
The problem may not be in the table itself, but in the system tables that have to be updated in order to create the trigger. If you're doing any other kind of DDL as part of your normal processes they could be holding it up.
Use sp_who to find out where the block is coming from then investigate from there.
I believe the CREATE Trigger will attempt to put a lock on the entire table.
If you have a lots of activity on that table it might have to wait a long time and you could be creating a deadlock.
For any schema changes you should really get everyone of the database.
That said it is tempting to put in "small" changes with active connections. You should take a look at the locks / connections to see where the lock contention is.
That's odd. An AFTER UPDATE trigger shouldn't need to check existing rows in the table. I suppose it's possible that you aren't able to obtain a lock on the table to add the trigger.
You might try creating a trigger that basically does nothing. If you can't create that, then it's a locking issue. If you can, then you could disable that trigger, add your intended code to the body, and enable it. (I do not believe you can disable a trigger during creation.)
Part of the problem may also be the trigger itself. Could your trigger accidentally be updating all rows of the table? There is a big differnce between 100 rows in a test database and 13,000,000. It is a very bad idea to develop code against such a small set when you have such a large dataset as you can have no way to predict performance. SQL that works fine for 100 records can completely lock up a system with millions for hours. You really want to know that in dev, not when you promote to prod.
Calling a stored proc in a trigger is usually a very bad choice. It also means that you have to loop through records which is an even worse choice in a trigger. Triggers must alawys account for multiple record inserts/updates or deletes. If someone inserts 100,000 rows (not unlikely if you have 13,000,000 records), then looping through a record based stored proc could take hours, lock the entire table and cause all users to want to hunt down the developer and kill (or at least maim) him because they cannot get their work done.
I would not even consider putting this trigger on prod until you test against a record set simliar in size to prod.
My friend Dennis wrote this article that illustrates why testing a small volumn of information when you have a large volumn of information can create difficulties on prd that you didn't notice on dev:
http://blogs.lessthandot.com/index.php/DataMgmt/?blog=3&title=your-testbed-has-to-have-the-same-volume&disp=single&more=1&c=1&tb=1&pb=1#c1210
Run DISABLE TRIGGER triggername ON tablename before altering the trigger, then reenable it with ENABLE TRIGGER triggername ON tablename