Is it ok to KILL this DELETE query? - sql

I ran a query to delete around 4 million rows from my database. It ran for about 12 hours before my laptop lost the network connection. At that point, I decided to take a look at the status of the query in the database. I found that it was in the suspended state. Specifically:
Start Time SPID Database Executing SQL Status command wait_type wait_time wait_resource last_wait_type
---------------------------------------------------------------------------------------------------------------------------------------------------
2018/08/15 11:28:39.490 115 RingClone *see below suspended DELETE PAGEIOLATCH_EX 41 5:1:1116111 PAGEIOLATCH_EX
*Here is the sql query in question:
DELETE FROM T_INDEXRAWDATA WHERE INDEXRAWDATAID IN (SELECT INDEXRAWDATAID FROM T_INDEX WHERE OWNERID='1486836020')
After reading this;
https://dba.stackexchange.com/questions/87066/sql-query-in-suspended-state-causing-high-cpu-usage
I realize I probably should have broken this up into smaller pieces to delete them (or even delete them one-by-one). But now I just want to know if it is "safe" for me to KILL this query, as the answer in that post suggests. One thing the selected answer states is that "you may run into data consistency problems" if you KILL a query while it's executing. If it causes some issues with the data I am trying to delete, I'm not that concerned. However, I'm more concerned about this causing some issues with other data, or with the table structure itself.
Is it safe to KILL this query?

If you ran the delete from your laptop over the network and it lost connection with the server, you can either kill the spid or wait when it will disappear by itself. Depending on the ##version of your SQL Server instance, in particular how well it's patched, the latter might require instance restart.
Regarding the consistency issues, you seem to misunderstand it. It is possible only if you had multiple statements run in a single batch without being wrapped with a transaction. As I understand, you had a single statement; if that's the case, don't worry about consistency, SQL Server wouldn't have become what it is now if it would be so easy to corrupt its data.
I would have rewritten the query however, if T_INDEX.INDEXRAWDATAID column has NULLs then you can run into issues. It's better to rewrite it via join, also adding batch splitting:
while 1=1 begin
DELETE top (10000) t
FROM T_INDEXRAWDATA t
inner join T_INDEX i on t.INDEXRAWDATAID = i.INDEXRAWDATAID
WHERE i.OWNERID = '1486836020';
if ##rowcount = 0
break;
checkpoint;
end;
It definitely will not be any slower, but it can boost performance, depending on your schema, data and the state of any indices the tables have.

Related

Understanding locks and query status in Snowflake (multiple updates to a single table)

While using the python connector for snowflake with queries of the form
UPDATE X.TABLEY SET STATUS = %(status)s, STATUS_DETAILS = %(status_details)s WHERE ID = %(entry_id)s
, sometimes I get the following message:
(snowflake.connector.errors.ProgrammingError) 000625 (57014): Statement 'X' has locked table 'XX' in transaction 1588294931722 and this lock has not yet been released.
and soon after that
Your statement X' was aborted because the number of waiters for this lock exceeds the 20 statements limit
This usually happens when multiple queries are trying to update a single table. What I don't understand is that when I see the query history in Snowflake, it says the query finished successfully (Succeded Status) but in reality, the Update never happened, because the table did not alter.
So according to https://community.snowflake.com/s/article/how-to-resolve-blocked-queries I used
SELECT SYSTEM$ABORT_TRANSACTION(<transaction_id>);
to release the lock, but still, nothing happened and even with the succeed status the query seems to not have executed at all. So my question is, how does this really work and how can a lock be released without losing the execution of the query (also, what happens to the other 20+ queries that are queued because of the lock, sometimes it seems that when the lock is released the next one takes the lock and have to be aborted as well).
I would appreciate it if you could help me. Thanks!
Not sure if Sergio got an answer to this. The problem in this case is not with the table. Based on my experience with snowflake below is my understanding.
In snowflake, every table operations also involves a change in the meta table which keeps track of micro partitions, min and max. This meta table supports only 20 concurrent DML statements by default. So if a table is continuously getting updated and getting hit at the same partition, there is a chance that this limit will exceed. In this case, we should look at redesigning the table updation/insertion logic. In one of our use cases, we increased the limit to 50 after speaking to snowflake support team
UPDATE, DELETE, and MERGE cannot run concurrently on a single table; they will be serialized as only one can take a lock on a table at at a time. Others will queue up in the "blocked" state until it is their turn to take the lock. There is a limit on the number of queries that can be waiting on a single lock.
If you see an update finish successfully but don't see the updated data in the table, then you are most likely not COMMITting your transactions. Make sure you run COMMIT after an update so that the new data is committed to the table and the lock is released.
Alternatively, you can make sure AUTOCOMMIT is enabled so that DML will commit automatically after completion. You can enable it with ALTER SESSION SET AUTOCOMMIT=TRUE; in any sessions that are going to run an UPDATE.

Any non-SELECT queries don't run in Oracle

So, I can successfully run any SELECT statement, but doing any UPDATE statements just hang until they eventually time out. This occurs with trying to execute any stored procedures as well. Other users that connect to the database can run anything without running into this problem.
Is there a cache per user that I can dump or something along those lines? I usually get sick of waiting and cancel the operation, so I don't know if that has contributed to the problem or not.
Just for reference, it's things as simple as these:
UPDATE SOME_TABLE
SET SOME_COLUMN = 'TEST';
EXECUTE SOME_PROCEDURE(1234);
But this works:
SELECT * FROM SOME_TABLE; -- various WHERE clauses don't cause any problems.
UPDATE:
Probably a little disappointing for anyone who came here looking for an answer to a similar problem, but the issue ended up being twofold: The DBA didn't think it was important to give me many details, but there were limitations on the Oracle server that were intentionally set for procedures in general (temp space issues, and things of that ilk). And second, there was an update to the procedure that I wasn't aware of that'd run a sub-query for every record that's pulled in the query (thousands of records). That was removed and now it's running as expected.
In my experience this happens most often because there is another uncommitted operation on the table. For example: User 1 successfully issues an update but does not commit it or roll it back. User 2 (or even another session of User 1) issues another update which just hangs until the other pending update is committed or rolled back. You say that "other users" don't have the same problem, which makes me wonder if they are committing their changes. And if so, if they are updating the same table or a different one.

Master database DB STARTUP problem

I have a SQL Server 2008 database and I have a problem with this database that I don't understand.
The steps that caused the problems are:
I ran a SQL query to update a table called authors from another table called authorAff
The authors table is 123,385,300 records and the authorsAff table is 139,036,077
The query took about 7 days executing but it didn't finish
I decided to cancel the query to do it another way.
The connection on which I was running the query disconnected suddenly so the database became in recovery until the query cancels
The server was shut down many times afterwards because of some electricity problems
The database took about two days and then recovered.
Now when I run this query
SELECT TOP 1000 *
FROM AUTHORS WITH(READUNCOMMITTED)
It executes and returns the results but when I remove WITH(READUNCOMMITTED) hint it gets locked by a process running on the master database that appears only on the Activity Monitor with Command [DB STARTUP] and no results show up.
so what is the DB STARTUP command and if it's a problem, how can I solve it?
Thank you in advance.
I suspect that your user database is still trying to rollback the transaction that you canceled. A general rule of thumb indicates that it will take about the same amount of time, or more, for an aborted transaction to rollback as it has taken to run.
The rollback can't be avoided even with the SQL Server stops and starts you had.
The reason you can run a query WITH(READUNCOMMITTED) is because it's ignoring the locks associated with transaction that is rolling back. Your query results are considered unreliable, but ironically, the results are probably what you want to see since the blocking process is a rollback.
The best solution is to wait it out, if you can afford to do so. You may be able to find ways to kill the blocking process, but then you should be concerned with database integrity.

MS SQL Server 2005 - Stored Procedure "Spontaneously Breaks"

A client has reported repeated instances of Very strange behaviour when executing a stored procedure.
They have code which runs off a cached transposition of a volatile dataset. A stored proc was written to reprocess the dataset on demand if:
1. The dataset had changed since the last reprocessing
2. The datset has been unchanged for 5 minutes
(The second condition stops massive repeated recalculation during times of change.)
This worked fine for a couple of weeks, the SP was taking 1-2 seconds to complete the re-processing, and it only did it when required. Then...
The SP suddenly "stopped working" (it just kept running and never returned)
We changed the SP in a subtle way and it worked again
A few days later it stopped working again
Someone then said "we've seen this before, just recompile the SP"
With no change to the code we recompiled the SP, and it worked
A few days later it stopped working again
This has now repeated many, many times. The SP suddenly "stops working", never returning and the client times out. (We tried running it through management studio and cancelled the query after 15 minutes.)
Yet every time we recompile the SP, it suddenly works again.
I haven't yet tried WITH RECOMPILE on the appropriate EXEC statments, but I don't particularly want to do that any way. It gets called hundred of times an hour and normally does Nothing (It only reprocesses the data a few times a day). If possible I want to avoid the overhead of recompiling what is a relatively complicated SP "just to avoid something which "shouldn't" happen...
Has anyone experienced this before?
Does anyone have any suggestions on how to overcome it?
Cheers,
Dems.
EDIT:
The pseduo-code would be as follows:
read "a" from table_x
read "b" from table_x
If (a < b) return
BEGIN TRANSACTION
DELETE table_y
INSERT INTO table_y <3 selects unioned together>
UPDATE table_x
COMMIT TRANSACTION
The selects are "not pretty", but when executed in-line they execute in no time. Including when the SP refuses to complete. And the profiler shows it is the INSERT at which the SP "stalls"
There are no parameters to the SP, and sp_lock shows nothing blocking the process.
This is the footprint of parameter-sniffing. Yes, first step is to try RECOMPILE, though it doesn't always work the way that you want it to on 2005.
Update:
I would try statement-level Recompile on the INSERT anyway as this might be a statistics problem (oh yeah, check that automatics statistics updating is on).
If this does not seem to fit parameter-sniffing, then compare th actual query plan from when it works correctly and from when it is running forever (use estimated plan if you cannot get the actual, though actual is better). You are looking to see if the plan changes or not.
I totally agree with the parameter sniffing diagnosis. If you have input parameters to the SP which are varying (or even if they aren't varying) - be sure to mask them with a local variable and use the local variable in the SP.
You can also use the WITH RECOMPILE if the set is changing but the query plan is no longer any good.
In SQL Server 2008, you can use the OPTIMIZE FOR UNKNOWN feature.
Also, if your process involves populating a table and then using that table in another operation, I recommend breaking the process up into separate SPs and calling them individually WITH RECOMPILE. I think the plans generated at the outset of the process can sometimes be very poor (so poor as not to complete) when you populate a table and then use the results of that table to carry out an operation. Because at the time of the initial plan, the table was a lot different than after the initial insert.
As others have said, something about the way the data or the source table statistics are changing is causing the cached query plan to go stale.
WITH RECOMPILE will probably be the quickest fix - use SET STATISTICS TIME ON to find out what the recompilation cost actually is before dismissing it out of hand.
If that's still not an acceptable solution, the best option is probably to try to refactor the insert statement.
You don't say whether you're using UNION or UNION ALL in your insert statement. I've seen INSERT INTO with UNION produce some bizarre query plans, particularly on pre-SP2 versions of SQL 2005.
Raj's suggestion of dropping and
recreating the target table with
SELECT INTO is one way to go.
You could also try selecting each of
the three source queries into their own
temporary table, then UNION those temp tables
together in the insert.
Alternatively, you could try a
combination of these suggestions -
put the results of the union into a
temporary table with SELECT INTO,
then insert from that into the target
table.
I've seen all of these approaches resolve performance problems in similar scenarios; testing will reveal which gives the best results with the data you have.
Obviously changing the stored procedure (by recompiling) changes the circumstances that led to the lock.
Try to log the progress of your SP as described here or here.
I would agree with the answer given above in a comment, this sounds like an unclosed transaction, particularly if you are still able to run the select statement from query analyser.
Sounds very much like there is an open transaction with a pending delete for table_y and the insert can't happen at this point.
When your SP locks up, can you perform an insert into table_y?
Do you have an index maintenance job?
Are your statistics up to date? One way to tell is examine the estimated and actual query plans for large variations.
As others have said, this sounds very likely to be an uncommitted transaction.
My best guess:
You'll want to make sure that table_y can be deleted completely and quickly.
If there are other stored procedures or external pieces of code that ever hold transactions on this table, you may be waiting forever. (They may error out and never close the transaction)
Another note: try using truncate if possible. it uses fewer resources than a delete with no where clause:
truncate table table_y
Also, once an error happens within your OWN transaction, it will cause all following calls (every 5 minutes apparently) to "hang", unless you handle your error:
begin tran
begin try
-- do normal stuff
end try
begin catch
rollback
end catch
commit
The very first error is what will give you information about the actual error. Seeing it hang in your own subsequent tests is just a secondary effect.
If you are doing these steps:
DELETE table_y
INSERT INTO table_y <3 selects unioned together>
You might want to try this instead
DROP TABLE table_y
SELECT INTO table_y <3 selects unioned together>

Mass Updates and Commit frequency in SQL Server

My database background is mainly Oracle, but I've recently been helping with some SQL Server work. My group has inherited some SQL server DTS packages that do daily loads and updates of large amounts of data. Currently it is running in SQL Server 2000, but will soon be upgraded to SQL Server 2005 or 2008. The mass updates are running too slowly.
One thing I noticed about the code is that some large updates are done in procedural code in loops, so that each statement only updates a small portion of the table in a single transaction. Is this a sound method to do updates in SQL server? Locking of concurrent sessions should not be an issue because user access to tables is disabled while the bulk loading takes place. I've googled around some, and found some articles suggesting that doing it this way conserves resources, and that resources are released each time an update commits, leading to greater efficiency. In Oracle this is generally a bad approach, and I've used single transactions for very large updates with success in Oracle. Frequent commits slow the process down and use more resources in Oracle.
My question is, for mass updates in SQL Server, is it generally a good practice to use procedural code, and commit many SQL statements, or to use one big statement to do the whole update?
Sorry Guys,
None of the above answer the question. They are just examples of how you can do things. The answer is, more resources get used with frequent commits, however, the transaction log cannot be truncated until a commit point. Thus, if your single spanning transaction is very big it will cause the transaction log to grow and possibly fregment which if undetected will cause problems later. Also, in a rollback situation, the duration is generally twice as long as the original transaction. So if your transaction fails after 1/2 hour it will take 1 hour to roll back and you can't stop it :-)
I have worked with SQL Server2000/2005, DB2, ADABAS and the above is true for all. I don't really see how Oracle can work differently.
You could possibly replace the T-SQL with a bcp command and there you can set the batch size without having to code it.
Issuing frequest commits in a single table scan is prefferable to running multiple scans with small processing numbers because generally if a table scan is required the whole table will be scanned even if you only returning a small subset.
Stay away from snapshots. A snapshot will only increase the number of IOs and competes for IO and CPU
In general, I find it better to update in batches - typically in the range of between 100 to 1000. It all depends on how your tables are structured: foreign keys? Triggers? Or just updating raw data? You need to experiment to see which scenario works best for you.
If I am in pure SQL, I will do something like this to help manage server resources:
SET ROWCOUNT 1000
WHILE 1=1 BEGIN
DELETE FROM MyTable WHERE ...
IF ##ROWCOUNT = 0
BREAK
END
SET ROWCOUNT 0
In this example, I am purging data. This would only work for an UPDATE if you could restrict or otherwise selectively update rows. (Or only insert xxxx number of rows into an auxiliary table that you can JOIN against.)
But yes, try not to update xx million rows at one time. It takes forever and if an error occurs, all those rows will be rolled back (which takes an additional forever.)
Well everything depends.
But ... assuming your db is in single user mode or you have table locks (tablockx) against all the tables involved, batches will probably perform worse. Especially if the batches are forcing table scans.
The one caveat is that very complex queries will quite often consume resources on tempdb, if tempdb runs out of space (cause the execution plan required a nasty complicated hash join) you are in deep trouble.
Working in batches is a general practice that is quite often used in SQL Server (when its not in snapshot isolation mode) to increase concurrency and avoid huge transaction rollbacks because of deadlocks (you tend to get deadlock galore when updating a 10 million row table that is active).
When you move to SQL Server 2005 or 2008, you will need to redo all those DTS packages in SSIS. I think you will pleasantly surprised to see how much faster SSIS can be.
In general, In SQL Server 2000, you want to run things in batches of records if the whole set ties up the table for too long. If you are running the packages at night when there is no use on the system, you may be able to get away with a set-based insert of the entire dataset. Row-by-row is always the slowest method, so avoid that if possible as well (Especially if all the row-row-row inserts are in one giant transaction!). If you have 24 hour access with no down time, you will almost certainly need to run in batches.