Nervous Running Queries (SQL Server): What do you think about this? - sql

I'm a frequent SQL Server Management Studio user. Sometimes I'm in situations where I have an update or delete query to run, but I'm afraid some typo or logic error on my part is going to cause me to make undesired, massive changes to a table (like change 1000 rows when I meant to change 2).
In the past, I would just clench my fists and hold my breath, but then I wondered if I could do something like this before a running possibly catastrophic query:
1) Run below
begin transaction
(my update/insert/delete statement I want to run)
2) If I'm satisfied, call:
commit transaction
3) Or, if I've fouled something up, just call:
rollback transaction
Is my idea sound, or am I missing something fundamental? I know I could always restore my database, but that seems like overkill compared to above.
EDITS:
1) I agree with testing on a test site before doing anything, but there's still a chance for a problem happening on the production server. Maybe some condition is true on the test server that's not true on production.
2) I'm also used to writing my where first, or doing a select with my where first to ensure I'm isolating the correct rows, but again, something can always go wrong.

Run your WHERE statement as SELECT before you run it as UPDATE or DELETE

Yes, you absolutely can do this. Be aware that you are putting a lock on the table(s) in question, which might interfere with other database activity.

This particular statement has saved my butt at least twice.
SELECT * INTO Table2_Backup FROM Table1
I also agree wholeheartedly with Manu. SELECT before UPDATE or DELETE

Sounds pretty good to me - I basically use this default try/catch query for most of my heavy-lifting; works just as you sketched out, plus it gives you error info if something does go wrong:
BEGIN TRANSACTION
BEGIN TRY
-- do your work here
COMMIT TRANSACTION
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS ErrorNumber,
ERROR_SEVERITY() AS ErrorSeverity,
ERROR_STATE() AS ErrorState,
ERROR_PROCEDURE() AS ErrorProcedure,
ERROR_LINE() AS ErrorLine,
ERROR_MESSAGE() AS ErrorMessage
ROLLBACK TRANSACTION
END CATCH
Marc

The most frequent cause of this fear is being forced to work on Production databases by hand. If that's the case...might be better to get some dev boxes. If not, I think you're fine...

One other thing you can do, though it will take practice: always write your WHERE clause first, so you never have to worry about running an UPDATE or DELETE on all rows.

Here is what I do when writing text to be run from a query window which I have done to fix bad data sent to us from the clients in an import (I always do this on dev first)
begin tran
delete mt
--select *
from mytable mt where <some condition>
--commit tran
--rollback tran
begin tran
update mt
set myfield = replace(myfield, 'some random text', 'some other random text'
--select myid, myfield, replace(myfield, 'some random text', 'some other random text'
from mytable mt where <some condition>
--commit tran
--rollback tran
Note that this way, I can run the select part of the query to first see the records that will be affected and how. Note the where clause is onthe same line as the table (or the last join if I had mulitple joins) This prevents the oops I forgot to highlight the whole thing problem. Note the delete uses an alias so if you run just the first line by accident, you don't delete the whole table.
I've found it best to write scripts so they can be run without highlighting (except when I highlight just the select part to see what records I'm affecting). Save the script in source control. If you have tested the tscript by running on dev and QA, you should be fine to run it on prod without the selects. If you are affecting a large update or delete on a table, I almost always copy those records to a work table first so that I can go back immediately if there is a problem.

If you want to be (or have to be) really paranoid about this, you should have some kind of log table for old/new vals (table/column/old val /new val; maybe user and a timestamp) and fill that with a trigger on insert/update/delete. It's really a hassle to restore some old values from this, but it may be helpful if all else goes horribly wrong. There is a pretty big performance impact, though.
SAP is using this approach (called change docs in SAP parlance) for changes through its GUI and gives a programmers a way to hook into this for changes done through "programs" (although you have to explicitly call this).

Related

What is wrong with the below tsql code

BEGIN TRAN
SELECT * FROM AnySchema.AnyTable
WHERE AnyColumn = SomeCondition
COMMIT
I know the transaction is not required here because it is just a select but just want to know how bad a programming it is and whether it is going to be an overhead on the DB engine.
You may use transaction on SELECT statements to ensure nobody else could update / delete records of the table of while the bunch of your select queries are executing.
Using WITH(NOLOCK):
Anyways, you may also use WITH(NOLOCK) for t_sql
SELECT * FROM AnySchema.AnyTable WITH(NOLOCK)
WHERE AnyColumn = SomeCondition
WITH (NOLOCK) is the equivalent of using READ UNCOMMITED as a transaction isolation level. Here stand the risk of reading an uncommitted row that is subsequently rolled back, i.e. data that never made it into the database.
So, while it can prevent reads being deadlocked by other operations, it comes with a risk.
TRANSACTION Block :
Using TRANSACTION block will not cause much of extra DB overload but if you keep the same type practice on, and suppose , at any SQL block you forget (you / your developers may forget, right ?) to close the transaction, then other processes can't work on the same table.
Anyways, it depends on what type of application you are using. If very frequent update and select things are there , it is advised not to use such transaction blocks. If medium level of updates and select are there, occurs next to each other, you may use transaction blocks for select (but ensure to close the transaction).
That's actually a good question. To understand what's going on, you need to know about SET IMPLICIT_TRANSACTIONS.
When ON, the system is in implicit transaction mode. This means that if ##TRANCOUNT = 0, any of the following Transact-SQL statements begins a new transaction. It is equivalent to an unseen BEGIN TRANSACTION being executed first:
When OFF, each of the preceding T-SQL statements is bounded by an unseen BEGIN TRANSACTION and an unseen COMMIT TRANSACTION statement. When OFF, we say the transaction mode is autocommit. [this is the default]
If your T-SQL code visibly issues a BEGIN TRANSACTION, we say the transaction mode is explicit.
https://msdn.microsoft.com/en-us/library/ms187807.aspx
Since the SQL Server would have created a transaction for you, manually doing doesn't actually change anything. The exact same thing would have happened either way.
Summary: What you are doing isn't 'wrong' because it has no effect, but unnecessary and confusing to the reader.
I think you would be taking a holdlock and most likely a tablock
table hints
That is not always a good thing as you would block any update or deletes (maybe even inserts)
It would be better to let SQL decide what level of of locks to take. Most likely pagelocks. I would stay away from nolock as bad stuff can happen.
On a select on a single table just let the optimizer do it's thing.

How to test your query first before running it sql server

I made a silly mistake at work once on one of our in house test databases. I was updating a record I just added because I made a typo but it resulted in many records being updated because in the where clause I used the foreign key instead of the unique id for the particular record I just added
One of our senior developers told me to do a select to test out what rows it will affect before actually editing it. Besides this, is there a way you can execute your query, see the results but not have it commit to the db until I tell it to do so? Next time I might not be so lucky. It's a good job only senior developers can do live updates!.
It seems to me that you just need to get into the habit of opening a transaction:
BEGIN TRANSACTION;
UPDATE [TABLENAME]
SET [Col1] = 'something', [Col2] = '..'
OUTPUT DELETED.*, INSERTED.* -- So you can see what your update did
WHERE ....;
ROLLBACK;
Than you just run again after seeing the results, changing ROLLBACK to COMMIT, and you are done!
If you are using Microsoft SQL Server Management Studio you can go to Tools > Options... > Query Execution > ANSI > SET IMPLICIT_TRANSACTIONS and SSMS will open the transaction automatically for you. Just dont forget to commit when you must and that you may be blocking other connections while you dont commit / rollback close the connection.
First assume you will make a mistake when updating a db so never do it unless you know how to recover, if you don't don't run the code until you do,
The most important idea is it is a dev database expect it to be messed up - so make sure you have a quick way to reload it.
The do a select first is always a good idea to see which rows are affected.
However for a quicker way back to a good state of the database which I would do anyway is
For a simple update etc
Use transactions
Do a begin transaction and then do all the updates etc and then select to check the data
The database will not be affected as far as others can see until you do a last commit which you only do when you are sure all is correct or a rollback to get to the state that was at the beginning
If you must test in a production database and you have the requisite permissions, then write your queries to create and use temporary tables that in name are similar to the production tables and whose schema other than index names is identical. Index names are unique across a databse, at least on Informix.
Then run your queries and look at the data.
Other than that, IMHO you need a development database, and perhaps even a development server with a development instance. That's paranoid advice, but you'd have to be very careful, even if you were allowed -- MS SQLSERVER lingo here -- a second instance on the same server.
I can reload our test database at will, and that's why we have a test system. Our production system contains citizens' tax payments and other information that cannot be harmed, "or else".
For our production data changes, we always ensure that we use a BEGIN TRAN and a ROLLBACK TRAN and then all statements have an OUTPUT clause. This way we can run the script first (usually in a copy of PRODUCTION db first) and see what is affected before changing the ROLLBACK TRAN to COMMIT TRAN
Have you considered explain ?
If there is a mistake in the command, it will report it as with usual commands.
But if there are no mistakes it will not run the command, it will just explain it.
Example of a "passed" test:
testdb=# explain select * from sometable ;
QUERY PLAN
------------------------------------------------------------
Seq Scan on sometable (cost=0.00..12.60 rows=260 width=278)
(1 row)
Example of a "failed" test:
testdb=# explain select * from sometaaable ;
ERROR: relation "sometaaable" does not exist
LINE 1: explain select * from sometaaable ;
It also works with insert, update and delete (i.e. the "dangerous" ones)

Using transaction rolled back for testing

I just discovered the idea of testing a stored proc by calling it from within a BEGIN TRAN t1 ROLLBACK TRAN t1 pair.
I am a bit afraid of this. Is that a common practice ? Is it reliable ?
My goal here is to quicly test a stored proc that reads and updates 2 databases (same server). The SP does not do any truncate but uses a table variable combined with an INSERT.. OUTPUT statement.
The volume will be low (less than 1000 lines affected).
Thanks
There are a few things that can go wrong:
The proc could do its own transaction management
It could execute non-transactable statements like CREATE DATABASE
It could have an error, causing the transaction to automatically rollback. If the proc then continues to run in some way, it might write stuff outside of a transaction
XACT_ABORT might be used inconsistently, causing the previously mentioned effect
In general, this is a good technique, though.
Truncate is transacted, btw.

How to rollback without transaction

I made a huge mistake, I executed this query:
update Contact set ContaPassword = '7FD736A3070CB9766'
I forgot the WHERE clause, so this way it updated the password for all users. :(
Is there a way I can recover the data before this query?
You cannot undo the change if you ran it outside of a BEGIN TRANSACTION / ROLLBACK. This is why I begin any sort of production data update with:
BEGIN TRANSACTION
-- report the bad or undesired data condition before-hand
SELECT ...
-- change the data
INSERT/UPDATE/DELETE ...
-- ensure we changed a reasonable number of records; may not be accurate if table has triggers
SELECT ##ROWCOUNT
-- get the data condition afterwards and be sure it looks good.
SELECT ...
-- always start with this enabled first
ROLLBACK
-- don't do this until you are very sure the change looks good
-- COMMIT
Martin Smith pointed out this excellent post by Brent Ozar on dba.stackexchange.com on this topic. In full recovery mode, it is possible to examine the log files to see what changed.
Also, as Oded pointed out, if you have backups, it is not hard to get back to the original data. You can restore the backup somewhere and copy the original data.

MS SQL Server 2005 - Stored Procedure "Spontaneously Breaks"

A client has reported repeated instances of Very strange behaviour when executing a stored procedure.
They have code which runs off a cached transposition of a volatile dataset. A stored proc was written to reprocess the dataset on demand if:
1. The dataset had changed since the last reprocessing
2. The datset has been unchanged for 5 minutes
(The second condition stops massive repeated recalculation during times of change.)
This worked fine for a couple of weeks, the SP was taking 1-2 seconds to complete the re-processing, and it only did it when required. Then...
The SP suddenly "stopped working" (it just kept running and never returned)
We changed the SP in a subtle way and it worked again
A few days later it stopped working again
Someone then said "we've seen this before, just recompile the SP"
With no change to the code we recompiled the SP, and it worked
A few days later it stopped working again
This has now repeated many, many times. The SP suddenly "stops working", never returning and the client times out. (We tried running it through management studio and cancelled the query after 15 minutes.)
Yet every time we recompile the SP, it suddenly works again.
I haven't yet tried WITH RECOMPILE on the appropriate EXEC statments, but I don't particularly want to do that any way. It gets called hundred of times an hour and normally does Nothing (It only reprocesses the data a few times a day). If possible I want to avoid the overhead of recompiling what is a relatively complicated SP "just to avoid something which "shouldn't" happen...
Has anyone experienced this before?
Does anyone have any suggestions on how to overcome it?
Cheers,
Dems.
EDIT:
The pseduo-code would be as follows:
read "a" from table_x
read "b" from table_x
If (a < b) return
BEGIN TRANSACTION
DELETE table_y
INSERT INTO table_y <3 selects unioned together>
UPDATE table_x
COMMIT TRANSACTION
The selects are "not pretty", but when executed in-line they execute in no time. Including when the SP refuses to complete. And the profiler shows it is the INSERT at which the SP "stalls"
There are no parameters to the SP, and sp_lock shows nothing blocking the process.
This is the footprint of parameter-sniffing. Yes, first step is to try RECOMPILE, though it doesn't always work the way that you want it to on 2005.
Update:
I would try statement-level Recompile on the INSERT anyway as this might be a statistics problem (oh yeah, check that automatics statistics updating is on).
If this does not seem to fit parameter-sniffing, then compare th actual query plan from when it works correctly and from when it is running forever (use estimated plan if you cannot get the actual, though actual is better). You are looking to see if the plan changes or not.
I totally agree with the parameter sniffing diagnosis. If you have input parameters to the SP which are varying (or even if they aren't varying) - be sure to mask them with a local variable and use the local variable in the SP.
You can also use the WITH RECOMPILE if the set is changing but the query plan is no longer any good.
In SQL Server 2008, you can use the OPTIMIZE FOR UNKNOWN feature.
Also, if your process involves populating a table and then using that table in another operation, I recommend breaking the process up into separate SPs and calling them individually WITH RECOMPILE. I think the plans generated at the outset of the process can sometimes be very poor (so poor as not to complete) when you populate a table and then use the results of that table to carry out an operation. Because at the time of the initial plan, the table was a lot different than after the initial insert.
As others have said, something about the way the data or the source table statistics are changing is causing the cached query plan to go stale.
WITH RECOMPILE will probably be the quickest fix - use SET STATISTICS TIME ON to find out what the recompilation cost actually is before dismissing it out of hand.
If that's still not an acceptable solution, the best option is probably to try to refactor the insert statement.
You don't say whether you're using UNION or UNION ALL in your insert statement. I've seen INSERT INTO with UNION produce some bizarre query plans, particularly on pre-SP2 versions of SQL 2005.
Raj's suggestion of dropping and
recreating the target table with
SELECT INTO is one way to go.
You could also try selecting each of
the three source queries into their own
temporary table, then UNION those temp tables
together in the insert.
Alternatively, you could try a
combination of these suggestions -
put the results of the union into a
temporary table with SELECT INTO,
then insert from that into the target
table.
I've seen all of these approaches resolve performance problems in similar scenarios; testing will reveal which gives the best results with the data you have.
Obviously changing the stored procedure (by recompiling) changes the circumstances that led to the lock.
Try to log the progress of your SP as described here or here.
I would agree with the answer given above in a comment, this sounds like an unclosed transaction, particularly if you are still able to run the select statement from query analyser.
Sounds very much like there is an open transaction with a pending delete for table_y and the insert can't happen at this point.
When your SP locks up, can you perform an insert into table_y?
Do you have an index maintenance job?
Are your statistics up to date? One way to tell is examine the estimated and actual query plans for large variations.
As others have said, this sounds very likely to be an uncommitted transaction.
My best guess:
You'll want to make sure that table_y can be deleted completely and quickly.
If there are other stored procedures or external pieces of code that ever hold transactions on this table, you may be waiting forever. (They may error out and never close the transaction)
Another note: try using truncate if possible. it uses fewer resources than a delete with no where clause:
truncate table table_y
Also, once an error happens within your OWN transaction, it will cause all following calls (every 5 minutes apparently) to "hang", unless you handle your error:
begin tran
begin try
-- do normal stuff
end try
begin catch
rollback
end catch
commit
The very first error is what will give you information about the actual error. Seeing it hang in your own subsequent tests is just a secondary effect.
If you are doing these steps:
DELETE table_y
INSERT INTO table_y <3 selects unioned together>
You might want to try this instead
DROP TABLE table_y
SELECT INTO table_y <3 selects unioned together>