Prevent update to non-existent rows - sql

At work we have a table to hold settings which essentially contains the following columns:
PARAMNAME
VALUE
Most of the time new settings are added but on rare occasions, settings are removed. Unfortunately this means that any scripts which might have previously updated this value will continue to do so despite the fact that the update results in "0 rows updated" and leads to unexpected behaviour.
This situation was picked up recently by a regression test failure but only after much investigation into why the data in the system was different.
So my question is: Is there a way to generate an error condition when an update results in zero rows updated?
Here are some options I have thought of, but none of them are really all that desirable:
PL/SQL wrapper which notices the failed update and throws an exception.
Not ideal as it doesn't stop anyone/a script from manually doing an update.
A trigger on the table which throws an exception.
Goes against our current policy of phasing out triggers.
Requires updating trigger every time a setting is removed and maintaining a list of obsolete settings (if doing exclusion).
Might have problems with mutating table (if doing inclusion by querying what settings currently exist).

A PL/SQL wrapper seems like the best option to me. Triggers are a great thing to phase out, with the exception of generating sequences and inserting history records.
If you're concerned about someone manually updating rather than using the PL/SQL wrapper, just restrict the user role so that it does not have UPDATE privileges on the table but has EXECUTE privileges on the procedure.

Not really a solution but a method to organize things a bit:
Create a separate table with the parameter definitions and link to that table from the parameter value table. Make the reference to the parameter definition required (nulls not allowed).
Definition table PARAMS (ID, NAME)
Actual settings table PARAM_VALUES (PARAM_ID, VALUE)
(changing your table structure is also a very effective way to evoke errors in scripts that have not been updated...)

May be you can use MERGE statement
here is a link for it
http://www.oracle-developer.net/display.php?id=203
The merge statement allows you to combine insert and update in the same query, so in case the desired row does not exist you may insert a record in a buffer table to indicate the the row does not exist or else you can update the required record
Hope it helps

Related

SQL continue executing queries after duplicate key violation

I have a situation where I want to insert a row if it doesn't exist, and to not insert it if it already does. I tried creating sql queries that prevented this from happening (see here), but I was told a solution is to create constraints and catch the exception when they're violated.
I have constraints in place already. My question is - how can I catch the exception and continue executing more queries? If my code looks like this:
cur = transaction.cursor()
#execute some queries that succeed
try:
cur.execute(fooquery, bardata) #this query might fail, but that's OK
except psycopg2.IntegrityError:
pass
cur.execute(fooquery2, bardata2)
Then I get an error on the second execute:
psycopg2.InternalError: current transaction is aborted, commands ignored until end of transaction block
How can I tell the computer that I want it to keep executing queries? I don't want to transaction.commit(), because I might want to roll back the entire transaction (the queries that succeeded before).
I think what you could do is use a SAVEPOINT before trying to execute the statement which could cause the violation. If the violation happens, then you could rollback to the SAVEPOINT, but keep your original transaction.
Here's another thread which may be helpful:
Continuing a transaction after primary key violation error
I gave an up-vote to the SAVEPOINT answer--especially since it links to a question where my answer was accepted. ;)
However, given your statement in the comments section that you expect errors "more often than not," may I suggest another alternative?
This solution actually harkens back to your other question. The difference here is how to load the data very quickly into the right place and format in order to move data around a single SELECT -and- is generic for any table you want to populate (so the same code could be used for multiple different tables). Here's a rough layout of how I would do it in pure PostgreSQL, assuming I had a CSV file in the same format of the table to be inserted into:
CREATE TEMP TABLE input_file (LIKE target_table);
COPY input_file FROM '/path/to/file.csv' WITH CSV;
INSERT INTO target_table
SELECT * FROM input_file
WHERE (<unique key field list>) NOT IN (
SELECT <unique key field list>
FROM target_table
);
Okay, this is a idealized example and I'm also glossing over several things (like reporting back the duplicates, pushing the data into the table via Python in-memory data, COPY from STDIN rather than via a file, etc.), but hopefully the basic idea is there and it's going to avoid much of the overhead if you expect more records to be rejected than accepted.

creating a table if it doesn't exist

Given the following:
if object_id('MyTable') is null create table MyTable( myColumn int )
Is it not possible that two separate callers could both evaluate object_id('MyTable') as null and so both attempt to create the table.
Obviously one of the two callers in that scenario would fail, but ideally no caller should fail, rather one should block and the other should create the table, then the blocked caller will see object_id('MyTable') as not null and proceed.
On what can I apply exclusive lock, such that I'm not locking more than is absolutely required?
After your initial check, use a try catch when creating the table, and if the error is that the table already exists, proceed, if not, you have a bigger problem
Usually CREATE TABLE is run from setup and installation scripts and is unreasonable to expect installation scripts to allow for concurrent installation from separate connections.
I recommend you use a session scopped app lock acquired at the beginning of your install/upgrade procedure, see sp_getapplock.
I don't think, you should be worrying about this.
DDL statements don't run under a transaction. Also, the 2nd caller will fail, if the table already was created by a call from the 1st caller.
I don't allow users to create tables. In general that's a bad practice. If they need to insert data, the table is already there. If you are worried about two people creating the same table are you also worried about whether their data is crossing? I don't know what your proc does but if it deos something like delte the records if the table exists and then insert, then you could have strange reults if two users were on at a the same time. In general though, if you are needing to creat ea table at run time , it is usually a sign that your design needs work.

MSSQL: Disable triggers for one INSERT

This question is very similar to SQL Server 2005: T-SQL to temporarily disable a trigger
However I do not want to disable all triggers and not even for a batch of commands, but just for one single INSERT.
I have to deal with a shop system where the original author put some application logic into a trigger (bad idea!). That application logic works fine as long as you don't try to insert data in another way than the original "administration frontend". My job is to write an "import from staging system" tool, so I have all data ready. When I try to insert it, the trigger overwrites the existing Product Code (not the IDENTITY numeric ID!) with a generated one. To generate the Code it uses the autogenerated ID of an insert to another table, so that I can't even work with the ##IDENTITY to find my just inserted column and UPDATE the inserted row with the actual Product Code.
Any way that I can go to avoid extremly awkward code (INSERT some random characters into the product name and then try to find the row with the random characters to update it).
So: Is there a way to disable triggers (even just one) for just one INSERT?
You may find this helpful:
Disabling a Trigger for a Specific SQL Statement or Session
But there is another problem that you may face as well.
If I understand the situation you are in correctly, your system by default inserts product code automatically(by generating the value).
Now you need to insert a product that was created by some staging system, and for that product its product code was created by the staging system and you want to insert it to the live system manually.
If you really have to do it you need to make sure that the codes generated by you live application in the future are not going to conflict with the code that you inserted manually - I assume they musty be unique.
Other approach is to allow the system to generate the new code and overwrite any corresponding data if needed.
You can disable triggers on a table using:
ALTER TABLE MyTable DISABLE TRIGGER ALL
But that would do it for all sessions, not just your current connection.. which is obviously a very bad thing to do :-)
The best way would be to alter the trigger itself so it makes the decision if it needs to run, whether that be with an "insert type" flag on the table or some other means if you are already storing a type of some sort.
Rather than disabling triggers can you not change the behaviour of the trigger. Add a new nullable column to the table in question called "insertedFromImport".
In the trigger change the code so that the offending bit of the trigger only runs on rows where "insertedFromImport" is null. When you insert your records set "insertedFromImport" to something non-null.
Disable the trigger, insert, commit.
SET IDENTITY_INSERT Test ON
GO
BEGIN TRAN
DISABLE TRIGGER trg_Test ON Test
INSERT INTO Test (MyId, MyField)
VALUES (999, 'foo')
ENABLE TRIGGER trg_Test ON Test
COMMIT TRAN
SET IDENTITY_INSERT Test OFF
GO
Can you check for SUSER_SNAME() and only run when in context of the administration frontend?
I see many things that could create a problem. First change the trigger to consider multiple record imports. That may probably fix your problem. DO not turn off the trigger as it is turned off for everyone not just you. If you must then put the database into single user user mode before you do it and do your task during off hours.
Next, do not under any circumstances ever use ##identity to get the value just inserted! USe scope_identity instead. ##identity will return the wrong value if there are triggers onthe table that also do inserts to other tables with identity fields. If you are using ##identity right now through your system (since we know your system has triggers), your abosolute first priority must be to immediately find and change all instances of ##identity in your code. You can have serious data integrity issues if you do not. This is a "stop all work until this is fixed" kind of problem.
As far as getting the information you just inserted back, consider creating a batchid as part of you insert and then adding a column called batchid (which is nullable so it won't affect other inserts)to the table. Then you can call back what you inserted by batchid.
If you insert using BULK INSERT, you can disable triggers just for the insert.
I'm pretty sure bulk insert will require a data file on the file system to import so you can't just use T-SQL.
To use BULK INSERT you need INSERT and ADMINISTRATOR BULK OPERATION permissions.
If you disable triggers or constraints, you'll also need ALTER TABLE permission.
If you are using windows authentication, your windows user will need read access from the file. if using Mixed Mode authentication, the SQl Server Service account needs read access from the file.
When importing using BULK IMPORT, triggers are disabled by default.
More information: http://msdn.microsoft.com/en-us/library/ms188365.aspx

validating data in database: sql vs code

My database schema has a 'locked' setting meaning that the entry can not be changed once it is set.
Before the locked flag is set we can update other attributes. So:
Would you check the locked flag in code and then update the entry
or
would it be better to combine that into a SQL query, if so, any examples?
EDIT: how would you combine the update & check into one SQL statement?
You should do both. The database should use an update trigger to decide if a row can be updated - this would prevent any one updating the row from the back tables accidentally. And the application should check to see if it should be able to update the rows and act accordingly.
"how would you combine the update & check into one SQL statement?"
update table
set ...
where key = ...
and locked ='N';
That would not raise an error, but would update 0 rows - something you should be able to test for after the update.
As for which is better, my view is that if this locked flag is important then:
you must check/enforce it in the database to ensure it is never violated by any access method
you may also check/enforce it in the application, if that is more user-friendly
I would check the locked flag in code and then (assuming the record isn't locked) run the update query, setting the locked flag in the query as well. That way it can be wrapped in a transaction and committed/rolled back all at once.
I would use a trigger if your DBMS offers this function to enforce the flag. If you set the flag, all updates fail.
Then you can create a special query which will update the flag. Your trigger can check what called the update, and allow the flag to flick back if necessary. That way whether the TSQL is nice or malicious, no one can update your row once the flag is set.
As a rule of thumb I would always prefer to check everything in code and consider writing the DB constraints after that. Having the DB perform consistency checks for you is cool and admittedly faster than doing it in your code but then you are putting some logic in the DB and have to pay some price. These constraints can become vendor-dependent and, worse, can you perform automated tests on these restrictions in your development environment?
So I'd consider putting this kind of checks in the DB as an added safety net but wouldn't rely on them alone.

What is the best way to maintain a LastUpdatedDate column in SQL?

Suppose I have a database table that has a timedate column of the last time it was updated or inserted. Which would be preferable:
Have a trigger update the field.
Have the program that's doing the insertion/update set the field.
The first option seems to be the easiest since I don't even have to recompile to do it, but that's not really a huge deal. Other than that, I'm having trouble thinking of any reasons to do one over the other. Any suggestions?
The first option can be more robust because the database will be maintaining the field. This comes with the possible overhead of using triggers.
If you could have other apps writing to this table in the future, via their own interfaces, I'd go with a trigger so you're not repeating that logic anywhere else.
If your app is pretty much it, or any other apps would access the database through the same datalayer, then I'd avoid that nightmare that triggers can induce and put the logic directly in your datalayer (SQL, ORM, stored procs, etc.).
Of course you'd have to make sure your time-source (your app, your users' pcs, your SQL server) is accurate in either case.
Regarding why I don't like triggers:
Perhaps I was rash by calling them a nightmare. Like everything else, they are appropriate in moderation. If you use them for very simple things like this, I could get on board.
It's when the trigger code gets complex (and expensive) that triggers start to cause lots of problems. They are a hidden tax on every insert/update/delete query you execute (depending on the type of trigger). If that tax is acceptable then they can be the right tool for the job.
You didn't mention 3. Use a stored procedure to update the table. The procedure can set timestamps as desired.
Perhaps that's not feasible for you, but I didn't see it mentioned.
As long as I'm using a DBMS in whose triggers I trust, I'd always go with the trigger option. It allows the DBMS to take care of as many things as possible, which is usually a good thing.
It work make sure under any circumstances that the timestamp column has the correct value. The overhead would be negligible.
The only thing that would be against triggers is portability. If that's not an issue, I don't think there is a question which direction to go.
I would say trigger just in case that someone uses something besides your app to update the table, you probably also want to have a LastUpdatedBy and use SUSER_SNAME() for that, this way you can see who did the update
I'm a proponent of stored procedures for everything. Your update proc could contain a GETDATE() for the column.
And I don't like triggers for this kind of update. Lack of visibility of triggers tends to cause confusion.
This sounds like business logic to me ... I would be more disposed to putting this in the code. Let the database manage the storage of data ... No more and no less.
Triggers are a blessing and a curse.
Blessing: You can use them to enable all kinds of custom constraint checking and data management without backend systems knowledge or changes.
Curse: You don't know whats happening behind your back. Concurrency issues/deadlocks by additional objects brought into transactions that were not origionally expected. Phantom behavior including session environment changes, unreliable rowcounts. Excessive triggering of conditions..additional hotspot/performance penalties.
The answer to this question (Update dates implicitly(trigger) or explicitly (code)) ususally weights heavily on context. For example if you are using last change date as an informational field you might want to only change it when a 'user' actually makes salient changes to a row vs an automated process that simply updates some sort of internal marker users don't care about.
If you are using the trigger for change synchronization or you have no control over code that is executing a trigger makes a lot more sense.
My advise on trigger use it to be careful. Most systems allow you to filter execution based on the operation and fields changed. Proper use of 'before' vs 'after' triggers can have a significant performance impacts.
Finally a few systems are capable of executing a single trigger on multiple changes (multiple rows effected within a transaction) your code should be prepared to apply itself as a bulk update to multiple rows.
Normally I'd say do it database side, but it depends on your application. If you're using LINQ-to-SQL you can just set the field as Timestamp and have your DAL use the Timestamp field for concurrency. It handles it for you automatically, so having to repeat code is a non event.
If you're writing your DAL yourself though, then I'd be more likely to handle this on the database side as it makes writing user interfaces far more flexible - although, I'd likely do this in a stored procedure that has "public" access and the tables locked down - you don't want just any clown coming along and bypassing your stored procedure by writing to the tables directly... unless you plan on making your DAL a standalone component that any future application must use to access the database, in which case, you could code it directly into the DAL - of course, you should only do this if you can guarantee that everyone accessing the database is doing so through your DAL component.
If you're going to allow "public" access to the database to insert into tables, then you'll have to go with the trigger because otherwise anyone can insert/update a single field in the table and the updated field could never get updated.
I would have the date maintained at the database, i.e., a trigger, stored procedure, etc. In most of your database-driven applications the user app is not going to be the only means by which the business users get at data. There are reporting tools, extracts, user SQL, etc. There's also updates and corrections that are done by the DBA that the application won't be providing the date for as well.
But honestly the #1 reason I wouldn't do it from the application is you have no control over the date/time on the client machine. They might be rolling it back to get more days out of a trial license on something or may just want to do bad things to your program.
You can do this without the trigger if your database supports default values on the fields. For example, in SQL Server 2005 I have a table with a field created like this:
create table dbo.Repository
(
...
last_updated datetime default getdate(),
...
)
then the insert code just leaves that field out of the insert field list.
I forgot that only worked for the first insert - I do have an update trigger as well, to update the date fields and put a copy of the updated record in my history table - which I would post ... but the editor keeps erroring out on my code ...
Finally:
create trigger dbo.Repository_Upd on dbo.Repository instead of update
as
--**************************************************************************
-- Trigger: Repository_Upd
-- Author: Ron Savage
-- Date: 09/28/2008
--
-- Description:
-- This trigger sets the last_updated and updated_by fields before the update
-- and puts a copy of the updated row into the Repository_History table.
--
-- Modification History:
-- Date Init Comment
-- 10/22/2008 RS Blocked .prm files from updating the history as they
-- get updated every time the cfg file is run.
-- 10/21/2008 RS Updated the insert into the history table to use the
-- d.last_updated field from the Repository table rather
-- than getdate() to avoid micro second differences.
-- 09/28/2008 RS Created.
--**************************************************************************
begin
--***********************************************************************
-- Update the record but fill in the updated_by, updated_system and
-- last_updated date with current information.
--***********************************************************************
update cr set
cr.filename = i.filename,
cr.created_by = i.created_by,
cr.created_system = i.created_system,
cr.create_date = i.create_date,
cr.updated_by = user,
cr.updated_system = host_name(),
cr.last_updated = getdate(),
cr.content = i.content
from
Repository cr
JOIN Inserted i
on (i.config_id = cr.config_id);
--***********************************************************************
-- Put a copy in the history table
--***********************************************************************
declare #extention varchar(3);
select #extention = lower(right(filename,3)) from Inserted;
if (#extention <> 'prm')
begin
Insert into Repository_History
select
i.config_id,
i.filename,
i.created_by,
i.created_system,
i.create_date,
user as updated_by,
host_name() as updated_system,
d.last_updated,
d.content
from
Inserted i
JOIN Repository d
on (d.config_id = i.config_id);
end
end
Ron