validating data in database: sql vs code

validating data in database: sql vs code - sql

My database schema has a 'locked' setting meaning that the entry can not be changed once it is set.
Before the locked flag is set we can update other attributes. So:
Would you check the locked flag in code and then update the entry
or
would it be better to combine that into a SQL query, if so, any examples?
EDIT: how would you combine the update & check into one SQL statement?

You should do both. The database should use an update trigger to decide if a row can be updated - this would prevent any one updating the row from the back tables accidentally. And the application should check to see if it should be able to update the rows and act accordingly.

"how would you combine the update & check into one SQL statement?"
update table
set ...
where key = ...
and locked ='N';
That would not raise an error, but would update 0 rows - something you should be able to test for after the update.
As for which is better, my view is that if this locked flag is important then:
you must check/enforce it in the database to ensure it is never violated by any access method
you may also check/enforce it in the application, if that is more user-friendly

I would check the locked flag in code and then (assuming the record isn't locked) run the update query, setting the locked flag in the query as well. That way it can be wrapped in a transaction and committed/rolled back all at once.

I would use a trigger if your DBMS offers this function to enforce the flag. If you set the flag, all updates fail.
Then you can create a special query which will update the flag. Your trigger can check what called the update, and allow the flag to flick back if necessary. That way whether the TSQL is nice or malicious, no one can update your row once the flag is set.

As a rule of thumb I would always prefer to check everything in code and consider writing the DB constraints after that. Having the DB perform consistency checks for you is cool and admittedly faster than doing it in your code but then you are putting some logic in the DB and have to pay some price. These constraints can become vendor-dependent and, worse, can you perform automated tests on these restrictions in your development environment?
So I'd consider putting this kind of checks in the DB as an added safety net but wouldn't rely on them alone.

Related

SQL Audit options for update commands where an existing column does not change

I have a need to audit changes where triggers are not performing well enough to use. In the audit I need to know exactly who made the change based on a column named LastModifiedBy (gathered at login and used in inserts and updates). We use a single SQL account to access the database so I cant use that to tie it to a user.
Scenario: Now we are researching the SQL transaction Log to determine what has changed. Table has a LastUpdatedBy column we used with trigger solution. With previous solution I had before and after transaction data so I could tell if the user making the change was the same user or a new user.
Problem: While looking at tools like DBForge Transaction Log and ApexSQL Audit I cant seem to find a solution that works. I can see the Update command but I can't tell if all the fields actually changed (Just because SQL says to update a field does not mean it actually changed value). ApexSQL Audit does have a before and after capability but if the LastUpdatedBy field does not change then I don't know what the original value is.
Trigger problem: Large data updates and inserts are crushing performance because of the triggers. I am gathering before and after data in the triggers so I can tell exactly what changed. But this volume of data is taking a 2 second update of 1000 rows and making it last longer than 3 minutes.

Prevent update to non-existent rows

At work we have a table to hold settings which essentially contains the following columns:
PARAMNAME
VALUE
Most of the time new settings are added but on rare occasions, settings are removed. Unfortunately this means that any scripts which might have previously updated this value will continue to do so despite the fact that the update results in "0 rows updated" and leads to unexpected behaviour.
This situation was picked up recently by a regression test failure but only after much investigation into why the data in the system was different.
So my question is: Is there a way to generate an error condition when an update results in zero rows updated?
Here are some options I have thought of, but none of them are really all that desirable:
PL/SQL wrapper which notices the failed update and throws an exception.
Not ideal as it doesn't stop anyone/a script from manually doing an update.
A trigger on the table which throws an exception.
Goes against our current policy of phasing out triggers.
Requires updating trigger every time a setting is removed and maintaining a list of obsolete settings (if doing exclusion).
Might have problems with mutating table (if doing inclusion by querying what settings currently exist).

A PL/SQL wrapper seems like the best option to me. Triggers are a great thing to phase out, with the exception of generating sequences and inserting history records.
If you're concerned about someone manually updating rather than using the PL/SQL wrapper, just restrict the user role so that it does not have UPDATE privileges on the table but has EXECUTE privileges on the procedure.

Not really a solution but a method to organize things a bit:
Create a separate table with the parameter definitions and link to that table from the parameter value table. Make the reference to the parameter definition required (nulls not allowed).
Definition table PARAMS (ID, NAME)
Actual settings table PARAM_VALUES (PARAM_ID, VALUE)
(changing your table structure is also a very effective way to evoke errors in scripts that have not been updated...)

May be you can use MERGE statement
here is a link for it
http://www.oracle-developer.net/display.php?id=203
The merge statement allows you to combine insert and update in the same query, so in case the desired row does not exist you may insert a record in a buffer table to indicate the the row does not exist or else you can update the required record
Hope it helps

MySQL trigger loop

I am going through the pain stacking process of sorting out someone else code.
So I am decided to recreate a new database to sit alongside the old one then to use triggers to transfer data between both tables.
Now I have an issue with a it looping IE
A trigger on each table to update the other. Once one updates it should update the other but as both tables have triggers it just will loop which will cause an issue.
Is their a way to stop this from happening ?
Hope this makes sense and hope you can advise.

You should be making entries in one db and using the trigger to copy that data to the second db. Having said that you use a check for the existence of the data and exit the trigger. Basically an if record exist do nothing. This site has a good tutorial:
http://www.databasedesign-resource.com/mysql-triggers.html
You may aloso want to read up on triggers in the mySQL manual:
http://dev.mysql.com/doc/refman/5.0/en/triggers.hthl

MSSQL: Disable triggers for one INSERT

This question is very similar to SQL Server 2005: T-SQL to temporarily disable a trigger
However I do not want to disable all triggers and not even for a batch of commands, but just for one single INSERT.
I have to deal with a shop system where the original author put some application logic into a trigger (bad idea!). That application logic works fine as long as you don't try to insert data in another way than the original "administration frontend". My job is to write an "import from staging system" tool, so I have all data ready. When I try to insert it, the trigger overwrites the existing Product Code (not the IDENTITY numeric ID!) with a generated one. To generate the Code it uses the autogenerated ID of an insert to another table, so that I can't even work with the ##IDENTITY to find my just inserted column and UPDATE the inserted row with the actual Product Code.
Any way that I can go to avoid extremly awkward code (INSERT some random characters into the product name and then try to find the row with the random characters to update it).
So: Is there a way to disable triggers (even just one) for just one INSERT?

You may find this helpful:
Disabling a Trigger for a Specific SQL Statement or Session
But there is another problem that you may face as well.
If I understand the situation you are in correctly, your system by default inserts product code automatically(by generating the value).
Now you need to insert a product that was created by some staging system, and for that product its product code was created by the staging system and you want to insert it to the live system manually.
If you really have to do it you need to make sure that the codes generated by you live application in the future are not going to conflict with the code that you inserted manually - I assume they musty be unique.
Other approach is to allow the system to generate the new code and overwrite any corresponding data if needed.

You can disable triggers on a table using:
ALTER TABLE MyTable DISABLE TRIGGER ALL
But that would do it for all sessions, not just your current connection.. which is obviously a very bad thing to do :-)
The best way would be to alter the trigger itself so it makes the decision if it needs to run, whether that be with an "insert type" flag on the table or some other means if you are already storing a type of some sort.

Rather than disabling triggers can you not change the behaviour of the trigger. Add a new nullable column to the table in question called "insertedFromImport".
In the trigger change the code so that the offending bit of the trigger only runs on rows where "insertedFromImport" is null. When you insert your records set "insertedFromImport" to something non-null.

Disable the trigger, insert, commit.
SET IDENTITY_INSERT Test ON
GO
BEGIN TRAN
DISABLE TRIGGER trg_Test ON Test
INSERT INTO Test (MyId, MyField)
VALUES (999, 'foo')
ENABLE TRIGGER trg_Test ON Test
COMMIT TRAN
SET IDENTITY_INSERT Test OFF
GO

Can you check for SUSER_SNAME() and only run when in context of the administration frontend?

I see many things that could create a problem. First change the trigger to consider multiple record imports. That may probably fix your problem. DO not turn off the trigger as it is turned off for everyone not just you. If you must then put the database into single user user mode before you do it and do your task during off hours.
Next, do not under any circumstances ever use ##identity to get the value just inserted! USe scope_identity instead. ##identity will return the wrong value if there are triggers onthe table that also do inserts to other tables with identity fields. If you are using ##identity right now through your system (since we know your system has triggers), your abosolute first priority must be to immediately find and change all instances of ##identity in your code. You can have serious data integrity issues if you do not. This is a "stop all work until this is fixed" kind of problem.
As far as getting the information you just inserted back, consider creating a batchid as part of you insert and then adding a column called batchid (which is nullable so it won't affect other inserts)to the table. Then you can call back what you inserted by batchid.

If you insert using BULK INSERT, you can disable triggers just for the insert.
I'm pretty sure bulk insert will require a data file on the file system to import so you can't just use T-SQL.
To use BULK INSERT you need INSERT and ADMINISTRATOR BULK OPERATION permissions.
If you disable triggers or constraints, you'll also need ALTER TABLE permission.
If you are using windows authentication, your windows user will need read access from the file. if using Mixed Mode authentication, the SQl Server Service account needs read access from the file.
When importing using BULK IMPORT, triggers are disabled by default.
More information: http://msdn.microsoft.com/en-us/library/ms188365.aspx

What is the best way to maintain a LastUpdatedDate column in SQL?

Suppose I have a database table that has a timedate column of the last time it was updated or inserted. Which would be preferable:
Have a trigger update the field.
Have the program that's doing the insertion/update set the field.
The first option seems to be the easiest since I don't even have to recompile to do it, but that's not really a huge deal. Other than that, I'm having trouble thinking of any reasons to do one over the other. Any suggestions?

The first option can be more robust because the database will be maintaining the field. This comes with the possible overhead of using triggers.
If you could have other apps writing to this table in the future, via their own interfaces, I'd go with a trigger so you're not repeating that logic anywhere else.
If your app is pretty much it, or any other apps would access the database through the same datalayer, then I'd avoid that nightmare that triggers can induce and put the logic directly in your datalayer (SQL, ORM, stored procs, etc.).
Of course you'd have to make sure your time-source (your app, your users' pcs, your SQL server) is accurate in either case.
Regarding why I don't like triggers:
Perhaps I was rash by calling them a nightmare. Like everything else, they are appropriate in moderation. If you use them for very simple things like this, I could get on board.
It's when the trigger code gets complex (and expensive) that triggers start to cause lots of problems. They are a hidden tax on every insert/update/delete query you execute (depending on the type of trigger). If that tax is acceptable then they can be the right tool for the job.

You didn't mention 3. Use a stored procedure to update the table. The procedure can set timestamps as desired.
Perhaps that's not feasible for you, but I didn't see it mentioned.

As long as I'm using a DBMS in whose triggers I trust, I'd always go with the trigger option. It allows the DBMS to take care of as many things as possible, which is usually a good thing.
It work make sure under any circumstances that the timestamp column has the correct value. The overhead would be negligible.
The only thing that would be against triggers is portability. If that's not an issue, I don't think there is a question which direction to go.

I would say trigger just in case that someone uses something besides your app to update the table, you probably also want to have a LastUpdatedBy and use SUSER_SNAME() for that, this way you can see who did the update

I'm a proponent of stored procedures for everything. Your update proc could contain a GETDATE() for the column.
And I don't like triggers for this kind of update. Lack of visibility of triggers tends to cause confusion.

This sounds like business logic to me ... I would be more disposed to putting this in the code. Let the database manage the storage of data ... No more and no less.

Triggers are a blessing and a curse.
Blessing: You can use them to enable all kinds of custom constraint checking and data management without backend systems knowledge or changes.
Curse: You don't know whats happening behind your back. Concurrency issues/deadlocks by additional objects brought into transactions that were not origionally expected. Phantom behavior including session environment changes, unreliable rowcounts. Excessive triggering of conditions..additional hotspot/performance penalties.
The answer to this question (Update dates implicitly(trigger) or explicitly (code)) ususally weights heavily on context. For example if you are using last change date as an informational field you might want to only change it when a 'user' actually makes salient changes to a row vs an automated process that simply updates some sort of internal marker users don't care about.
If you are using the trigger for change synchronization or you have no control over code that is executing a trigger makes a lot more sense.
My advise on trigger use it to be careful. Most systems allow you to filter execution based on the operation and fields changed. Proper use of 'before' vs 'after' triggers can have a significant performance impacts.
Finally a few systems are capable of executing a single trigger on multiple changes (multiple rows effected within a transaction) your code should be prepared to apply itself as a bulk update to multiple rows.

Normally I'd say do it database side, but it depends on your application. If you're using LINQ-to-SQL you can just set the field as Timestamp and have your DAL use the Timestamp field for concurrency. It handles it for you automatically, so having to repeat code is a non event.
If you're writing your DAL yourself though, then I'd be more likely to handle this on the database side as it makes writing user interfaces far more flexible - although, I'd likely do this in a stored procedure that has "public" access and the tables locked down - you don't want just any clown coming along and bypassing your stored procedure by writing to the tables directly... unless you plan on making your DAL a standalone component that any future application must use to access the database, in which case, you could code it directly into the DAL - of course, you should only do this if you can guarantee that everyone accessing the database is doing so through your DAL component.
If you're going to allow "public" access to the database to insert into tables, then you'll have to go with the trigger because otherwise anyone can insert/update a single field in the table and the updated field could never get updated.

I would have the date maintained at the database, i.e., a trigger, stored procedure, etc. In most of your database-driven applications the user app is not going to be the only means by which the business users get at data. There are reporting tools, extracts, user SQL, etc. There's also updates and corrections that are done by the DBA that the application won't be providing the date for as well.
But honestly the #1 reason I wouldn't do it from the application is you have no control over the date/time on the client machine. They might be rolling it back to get more days out of a trial license on something or may just want to do bad things to your program.

You can do this without the trigger if your database supports default values on the fields. For example, in SQL Server 2005 I have a table with a field created like this:
create table dbo.Repository
(
...
last_updated datetime default getdate(),
...
)
then the insert code just leaves that field out of the insert field list.
I forgot that only worked for the first insert - I do have an update trigger as well, to update the date fields and put a copy of the updated record in my history table - which I would post ... but the editor keeps erroring out on my code ...
Finally:
create trigger dbo.Repository_Upd on dbo.Repository instead of update
as
--**************************************************************************
-- Trigger: Repository_Upd
-- Author: Ron Savage
-- Date: 09/28/2008
--
-- Description:
-- This trigger sets the last_updated and updated_by fields before the update
-- and puts a copy of the updated row into the Repository_History table.
--
-- Modification History:
-- Date Init Comment
-- 10/22/2008 RS Blocked .prm files from updating the history as they
-- get updated every time the cfg file is run.
-- 10/21/2008 RS Updated the insert into the history table to use the
-- d.last_updated field from the Repository table rather
-- than getdate() to avoid micro second differences.
-- 09/28/2008 RS Created.
--**************************************************************************
begin
--***********************************************************************
-- Update the record but fill in the updated_by, updated_system and
-- last_updated date with current information.
--***********************************************************************
update cr set
cr.filename = i.filename,
cr.created_by = i.created_by,
cr.created_system = i.created_system,
cr.create_date = i.create_date,
cr.updated_by = user,
cr.updated_system = host_name(),
cr.last_updated = getdate(),
cr.content = i.content
from
Repository cr
JOIN Inserted i
on (i.config_id = cr.config_id);
--***********************************************************************
-- Put a copy in the history table
--***********************************************************************
declare #extention varchar(3);
select #extention = lower(right(filename,3)) from Inserted;
if (#extention <> 'prm')
begin
Insert into Repository_History
select
i.config_id,
i.filename,
i.created_by,
i.created_system,
i.create_date,
user as updated_by,
host_name() as updated_system,
d.last_updated,
d.content
from
Inserted i
JOIN Repository d
on (d.config_id = i.config_id);
end
end
Ron

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

validating data in database: sql vs code - sql

You should do both. The database should use an update trigger to decide if a row can be updated - this would prevent any one updating the row from the back tables accidentally. And the application should check to see if it should be able to update the rows and act accordingly.

I would check the locked flag in code and then (assuming the record isn't locked) run the update query, setting the locked flag in the query as well. That way it can be wrapped in a transaction and committed/rolled back all at once.

Related

SQL Audit options for update commands where an existing column does not change

Prevent update to non-existent rows

MySQL trigger loop

MSSQL: Disable triggers for one INSERT

What is the best way to maintain a LastUpdatedDate column in SQL?

Categories

Resources