Is it appropriate to raise exceptions in stored procedures that wrap around CRUD operations, when the number of rows affected != 1? - sql

This is a pretty specific question, albeit possibly subjective, but I've been using this pattern very frequently while not seeing others use it very often. Am I missing out on something or being too paranoid?
I wrap all my UPDATE,DELETE,INSERT operations in stored procedures, and only give EXECUTE on my package and SELECT on my tables, to my application. For the UPDATE and DELETE procedures I have an IF statement at the end in which I do the following:
IF SQL%ROWCOUNT <> 1 THEN
RAISE_APPLICATION_ERROR(-20001, 'Invalid number of rows affected: ' || SQL%ROWCOUNT);
END IF;
One could also do this check in the application code, as the number of rows affected is usually available after a SQL statement is executed.
So am I missing something or is this not the safest way to ensure you're updating or deleting exactly what you want to, nothing more, nothing less?

I think this is a fine way to go. If the pl/sql proc is expected to always update/delete/insert a row and it's considered an error otherwise, then what better place to put this check than in the pl/sql proc itself? That way, no matter what client side code (C#, JAVA, PHP, Rails, etc.) uses this proc, you have this error check centralized in one place.
However, I'm not sure you need the check for an insert. If the insert fails, you should get some sort of DB exception, so no need to check for it explicitly unless you are wrapping the error in some other error message and re-raising it.

In most cases I'd use an ORM like Hibernate, which does a similar thing in order to handle Optimistic locking. Also it will use the PK in the where clause.
So I would consider this kind of stored procedure a waste of time:
- A lot of effort for minimal benefit
- Makes usage of tools like ORMs harder, which solve more and more important problems.

Related

Trigger calls Stored Procedure and if we we do a select will the return values be the new or old?

Using MS SQL Server, a Trigger calls a Stored Procedure which internally makes a select, will the return values be the new or old ones?
I know that inside the trigger I can access them by FROM INSERTED i inner join DELETED, but in this case I want to reuse (cannot change it) an existing Stored Procedure that internally makes a select on the triggered table and processes some logic with them. I just want to know if I can be sure that the existing logic will work or not (by accessing the NEW values).
I can simply try to simulate it with one update... But maybe there are other cases (example: using transactions or something else) that I maybe not be aware and never test it that could result in a different case.
I decided to ask someone else that might know better. Thank you.
AFTER triggers (the default) fire after the DML action. When the proc is called within the trigger, the tables will reflect changes made by the statement that fired the trigger as well changes made within the trigger before calling the proc.
Note changes are uncommitted until the trigger completes or explict transaction later committed.
Since the procedure is running in the same transaction as the (presumably, "after") trigger, it will see the uncommitted data.
I hope you see the implications of that: the trigger is executing as part of the transaction started by the DML statement that caused it to fire, so the stored procedure is part of the same transaction, so a "complicated" stored procedure means that transaction stays open longer, holding locks longer, making responses back to users slower, etc etc.
Also, you said
internally makes a select on the triggered table and processes some logic with them.
if you just mean that the procedure is selecting the data in order to do some complex processing and then write it to somewhere else inside the database, ok, that's not great (for reasons given above), but it will "work".
But just in case you mean you are doing some work on the data in the procedure and then returning that back to the client application, Don't do that
The ability to return results from triggers will be removed in a future version of SQL Server. Triggers that return result sets may cause unexpected behavior in applications that aren't designed to work with them. Avoid returning result sets from triggers in new development work, and plan to modify applications that currently do. To prevent triggers from returning result sets, set the disallow results from triggers option to 1.

Where do you put SQL RAISERROR code?

I'm trying to set up a custom error message to pass to MS Access (from SQL Server) when a user enters a duplicate key (instead of system message 2627). I've read up on sp_addmessage and RAISERROR and TRY/CATCH blocks which all make perfect sense. But nowhere I've looked does it seem to say where you put the RAISERROR code (and TRY/CATCH block) so it will actually pass back to the application. So, where does the code go?
Don't think about it in terms of users entering duplicate keys. Instead, think in terms of users just entering keys, some of which turn out to be duplicates when you try to insert them. It's a subtle difference, but it helps you here, because it means you think in terms of having your code available for all new table inserts instead of just a specific type of insert.
When a user enters a key, an INSERT sql statement runs. If the key is a duplicate and you have the constraints defined on the table to the prevent that, the INSERT statement fails. If your Access application has you writing custom SQL, you can wrap this in a TRY/CATCH, and put the RAISERROR in the CATCH block. If your Access application is such that you never see any SQL, you may be stuck, and have to put up with the built-in behavior.

SQL Server: verbose error messages?

Is there some configuration option for MS SQL Server which enables more verbose error messages.
Specific example: I would like to see the actual field values of the inserted record which violates a constraint during an insert, to help track down a bug in stored procedures which I haven't been able to reproduce.
I don't believe there is any such option. There are trace flags that give more information about deadlocks, but I've never heard of one that gives more information on a constraint violation.
If you control the application that is causing the crash then extending it's handling (as Jenn suggested) to include parameter values etc. Once you have the parameter values you can get a copy of live setup on a non-live server and start debugging the issue.
For more options, can any of the users affected reliably reproduce the issue? If they can then you might be able to run a profiler trace to capture the actual statements / parameter values being sent to the database. Of course, if you can figure out the steps to reproduce the issue then you can probably use more traditional debugging methods...
You don't say what the constraint is, I'm assuming it is a fairly complex constraint. If so, could it be broken down into several constraints so you can get more of a hint about the problem with the data?
You could also re-write the constraint as a trigger which could then include more information in the error that it raises. Although this would obviously need testing before being deployed to a production server!
Personally, I would go with changing the error handling of the application. It is probably the less risky change.
PS The application that I helped write, and now spend my time supporting, logs quite a lot of data when an unhandled exception occurs. If it is during a save then our data access layer attaches the complete list of all commands that were being run as part of the save transaction including parameter values. This has proved to be invaluable on many occasions, including some when tracking down constraint violations.
In a stored proc, what I do to get better informatino in a complex SP about the errors is take advantage of the fact that table variables are not affected by a rollback. So I put the information I want to use to troubleshoot into table variables at the time I create it and then if I hit the catch block and rollback, after the rollback I insert the data from the table variable into an exception table along with some meta data like the datetime.
With some thought you can design an exception table that will capture what you need from just about any proc (for instance you could concatenate all the input variables into one field, you could put in the step number that failed (of course then you have to assign stepnumbers to a variable) or you could log every step along the awy and then the last one logged is the one it failed on. Belive me when you are looking at troubleshooting a 100 line SP, this can come in handy. If I have dymanic SQl inteh proc, I can log the SQL variable that contains the dynamic code that was run.
The beauty of this is now you don't have to try to reproduce the error, you know what the input parameters were and any other information you find useful. Yes it can take a bit of time to set up once, but once you do it is relatively easy to get in the habit of putting it into any complex proc that you will want to log errors on.
You might also want to set a an nongeneralized one if you want to return spefic data values of a select used for an insert or the result set of a select that would tell you waht what wopuld have been being updated or deleted. Then you would have that only if the proc failed. This is a little more work than the general exception table but may be needed in some complex cases.

Best practices of structuring stored procedures

As a developer mainly writing c# I have adopted some good practices when writing c# code. When I sometimes write stored procedures I have trouble applying those practices to the stored procedure code.
On several occasions I have inherited nightmare stored procedure code, first three or four layers of stored procedures setting up some temp tables and mostly calling each other. No real work done and just a few lines of code. Then at last there is a call to "the final" stored procedure, a big monster of 3000-5000 lines of SQL code. That code usually have a lot of code smells like code duplication, intricate control flows (a.k.a. spaghetti) and a method that does too many things stacked after each other with no clear separation where one chunk of work starts and where it ends (not even a comment as a divisor).
I have also noticed the use of out commented select statements that selects from intermediate temp tables. The selects can be turned back on for debug purposes, but need to be removed before any calling code expecting a specific order of the returned result sets.
Apparently my fellow team mates also share my lack of good SQL writing practices.
So... ( and here comes the real question) ... what are good practices for writing modular maintainable stored procedures?
Both home made practices and references to books/blogs are welcome. Methods as well as tools that help with certain tasks.
Lets summarize some areas where I have not found good practices
Modularization and encapsulation (is stored procedures communication via temp tables really the way to go?)
In c# I use assemblies, classes and methods decorated with access modifiers to accomplish this.
Debugging/testing (better than modifying the target of debugging?)
Debug tools?
Debug traces?
Test fixtures?
Emphasizing code/logic/data/control flow using code the structure of the code
In c# I refactor and break out smaller methods that does just one logical task each.
Code duplication
Mostly I encounter SQL Server as DBMS but DBMS agnostic answers or answers pointing out features of other DBMS:es that help in above cases are also welcome.
To give some background: Most large stored procedures I have encountered are in reporting scenarios where the base is to just create some summary values from a large table. But along the way you need to exclude some of the values that happen to be in some exception table, add some of the values in some not yet completed stuff table, compare with last year (can you imagine the ugly code that handles products changing department between years?), etc.
I write a lot of complex stored procs. Some things I would consider best practices:
Don't use dynamic SQl in a stored proc unless you are doing a search proc with lots of parameters which may or may not be needed (then it is currently one of the best solutions). If you must use dynamic SQl in a proc always have a debug input parameter and if the debug parameter is set, then print the SQL statement created rather than executing it. This will save hours of debugging time!
If you are performing more than one action query in a proc (insert/update/delete), use Try Cacth blocks and transaction processing. Add a test parameter to the input parameters and when it is is set to 1, always rollback the entire transaction. Before rolling back in test mode, I usually have a section that returns the values in the tables I'm affecting to ensure that what I think I am doing to the database is in fact what I did do. Or you could have checks as go as shown below. That is as simple as putting in the following code around your currently commented out selects (and uncommenting them) once you have the #test parameter.
If #test =1
Begin
Select * from table1 where field1 = #myfirstparameter
End
Now you don't have to go through and comment and uncomment each time time you test.
#test or #debuig should always be set with a default value of 0 and placed last in the list. That way adding them won't break existing calls of the proc.
Consider having logging and/or error logging tables for procs doing inserts/updates/deletes. If you record the the steps and or errors in table variables as you go, they are still available after a rollback to be inserted into the logging table. Knowing what part of a complex proc failed and what the error was can be invaluable later on.
Where possible do not nest stored procs. If you need to run multiple records in a loop, replace the stored proc with one that has a table-valued parameter and set up the proc to run in a set-based and not individual record fashion. This will work if the table-valued parameter has one record or many records.
If you have a complex select with a lot of subqueries or derived tables, consider using CTEs instead. Refactor any correlated subqueries or cursors to better performing set-based code. Always think in terms of sets of data not one record.
Do not, under any conceivable circumstance, nest views. The performance hit is much worse than any small amount of saved development time. And trust me, nested views do not save maintenance time when the change needs to be to the view the furthest into the chain of views.
All stored procs (and other database code) should be in source control.
Table variables are good for smaller data sets, but temp tables (real ones that start with # or ## not staging tables) can be better for performance in large data sets. If using temp tables drop them when you don't need them anymore. Try to avoid the use of global temp tables.
Learn to write performant SQL. It is usually just as easy to write SQL that will perform well than SQL which will not once you know the techiniques. If you write complex stored procs, there is no excuse for not knowing which techniques work better than which other ones. Learn how to make sure your query is sargable. Avoid cursors, correlated subqueries, scalar functions and other things which run row-by-agonizing-row.
Communication via temp tables is sometimes a huge code smell. Such procedures often cannot be run by a user without interfering with each other (if you re-use a temp table name for different procedures' ins and outs and they aren't re-created or if you use the same name with two different table schemas). They can be hard to troubleshoot - like any feature, use them when necessary and better alternatives don't exist. Using real tables temporarily can also be problematic.
Stored procs which pass data to each other in SQL Server at all (more than parameters) can be problematic. There are table-valued parameters now and many things which previously would have been done with procs can now be done with inline table-valued functions or (and usually preferred over) multi-statement table-valued functions.
In SQL Server, avoid heavy use of scalar functions and multi-statement table-valued function on large rowsets - they do not perform very well, so modular techniques which may seem obvious in C# don't really apply here.
I would recommend you look at Ken Henderson's Guru's Guide to SQL Server Stored Procedures - published in 2002, it still has a wealth of useful information on database application design.
This is such a good question. As a C# dev myself having to dabble in SQL it seems SQL by its very nature gets in the way of the best-practices I'm used to with C#.
Common Table Expresions are great to isolate queries in a stored procedure but you can only use them once! That leads you to define views but then you've lost your encapsulation.
A resultset from one stored procedure are very difficult to use in another so you might be tempted to write table-valued functions. That adds to your permissions-maintenance burden and forces you to write functions 'twice' - once as a function and another as a procedure that calls the function. Otherwise, you have different interfaces to your DAL depending on whether it's a procedure or not.
All of this has caused me, over time, to stick to simple CRUD stored procedures (that do not call each other) in the database and few, isolated, queries when the relationships are complex. More BI-stuff. Everything else is in the BLL.
Physically, SQL is isolated in seperate files by function or the table they revolve around and managed in source control.
Avoid SELECT * and favor specifying columns. That saves you from run-time problems when you change a table and don't touch all the procs. Yes, there is a recompile for procs but it WILL miss some, especially if views are involved. Plus, SELECT * almost always returns more columns than you really need and that's a waste of bandwidth.
The comments above are great advice of Do's and Dont's when it comes to SQL code writing. If I understand your questions correctly, you are asking if this is normal for SQL Developer to write hundreds even thousands of code in a single stored procedure. In C# this is a big no-no. You are to encapsulate logic into small chucks using methods, assemblies, and classes. SQL Developer tend to write the entire logic in one stored procedure to accomplish a relating task; as HLGEM mentioned above, "If possible, do not nest stored procedures". Do not nest Views.
For example: A simple Get and Insert design in C# looks like this:
Call GetData Method
Call Get Data Method
Call Transform Data Method
Call CheckAlphaNumeric Method
Call Data Enrichment Method
Call Load Transformed Data Method
A SQL developer will design it like this:
In a single stored proc:
Get Data and Transform using either temp table or table variable, then Load it into the final table.
If you are to change the way SQL is written to match the writing structure of C# Developer, you would then do this:
Execute Main Stored Procedure (which calls the below sproc.)
Execute GetData Stored Procedure and load into Stage Table
Execute Transform Stored Procedure which read the Stage Table and transform data
Execute Load Data stored procedure to load Staged or Transformed data into final table.

Handle error in SQL Trigger without failing transaction?

I've a feeling this might not be possible, but here goes...
I've got a table that has an insert trigger on it. When data is inserted into this table the trigger fires and parses a long varbinary column. This trigger performs some operations on the binary data and writes several entries into a second table.
What I have recently discovered is that sometimes the binarydata is not "correct" (i.e. it does not conform to the spec it is supposed to - I have NO control over this whatsoever) and this can cause casting errors etc.
My initial reaction was to wrap things in TRY/CATCH blocks, but it appears this is not a solution either, as the execution of the CATCH means the transaction is doomed and I get a "Transaction doomed in trigger" error.
What is imperitive is that the data still gets written to the initial table. I don't care if the data gets written to the second table or not.
I'm not sure if I can accomplish this or not, and would gratefully receive any advice.
what you could do is commit a transaction inside a trigger and then perform those cast.
i don't know if that's a possible solution to your problem though.
another option would be to create a function IsYourBinaryValueOK which would check the column value. however the check would have to be done with like to not cause an error.
It doesn't sound like this code should run in an insert trigger since it is conceptually two different transactions. You would probably be better off with asynchronous processing such as service broker, a background nanny task that looks for 'not done' work, etc. You could also handle it by using a sproc to do the insert in one transaction and then having it call the do-other-work code afterwards.
If you absolutely have to do it in the trigger then you basically need an autonomous transaction. For some ideas see this link (the techniques apply to sql 2005 as well).