Avoiding duplicating SQL code? - sql

I know many ways to avoid duplicating PHP code (in my case PHP). However, I am developing a rather big application that does some calculations on the database with the data it finds, and I have noticed the need to use the same code (parts of SQL) in other places.
I don't like the idea of copying and pasting the same thing over and over again. What is a good way to do this? Should I use stored procedures? I could almost calculate some of the stuff in PHP except that most of the times the queries are calculating values based on also data not returned by the query and it seems stupid to return extra data to PHP so that it could its calculations. Sometimes that may be okay, but now it does not feel so.
What should I do?
For example, all over in many SQL queries I am calculating similar to this:
...
(SELECT SUM(amount) FROM IT INNER JOIN Invoice I WHERE IT.invoiceId=I.id) AS total
...
FROM InvoiceTransaction IT
...
Note that I'm at home now so I'm writing this off the top of my head.

I think you have 2 solutions:
if the SQL returns a small amount of data, I would simply wrap the SQL invocation in a method call and call it (parameterising as necessary)
if the SQL handles a lot of data, I would keep that data in the database and use a stored procedure. You can then call that stored procedure without duplicating the code (but wrap the stored proc call in a function and call it - i.e. as in option 1)
I wouldn't necessarily shy away from stored procedures. But I would advise keeping business logic out of them (keep it in the application itself) and make sure you have sufficient unit testing around it.

I do not prefer store procedure, especially not for the sake of refactoring. You should consider writing a function that return the record you need, and put your SQL queries in that function so you can call it instead of putting your SQL everywhere.

I think we would need an example of a query. Stored procs might be a good option. Or an alternative might be to use views. One advantage in having your queries in views or stored procs is that you can often use the database to see where your tables are used. Disadvantage is that you are locking yourself into one database, however you are probably doing this anyway.

Related

What is the best way to call stored proc for each row?

I try to copy this set of tables to other set with the same scheme as the source.
I wrote stored proc, in SQL, that receives ID from TableA and copies all tables from B-G.
Now I want for each row of TalbeA to call that stored proc. I can use CURSOR or WHILE for this but, I read that CURSOR is not recommended and that WHILE is slower than CURSOR.
Is there another way or in this cases CURSOR\WHILE is the solution?
Thank you
CURSOR / WHILE is fine in this instance - there isn't a better way to call a sproc per row. If the performance of this is likely to have an impact on the system, though, be careful when you run it.
There is a better alternative if you can code it up - and that's to perform all the "copying" for the records in TableA and below in a bunch of SQL statements, avoiding cursors altogether. To summarise this suggestion - set-based rather than row-based.
Unless there's a significant advantage in having this copying performed in the stored proc, you'd be better off re-writing the copying inline in your current script.
If you're wondering how this might be achieved, if you're having to maintain foreign keys, work with identity columns, etc, you might check out my answer to How Can I avoid using a cursor....

Best Practice: One Stored Proc that always returns all fields or different stored procedure for each field set needed?

If I have a table with Field 1, Field 2, Field 3, Field 4 and for one instance need just Field 1 and Field 2, but another I need Field 3 and Field 4 and yet another need all of them...
Is it better to have a SP for each combination I need or one SP that always returns them all?
Very important question:
Writing many stored procs that run the same query will make you spend a lot of time documenting and apologising to future maintainers.
For every time anyone wants to introduce a change, they have to consider whether it should apply to all stored procs, or to some or to one only...
I would do only one stored proc.
I would just have one Stored Procedure as it will be easier to maintain.
Does it need to be a Stored Procedure? You could rewrite it as a View then simply select the columns that you need.
If network bandwidth and memory usage is more important than hours of work and project simplicity, then make a separate SP for each task. Otherwise there's no point. (the gains aren't that great, and are noticeable only when the rowset is extremely large, or there are a lot of simultaneous requests)
As a general rule it is good practice to select only the columns we need to serve a particular purpose. This is particularly true for tables which have:
lots of columns
LOB columns
sensitive or restricted data
However, if we have a complicated system with lots of tables it is obviously impractical to build a separate stored procedure for each distinct query. In fact it is probably undesirable to do so. The resultant API would be overwhelming to use and a lot of effort to maintain.
The solutions are several and various, and really depend on the nature of the applications. Views can help, although they share some of the same maintenance issues. Dynamic SQL is another approach. We can write complicated procedures which return many differnet result sets depending on the input parameters. Heck, sometimes we can even write SQL statements in the actual application.
Oh, and there is the simple procedure which basically wraps a SELECT * FROM some_table but that comes with its own suite of problems.

T-SQL optimizing performance of various stored procedures question

so I have written several stored procedures that act on individual rows of data by taking in an ID number. I would like to keep several stored procedures that can call this stored procedure at different levels of my database scheme. For instance, when a row is inserted I call this stored procedure. When something else is modified I would like to call this stored procedure for each line. This is so I can have one set of base code that can be called everywhere else but that acts on different amounts of data. I have been able to produce this result with Cursors, but I am told these are very inefficient. Is there any other way to produce this kind of functionality without sacrificing performance? Thanks.
Yes. Use standard joins to operate on sets rather than RBAR (Row By Agonising Row). i.e. Rather than call a function for each row, design a join that performs the required operation on every applicable row as a set operation.
I often see devs use the 'function operates on a each row', and although this seems to be the obvious way to encapsulate logic, it doesn't perform well on SQL Server or most DB engines.
In some circumstances, a table-valued function can be used effectively (MS SQL Server).
(BTW, you are correct in saying cursors are inefficient).

Best to use SQL + JOINS or Stored Proc to combine results from multiple queries?

I am creating a table to summarize data that is gathered from about 8 or so queries that have very light logic/WHERE clauses and all select against different tables.
I was wondering what the best option would be to fetch the summarized data:
One query with multiple JOINS to gather all relevant information
A stored proc that encapsulates the logic and maybe executes the 8 queries and does the "joining" in some other way? This seems more modular and maintainable to me...but I'm not sure.
I am using SQL Server 2008 for this. Any suggestions?
If you can, then use usual SQL methods. Db's are optimized to run them. This "joining in some other way" would probably require the use of cursor which slows down everything. Just let the db do its job. If you need more performance then you should examine execution plan and do what has to be done there(eg. adding indexes).
Databases are pretty good at figuring out the optimal way of executing SQL. It is what they are designed to do. Using stored procedures to load the data in chunks and combining it yourself will be more complex to write, and likely to be less efficient than letting the database just do it for you.
If you are concerned about reusing a complex query in multiple places, consider creating a view of it instead.
Depending on the size of the tables, joining 8 of them could be pretty hairy. I would try it that way first - as others have said, the db is pretty good at figuring this stuff out. If the performance is not as good as you would like, I would try a stored proc which creates a table variable (or a temp table) and inserts the data from each of the 8 tables separately. Then you can return the contents of the table variable to your app.
This method also makes it a little easier to add the 9th, 10th, etc tables in the future. And it gives you an easy way to do any processing you may need on the summarized data before returning it to your app.

SQL-Server Performance: What is faster, a stored procedure or a view?

What is faster in SQL Server 2005/2008, a Stored Procedure or a View?
EDIT:
As many of you pointed out, I am being too vague. Let me attempt to be a little more specific.
I wanted to know the performance difference for a particular query in a View, versus the exact same query inside a stored procedure.
(I still appreciate all of the answers that point out their different capabilities)
Stored Procedures (SPs) and SQL Views are different "beasts" as stated several times in this post.
If we exclude some [typically minor, except for fringe cases] performance considerations associated with the caching of the query plan, the time associated with binding to a Stored Procedure and such, the two approaches are on the whole equivalent, performance-wise. However...
A view is limited to whatever can be expressed in a single SELECT statement (well, possibly with CTEs and a few other tricks), but in general, a view is tied to declarative forms of queries. A stored procedure on the other can use various procedural type constructs (as well as declarative ones), and as a result, using SPs, one can hand-craft a way of solving a given query which may be more efficient than what SQL-Server's query optimizer may have done (on the basis of a single declarative query). In these cases, an SPs may be much faster (but beware... the optimizer is quite smart, and it doesn't take much to make an SP much slower than the equivalent view.)
Aside from these performance considerations, the SPs are more versatile and allow a broader range of inquiries and actions than the views.
Unfortunately, they're not the same type of beast.
A stored procedure is a set of T-SQL statements, and CAN return data. It can perform all kinds of logic, and doesn't necessarily return data in a resultset.
A view is a representation of data. It's mostly used as an abstraction of one or more tables with underlying joins. It's always a resultset of zero, one or many rows.
I suspect your question is more along the lines of:
Which is faster: SELECTing from a view, or the equivalent SELECT statement in a stored procedure, given the same base tables performing the joins with the same where clauses?
This isn't really an answerable question in that an answer will hold true in all cases. However, as a general answer for an SQL Server specific implementaion...
In general, a Stored Procedure stands a good chance of being faster than a direct SQL statement because the server does all sorts of optimizations when a stored procedure is saves and executed the first time.
A view is essentially a saved SQL statement.
Therefore, I would say that in general, a stored procedure will be likely to be faster than a view IF the SQL statement for each is the same, and IF the SQL statement can benefit from optimizations. Otherwise, in general, they would be similar in performance.
Reference these links documentation supporting my answer.
http://www.sql-server-performance.com/tips/stored_procedures_p1.aspx
http://msdn.microsoft.com/en-us/library/ms998577.aspx
Also, if you're looking for all the ways to optimize performance on SQL Server, the second link above is a good place to start.
In short, based on my experience in some complex queries, Stored procedure gives better performance than function.
But you cannot use results of stored procedure in select or join queries.
If you don't want to use the result set in another query, better to use SP.
And rest of the details and differences are mentioned by people in this forum and elsewhere.
I prefer stored procedures due to Allow greater control over data, if you want to build a good, secure modular system then use stored procedures, it can run multiple sql-commands, has control-of-flow statements and accepts parameters. Everything you can do in a view you can do in a stored procedure. But in a stored procedure, you can do with much more flexibility.
I believe that another way of thinking would be to use stored procedures to select the views. This will make your architecture a loosely coupled system. If you decide to change the schema in the future, you won't have to worry 'so' much that it will break the front end.
I guess what I'm saying is instead of sp vs views, think sp and views :)
Stored procedures and views are different and have different purposes. I look at views as canned queries. I look at stored procedures as code modules.
For example let's say you have a table called tblEmployees with these two columns (among others): DateOfBirth and MaleFemale.
A view called viewEmployeesMale which filters out only male employees can be very useful. A view called viewEmployeesFemale is also very useful. Both of these views are self describing and very intuitive.
Now, lets say you need to produce a list all male employees between the ages of 25 and 30. I would tend to create a stored procedure to produce this result. While it most certainly could be built as a view, in my opinion a stored procedure is better suited for dealing with this. Date manipulation especially where nulls are a factor can become very tricky.
I know I'm not supposed to turn this into a "discussion", but I'm very interested in this and just thought I'd share my empirical observations of a specific situation, with particular reference to all the comments above which state that an equivalent SELECT statement executed from within a Stored Procedure and a View should have broadly the same performance.
I have a view in database "A" which joins 5 tables in a separate database (db "B"). If I attach to db "A" in SSMS and SELECT * from the view, it takes >3 minutes to return 250000 rows. If I take the select statement from the design page of the view and execute it directly in SSMS, it takes < 25 seconds. Putting the same select statement into a stored procedure gives the same performance when I execute that procedure.
Without making any observations on the absolute performance (db "B" is an AX database which we are not allowed to touch!), I am still absolutely convinced that in this case using an SP is an order of magnitude faster than using a View to retrieve the same data, and this applies to many other similar views in this particular case.
I don't think it's anything to do with creating a connection to the other db, unless by using a view it somehow can never cache the connection whereas the select does, because I can switch between the 2 selects in the same SSMS window repeatedly and the performance of each query remains consistent. Also, if I connect directly to db "B" and run the select without the dbname.dbo.... refs, it takes the same time.
Any thoughts anyone?
Views:
We can create index on views (not possible in stored proc)
it's easy to give abstract views(only limited column access of multiple table ) of
table data to other DBA/users
Store Procedure:
We can pass parameters to sp(not possible in views)
Execute multiple statement inside procedure (like insert, update,delete operations)
A couple other considerations: While performance between an SP and a view are essentially the same (given they are performing the exact same select), the SP gives you more flexibility for that same query.
The SP will support ordering the result set; i.e., including an ORDER BY statement. You cannot do so in a view.
The SP is fully compiled and requires only an exec to invoke it. The view still requires a SELECT * FROM view to invoke it; i.e., a select on the compiled select in the view.
Found a detailed performance analysis: https://www.scarydba.com/2016/11/01/stored-procedures-not-faster-views/
Compile Time Comparison:
There is a difference in the compile time between the view by itself and the stored procedures (they were almost identical). Let’s look at performance over a few thousand executions:
View AVG: 210.431431431431
Stored Proc w/ View AVG: 190.641641641642
Stored Proc AVG: 200.171171171171
This is measured in microsends, so the variation we’re seeing is likely just some disparity on I/O, CPU or something else since the differences are trivial at 10mc or 5%.
What about execution time including compile time, since there is a
difference:
Query duration View AVG: 10089.3226452906
Stored Proc AVG: 9314.38877755511
Stored Proc w/ View AVG: 9938.05410821643
Conclusion:
With the exception of the differences in compile time, we see that views actually perform exactly the same as stored procedures, if the query in question is the same.