I have never undertood the basic difference (if there is any) between these two terms "process" and "procedure", could you help me out? it can be answered in programming-terms or in any other terms you like.
A process involves procedures, because the process is the whole, while the procedure is the part. In some languages (like vb, sql) procedure is a method which does not return values, in counterpart to the function that return values. Also in computing a process means a program that is being executed or at least is loaded in memory.
Process is business oriented (it can be represented by a workflow diagram), normally includes a set of business rules, while the procedure is algorithm oriented (it can be represented by a flow diagram).
See:
http://en.wikipedia.org/wiki/Procedure_(term)
http://en.wikipedia.org/wiki/Process_(computing)
Here are the definitions for both terms provided by the Information Technology Infrastructure Library (ITIL):
Procedure: A Document containing steps that specify how to achieve an
Activity. Procedures are defined as
part of Processes. See Work
Instruction.
Process: A structured set of activities designed to accomplish a
specific Objective. A Process takes
one or more defined inputs and turns
them into defined outputs. A Process
may include any of the Roles,
responsibilities, tools and management
Controls required to reliably deliver
the outputs. A Process may define
Policies, Standards, Guidelines,
Activities, and Work Instructions if
they are needed.
I found this link which I think sums it up Process versus Procedures
I think the first two comparisons are crucial and give a good idea of what the rest elaborate on:
Procedures are driven by completion of the task
Processes are driven by achievement of a desired outcome
Procedures are implemented
Processes are operated
In the sicp book, there is a section: 1.2 Procedures and the Processes They Generate
And the description of procedure may help understand:
A procedure is a pattern for the local evolution of a computational process. It specifies how each stage of the process is built upon the previous stage. We would like to be able to make statements about the overall, or global, behavior of a process whose local evolution has been specified by a procedure. This is very difficult to do in general, but we can at least try to describe some typical patterns of process evolution.
Per my understanding, a procedure is about how to program to resolve your problems with the program language while a process is what the computer need to do according to your defined procedure.
Policy is a rule or regulation for a task.
Process is a high level view on how to achieve task, simply it is a way.
Procedure is an instruction to perform an activity within a process.
Related
I'm wondering if there is a language that supports programming in the following style that I think of as "data driven":
Imagine a language like C or Python except that it's possible to define a function whose input parameters are bound to particular variables. Then, whenever one of those input variables changes, the function is re-run.
Alternatively, the function might only ever be run when its output is needed, regardless of when its inputs are changed.
The first option is useful when things need to be kept up to date whenever something changes. The second option is useful when the computation is expensive, and you want to run it only when needed.
In my mind, this type of programming paradigm would require many stacks, perhaps one for each function that was defined in the above manner. This would allow those functions to be run in any order, and it would allow their execution to be occassionally blocked. Execution on one stack would be blocked anytime it needs the output of a function whose inputs are not yet ready.
This would be a single-threaded application. The runtime system would take care of switching from one stack to another in an appropriate manner. Deadlocks would be possible.
I have a medium-sized app written in Ruby, which makes pretty heavy use of a RDBMS. As our code grows, I found the ugly SQL statements are spreading to all modules and methods in my app and embedded in many application logic. I am not sure if this is bad, however, my gut tells me this is quite ugly...
So generally in any languages, how do you manage your SQL statements? Or do you think it is harmful for maintainibility to let many SQL statements embedded in the application logic? Why or why not?
Thanks.
SQL is a language for accessing databases. Often, it gets confused as being the API into the data store for a larger application. In fact, you should design a real API between the data store and the app.
The means several things.
For accessing data stored in tables, you want to go through views in the database, rather than directly access the tables.
For data modification steps, you want to wrap insert/update/delete in stored procedures. This has secondary benefits, where you can handle constraints and triggers in the stored procedure and better log what is happening.
For security, you want to include database security as part of your security architecture. Giving all users full access may not be the best approach.
Unfortunately, it is easy to write a simple app that uses a database directly, whether in java or ruby or VBA or whatever. This grows into a bigger app, and then the maintenance problems arise.
I would suggest an incremental approach to fixing this. Go through the code and create views where you have nasty select statements. You'll probably find you need many fewer views than selects (the views can be re-used -- a good thing).
Find places where code is being modified, and change these to stored procedures. I always return status from the stored procedure for error checking and put log information into a table called someting like splog or _spcalls.
If you want to limit permissions for different users of your app, then you might be interested in this.
Leaving the raw SQL statements in the code is a problem. Just wait until you want to rename a column and you have to find all the places where this breaks the code.
Yes, this is not optimal - maintenance becomes a nightmare; it's hard to forecast and determine which code must change when underlying DB changes occur. This is why it is good practice to create a data access layer (DAL) to encapsulate CRUD operations from the application logic. There is often an business logic layer (BLL) between the application logic and DAL to enforce business rules/logic.
Google "data access layer" "business logic layer" and even "n-tier architecture" to learn more.
If you are concerned about the SQL statements littered around your application logic, maybe consider implementing them as Stored Procedures?
That way you will only be including the procedure name and any parameters that need to be passed to it in your code.
It has other benefits too, a common one being easier to re-use in multiple files.
There is much debate about speed and security of Stored Procedure and you will never get a definitive answer about that so I won't even open that can of worms.
Here is how you do this with Java: Create a class that encapsulates all access to the database. Add a method to the class for each query you need to run.
The answer for ruby will be similar to this.
It depends on the architecture of your application but a simple solution is to keep each sql in a file, qry.sql. For each Ruby module (or whatever is used in Ruby to aggregate related code) you can keep a folder SQL with these files. So, the collection of SQL folder/files form the data access layer of your application. The Ruby code provides the business layer. If your data model changes (field names, etc), you can do greps to identify the sql files that need changes. Anyway, definitely separate SQL from your logic code.
Often I have seen stored procs used for writing business logic in an application. Sometimes these procs will contain 1000+ lines of code. If I write a method/function in application code that contained 1000 lines it would be rightly criticised. Should long stored procs be broken down into separate procs, like methods in a class would be? What isn't this done more as it would certainly make code more usable.
It sounds to me like you're to the point where you need to start thinking about a service layer for your database. This will allow you to move the business logic into a more appropriate language for lots of procedural code, while still enforcing access to the database through your approved api.
First, I agree with Nat's answer: the tools (such as debuggers) for T-SQL debugging provide nowhere near the functionality that one finds for other languages.
Second, there are a number of potential challenges when passing values between stored procedures. Passing simple data types is straight-forward. Passing involved data types becomes more complex. Using temporary tables, XML, delimited strings, record sets, etc. require additional coding, create additional overhead, and have performance implications.
My "rule" is that if the input and output parameters can be handled with the standard methods (i.e. standard data types), then breaking up the stored procedure is warranted. If passing the input and output requires a lot of coding effort, then the stored procedure remains large.
I think "lines of code" is a poor measure of how reusable the code is. I think you need to take a much more qualitative look at these "long" procedures. I've had several long procedures in the past, and whether any of the code can be shortened and modularized really depends - is any of the logic really reused by other applications or is this more of a textbook desire? I am sure there are plenty of modules out there in enterprise applications that are more than 1000 lines of code and don't need to be criticized or broken down into smaller parts...
Does that mean the procedures you've seen that are 1000+ lines of code are justified? Of course not. I just wanted to stress that number of lines of code is not the only factor you should be looking at.
Absolutely, long stored procs should be broken down into short blocks of code that are re-usable and robust. Unfortunately, the language and tools used in database development don't support doing this in a practical manner.
When your SP is so big, you can definitelly say that this procedure 'does many things'. If so, you should do any of these things separatelly.
As I can see in my expirience, if you need to support this functionality, easy to refactor such SP one time than work with 1000 lines of spaghetti-code.
I have read many strong views (both for and against) SPs or DS.
I am writing a query engine in C++ (mySQL backend for now, though I may decide to go with a C++ ORM). I cant decide whether to write a SP, or to dynamically creat the SQL and send the query to the db engine.#
Any tips on how to decide?
Here's the simple answer:
If your programmers do both database and coding work, keep the SQL with the app. It's easier to maintain that way. Otherwise, let the DB guys handle it in SPs.
You have more control over the mechanisms outside the database. The biggest win for taking care of this outside the database is simply maintenance (in my mind). It'd be slightly hard to version control the SP vs the code you generate outside the database. One more thing to keep track of.
While we're on the topic, it's similar to handling data/schema migrations. It's annoyingly complex to version/handle schema migrations, if you don't already have a mechanism for this, you will have yet another thing you'll need to manage. It comes down to simply being easier to manage/version these things outside the database.
Consider the scenario where you have a bug in your SP. Now it needs to be changed, but then you hop over to another developers database/sandbox. What version is the sandbox and the SP? Now you have to track multiple versions.
One of the main differentiators is whether you are writing the "one true front end" or whether the database is the central piece of your application.
If you are going to have multiple front ends stored procedures make a lot of sense because you reduce your maintenance overhead. If you are writing only one interface, stored procedures are a pain, because you lose a lot of flexibility in changing your data set as your front end needs change, plus you now have to do code maintenance, version control, etc. in two places. Databases are a real pain to keep in sync with code repositories.
Finally, if you are coding for multiple databases (Oracle and SQL compatible code, for example), I'd avoid stored procedures completely.
You may in certain rare circumstances, after profiling, determine that some limited stored procedures are useful to you. This situation comes up way less than people think it does.
The main scenarios when you MUST have the SP is:
1) When you have very complex set of queries with heavy compile overhead and data drift low enough that recompiling is not needed on a regular basis.
2) When the "Only True" logic for accessing the specific data set is VERY complicated, needs to be accessed from several different codebases on different platforms (so writing multiple APIs in code is much more expensive).
Any other scenario, it's debatable, and can be decided one way or another.
I must also say that the other posters' arguments about versioning are not really such a big deal in my experience - having your SPs in version control is as easy as creating a "sql/db_name" directory structure and having easy basic "database release" script which releases the SP code from the version control location to the database. Every company I worked for had some kind of setup like this, central one run by DBAs or departmental one run by developers.
The one thing you want to avoid is to have your business logic spread across multiple tiers of your application. Database DDL and DML are difficult enough to keep in sync with an application code base as it is.
My recommendation is to create a good relational schema, but all your constraints and triggers so that the data retains integrity even if somebody goes to the database and tries to do something through some command line SQL.
Put all your business logic in an application or service that calls (static/dynamic) SQL then wraps the business functionality you are are trying to expose.
Stored-procedures have two purposes that I can think of.
An aid to simplifying data access.
The Stored Procedure does not have
any business logic in it, it just
knows about the structure of the
data and exposes an interface to
isolate accessing three tables and a
view just to get a single piece of
information.
Mapping the Domain Model to the Data
Model, Stored Procedures can assist
in making the Data Model look like a
given Domain Model.
After the program has been completed and has been profiled there are often performance issues with the pre 1.0 release. Stored procedures do offer batching of SQL without traffic needing to go back and forth between the DBMS and the Application. That being said in rare and extreme cases due to performance a few business rules might need to be migrated to the Stored-Procedure side. Make sure to document any exceptions to the architectural philosophy in multiple prominent places.
Stored Procedures are ideal for:
Creating reusable abstractions over complex queries;
Enforcing specific types of insertions/updates to tables (if you also deny permissions to the table);
Performing privileged operations that the logged-in user wouldn't normally be allowed to do;
Guaranteeing a consistent execution plan;
Extending the capabilities of an ORM (batch updates, hierarchy queries, etc.)
Dynamic SQL is ideal for:
Variable search arguments or output columns:
Optional search conditions
Pivot tables
IN clauses with user-specified values
ORM implementations (most can use SPs, but can't be built entirely on them);
DDL and administrative scripts.
They solve different problems, really. Use whichever one is more appropriate to the task at hand, and don't restrict yourself to just one or the other. After you work on database code for a while you'll start to get a more intuitive feel for these things; you'll find yourself banging together some rat's nest of strings for a query and think, "this should really go in a stored procedure."
Final note: Because this question implies a certain level of inexperience with SQL, I feel obliged to say, don't forget that you still need to parameterize your queries when you write dynamic SQL. Parameters aren't just for stored procedures.
DS is more flexible. SP approach makes your system more manageable.
Im building a web application which is a process management app. Several different employee types will be shown a list of Tasks to do and as each one completes a task then it is moved on to the next employee to work on.
The Task hierarchy is Batch > Load > Assembly > Part > Task. There are currently 8 rules for determining which Task should be worked on first for each Employee type. These rules apply to size of part, and also how that parts completion will affect the Hierarchy e.g if part A is completed then it completes an entire Batch where as Part B will not as there are still other parts remaining in its Batch to be completed.
Anyway that's the elevator pitch of how the system works. What i am trying to figure out is an efficient, fast and maintainable way of doing this bearing in mind the rules might change and more rules may be added.
Initially I was intending to let the DB (sql 2005) do all the heavy lifting but i have a worry that the more complicated rules will be difficult to implement with a DB. So an alternative is to pull out a list of Tasks into a middle tier and create a Collection of objects and the apply each of the rules to the Collection. I have no doubt that each rule could be translated to T-SQL in isolation but ordering by up to 8 criteria depending on the task type feels like a lot of hassle.
One benefit i can see with the the middle tier approach is that i can create a more loosely restricted system where Task flow can be changed, that would be more difficult in the DB i think.
So what would you guys recommend? Is there a third alternative i haven't thought of ?
EDIT[1] Just to qualify this a bit more, The DB is not expected to change from what i initially develop it in.
It is difficult to determine from the details in the question. However, putting your logic in a business logic (middle) layer will mean that your business rules can continue to use the same code no matter what the back-end database may be. At the moment you specify T-SQL but is it possible that in the future you will move to a non-SQL Server environment?
What platform? .NET 3.5 introduces LinqToSQL that may make this a moot point. You could use a strategy pattern to select from/build an appropriate query based on the task type, then let LINQ do the translation to SQL for you. That way you can build the query in code, but still have it actually executed on the DB.