SSIS Job , passing the value in job steps - sql

I have a SSIS job with multiple steps, each steps is from different package or project.
When I execute the job , I write log in db with a guid, but each guid is created in its package or project.
I need to update the same value for all the steps , kind of link all the steps together with one value/ID.
Any suggestions how to do this?

SQL Agent jobs do not allow for values to be passed in to job steps.
But, you can rethink how your current invocation of SSIS packages works to meet your goals.
What if you added a precursor step to your SQL Agent job? Type of SQL Task and use that to generate the GUID you'd like for your packages to share. Store it into either a 1 row table or create a key/value style historical table.
CREATE TABLE dbo.CorrelateAgentToSSIS
(
jobid uniqueidentifier NOT NULL
, runid uniqueidentifier NOT NULL
, insert_date datetime NOT NULL CONSTRAINT DF__CorrelateAgentToSSIS__insert_date DEFAULT (GETDATE())
);
4 columns there. The first will be the guid an instance of SQL Server Agent generates. The second column is your tracking guid.
Step 0 would look something like
declare #jobid uniqueidentifier = CONVERT(uniqueidentifier, $(ESCAPE_NONE(JOBID)))
-- Populate this however it needs to be done
, #myguid uniqueidentifier = newid()
Your job steps for SSIS will change a bit. Instead of using the native SSIS jobstep type, you're going to use the TSQL type and do something like this.
DECLARE #execution_id bigint
, #jobid uniqueidentifier = CONVERT(uniqueidentifier, $(ESCAPE_NONE(JOBID)));
DECLARE #runid uniqueidentifier = (SELECT TOP 1 runid FROM dbo.CorrelateAgentToSSIS AS CATS WHERE CATS.jobid = #jobid);
EXEC SSISDB.catalog.create_execution
#package_name = N'SomePackage.dtsx'
, #execution_id = #execution_id OUTPUT
, #folder_name = N'MyFolder'
, #project_name = N'MyProject'
, #use32bitruntime = False
, #reference_id = NULL;
-- ddl left as exercise to the reader
INSERT INTO dbo.RunToSSIS
SELECT
#run
, #execution_id;
DECLARE #var0 smallint = 1;
EXEC SSISDB.catalog.set_execution_parameter_value
#execution_id
, #object_type = 50
, #parameter_name = N'LOGGING_LEVEL'
, #parameter_value = #var0;
-- This assumes you have a parameter defined in SSIS packages to receive the
-- runid guid
EXEC SSISDB.catalog.set_execution_parameter_value
#execution_id
, #object_type = 50
, #parameter_name = N'RUN_ID'
, #parameter_value = #runid;
EXEC SSISDB.catalog.start_execution
#execution_id;
GO
Finally, while you're collecting metrics, you might also want to think about linking a job run to the data the packages collect in the SSISDB. You can bridge that gap by recording the jobid/runid to a bigint of execution_id. If you're running packages from the SSISDB, you can plumb in System variable ServerExecutionID. I do this in the first step of every package with an Execute SQL Task. In packages run from VS, the value is 0. Otherwise, it's the value you see in SSISDB.catalog.operations Knowing those three things will allow you to see how the Agent job did, correlate it to your custom guid and whatever metrics you collect and you can pull apart performance data from the SSIS catalog.
https://dba.stackexchange.com/questions/13347/get-job-id-or-job-name-from-within-executing-job
https://dba.stackexchange.com/questions/38808/relating-executioninstanceguid-to-the-ssisdb

Related

SSIS is hanging during Update with 3 millions of rows

I'm implementing a new method for a warehouse. The new method consist on perform incremental loading between source and destination tables (Insert,Update or Delete).
All the table are working really well, except for 1 table which the Source has more than 3 millions of rows, as you will see in the image below it just start running but never finish.
Probable I'm not doing the update in the correct way or there is another way to do it.
Here are some pictures of my SSIS package:
Highlighted object is where it hangs.
This is the stored procedure I call to update the table:
ALTER PROCEDURE [dbo].[UpdateDim_A]
#ID INT,
#FileDataID INT
,#CategoryID SMALLINT
,#FirstName VARCHAR(50)
,#LastName VARCHAR(50)
,#Company VARCHAR(100)
,#Email VARCHAR(250) AS BEGIN
SET NOCOUNT ON;
BEGIN TRAN
UPDATE DIM_A
SET
[FileDataID] = #FileDataID,
[CategoryID] = #CategoryID,
[FirstName] = #FirstName,
[LastName] = #LastName,
[Company] = #Company,
[Email] = #Email
WHERE PartyID=#ID
COMMIT TRAN; END
Note:
I already tried Dropping the constraint and indexes and changing the recovery mode of the database to simple.
Any help will be appreciate.
After Apply the solution provided by #Prabhat G, this is how my package looks like, running in 39 seconds (avg)!!!
Inside Dim_A DataFlow
Follow these 2 performance enhancers and you'll avoid your bottleneck.
Remove sort transformation. In your source, while fetching the data use order by sql. Reason being, sort takes up all the records in memory before sorting. You don't want that, be it incremental or full load.
In the last step of update, introduce another Staging Table instead of update records oledb command, which will be replica of Dim table. Once all the matching records are inserted in this new staging table, exit the Data flow task and create EXECUTE SQL TASK which will simply UPDATE Dim table based on joining ID/conditions.
Reason for this is, oledb command hits row by row. Always prefer update using Execute SQL Task as its a batch process.
Edit:
As per comments, to update only changed rows in Execute SQL Task, add the conditions in where clause:
eg:
UPDATE x
SET
x.attribute_A = y.attribute_A
,x.attribute_B = y.attribute_B
FROM
DimA x
inner join stg_DimA y
ON x.Id = y.Id
WHERE
(x.Attribute_A <> y.Attribute_A
OR x.Attribute_B <> y.Attribute_B)
So your problem is actually very simple the method you are using is executing that stored procedure for every row returned. If you have 9961(as in your picture) rows to update it will run that statement 9961 sepreate time. Chances are if you are to look at active queries running on SQL server you'll see that procedure executing over and over.
What you should do to speed this up is dump that data into a staging table then use the execute SQL task further in your package to run a standard SQL update. This will run much faster.
The problem is that you are trying to execute a stored procedure within the data flow. The correct SqlCommand will be an explicit UPDATE query and then map the columns from SSIS to the columns on the table that you are updating.
UPDATE DIM_A
SET FileDataID = ?
,CategoryID = ?
,FirstName = ?
,LastName = ?
,Company = ?
,Email = ?
WHERE PartyID = ?
Note: The #Id needs to be included as a column in your data flow.
One final thing you should consider, as Zane correctly pointed out: you should only update rows that have changed. So, in your data flow you should add a Conditional Split transformation that checks to see if any of the columns in the new source row are different from the existing table rows. Only rows that are different should be send to the OLE DB Command - the rest can be disregarded.

Get job that ran SQL query on UPDATE trigger

I am trying to create an audit trail for actions that are performed within a web application, SQL server agent jobs and manually run queries to the database. I am trying to use triggers to catch updates, inserts and deletes on certain tables.
In the whole this process is working. Example, user performs update in web application and the trigger writes the updated data to an audit trail table I have defined, including the username of the person who performed the action. This works fine from a web application or manual query perspective, but we also have dozens of SQL Server Agent Jobs that I would like to capture which one ran specific queries. Each of the agent jobs are ran with the same username. This works fine also and inputs the username correctly into the table but I can't find which job calls this query.
My current "solution" was to find which jobs are currently running at the time of the trigger, as one of them must be the correct one. Using:
CREATE TABLE #xp_results
(
job_id UNIQUEIDENTIFIER NOT NULL,
last_run_date INT NOT NULL,
last_run_time INT NOT NULL,
next_run_date INT NOT NULL,
next_run_time INT NOT NULL,
next_run_schedule_id INT NOT NULL,
requested_to_run INT NOT NULL, -- BOOL
request_source INT NOT NULL,
request_source_id sysname COLLATE database_default NULL,
running INT NOT NULL, -- BOOL
current_step INT NOT NULL,
current_retry_attempt INT NOT NULL,
job_state INT NOT NULL
)
INSERT INTO #xp_results
EXECUTE master.dbo.xp_sqlagent_enum_jobs 1, 'sa'
SELECT #runningJobs = STUFF((SELECT ',' + j.name
FROM #xp_results r
INNER JOIN msdb..sysjobs j ON r.job_id = j.job_id
WHERE running = 1
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
DROP TABLE #xp_results
I ran a specific job to test and it seems to work, in that any OTHER job which is running will be listed in #runningJobs, but it doesn't record the job that runs it. I assume that by the time the trigger runs the job has finished.
Is there a way I can find out what job calls the query that kicks off the trigger?
EDIT: I tried changing the SELECT query above to get any job that ran within the past 2 mins or is currently running. The SQL query is now:
SELECT #runningJobs = STUFF((SELECT ',' + j.name
FROM #xp_results r
INNER JOIN msdb..sysjobs j ON r.job_id = j.job_id
WHERE (last_run_date = CAST(REPLACE(LEFT(CONVERT(VARCHAR, getdate(), 120), 10), '-', '') AS INT)
AND last_run_time > CAST(REPLACE(LEFT(CONVERT(VARCHAR,getdate(),108), 8), ':', '') AS INT) - 200)
OR running = 1
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
When I run a job, then run the above query while the job is running, the correct jobs are returned. But when the SSIS package is run, either via the SQL Server Agent job or manually ran in SSIS, the #runningJobs is not populated and just returns NULL.
So I am now thinking it is a problem with permissions of SSIS and master.dbo.xp_sqlagent_enum_jobs. Any other ideas?
EDIT #2: Actually don't think it is a permissions error. There is an INSERT statement below this code, if it IS a permissions error the INSERT statement does not run and therefore the audit line does not get added to the database. So, as there IS a line added to the database, just not with the runningJobs field populated. Strange times.
EDIT #3: I just want to clarify, I am searching for a solution which DOES NOT require me to go into each job and change anything. There are too many jobs to make this a feasible solution.
WORKING CODE IS IN FIRST EDIT - (anothershrubery)
Use the app_name() function http://msdn.microsoft.com/en-us/library/ms189770.aspx in your audit trigger to get the name of the app running the query.
For SQL Agent jobs, app_name includes the job step id in the app name (if a T-SQL step). We do this in our audit triggers and works great. An example of the app_name() results when running from within an audit trigger:
SQLAgent - TSQL JobStep (Job 0x96EB56A24786964889AB504D9A920D30 : Step
1)
This job can be looked up via the job_id column in msdb.dbo.sysjobs_view.
Since SSIS packages initiate the SQL connection outside of the SQL Agent job engine, those connections will have their own application name, and you need to set the application name within the connection strings of the SSIS packages. In SSIS packages, Web apps, WinForms, or any client that connects to SQL Server, you can set the value that is returned by the app_name function by using this in your connection string :
"Application Name=MyAppNameGoesHere;"
http://www.connectionstrings.com/use-application-name-sql-server/
If the "Application Name" is not set within a .NET connection string, then the default value when using the System.Data.SqlClient.SqlConnection is ".Net SqlClient Data Provider".
Some other fields that are commonly used for auditing:
HOST_NAME(): http://technet.microsoft.com/en-us/library/ms178598.aspx Returns the name of the client computer that is connecting. This is helpful if you have an intranet app.
CONNECTIONPROPERTY('local_net_address'): For getting the client IP address.
CONTEXT_INFO(): http://technet.microsoft.com/en-us/library/ms187768.aspx You can use this to store information for the duration of the connection/session. Context_Info is a binary 128 byte field, so you might need to do conversions to/from strings when using it.
Here are SQL helper methods for setting/getting context info:
CREATE PROC dbo.usp_ContextInfo_SET
#val varchar(128)
as
begin
set nocount on;
DECLARE #c varbinary(128);
SET #c=cast(#val as varbinary(128));
SET CONTEXT_INFO #c;
end
GO
CREATE FUNCTION [dbo].[ufn_ContextInfo_Get] ()
RETURNS varchar(128)
AS
BEGIN
--context_info is binary data type, so will pad any values will CHAR(0) to the end of 128 bytes, so need to replace these with empty string.
RETURN REPLACE(CAST(CONTEXT_INFO() AS varchar(128)), CHAR(0), '')
END
EDIT:
The app_name() is the preferred way to get the application that is involved in the query, however since you do not want to update any of the SSIS packages, then here is an updated query to get currently executing jobs using the following documented SQL Agent tables. You may have to adjust the GRANTs for SELECT in the msdb database for these tables in order for the query to succeed, or create a view using this query, and adjust the grants for that view.
msdb.dbo.sysjobactivity http://msdn.microsoft.com/en-us/library/ms190484.aspx
msdb.dbo.syssessions http://msdn.microsoft.com/en-us/library/ms175016.aspx
msdb.dbo.sysjobs http://msdn.microsoft.com/en-us/library/ms189817.aspx
msdb.dbo.sysjobhistory http://msdn.microsoft.com/en-us/library/ms174997.aspx
Query:
;with cteSessions as
(
--each time that SQL Agent is started, a new record is added to this table.
--The most recent session is the current session, and prior sessions can be used
--to identify the job state at the time that SQL Agent is restarted or stopped unexpectedly
select top 1 s.session_id
from msdb.dbo.syssessions s
order by s.agent_start_date desc
)
SELECT runningJobs =
STUFF(
( SELECT N', [' + j.name + N']'
FROM msdb.dbo.sysjobactivity a
inner join cteSessions s on s.session_id = a.session_id
inner join msdb.dbo.sysjobs j on a.job_id = j.job_id
left join msdb.dbo.sysjobhistory h2 on h2.instance_id = a.job_history_id
WHERE
--currently executing jobs:
h2.instance_id is null
AND a.start_execution_date is not null
AND a.stop_execution_date is null
ORDER BY j.name
FOR XML PATH(''), ROOT('root'), TYPE
).query('root').value('.', 'nvarchar(max)') --convert the xml to nvarchar(max)
, 1, 2, '') -- replace the leading comma and space with empty string.
;
EDIT #2:
Also if you are on SQL 2012 or higher, then checkout the SSISDB.catalog.executions view http://msdn.microsoft.com/en-us/library/ff878089(v=sql.110).aspx to get the list of currently running SSIS packages, regardless of if they were started from within a scheduled job. I have not seen an equivalent view in SQL Server versions prior to 2012.
I would add an extra column to your table e.g. Update_Source, and get all the source apps (including SSIS) to set it when they update the table.
You could use the USER as a DEFAULT for that column to minimize the changes needed.
You could try using CONTEXT_INFO
Try adding a T-SQL step with SET CONTEXT_INFO 'A Job' in to your job
Then try reading that in your trigger using sys.dm_exec_sessions
I'm curious to see if it works - please post your findings.
http://msdn.microsoft.com/en-us/library/ms187768(v=sql.105).aspx

Clear data in Excel using SSIS

I have created s Stored Procedure in SQL which imports data from a flat file, updates the data and imports the updated data within a table.
After some research, I found that the only way to export this (64bit) table would be to create a SSIS package and use a SQL Job to execute the package.
I have done all of this and managed to get the table data exported but the problem is that it does not clear the data before the import. I have then created the following
When dropping the excel table, I have the following SQL Statement : DROP TABLE [Sheet1$]
When creating the table, I have the following SQL Statement :
CREATE TABLE 'Sheet1$'
(
BRANCH NVARCHAR(10) ,
SRCBRANCH NVARCHAR(10) ,
DEPARTMENT NVARCHAR(10) ,
GLCODE NVARCHAR(10) ,
DOCDATE NVARCHAR(10) ,
VALUE NVARCHAR(50) ,
ITEMREFERENCE NVARCHAR(100) ,
MISCREFERENCE NVARCHAR(100) ,
SUFFIX NVARCHAR(10) ,
NARRATIVE [NVARCHAR(100)
)
GO
After the table has been dropped, it clears all the data together with the header and then fails on the second SQL Task (Create Excel Table) with the following error message:
[Execute SQL Task] Error: Executing the query "CREATE TABLE 'Sheet1$' (
BRANCH NVARCHAR(10) ,
S..." failed with the following error: "Syntax error in CREATE TABLE statement.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
After fail, I cant execute the package due to a validation error. This is because my headers in the Excel sheet has been deleted.
Could someone please point me in the right direction. I have exhausted all options.
Regards
CREATE TABLE 'Sheet1$'
(
BRANCH NVARCHAR(10) ,
SRCBRANCH NVARCHAR(10) ,
DEPARTMENT NVARCHAR(10) ,
GLCODE NVARCHAR(10) ,
DOCDATE NVARCHAR(10) ,
VALUE NVARCHAR(50) ,
ITEMREFERENCE NVARCHAR(100) ,
MISCREFERENCE NVARCHAR(100) ,
SUFFIX NVARCHAR(10) ,
NARRATIVE **[**NVARCHAR(100)
)
GO
***This is your syntax error. This was asked quite a while ago so I would think you have rectified this by now but I wanted to post for others to know what the issue was.

Replication Custom resolver changes empty strings to NULLs

We have an C# application which posts to a database which is replicated to another database (using merge-replication) and has one custom resolver which is a stored procedure.
This was working fine under SQL Server 2000 , but when testing under SQL Server 2005 the custom resolver is attempting to change any empty varchar columns to be nulls (and failing cos this particular column does not allow nulls).
Note that these varchar fields are not the ones which cause the conflict as they are current empty on both databases and are not being changed and the stored procedure does not change them (all it is doing is attempting to set the value of another money column).
Has anyone come across this problem, or has example of a stored procedure which will leave empty strings as they are?
The actual stored procedure is fairly simply and and re-calculates the customer balance in the event of a conflict.
ALTER procedure [dbo].[ReCalculateCustomerBalance]
#tableowner sysname,
#tablename sysname,
#rowguid varchar(36),
#subscriber sysname,
#subscriber_db sysname,
#log_conflict INT OUTPUT,
#conflict_message nvarchar(512) OUTPUT
AS
set nocount on
DECLARE
#CustomerID bigint,
#SysBalance money,
#CurBalance money,
#SQL_TEXT nvarchar(2000)
Select #CustomerID = customer.id from customer where rowguid= #rowguid
Select #SysBalance = Sum(SystemTotal), #CurBalance = Sum(CurrencyTotal) From CustomerTransaction Where CustomerTransaction.CustomerID = #CustomerID
Update Customer Set SystemBalance = IsNull(#SysBalance, 0), CurrencyBalance = IsNull(#CurBalance, 0) Where id = #CustomerID
Select * From Customer Where rowguid= #rowguid
Select #log_conflict =0
Select #conflict_message ='successful'
Return(0)
You have a few options here, each are a bit of a workaround from what my research seems to show is an issue with SQL Server.
1- Alter this statement: Select * From Customer Where rowguid= #rowguid to explicitly mention each of the columns, and use an "isNull" for the offending fields
2- Alter the column in the table to add a default constraint for ''. What this will do, is if you attempt to insert a 'null', it will replace it with the empty string
3- Add a 'before insert' trigger which will alter the data before the insert, to not contain a 'null' anymore
PS: Are you positive that the replication system has that column marked as "required"? I think if it is not required, it will insert 'null' if no data exists.

Accessing data with stored procedures

One of the "best practice" is accessing data via stored procedures. I understand why is this scenario good.
My motivation is split database and application logic ( the tables can me changed, if the behaviour of stored procedures are same ), defence for SQL injection ( users can not execute "select * from some_tables", they can only call stored procedures ), and security ( in stored procedure can be "anything" which secure, that user can not select/insert/update/delete data, which is not for them ).
What I don't know is how to access data with dynamic filters.
I'm using MSSQL 2005.
If I have table:
CREATE TABLE tblProduct (
ProductID uniqueidentifier -- PK
, IDProductType uniqueidentifier -- FK to another table
, ProductName nvarchar(255) -- name of product
, ProductCode nvarchar(50) -- code of product for quick search
, Weight decimal(18,4)
, Volume decimal(18,4)
)
then I should create 4 stored procedures ( create / read / update / delete ).
The stored procedure for "create" is easy.
CREATE PROC Insert_Product ( #ProductID uniqueidentifier, #IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
INSERT INTO tblProduct ( ProductID, IDProductType, ... etc .. ) VALUES ( #ProductID, #IDProductType, ... etc ... )
END
The stored procedure for "delete" is easy too.
CREATE PROC Delete_Product ( #ProductID uniqueidentifier, #IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
DELETE tblProduct WHERE ProductID = #ProductID AND IDProductType = #IDProductType AND ... etc ...
END
The stored procedure for "update" is similar as for "delete", but I'm not sure this is the right way, how to do it. I think that updating all columns is not efficient.
CREATE PROC Update_Product( #ProductID uniqueidentifier, #Original_ProductID uniqueidentifier, #IDProductType uniqueidentifier, #Original_IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
UPDATE tblProduct SET ProductID = #ProductID, IDProductType = #IDProductType, ... etc ...
WHERE ProductID = #Original_ProductID AND IDProductType = #Original_IDProductType AND ... etc ...
END
And the last - stored procedure for "read" is littlebit mystery for me. How pass filter values for complex conditions? I have a few suggestion:
Using XML parameter for passing where condition:
CREATE PROC Read_Product ( #WhereCondition XML ) AS BEGIN
DECLARE #SELECT nvarchar(4000)
SET #SELECT = 'SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'
DECLARE #WHERE nvarchar(4000)
SET #WHERE = dbo.CreateSqlWherecondition( #WhereCondition ) --dbo.CreateSqlWherecondition is some function which returns text with WHERE condition from passed XML
DECLARE #LEN_SELECT int
SET #LEN_SELECT = LEN( #SELECT )
DECLARE #LEN_WHERE int
SET #LEN_WHERE = LEN( #WHERE )
DECLARE #LEN_TOTAL int
SET #LEN_TOTAL = #LEN_SELECT + #LEN_WHERE
IF #LEN_TOTAL > 4000 BEGIN
-- RAISE SOME CONCRETE ERROR, BECAUSE DYNAMIC SQL ACCEPTS MAX 4000 chars
END
DECLARE #SQL nvarchar(4000)
SET #SQL = #SELECT + #WHERE
EXEC sp_execsql #SQL
END
But, I think the limitation of "4000" characters for one query is ugly.
The next suggestion is using filter tables for every column. Insert filter values into the filter table and then call stored procedure with ID of filters:
CREATE TABLE tblFilter (
PKID uniqueidentifier -- PK
, IDFilter uniqueidentifier -- identification of filter
, FilterType tinyint -- 0 = ignore, 1 = equals, 2 = not equals, 3 = greater than, etc ...
, BitValue bit , TinyIntValue tinyint , SmallIntValue smallint, IntValue int
, BigIntValue bigint, DecimalValue decimal(19,4), NVarCharValue nvarchar(4000)
, GuidValue uniqueidentifier, etc ... )
CREATE TABLE Read_Product ( #Filter_ProductID uniqueidentifier, #Filter_IDProductType uniqueidentifier, #Filter_ProductName uniqueidentifier, ... etc ... ) AS BEGIN
SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume
FROM tblProduct
WHERE ( #Filter_ProductID IS NULL
OR ( ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_ProductID AND FilterType = 1 ) AND NOT ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_ProductID AND FilterType = 2 ) )
AND ( #Filter_IDProductType IS NULL
OR ( ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_IDProductType AND FilterType = 1 ) AND NOT ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = #Filter_IDProductType AND FilterType = 2 ) )
AND ( #Filter_ProductName IS NULL OR ( ... etc ... ) )
END
But this suggestion is littlebit complicated I think.
Is there some "best practice" to do this type of stored procedures?
For reading data, you do not need a stored procedure for security or to separate out logic, you can use views.
Just grant only select on the view.
You can limit the records shown, change field names, join many tables into one logical "table", etc.
First: for your delete routine, your where clause should only include the primary key.
Second: for your update routine, do not try to optimize before you have working code. In fact, do not try to optimize until you can profile your application and see where the bottlenecks are. I can tell you for sure that updating one column of one row and updating all columns of one row are nearly identical in speed. What takes time in a DBMS is (1) finding the disk block where you will write the data and (2) locking out other writers so that your write will be consistent. Finally, writing the code necessary to update only the columns that need to change will generally be harder to do and harder to maintain. If you really wanted to get picky, you'd have to compare the speed of figuring out which columns changed compared with just updating every column. If you update them all, you don't have to read any of them.
Third: I tend to write one stored procedure for each retrieval path. In your example, I'd make one by primary key, one by each foreign key and then I'd add one for each new access path as I needed them in the application. Be agile; don't write code you don't need. I also agree with using views instead of stored procedures, however, you can use a stored procedure to return multiple result sets (in some version of MSSQL) or to change rows into columns, which can be useful.
If you need to get, for example, 7 rows by primary key, you have some options. You can call the stored procedure that gets one row by primary key seven times. This may be fast enough if you keep the connection opened between all the calls. If you know you never need more than a certain number (say 10) of IDs at a time, you can write a stored procedure that includes a where clause like "and ID in (arg1, arg2, arg3...)" and make sure that unused arguments are set to NULL. If you decide you need to generate dynamic SQL, I wouldn't bother with a stored procedure because TSQL is just as easy to make a mistake as any other language. Also, you gain no benefit from using the database to do string manipulation -- it's almost always your bottleneck, so there is no point in giving the DB any more work than necessary.
I disagree that create Insert/Update/Select stored procedures are a "best practice". Unless your entire application is written in SPs, use a database layer in your application to handle these CRUD activities. Better yet, use an ORM technology to handle them for you.
My suggestion is that you don't try to create a stored procedure that does everything that you might now or ever need to do. If you need to retrieve a row based on the table's primary key then write a stored procedure to do that. If you need to search for all rows meeting a set of criteria then find out what that criteria might be and write a stored procedure to do that.
If you try to write software that solves every possible problem rather than a specific set of problems you will usually fail at providing anything useful.
your select stored procedure can be done as follows to require only one stored proc but any number of different items in the where clause. Pass in any one or combination of the parameters and you will get ALL items which match - so you only need one stored proc.
Create sp_ProductSelect
(
#ProductID int = null,
#IDProductType int = null,
#ProductName varchar(50) = null,
#ProductCode varchar(10) = null,
...
#Volume int = null
)
AS
SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'
Where
((#ProductID is null) or (ProductID = #ProductID)) AND
((#ProductName is null) or (ProductName = #ProductName)) AND
...
((#Volume is null) or (Volume= #Volume))
In SQL 2005, it supports nvarchar(max), which has a limit of 2G, but virtually accepting all string operations upon normal nvarchar. You may want to test if this can fit into what you need in the first approach.