I am trying to learn database on my own; all of your comments are appreciated.
I have the following table.
CREATE TABLE AccountTable
(
AccountId INT IDENTITY(100,1) PRIMARY KEY,
FirstName NVARCHAR(50) NULL,
LastName NVARCHAR(50) NULL,
Street NVARCHAR(50) NULL,
StateId INT REFERENCES STATETABLE(StateId) NOT NULL
)
I would like to write a Stored procedure that updates the row. I imagine that the stored procedure would look something like this:
CREATE PROCEDURE AccountTable_Update
#Id INT,
#FirstName NVARCHAR(20),
#LastName NVARCHAR(20),
#StreetName NVARCHAR(20),
#StateId INT
AS
BEGIN
UPDATE AccountTable
Set FirstName = #FirstName
Set LastName = #LastName
Set Street = #StreetName
Set StateId = #StateId
WHERE AccountId = #Id
END
the caller provides the new information that he wants the row to have. I know that some of the fields are not entirely accurate or precise; I am doing this mostly for learning.
I am having a syntax error with the SET commands in the UPDATE portion, and I don't know how to fix it.
Is the stored procedure I am writing a procedure that you would write in real life? Is this an antipattern?
Are there any grave errors I have made that just makes you cringe when you read the above TSQL?
Are there any grave errors I have made that just makes you cringe when you read the above TSQL?
Not really "grave," but I noticed your table's string fields are set up as the datatype of NVARCHAR(50) yet your stored procedure parameters are NVARCHAR(20). This may be cause for concern. Usually your stored procedure parameters will match the corresponding field's datatype and precision.
#1: You need commas between your columns:
UPDATE AccountTable SET
FirstName = #FirstName,
LastName = #LastName,
Street = #StreetName,
StateId = #StateId
WHERE
AccountId = #Id
SET is only called once, at the very start of the UPDATE list. Every column after that is in a comma separated list. Check out the MSDN docs on it.
#2: This isn't an antipattern, per se. Especially given user input. You want parametized queries, as to avoid SQL injection. If you were to build the query as a string off of user input, you would be very, very susceptible to SQL injection. However, by using parameters, you circumvent this vulnerability. Most RDBMS's make sure to sanitize the parameters passed to its queries automagically. There are a lot of opponents of stored procedures, but you're using it as a way to beat SQL injection, so it's not an antipattern.
#3: The only grave error I saw was the SET instead of commas. Also, as ckittel pointed out, your inconsistency in the length of your nvarchar columns.
Related
I have a stored procedure up_InsertEmployees. I have a functionality where I am uploading a batch of employee details into the web application. This functionality inserts the employees details into the DB using the above mentioned stored procedure.
The stored procedure goes something like this
create procedure 'sp_InsertEmployees'
(#EmployeeUN,
#FirstName,
#LastName,
#City, #Ref)
BEGIN
declare #BatchRef varchar(20)
set #BatchRef = #Ref+GetUTCDate()
Insert into Employee(EmployeeUN, FirstName, LastName, City, BatchRef)
Values(#EmployeeUN, #FirstName, #LastName, #City, #BatchRef)
END
Here the column Ref holds the reference of the batch upload that I have performed. The value BatchRef has to be the same for all the employees of that particular batch. Since I am using GetUTCDate() the value of BatchRef might change with every employee that is being inserted. Can somebody please let me know how I can calculate the value of BatchRef when the first employee is being inserted and then keep it constant there on? I want this to be done in SQL only and not in the code. And also I want the datetime value to be in BatchRef so that each batch values are unique.
The best way to keep a consistent BatchRef value across multiple rows being inserted is to insert all of the rows from that batch at the same time ;-). And doing so will also have the benefit of being quite a bit more efficient :-).
Assuming you are using SQL Server 2008 or newer, this can be accomplish via Table-Valued Parameters (TVPs). I have detailed the approach in my answer to the following question:
Pass Dictionary<string,int> to Stored Procedure T-SQL
For this particular use-case, the stored procedure would look something like the following:
CREATE PROCEDURE InsertEmployees
(
#EmployeeBatch dbo.EmployeeData READONLY,
#Ref VARCHAR(50)
)
BEGIN
DECLARE #BatchRef VARCHAR(50);
SET #BatchRef = #Ref + CONVERT(VARCHAR(30), GetUTCDate(), 121);
INSERT INTO Employee (EmployeeUN, FirstName, LastName, City, BatchRef)
SELECT eb.EmployeeUN, eb.FirstName, eb.LastName, eb.City, #BatchRef
FROM #EmployeeBatch eb;
END
Anyone stuck on SQL Server 2005 can accomplish the same basic concept using XML. It won't be as efficient as the entire XML document needs to be created in the app code and then parsed into a temp table in the stored procedure, but it is still a lot faster than doing row-by-row operations.
Given:
#Ref is unique per each batch and never reused
#Ref needs to stay unique per each batch, not unique per each Employee that is uploaded
There is no option to alter the upload process from one call per Employee to doing the entire batch as a single set
You want the time that the batch started (i.e. when the first Employee in each batch is uploaded)
Then:
Don't combine #Ref with GETUTCDATE(), but instead track the start time in another table.
CREATE TABLE dbo.EmployeeBatchStartTime
(
BatchRef VARCHAR(50) NOT NULL
CONSTRAINT [PK_EmployeeBatchStartTime] PRIMARY KEY,
StartTime DATETIME NOT NULL
CONSTRAINT [DF_EmployeeBatchStartTime_StartTime] DEFAULT (GETUTCDATE())
);
Then you can check to see if a row for the passed-in value of #Ref exists in that table and if not, insert it, else skip it.
CREATE PROCEDURE InsertEmployee
(
#EmployeeUN DataType,
#FirstName DataType,
#LastName DataType,
#City DataType,
#Ref VARCHAR(50)
)
AS
SET NOCOUNT ON;
IF (NOT EXISTS(
SELECT *
FROM dbo.EmployeeBatchStartTime
WHERE BatchRef = #Ref
)
)
BEGIN
INSERT INTO dbo.EmployeeBatchStartTime (BatchRef)
VALUES (#Ref);
END;
INSERT INTO Employee (EmployeeUN, FirstName, LastName, City, BatchRef)
VALUES (#EmployeeUN, #FirstName, #LastName, #City, #BatchRef);
This not only lets you keep the value of BatchRef clean and usable (instead of combining a DATETIME value into it) but also gives you a DATETIME value that is usable without any error-prone text parsing or conversion from a string into a DATETIME datatype. This means that you can even add an index, if need be, to the StartTime field in the EmployeeBatchStartTime which will allow you to JOIN to that table on BatchRef and then use StartTime in an ORDER BY and it will be rather efficient :). AND, this requires no change at all to the existing app code :) :).
Assume the following please:
I have a table that has ~50 columns. I receive an import file that has information that needs to be updated in a variable number of these columns. Is there a method via a stored procedure where I can only update the columns that need to be updated and the rest retain their current value unchanged. Keep in mind I am not saying that the unaffected columns return to some default value but actually maintain the current value stored in them.
Thank you very much for any response on this.
Is COALESCE something that could be applied here as I have read in this thread: Updating a Table through a stored procedure with variable parameters
Or am I looking at a method more similar to what is being explained here:
SQL Server stored procedure with optional parameters updates wrong columns
For the record my sql skills are quite weak, I am just beginning to dive into this end of the pool
JD
Yes, you can use COALESCE to do this. Basically if the parameter is passed in as NULL then it will use the original value. Below is a general pattern that you can adapt.
DECLARE #LastName NVARCHAR(50) = 'John'
DECLARE #FirstName NVARCHAR(50) = NULL;
DECLARE #ID INT = 1;
UPDATE dbo.UpdateExample
SET LastName = COALESCE(#LastName, LastName), FirstName = COALESCE(#FirstName, FirstName),
WHERE ID = #ID
Also, have a read of this article, titled: The Impact of Non-Updating Updates
http://web.archive.org/web/20180406220621/http://sqlblog.com:80/blogs/paul_white/archive/2010/08/11/the_2D00_impact_2D00_of_2D00_update_2D00_statements_2D00_that_2D00_don_2D00_t_2D00_change_2D00_data.aspx
Basically,
"SQL Server contains a number of optimisations to avoid unnecessary logging or page flushing when processing an UPDATE operation that will not result in any change to the persistent database."
OK, so my subject line isn't very descriptive, but here's the scenario:
An end-user has a legal obligation to submit transaction data to a government agency. The transactions contain the name and address of various individuals and organizations. HOWEVER, end users frequently misspell the names of the reported individuals and organizations, or they badly mangle the address, etc.
The information submitted by the end user is a legal 'document', so it cannot be altered by the agency that received it. Also, the transactions can be viewed and searched by the public. When the government agency notices an obvious misspelling or bad address, they would like to 'hide' or 'mask' that bad value with a known good value. For example, if an end user entered 'Arnie Schwarzeger', the agency could replace that name with 'Arnold Schwarzenegger'. The public that viewed the data would see (and search for) the correct spelling, but could view the original data as entered by the end user after they found the data record in question.
Hopefully that explains the business case well enough...on to the SQL part! So to address this problem, we have tables that look like this:
CREATE TABLE [dbo].[SomeUserEnteredData](
[Id] [uniqueidentifier] NOT NULL,
[LastOrOrganizationName] [nvarchar](350) NOT NULL, // data as entered by end-user
[FirstName] [nvarchar](50) NULL, // data as entered by end-user
[FullName] AS ([dbo].[FullNameValue]([FirstName],[LastName])) PERSISTED, // data as entered by end-user
[MappedName] AS ([dbo].[MappedNameValue]([FirstName],[LastName]))) // this is the 'override' data from the agency
CREATE TABLE [dbo].[CorrectionsByAgency](
[Id] [uniqueidentifier] NOT NULL,
[ReplaceName] [nvarchar](400) NOT NULL,
[KeepName] [nvarchar](400) NOT NULL)
CREATE FUNCTION [dbo].[FullNameValue]
(
#FirstName as NVARCHAR(40),
#LastOrOrganizationName as NVARCHAR(350)
)
RETURNS NVARCHAR(400)
WITH SCHEMABINDING
AS
BEGIN
DECLARE #result NVARCHAR(400)
IF #FirstName = '' OR #FirstName is NULL
SET #result = #LastOrOrganizationName
ELSE
SET #result = #LastOrOrganizationName + ', ' + #FirstName
RETURN #result
END
CREATE FUNCTION [dbo].[MappedNameValue]
(
#FirstName as NVARCHAR(50),
#LastOrOrganizationName as NVARCHAR(350)
)
RETURNS NVARCHAR(400)
AS
BEGIN
DECLARE #result NVARCHAR(400)
DECLARE #FullName NVARCHAR(400)
SET #FullName = dbo.FullNameValue(#FirstName, #LastOrOrganizationName)
SELECT top 1 #result = KeepName from CorrectionsByAgency where ReplaceName = #FullName
if #result is null
SET #result = #FullName
RETURN #result
END
Hopefully, if my sample isn't TOO convoluted, you can see that if the agency enters a name correction, it will replace all occurrences of the misspelled name. From a business logic perspective, this works exactly right: the agency staff only enters a few corrections and the corrections can override everywhere there are misspelled names.
From a server performance standpoint, this solution STINKS. The calculated SomeUserEnteredData.MappedName column can't be indexed, and no view that reads from that column can be indexed either! There's no way this can work for our needs if we can't index the MappedName values.
The only alternative I've been able to see as a possibility is to create an additional linking table between the end-user created data and the agency created data -- when the agency enters a correction record, a record is created in the linking table for every occurrence of the bad column value. The down side to this seems to be the very real likelihood of creating/destroying many (hundreds of thousands) of those linking records for every correction entered by an agency user...
Do any of you SQL geniuses out there have great ideas about how to address this problem?
I'm not sure if this is answering your question directly, but I would try to simplify the whole thing: stop using functions, persist "calculated" values and use application logic (possibly in a stored procedure) to manage the data.
Assuming that one agency correction can be applied to many user-entered names, then you could have something like this:
create table dbo.UserEnteredData (
DocumentId uniqueidentifier not null primary key,
UserEnteredName nvarchar(1000) not null,
CorrectedNameId uniqueidentifier null,
constraint FK_CorrectedNames foreign key (CorrectedNameId)
references dbo.CorrectedNames (CorrectedNameId)
)
create table dbo.CorrectedNames (
CorrectedNameId uniqueidentifier not null primary key,
CorrectedName nvarchar(1000) not null
)
Now, you need to make sure your application logic can do something like this:
External user enters dirty data
Agency user reviews the dirty data and identifies both the incorrect name and the corrected name
Application checks if the corrected name already exists
If no, create a new row in dbo.CorrectedNames
Create a new row in dbo.UserEnteredData, with the CorrectedNameId
I'm assuming that things are rather more complicated in reality and corrections are made based on addresses and other data as well as just names, but the basic relationship you describe seems simple enough. As you said, the functions add a lot of overhead and it's not clear (to me) what benefit they provide over just storing the data you need directly.
Finally, I don't understand your comment about creating/destroying linking records; it's up to your application logic to handle data changes correctly.
I have a stored procedure that inserts a user into a table but I want an output value equals to the new inserted UserID in the table but I don't know how to do it can you guys help me?
I have this
ALTER PROCEDURE dbo.st_Insert_User
(
#Nombre varchar(200),
#Usuario varchar(100),
#Password varchar(100),
#Administrador bit,
#Resultado int output
)
AS
INSERT INTO tbl_Users(Nombre, Usuario, Password, Administrador)
VALUES(#Nombre, #Usuario, #Password, #Administrador)
SELECT #resultado = UserID
I also tried
SELECT #resultado = UserID FROM tbl_Users WHERE Usuario = #Usuario
SELECT SCOPE_IDENTITY()
will give you the identity of the row
For SQL Server, you want to use the OUTPUT clause. See information and examples in Books Online here. It does cover your case-- as it mentions "The OUTPUT clause may be useful to retrieve the value of identity or computed columns after an INSERT or UPDATE operation."
(If this is for real world purposes, you do of course have security concerns in storing passwords that you should address.)
Add at the end
select ##Identity
I would like to have a stored procedure that will update values in a table row depending on whether or not the parameters are provided. For example, I have a situation where I want to update all the values, but also a situation where I'm only required to update two values. I was hoping to be able to do this with only one procedure, rather than writing two, which doesn't particularly appeal to me. The best I have managed to come up with myself is something like the following:
CREATE PROCEDURE dbo.UpdatePerson
#PersonId INT,
#Firstname VARCHAR(50) = NULL,
#Lastname VARCHAR(50) = NULL,
#Email VARCHAR(50) = NULL
AS
BEGIN
SET NOCOUNT ON
UPDATE Person
Set
Firstname = COALESCE(#Firstname, Firstname),
Lastname = COALESCE(#LastName, Lastname),
Email = COALESCE(#Email, Email)
WHERE PersonId = #PersonId
END
I realize that the values will be updated each time anyway, which isn't ideal. Is this an effective way of achieving this, or could it be done a better way?
I think your code is fine. The only thing I would add is a check for the case when all three params are NULL, in which case no update should be done.
SQL Server does actually have some logic to deal with non updating updates.
More details than you probably wanted to know!