SQL merge functionality without specifying column names - sql

I'm doing a SQLBulkCopy from my web app and inserting the records into a staging table. This is my first time working with staging tables. The live table that will be accepting the data has about 200 fields and can change into the future. When this change occurs I didn't want to have to re-write the merge statement.
I came up with this SQL that mimics the merge functionality, but doesn't require me to spell out the table columns. I am not an SQL expert and wanted someone that is to take a look and let me know if you see any problems that could arise by using this SQL because I haven't seen any examples of this and many people searching.
Note that records in the staging table that have a null id field are to be inserted.
-- set the table names, primary key field & vars to hold query parts
DECLARE #LiveTable varchar(20) = 'Test'
DECLARE #StagingTable varchar(20) = 'TestStaging'
DECLARE #PKField varchar(20) = 'TestPK'
DECLARE #SQLSet nvarchar(MAX) = ''
DECLARE #SQLInsertFields nvarchar(MAX) = ''
-- get comma delimited field names
DECLARE #Fields nvarchar(MAX) = (SELECT dbo.fn_GetCommaDelimitedFieldNames(#LiveTable))
-- loop through fields generating set clause of query to execute
WHILE LEN(#Fields) > 0
BEGIN
DECLARE #Field varchar(50) = left(#Fields, CHARINDEX(',', #Fields+',')-1)
IF #Field <> #PKField -- the primary key field cannot be updated
BEGIN
SET #SQLSet += ', ' + #LiveTable + '.' + #Field + ' = ' + #StagingTable + '.' + #Field
SET #SQLInsertFields += ', ' + #Field
END
SET #Fields = STUFF(#Fields, 1, CHARINDEX(',', #Fields+','), '')
END
-- remove the leading comma
SET #SQLSet = SUBSTRING(#SQLSet,3,LEN(#SQLSet))
SET #SQLInsertFields = SUBSTRING(#SQLInsertFields,3,LEN(#SQLInsertFields))
-- update records from staging table where primary key is provided
DECLARE #SQL nvarchar(MAX) = N'UPDATE ' + #LiveTable +
' SET ' + #SQLSet +
' FROM ' + #LiveTable +
' INNER JOIN ' + #StagingTable +
' ON ' + #LiveTable + '.' + #PKField + ' = ' + #StagingTable + '.' + #PKField
-- insert records from staging table where primary key is null
SET #SQL += '; INSERT INTO ' + #LiveTable + ' (' + #SQLInsertFields + ') SELECT ' + #SQLInsertFields + ' FROM ' + #StagingTable + ' WHERE ' + #PKField + ' IS NULL'
-- delete the records from the staging table
SET #SQL += '; DELETE FROM ' + #StagingTable
-- execute the sql statement to update existing records and insert new records
exec sp_executesql #SQL;
If anyone see's any issues with performance or anything else, I appreciate the insight.

Don't do this. Really. You're working very hard to avoid a rare problem that you probably won't handle correctly when the time comes.
If the target table changes, how do you know it will change in such a way that your fancy dynamic SQL will work correctly? How can you be sure it won't seem to work -- i.e. will work, syntactically -- but actually do the wrong thing? If and when the target table changes, won't you have to change your application, and the staging table too? With all that in the air, what's adding one more SET clause?
In the meanwhile, how can anyone be expected to read that gobbledygook (not your fault, really, that's SQL's syntax)? A bog-standard insert statement would be very clear and dependable.
And fast. SQL Server can't optimize your dynamic query. You used bcp for efficiency, and now you're defeating it with well meaning futureproofingness.

Related

Data purging Delete statement formation

I was just writing a stored procedure and I am stuck badly at one point.
Basically my stored procedure looks like this:
CREATE PROCEDURE [dbo].[DELETEGUIDTESTNEW1]
(#IpApplicationNumber NVARCHAR(50) = NULL)
AS
BEGIN
DECLARE #ApplicationNumber NVARCHAR(20)
SET #ApplicationNumber = (SELECT APPLICATIONNUMBER
FROM CUSTOMERROLE
WHERE CUSTOMERNUMBER = #IpCustomerNumber
AND CUSTOMERVERSIONNUMBER = #IpCustomerVersion)
-- In between I am doing business operation--------------------
INSERT INTO dbo.DeleteTest1(Statements)
VALUES ('Delete from '+#GuidTableName+' where ApplicationNumber =' + #ApplicationNumber)
I will get my parameter at runtime but actual issue is I want to form my delete statements before getting application number and I want form my delete statements in such a way that when I will get my application number I will fetch all delete statement from table and replace #ApplicationNumber by actual application number and will delete records from database.
So basically I want to form delete statements with application number as template and delete records at runtime.
Please help!
Well I have found solution to db purging. To handle runtime GUID situation I published stored procedure from stored procedure.
SELECT #GuidPrimaryTableSpace = #GuidPrimaryTableSpace + DeleteGuidStatement + ';'+ CHAR(13) + CHAR(10)
FROM ##Purge_GuidForeignKeyTablePurgeStatements
SELECT #GuidForeignKeyTableSpace = #GuidForeignKeyTableSpace +'Insert Into #AddressRecordsToPurge (GuidValue, GuidColumn) ( '+ SelectGuidStatement +');'+ CHAR(13) + CHAR(10)
FROM ##Purge_GuidPrimaryTablePurgeStatements
SELECT #DeleteGuidStatementSpace = #DeleteGuidStatementSpace + DeleteGuidValueStatement + ';'+ CHAR(13) + CHAR(10)
FROM ##Purge_GuidTableNames
SET #createProcedureCmd = 'CREATE PROCEDURE MyProc' + ' (#ApplicationNum nvarchar(50),#CustomerNumber nvarchar(50),#CustomerVersionNumber nvarchar(50)) AS ' + ' BEGIN '+
' DECLARE #ApplicationNumber nvarchar(50); SET #ApplicationNumber = #ApplicationNum;
DECLARE #AddressRecordsToPurge TABLE
(
RowID INT NOT NULL PRIMARY KEY IDENTITY(1,1),
GUIDValue Nvarchar(max),
GUIDColumn Nvarchar(max)
)'+ CHAR(13) + CHAR(10) + #GuidForeignKeyTableSpace + #GuidPrimaryTableSpace + #DeleteGuidStatementSpace +' END'
EXEC(#createProcedureCmd)
So here I am forming all select statement first and publishing stored procedure. At runtime when it will hit my published stored procedure, I will first find all guids in temp table and then start my delete.
Thank you anyways for all inputs.

How to Create DELETE Statement Stored Procedure Using TableName, ColumnName, and ColumnValue as Passing Parameters

Here is what i'm trying to do. I'm trying to create a stored procedure where I could just enter the name of the table, column, and column value and it will delete any records associated with that value in that table. Is there a simple way to do this? I don't know too much about SQL and still learning about it.
Here is what I have so far.
ALTER PROCEDURE [dbo].[name of stored procedure]
#TABLE_NAME varchar(50),
#COLUMN_NAME varchar(50),
#VALUE varchar(5)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #RowsDeleted int;
DECLARE #sql VARCHAR(500);
SET #sql = 'DELETE FROM (name of table).' + #TABLE_NAME + ' WHERE ' + #COLUMN_NAME + '=' + '#VALUE'
EXEC(#sql)
SET #RowsDeleted=##ROWCOUNT
END
GO
Couple issues
First, you don't need (name of table)
SET #sql = 'DELETE FROM ' + #TABLE_NAME + etc.
In general you should try to include the appropriate schema prefix
SET #sql = 'DELETE FROM dbo.' + #TABLE_NAME + etc.
And in case your table name has special characters perhaps it should be enclosed in brackets
SET #sql = 'DELETE FROM dbo.[' + #TABLE_NAME + ']' + etc.
Since #Value is a string, you must surround it with single quotes when computing the value for #SQL. To insert a single quote into a string you have to escape it by using two single quotes, like this:
SET #SQL = 'DELETE FROM dbo.[' + #TABLE_NAME + '] WHERE [' + #COLUMN_NAME + '] = '''' + #VALUE + ''''
If #VALUE itself contains a single quote, this whole thing will break, so you need to escape that as well
SET #SQL = 'DELETE FROM dbo.[' + #TABLE_NAME + '] WHERE [' + #COLUMN_NAME + '] = '''' + REPLACE(#VALUE,'''','''''') + ''''
Also, ##ROWCOUNT will not populate from EXEC. If you want to be able to read ##ROWCOUNT, use sp_ExecuteSQL instead
EXEC sp_ExecuteSql #SQL
And finally, let me editorialize for a minute--
This sort of stored procedure is not a great idea. I know it seems pretty cool because it is flexible, and that kind of thinking is usually smart when it comes to other languages, but in the database world this approach causes problems, e.g. there are security issues (e.g. injection, and the fact that you need elevated privileges to call sp_executeSql) and there issues with precompilation/performance (because the SQL isn't known ahead of time, SQL Server will need to generate a new query plan each and every time you call this) and since the caller can supply any value for table and column name you have no idea whether this delete statement will be efficient and use indexes or if it will cause a huge performance issue because the table is large and the column is not indexed.
The proper approach is to have a series of appropriate stored procedures with strongly-typed inputs that are specific to each data use case where you need to delete based on criteria. Database engineers should not be trying to make things flexible; you should be forcing people to think through what exactly they are going to need, and implement that and only that. That is the only way to ensure people are following the rules, keeping R/I intact, efficient use of indexes, etc.
Yes, this may seem like repetitive and redundant work, but c'est la vie. There are tools available to generate the code for CRUD operations if you don't like the extra typing.
In addition to some of the information John Wu provided you have to worry about data types and ##ROWCOUNT may not be accurate if there are triggers on your tables and things..... You can get around both of those issues though by casting to nvarchar() and using OUTPUT clause with a temp table to do the COUNT().
So just for fun here is a way you can do it:
CREATE PROCEDURE dbo.[ProcName]
#TableName SYSNAME
,#ColumnName SYSNAME
,#Value NVARCHAR(MAX)
,#RecordCount INT OUTPUT
AS
BEGIN
DECLARE #SQL NVARCHAR(1000)
SET #SQL = N'IF OBJECT_ID(''tempdb..#DeletedOutput'') IS NOT NULL
BEGIN
DROP TABLE #DeletedOutput
END
CREATE TABLE #DeletedOutput (
ID INT IDENTITY(1,1)
ColumnValue NVARCHAR(MAX)
)
DELETE FROM dbo.' + QUOTENAME(#TableName) + '
OUTPUT deleted.' + QUOTENAME(#ColumnName) + ' INTO #DeletedOutput (ColumnValue)
WHERE CAST(' + QUOTENAME(#ColumnName) + ' AS NVARCHAR(MAX)) = ' + CHAR(39) + #Value + CHAR(39) + '
SELECT #RecordCountOUT = COUNT(ID) FROM #DeletedOutput
IF OBJECT_ID(''tempdb..#DeletedOutput'') IS NOT NULL
BEGIN
DROP TABLE #DeletedOutput
END'
DECLARE #ParmDefinition NVARCHAR(200) = N'#RecordCountOUT INT OUTPUT'
EXECUTE sp_executesql #SQL, #ParmDefinition, #RecordCountOUT = #RecordCount OUTPUT
END
So the use of QOUTENAME will help against the injection attack but not be perfect. And I use CHAR(39) instead of the escape sequence for a single quote on value because I find it easier when string building at that point.... By using Parameter OUTPUT from sp_executesql you can still return your count.
Keep in mind just because you can do something in SQL doesn't always mean you should.

Update data in a SQL Server temp table where the column names are unknown

In a stored procedure I dynamically create a temp table by selecting the name of applications from a regular table. Then I add a date column and add the last 12 months. The result looks like this:
So far so good. Now I want to update the data in columns by querying another regular table. Normally it would be something like:
UPDATE ##TempTable
SET [columName] = (SELECT SUM(columName)
FROM RegularTable
WHERE FORMAT(RegularTable.Date,'MM/yyyy') = FORMAT(##TempMonths.x,'MM/yyyy'))
However, since I don't know what the name of the columns are at any given time, I need to do this dynamically.
So my question is, how can I get the column names of a temp table dynamically while doing an update?
Thanks!
I think you can use something like the following.
select name as 'ColumnName'
from tempdb.sys.columns
where object_id = object_id('tempdb..##TempTable');
And then generate dynamic sql using something like the following.
DECLARE #tableName nvarchar(50)
SET #tableName = 'RegularTable'
DECLARE #sql NVARCHAR(MAX)
SET #sql = ''
SELECT #sql = #sql + ' UPDATE ##TempTable ' + CHAR(13) +
' SET [' + c.name + '] = (SELECT SUM([' + c.name + ']) ' + CHAR(13) +
' FROM RegularTable' + CHAR(13) +
' WHERE FORMAT(RegularTable.Date,''MM/yyyy'') = FORMAT(##TempMonths.x,''MM/yyyy''));' + CHAR(13)
from tempdb.sys.columns c
where object_id = object_id('tempdb..##MyTempTable');
print #sql
-- exec sp_executesql #sql;
Then print statement in above snippet shows that the #sql variable has the following text.
UPDATE ##TempTable
SET [Test Application One] = (SELECT SUM([Test Application One])
FROM RegularTable
WHERE FORMAT(RegularTable.Date,'MM/yyyy') = FORMAT(##TempMonths.x,'MM/yyyy'));
UPDATE ##TempTable
SET [Test Application Two] = (SELECT SUM([Test Application Two])
FROM RegularTable
WHERE FORMAT(RegularTable.Date,'MM/yyyy') = FORMAT(##TempMonths.x,'MM/yyyy'));
So now, you use sp_exec to execute the updates as follows (un-comment it from above snippet).
exec sp_executesql #sql;
If it's a 1 time UPDATE you can PRINT the dynamic SQL statement (as shown above) and then execute it in the SSMS Query Windows.
I recommend you use the print statement first to make sure the UPDATE statements generated are what you want, and then do the sp_executesql or run the printed UPDATE statement in the query window.

SQL Server 2012 Using Declared Variables in a Join

I'm quite new to SQL Server so hopefully this makes sense :)
I'm trying to declare variables to be used in an INNER JOIN.
If you take a look at my code, you'll see what I'm trying to do, without me needing to go into too much detail. Let me know if you need more info. Is that syntax possible?
EDIT: See new attempt below
--State - If suburb/postcode, could use postcode lookup
Declare #Missing as nvarchar(255),
#MissingUpdate as nvarchar(255),
#MatchA as nvarchar(255),
#MatchB as nvarchar(255),
#Reason as nvarchar(255);
Set #Missing = '[StateEXPORT]'; -- field to update
Set #MissingUpdate = '[State]'; -- field in postcode lookup to pull in
Set #MatchA = '[PostcodeEXPORT]'; -- field in master field to match with
Set #MatchB = '[Pcode]'; -- field in postcode lookup to match with
Set #Reason = 'Contactable - Needs verificiation - #MissingUpdate taken from Lookup'; -- reason here
update [BT].[dbo].[test]
set #Missing = b.#MissingUpdate,
FinalPot = #Reason
FROM [BT].[dbo].[test] a
INNER JOIN [BT].[dbo].[Postcode Lookup] b
ON a.#MatchA = b.#MatchB
where (#Missing is null or #Missing = '0') and [AddressSource] != ('Uncontactable')
GO
EDIT: SECOND ATTEMPT:
set #sql = 'update [BT].[dbo].[test] set ' + quotename(#Missing) + '= b.' + quotename(#MissingUpdate) + ', FinalPot = ' + #Reason + 'FROM [BT].[dbo].[test] a INNER JOIN [BT].[dbo].[Postcode Lookup] b ON a.' + quotename(#MatchA) + ' = b.' + quotename(#MatchB) + 'where (' + quotename(#Missing) + 'is null or' + quotename(#Missing) + ' = 0 and [AddressSource] != "(Uncontactable)"'
exec (#sql)
Thanks for your help,
Lucas
No, this syntax is not possible, at least not directly: you need to specify the column name, not a string variable that has the name.
If you wish to decide the names of columns dynamically, you could make a SQL string that represents the statement that you wish to execute, and pass that string to EXECUTE command. You have to take extra care not to put any of the user-entered data into the generated SQL string, though, to avoid SQL injection attacks.
EDIT: The reason your second attempt may be failing is that you are passing names in square brackets to quotename. You should remove brackets from your variable declarations, like this:
Set #Missing = 'StateEXPORT'; -- field to update
Set #MissingUpdate = 'State'; -- field in postcode lookup to pull in
Set #MatchA = 'PostcodeEXPORT'; -- field in master field to match with
Set #MatchB = 'Pcode'; -- field in postcode lookup to match with
You can't use variable names as column names without dynamic SQL.
An example of a dynamic SQL query:
declare #ColumnName varchar(100) = 'col1'
declare #sql varchar(max)
set #sql = 'select ' + quotename(#ColumnName) + ' from dbo.YourTable'
exec (#sql)

Checking whether conditions are met by all rows with dynamic SQL

I have a table in SQL Server 2008 which contains custom validation criteria in the form of expressions stored as text, e.g.
StagingTableID CustomValidation
----------------------------------
1 LEN([mobile])<=30
3 [Internal/External] IN ('Internal','External')
3 ([Internal/External] <> 'Internal') OR (LEN([Contact Name])<=100)
...
I am interested in determining whether all rows in a table pass the conditional statement. For this purpose I am writing a validation stored procedure which checks whether all values in a given field in a given table meet the given condition(s). SQL is not my forte, so after reading this questions this is my first stab at the problem:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT * FROM (' +
N'SELECT CASE WHEN ' + #CustomValidationExpr + N' THEN 1 ' +
N'ELSE 0 END AS ConditionalTest ' +
N'FROM ' + #StagingTableName +
N')t ' +
N'WHERE t.ConditionalTest = 0)'
,N'#passed BIT OUTPUT'
,#passed = #PassedCustomValidation OUTPUT
However, I'm not sure if the nested queries can be re-written as one, or if there is an entirely better way for testing for validity of all rows in this scenario?
Thanks in advance!
You should be able to reduce by at least one subquery like this:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT 1 FROM ' + #StagingTableName +
N'WHERE NOT(' + #CustomValidationExpr + N')) ' +
,N'#passed BIT OUTPUT'
,#passed = #PassedcustomValidation OUTPUT
Before we answer the original question, have you looked into implementing constraints? This will prevent bad data from entering your database in the first place. Or is the point that these must be dynamically set in the application?
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$MobileValidLength]
CHECK (LEN([mobile])<=30)
GO
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$InternalExternalValid]
CHECK ([Internal/External] IN ('Internal','External'))
GO
--etc...
You need to concatenate the expressions together. I agree with #PinnyM that a where clause is easier for full table validation. However, the next question will be how to identify which rows fail which tests. I'll wait for you to ask that question before answering it (ask it as a separate question and not as an edit to this one).
To create the where clause, something like this:
declare #WhereClause nvarchar(max);
select #WhereClause = (select CustomValidation+' and '
from Validations v
for xml path ('')
) + '1=1'
select #WhereClause = replace(replace(#WhereClause, '<', '<'), '>', '>'))
This strange construct, with the for xml path('') and the double select, is the most convenient way to concatenate values in SQL Server.
Also, put together your query before doing the sp_executesql call. It gives you more flexibilty:
declare #sql nvarchar(max);
select #sql = '
select #passed = count(*)
from '+#StagingTableName+'
where '+#WhereClause
That is the number that pass all validation tests. The where clause for the fails is:
declare #WhereClause nvarchar(max);
select #WhereClause = (select 'not '+CustomValidation+' or '
from Validations v
for xml path ('')
) + '1=0'