Checking whether conditions are met by all rows with dynamic SQL - sql

I have a table in SQL Server 2008 which contains custom validation criteria in the form of expressions stored as text, e.g.
StagingTableID CustomValidation
----------------------------------
1 LEN([mobile])<=30
3 [Internal/External] IN ('Internal','External')
3 ([Internal/External] <> 'Internal') OR (LEN([Contact Name])<=100)
...
I am interested in determining whether all rows in a table pass the conditional statement. For this purpose I am writing a validation stored procedure which checks whether all values in a given field in a given table meet the given condition(s). SQL is not my forte, so after reading this questions this is my first stab at the problem:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT * FROM (' +
N'SELECT CASE WHEN ' + #CustomValidationExpr + N' THEN 1 ' +
N'ELSE 0 END AS ConditionalTest ' +
N'FROM ' + #StagingTableName +
N')t ' +
N'WHERE t.ConditionalTest = 0)'
,N'#passed BIT OUTPUT'
,#passed = #PassedCustomValidation OUTPUT
However, I'm not sure if the nested queries can be re-written as one, or if there is an entirely better way for testing for validity of all rows in this scenario?
Thanks in advance!

You should be able to reduce by at least one subquery like this:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT 1 FROM ' + #StagingTableName +
N'WHERE NOT(' + #CustomValidationExpr + N')) ' +
,N'#passed BIT OUTPUT'
,#passed = #PassedcustomValidation OUTPUT

Before we answer the original question, have you looked into implementing constraints? This will prevent bad data from entering your database in the first place. Or is the point that these must be dynamically set in the application?
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$MobileValidLength]
CHECK (LEN([mobile])<=30)
GO
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$InternalExternalValid]
CHECK ([Internal/External] IN ('Internal','External'))
GO
--etc...

You need to concatenate the expressions together. I agree with #PinnyM that a where clause is easier for full table validation. However, the next question will be how to identify which rows fail which tests. I'll wait for you to ask that question before answering it (ask it as a separate question and not as an edit to this one).
To create the where clause, something like this:
declare #WhereClause nvarchar(max);
select #WhereClause = (select CustomValidation+' and '
from Validations v
for xml path ('')
) + '1=1'
select #WhereClause = replace(replace(#WhereClause, '<', '<'), '>', '>'))
This strange construct, with the for xml path('') and the double select, is the most convenient way to concatenate values in SQL Server.
Also, put together your query before doing the sp_executesql call. It gives you more flexibilty:
declare #sql nvarchar(max);
select #sql = '
select #passed = count(*)
from '+#StagingTableName+'
where '+#WhereClause
That is the number that pass all validation tests. The where clause for the fails is:
declare #WhereClause nvarchar(max);
select #WhereClause = (select 'not '+CustomValidation+' or '
from Validations v
for xml path ('')
) + '1=0'

Related

While using SP_ExecuteSQL I'm generating a boolean error

I'm very new to SQL, so let me first apologize if my questions come off as trivial or it seems as though I haven't done my work. I am trying to learn how to grasp a lot of these concepts.
Anyway, I'm writing a complex query that will eventually take many parameters. For these parameters I'll be using a comma delimited string to allow multiple variables. (I've solved this issue previously, but not when attempting to execute an SP_ExecuteSQL.
With that being said, here is the bare bones of the query.
DECLARE #system_status varchar(30)
SELECT #system_status = '12,14'
DECLARE #sql nvarchar(4000)
SELECT #sql = 'SELECT [system_status]
FROM VW_Document_Main
WHERE 1=1 '
IF #System_Status = '-1'
Begin
SELECT #sql = #sql + 'and system_status <> 20'
End
ELSE IF #system_status IS NOT NULL AND #system_status NOT IN ('-1','0')
Begin
SELECT #sql = #sql + 'and ' + #system_Status + ' LIKE ''%,'' + system_Status + '',%'''
I'm able to populate a useable query when not building it into an sp_executesql statement, however, since I'll be building about this query it's necessary to take these steps... any thoughts as to why I'm generating the non-Boolean error?
EDIT: Not sure if it's a step in the right direction, but now after reworking the final SELECT statement to read:
SELECT #sql = #sql + 'and '',''' + #system_Status + '',''' LIKE ''%,' + 'system_Status' + ',%'''
It's giving me back a different error: A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.
It should be worth noting that the error reads: An expression of non-boolean type specified in a context where a condition is expected, near ','.
You're plugging in your literal #system_status value '12, 14' into the final SQL so it looks like this
SELECT [system_status] FROM VW_Document_Main WHERE 1=1 and 12,14 LIKE '%,' + system_Status + ',%'
Which will fail because you don't have 12,14 in quotes. So you'll have to modify the line
SELECT #system_status = '12,14'
to be
SELECT #system_status = '''12,14'''
Alternatively, since you said you're running this through sp_executeSql you should modify your last select to
SELECT #sql = #sql + 'and #system_Status LIKE ''%,'' + system_Status + '',%'''
And change your execute SQL proc to
sp_executesql #sql, N'#system_status varchar(30)', #system_status

SQL merge functionality without specifying column names

I'm doing a SQLBulkCopy from my web app and inserting the records into a staging table. This is my first time working with staging tables. The live table that will be accepting the data has about 200 fields and can change into the future. When this change occurs I didn't want to have to re-write the merge statement.
I came up with this SQL that mimics the merge functionality, but doesn't require me to spell out the table columns. I am not an SQL expert and wanted someone that is to take a look and let me know if you see any problems that could arise by using this SQL because I haven't seen any examples of this and many people searching.
Note that records in the staging table that have a null id field are to be inserted.
-- set the table names, primary key field & vars to hold query parts
DECLARE #LiveTable varchar(20) = 'Test'
DECLARE #StagingTable varchar(20) = 'TestStaging'
DECLARE #PKField varchar(20) = 'TestPK'
DECLARE #SQLSet nvarchar(MAX) = ''
DECLARE #SQLInsertFields nvarchar(MAX) = ''
-- get comma delimited field names
DECLARE #Fields nvarchar(MAX) = (SELECT dbo.fn_GetCommaDelimitedFieldNames(#LiveTable))
-- loop through fields generating set clause of query to execute
WHILE LEN(#Fields) > 0
BEGIN
DECLARE #Field varchar(50) = left(#Fields, CHARINDEX(',', #Fields+',')-1)
IF #Field <> #PKField -- the primary key field cannot be updated
BEGIN
SET #SQLSet += ', ' + #LiveTable + '.' + #Field + ' = ' + #StagingTable + '.' + #Field
SET #SQLInsertFields += ', ' + #Field
END
SET #Fields = STUFF(#Fields, 1, CHARINDEX(',', #Fields+','), '')
END
-- remove the leading comma
SET #SQLSet = SUBSTRING(#SQLSet,3,LEN(#SQLSet))
SET #SQLInsertFields = SUBSTRING(#SQLInsertFields,3,LEN(#SQLInsertFields))
-- update records from staging table where primary key is provided
DECLARE #SQL nvarchar(MAX) = N'UPDATE ' + #LiveTable +
' SET ' + #SQLSet +
' FROM ' + #LiveTable +
' INNER JOIN ' + #StagingTable +
' ON ' + #LiveTable + '.' + #PKField + ' = ' + #StagingTable + '.' + #PKField
-- insert records from staging table where primary key is null
SET #SQL += '; INSERT INTO ' + #LiveTable + ' (' + #SQLInsertFields + ') SELECT ' + #SQLInsertFields + ' FROM ' + #StagingTable + ' WHERE ' + #PKField + ' IS NULL'
-- delete the records from the staging table
SET #SQL += '; DELETE FROM ' + #StagingTable
-- execute the sql statement to update existing records and insert new records
exec sp_executesql #SQL;
If anyone see's any issues with performance or anything else, I appreciate the insight.
Don't do this. Really. You're working very hard to avoid a rare problem that you probably won't handle correctly when the time comes.
If the target table changes, how do you know it will change in such a way that your fancy dynamic SQL will work correctly? How can you be sure it won't seem to work -- i.e. will work, syntactically -- but actually do the wrong thing? If and when the target table changes, won't you have to change your application, and the staging table too? With all that in the air, what's adding one more SET clause?
In the meanwhile, how can anyone be expected to read that gobbledygook (not your fault, really, that's SQL's syntax)? A bog-standard insert statement would be very clear and dependable.
And fast. SQL Server can't optimize your dynamic query. You used bcp for efficiency, and now you're defeating it with well meaning futureproofingness.

Is there a way to replace a character or string in all fields without writing it for each field?

I will warn you up front, this question borders on silly, but I'm asking anyway.
The impetus for my question is creating a csv from a query result and some of the fields containing commas already. Obviously, the csv doesn't know any better and just merrily jacks up my good mood by having some stragglers in non-field columns.
I know I can write
Replace(FieldName, OldChar, NewChar)
for each field, but I'm more curious than anything if there's a shortcut to replace them all in the query output.
Basically what I'm looking for (logically) is:
Replace(AllFields, OldChar, NewChar)
I don't know all of the SQL tricks (or many of them), so I thought maybe the SO community may be able to enlighten me...or call me nuts.
There is no SQL syntax to do what you describe, but as you've seen there are many ways to do this with dynamic SQL. Here's the way I prefer (this assumes you want to replace commas with pipe, change this as you see fit):
DECLARE #table NVARCHAR(511),
#newchar NCHAR(1),
#sql NVARCHAR(MAX);
SELECT #table = N'dbo.table_name',
#newchar = N'|', -- tailor accordingly
#sql = N'';
SELECT #sql = #sql + ',
' + QUOTENAME(name)
+ ' = REPLACE(CONVERT(NVARCHAR(MAX), ' + QUOTENAME(name) + '),'','','''
+ #newchar + ''')'
FROM sys.columns
WHERE [object_id] = OBJECT_ID(#table)
ORDER BY column_id;
SELECT #sql = N'SELECT ' + STUFF(#sql, 1, 1, '') + '
FROM ' + #table;
PRINT #sql;
-- EXEC sp_executesql #sql;
I feel your pain. I often have one-time type cleansing steps in ETL routines. I find a script like this helps when you need to remove some oddity from an import (rogue page breaks, whitespace, etc.):
declare #tableName nvarchar(100) = 'dbo.YourTable';
declare #col nvarchar(max);
-- remove quotes and trim every column, kill page breaks, etc.
;with c_Col (colName)
as ( select c.name
from sys.tables t
join sys.columns c on
c.object_id = t.object_id
where t.object_id = object_id(#tableName)
)
select #col = stuff(a.n, 1, 1, '')
from ( select top 100 percent
',' + c.colName + '= nullif(replace(replace(replace(rtrim(ltrim('+c.colName+ ')), ''"'', ''''), char(13), ''''), char(10), ''''), '''') '
from c_col c
for xml path('')
) as a(n)
declare #cmd nvarchar(max)
set #cmd = 'update ' + #tableName + ' set ' + #col
print #cmd;
--exec(#cmd);
If you are just looking to save yourself some typing for a one time query statement affecting all fields in a table then this is a trick I've used in the past.
First query the schema to produce a result set that returns all the field names in any table you specify. You can modify what I've provided here as a template but I've given the basic structure of an update statement around the field names.
select column_name + ' = Replace(' + column_name + ',OldChar,NewChar),'
from information_schema.columns
where table_name = 'YourTableName'
The result set comes back in query analyzer as a series of rows that you can highlight (by clicking on column name) and then copying and pasting right back into your query analyzer window. From there add your update statement to the beginning and where clause to the end. You'll also need to get rid of the one extra comma.
You can then re-run the query to produce the desire outcome.

MSSQL: given a table's object_id, determine whether it is empty

For a bit of database-sanity checking code, I'd like to determine whether a particular object_id corresponds to an empty table.
Is there some way to (for instance) select count(*) from magic_operator(my_object_id) or similar?
I'd strongly prefer a pure-sql solution that can run on MS SQL server 2008b.
You can get a rough idea from
SELECT SUM(rows)
FROM sys.partitions p
WHERE index_id < 2 and p.object_id=#my_object_id
If you want guaranteed accuracy you would need to construct and execute a dynamic SQL string containing the two part object name. Example below though depending on how you are using this you may prefer to use sp_executesql and return the result as an output parameter instead.
DECLARE #DynSQL nvarchar(max) =
N'SELECT CASE WHEN EXISTS(SELECT * FROM ' +
QUOTENAME(OBJECT_SCHEMA_NAME(#my_object_id)) + '.' +
QUOTENAME(OBJECT_NAME(#my_object_id)) +
') THEN 0 ELSE 1 END AS IsEmpty'
EXECUTE (#DynSQL)
Well it depends on what do you consider as Pure sql
I've come up with the following solution. It is purely written in T-SQL but uses dynamically built query
-- Using variables just for better readability.
DECLARE #Name NVARCHAR(4000)
DECLARE #Schema NVARCHAR(4000)
DECLARE #Query NVARCHAR(4000)
-- Get the relevant data
SET #Schema = QUOTENAME(OBJECT_SCHEMA_NAME(613577224))
SET #Name = QUOTENAME(OBJECT_NAME(613577224))
-- Build query taking into consideration the schema and possible poor object naming
SET #Query = 'SELECT COUNT(*) FROM ' + #Schema + '.' + #Name + ''
-- execute it.
EXEC(#Query)
EDIT
The changes consider the possible faulty cases described in the comments.
I've outlined the variables, because this is a convenient approach for me. Cheers.

Accessing 400 tables in a single query

I want to delete rows with a condition from multiple tables.
DELETE
FROM table_1
WHERE lst_mod_ymdt = '2011-01-01'
The problem is that, the number of table is 400, from table_1 to table_400.
Can I apply the query to all the tables in a single query?
If you're using SQL Server 2005 and later you can try something like this (other versions and RDMS also have similar ways to do this):
DECLARE #sql VARCHAR(MAX)
SET #sql = (SELECT 'DELETE FROM [' + REPLACE(Name, '''','''''') + '] WHERE lst_mod_ymdt = ''' + #lst_mod_ymdt + ''';' FROM sys.tables WHERE Name LIKE 'table_%' FOR XML PATH(''))
--PRINT #sql;
EXEC ( #sql );
And as always with dynamic sql, remember to escape the ' character.
This will likely fall over if you have say table_341 which doesn't have a lst_mod_ymdt column.