Dynamic SQL w/ Loop Over All Columns in a Table - sql

I recently was pulled off of an ASP.net conversion project at my new job to help with a rather slow, mundane, but desperate task another department is handling. Basically, they are using a simple SQL script on every column of every table in every database (it's horrible) to generate a count of all of the distinct records on each table for each column. My SQL experience is limited and my dynamic SQL experience is zero, more or less, but since I have not been given permissions yet to even access this particular database I went to work attempting to formulate a more automated query to perform this task, testing on a database I do have access to.
In short, I ran into some issues and I was hoping someone might be able to help me fill in the blanks. It'll save this department more than a month of speculated time if something more automated can be utilized.
These are the two scripts I was given and told to run on each column. The first one was for any non-bit/boolean column and also for non-datetime columns. The second was to be used for any datetime column.
select columnName, count(*) qty
from tableName
group by columnName
order by qty desc
select year(a.columnName), count(*) qty
from tableName a
group by year(a.columnName)
order by qty desc
Doing this thousands of times doesn't seem like a lot of fun to me, so here is more or less some pseudo-code that I came up with that I think could solve the issue, I will point out which areas I am fuzzy on.
declare #sql nvarchar(2500)
set #sql = 'the first part(s) of statement'
[pseudo-pseudo] Get "List" of All Column Names in Table (I do not believe there is a Collection datatype in SQL code, but you get the idea)
[pseudo-pseudo] Loop Through "List" of Column Names
(I know this dot notation wouldn't work, but I would like to perform something similar to this)
IF ColumnName.DataType LIKE 'date%'
set #sql = #sql + ' something'
IF ColumnName.DataType = bit
set #sql = #sql + ' something else' --actually it'd be preferable to skip bit/boolean datatypes if possible as these aren't necessary for the reports being created by these queries
ELSE
set #sql = #sql + ' something other than something else'
set #sql = #sql + ' ending part of statement'
EXEC(#sql)
So to summarize, for simplicity's sake I'd like to let the user plug the table's name into a variable at the start of the query:
declare #tableName nvarchar(50)
set #tableName = 'TABLENAME' --Enter Query's Table Name Here
Based on this, the code will loop through every column of that table, checking for datatype. If the datatype is a datetime (or other date like datatype), the "year" code would be added to the dynamic SQL. If it is anything else (except bit/boolean), then it will add the default logic to the dynamic SQL code.
Again, for simplicity's sake (even if it is bad practice) I figure the end result will be a dynamic SQL statement with multiple selects, one for each column in the table. Then the user would simply copy the output to excel (which they are doing right now anyway). I know this isn't the perfect solution so I am open to suggestions, but since time is of the essence and my experience with dynamic SQL is close to null, I thought a somewhat quick and dirty approach would be tolerable in this case.
I do apologize for my very haphazard preparation with this question but I do hope someone out there might be able to steer me in the right direction.
Thanks so much for your time, I certainly appreciate it.

Here's an example working through all the suggestions in the comments.
declare #sql nvarchar(max);
declare stat_cursor cursor local fast_forward for
select
case when x.name not in ('date', 'datetime2', 'smalldatetime', 'datetime') then
N'select
' + quotename(s.name, '''') + ' as schema_name,
' + quotename(t.name, '''') + ' as table_name,
' + quotename(c.name) + ' as column_name,
count(*) qty
from
' + quotename(s.name) + '.' + quotename(t.name) + '
group by
' + quotename(c.name) + '
order by
qty desc;'
else
N'select
' + quotename(s.name, '''') + ' as schema_name,
' + quotename(t.name, '''') + ' as table_name,
year(' + quotename(c.name) + ') as column_name,
count(*) qty
from
' + quotename(s.name) + '.' + quotename(t.name) + '
group by
year(' + quotename(c.name) + ')
order by
qty desc;'
end
from
sys.schemas s
inner join
sys.tables t
on s.schema_id = t.schema_id
inner join
sys.columns c
on c.object_id = t.object_id
inner join
sys.types x
on c.system_type_id = x.user_type_id
where
x.name not in (
'geometry',
'geography',
'hierarchyid',
'xml',
'timestamp',
'bit',
'image',
'text',
'ntext'
);
open stat_cursor;
fetch next from stat_cursor into #sql;
while ##fetch_status = 0
begin
exec sp_executesql #sql;
fetch next from stat_cursor into #sql;
end;
close stat_cursor;
deallocate stat_cursor;
Example SQLFiddle (note this only shows the first iteration through the cursor. Not sure if this is a limitation of SQLFiddle or a bug).
I'd probably stash the results into a separate database if I was doing this. Also, I'd probably put the SQL building bits into user defined functions for maintainability (the slow bit will be running the queries, no point optimizing generating them).

Related

SQL Server 2008: create trigger across all tables in db

Using SQL Server 2008, I've created a database where every table has a datetime column called "CreatedDt". What I'd like to do is create a trigger for each table so that when a value is inserted, the CreatedDt column is populated with the current date and time.
If you'll pardon my pseudocode, what I'm after is the T-SQL equivalent of:
foreach (Table in MyDatabase)
{
create trigger CreatedDtTrigger
{
on insert createddt = datetime.now;
}
}
If anyone would care to help out, I'd greatly appreciate it. Thanks!
As #EricZ says, the best thing to do is bind a default for the column. Here's how you'd add it to every table using a cursor and dynamic SQL:
Sure, You can do it with a cursor:
declare #table sysname, #cmd nvarchar(max)
declare c cursor for
select name from sys.tables where is_ms_shipped = 0 order by name
open c; fetch next from c into #table
while ##fetch_status = 0
begin
set #cmd = 'ALTER TABLE ' + #table + ' ADD CONSTRAINT DF_' + #table + '_CreateDt DEFAULT GETDATE() FOR CreateDt'
exec sp_executesql #cmd
fetch next from c into #table
end
close c; deallocate c
No need to go for Cursors. Just copy the result of below Query and Execute.
select distinct 'ALTER TABLE '+ t.name +
' ADD CONSTRAINT DF_'+t.name+'_crdt DEFAULT getdate() FOR '+ c.name
from sys.tables t
inner join sys.columns c on t.object_id=c.object_id
where c.name like '%your column name%'
Here's another method:
DECLARE #SQL nvarchar(max);
SELECT #SQL = Coalesce(#SQL + '
', '')
+ 'ALTER TABLE ' + QuoteName(T.TABLE_SCHEMA) + '.' + QuoteName(T.TABLE_NAME)
+ ' ADD CONSTRAINT ' + QuoteName('DF_'
+ CASE WHEN T.TABLE_SCHEMA <> 'dbo' THEN T.Table_Schema + '_' ELSE '' END
+ C.COLUMN_NAME) + ' DEFAULT (GetDate()) FOR ' + QuoteName(C.COLUMN_NAME)
+ ';'
FROM
INFORMATION_SCHEMA.TABLES T
INNER JOIN INFORMATION_SCHEMA.COLUMNS C
ON T.TABLE_SCHEMA = C.TABLE_SCHEMA
AND T.TABLE_NAME = C.TABLE_NAME
WHERE
C.COLUMN_NAME = 'CreatedDt'
;
EXEC (#SQL);
This yields, and runs, a series of statements similar to the following:
ALTER TABLE [schema].[TableName] -- (line break added)
ADD CONSTRAINT [DF_schema_TableName] DEFAULT (GetDate()) FOR [ColumnName];
Some notes:
This uses the INFORMATION_SCHEMA views. It is best practice to use these where possible instead of the system tables because they are guaranteed to not change between versions of SQL Server (and moreover are supported on many DBMSes, so all things being equal it's best to use standards-compliant/portable code).
In a database with a case-sensitive default collation, one MUST use upper case for the INFORMATION_SCHEMA view names and column names.
When creating script it's important to pay attention to schema names and proper escaping (using QuoteName). Not doing so will break in someone's system some day.
I think it is best practice to put the DEFAULT expression inside parentheses. While no error is received without it in this case, with it, if the function GetDate() is parameterized and/or ever changed to a more complex expression, nothing will break.
If you decide that column defaults are not going to work for you, then the triggers you imagined are still possible. But it will take some serious work to manage whether the trigger already exists and alter or create it appropriately, JOIN to the inserted meta-table inside the trigger, and do it based on the full list of primary key columns for the table (if they exist, and if they don't, then you're out of luck). It is quite possible, but extremely difficult--you could end up with nested, nested, nested dynamic SQL. I have such automated object-creating script that contains 13 quote marks in a row...

Checking whether conditions are met by all rows with dynamic SQL

I have a table in SQL Server 2008 which contains custom validation criteria in the form of expressions stored as text, e.g.
StagingTableID CustomValidation
----------------------------------
1 LEN([mobile])<=30
3 [Internal/External] IN ('Internal','External')
3 ([Internal/External] <> 'Internal') OR (LEN([Contact Name])<=100)
...
I am interested in determining whether all rows in a table pass the conditional statement. For this purpose I am writing a validation stored procedure which checks whether all values in a given field in a given table meet the given condition(s). SQL is not my forte, so after reading this questions this is my first stab at the problem:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT * FROM (' +
N'SELECT CASE WHEN ' + #CustomValidationExpr + N' THEN 1 ' +
N'ELSE 0 END AS ConditionalTest ' +
N'FROM ' + #StagingTableName +
N')t ' +
N'WHERE t.ConditionalTest = 0)'
,N'#passed BIT OUTPUT'
,#passed = #PassedCustomValidation OUTPUT
However, I'm not sure if the nested queries can be re-written as one, or if there is an entirely better way for testing for validity of all rows in this scenario?
Thanks in advance!
You should be able to reduce by at least one subquery like this:
EXEC sp_executesql N'SELECT #passed = 0 WHERE EXISTS (' +
N'SELECT 1 FROM ' + #StagingTableName +
N'WHERE NOT(' + #CustomValidationExpr + N')) ' +
,N'#passed BIT OUTPUT'
,#passed = #PassedcustomValidation OUTPUT
Before we answer the original question, have you looked into implementing constraints? This will prevent bad data from entering your database in the first place. Or is the point that these must be dynamically set in the application?
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$MobileValidLength]
CHECK (LEN([mobile])<=30)
GO
ALTER TABLE StagingTable
WITH CHECK ADD CONSTRAINT [StagingTable$InternalExternalValid]
CHECK ([Internal/External] IN ('Internal','External'))
GO
--etc...
You need to concatenate the expressions together. I agree with #PinnyM that a where clause is easier for full table validation. However, the next question will be how to identify which rows fail which tests. I'll wait for you to ask that question before answering it (ask it as a separate question and not as an edit to this one).
To create the where clause, something like this:
declare #WhereClause nvarchar(max);
select #WhereClause = (select CustomValidation+' and '
from Validations v
for xml path ('')
) + '1=1'
select #WhereClause = replace(replace(#WhereClause, '<', '<'), '>', '>'))
This strange construct, with the for xml path('') and the double select, is the most convenient way to concatenate values in SQL Server.
Also, put together your query before doing the sp_executesql call. It gives you more flexibilty:
declare #sql nvarchar(max);
select #sql = '
select #passed = count(*)
from '+#StagingTableName+'
where '+#WhereClause
That is the number that pass all validation tests. The where clause for the fails is:
declare #WhereClause nvarchar(max);
select #WhereClause = (select 'not '+CustomValidation+' or '
from Validations v
for xml path ('')
) + '1=0'

SQL - Concatenate all columns from any table

I'm using triggers to audit table changes. Right now I capture the individual column changes in the following:
DECLARE #statement VARCHAR(MAX)
SELECT #statement =
'Col1: ' + CAST(ISNULL(Col1, '') AS VARCHAR) + ', Col2: ' + CAST(ISNULL(Col2, '') AS VARCHAR) + ', Col3: ' + CAST(ISNULL(Col3, '') AS VARCHAR)
FROM INSERTED;
The problem is, I need to tweak the column names for every table/trigger that I want to audit against. Is there a way I can build #statement, independent of the table using a more generic approach?
cheers
David
what you need to do is build a memory table using the following query and then loop through the same to produce the SQL statement you want
select column_name from information_schema.columns
where table_name like 'tName'
order by ordinal_position
however i am not sure this would be the right thing to do for AUDIT. How are you going to pull it back later. Say in one of your releases you happen to drop the column what will happen then? how will you know which column held which data.

Is there a way to replace a character or string in all fields without writing it for each field?

I will warn you up front, this question borders on silly, but I'm asking anyway.
The impetus for my question is creating a csv from a query result and some of the fields containing commas already. Obviously, the csv doesn't know any better and just merrily jacks up my good mood by having some stragglers in non-field columns.
I know I can write
Replace(FieldName, OldChar, NewChar)
for each field, but I'm more curious than anything if there's a shortcut to replace them all in the query output.
Basically what I'm looking for (logically) is:
Replace(AllFields, OldChar, NewChar)
I don't know all of the SQL tricks (or many of them), so I thought maybe the SO community may be able to enlighten me...or call me nuts.
There is no SQL syntax to do what you describe, but as you've seen there are many ways to do this with dynamic SQL. Here's the way I prefer (this assumes you want to replace commas with pipe, change this as you see fit):
DECLARE #table NVARCHAR(511),
#newchar NCHAR(1),
#sql NVARCHAR(MAX);
SELECT #table = N'dbo.table_name',
#newchar = N'|', -- tailor accordingly
#sql = N'';
SELECT #sql = #sql + ',
' + QUOTENAME(name)
+ ' = REPLACE(CONVERT(NVARCHAR(MAX), ' + QUOTENAME(name) + '),'','','''
+ #newchar + ''')'
FROM sys.columns
WHERE [object_id] = OBJECT_ID(#table)
ORDER BY column_id;
SELECT #sql = N'SELECT ' + STUFF(#sql, 1, 1, '') + '
FROM ' + #table;
PRINT #sql;
-- EXEC sp_executesql #sql;
I feel your pain. I often have one-time type cleansing steps in ETL routines. I find a script like this helps when you need to remove some oddity from an import (rogue page breaks, whitespace, etc.):
declare #tableName nvarchar(100) = 'dbo.YourTable';
declare #col nvarchar(max);
-- remove quotes and trim every column, kill page breaks, etc.
;with c_Col (colName)
as ( select c.name
from sys.tables t
join sys.columns c on
c.object_id = t.object_id
where t.object_id = object_id(#tableName)
)
select #col = stuff(a.n, 1, 1, '')
from ( select top 100 percent
',' + c.colName + '= nullif(replace(replace(replace(rtrim(ltrim('+c.colName+ ')), ''"'', ''''), char(13), ''''), char(10), ''''), '''') '
from c_col c
for xml path('')
) as a(n)
declare #cmd nvarchar(max)
set #cmd = 'update ' + #tableName + ' set ' + #col
print #cmd;
--exec(#cmd);
If you are just looking to save yourself some typing for a one time query statement affecting all fields in a table then this is a trick I've used in the past.
First query the schema to produce a result set that returns all the field names in any table you specify. You can modify what I've provided here as a template but I've given the basic structure of an update statement around the field names.
select column_name + ' = Replace(' + column_name + ',OldChar,NewChar),'
from information_schema.columns
where table_name = 'YourTableName'
The result set comes back in query analyzer as a series of rows that you can highlight (by clicking on column name) and then copying and pasting right back into your query analyzer window. From there add your update statement to the beginning and where clause to the end. You'll also need to get rid of the one extra comma.
You can then re-run the query to produce the desire outcome.

Accessing 400 tables in a single query

I want to delete rows with a condition from multiple tables.
DELETE
FROM table_1
WHERE lst_mod_ymdt = '2011-01-01'
The problem is that, the number of table is 400, from table_1 to table_400.
Can I apply the query to all the tables in a single query?
If you're using SQL Server 2005 and later you can try something like this (other versions and RDMS also have similar ways to do this):
DECLARE #sql VARCHAR(MAX)
SET #sql = (SELECT 'DELETE FROM [' + REPLACE(Name, '''','''''') + '] WHERE lst_mod_ymdt = ''' + #lst_mod_ymdt + ''';' FROM sys.tables WHERE Name LIKE 'table_%' FOR XML PATH(''))
--PRINT #sql;
EXEC ( #sql );
And as always with dynamic sql, remember to escape the ' character.
This will likely fall over if you have say table_341 which doesn't have a lst_mod_ymdt column.