Dynamic SQL Server Query loop through schema find primary key duplicate - sql

EDIT: There seems to be some confusion around the create table statements. These are there solely as a demonstration of what tables *might come in to our synapse instance, not as actual code that will run. The important part of the question is contained in the latter half.
I am trying to create a stored procedure that loops through every table in a supplied schema and outputs the count of duplicate primary key rows for each table. Assume that the data is being supplied from elsewhere and the primary keys are not being enforced. For example I may have three tables in the stack schema:
CREATE TABLE stack.table1(
id int,
name NVARCHAR(MAX),
color NVARCHAR(20)
PRIMARY KEY (id))
INSERT INTO stack.table1 VALUES(1,'item1','yellow')
(2,'item2','blue')
(2,'item2','blue')
CREATE TABLE stack.table2(
id int,
name NVARCHAR(MAX),
size NVARCHAR(1)
PRIMARY KEY (id,size))
INSERT INTO stack.table2 VALUES(1,'item1','L')
(2,'item2','M')
(3,'item2','S')
CREATE TABLE stack.table3(
id int,
name NVARCHAR(MAX),
weight NVARCHAR(20)
PRIMARY KEY (id))
INSERT INTO stack.table1 VALUES(1,'item1','200lb')
(2,'item2','150lb')
(3,'item2','125lb')
I want to supply a variable to a stored procedure to indicate the schema (in this case 'stack') and have that procedure spit out a table with the names of the tables in the schema and the counts of duplicate primary key rows. So in this example a stored procedure called 'loopcheck' would look like this:
Query:
EXEC loopcheck #schema = 'stack'
Output:
table
duplicate_count
table1
1
table2
0
table3
0
I am using an Azure Synapse instance so there are several functions that are not available (such as FOR XML PATH and others.) Since each table may have a single column primary key or a composite primary key I need to join to the system provided tables to get primary key info. My general idea was like so:
CREATE procedure loopcheck #schema= NVARCHAR(MAX)
AS
BEGIN
create table #primarykey(
SCHEMA_NAME nvarchar(400),
TABLE_NAME nvarchar(500),
COLUMN_NAME nvarchar(500)
)
insert into #primarykey
select l.TABLE_SCHEMA,
l.TABLE_NAME,
l.COLUMN_NAME
from INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE l
inner join INFORMATION_SCHEMA.TABLE_CONSTRAINTS t on l.constraint_Name = t.CONSTRAINT_NAME
where
l.table_schema = #schema
CREATE TABLE #groupBy2(
TABLE_NAME nvarchar(50),
groupby nvarchar(200)
)
INSERT INTO #groupBy2
SELECT TABLE_NAME, STRING_AGG(CONVERT(NVARCHAR(max), COLUMN_NAME), ',') as groupby
FROM #primarykey
GROUP BY TABLE_NAME
DECLARE #currentTable NVARCHAR(MAX)=''
DECLARE #currentGroup NVARCHAR(MAX)=''
create table #work4(
TABLE_NAME nvarchar(400),
COUNT int)
DECLARE #final NVARCHAR(MAX)=
'INSERT INTO #work4
SELECT '+#currentTable+', COUNT(*) FROM '+#currentTable+'GROUP BY'+#currentGroup
WHILE (SELECT COUNT(*) FROM #groupby2)>0
BEGIN
SET #currentTable =(SELECT TOP 1 TABLE_NAME FROM #groupby2 ORDER BY TABLE_NAME)
SET #currentGroup =(SELECT TOP 1 groupby FROM #groupby2 ORDER BY TABLE_NAME)
exec #final
DELETE #groupby2 where TABLE_NAME =#currentTable
END
END
This code gives me an error:
Incorrect syntax near 'SELECT'
but doesn't give me the actual line it has the error on.

Your primary issue was syntax errors: parameter declaration should not have = between name and type name, and missing spaces in the dynamic SQL.
Also
A schema name (or any object) can be up to nvarchar(128), you can use the alias sysname
You don't need to do any loops or use temp tables, you can build one big dynamic statement to execute
CREATE procedure loopcheck
#schema sysname
AS
DECLARE #sql nvarchar(max) = (
SELECT STRING_AGG(CAST('
SELECT
TableName = ' + QUOTENAME(t.name, '''') + ',
IndexName = ' + QUOTENAME(i.name, '''') + ',
Duplicates = COUNT(*)
FROM (
SELECT 1 n
FROM ' + QUOTENAME(s.name) + '.' + QUOTENAME(t.name) + ' t
GROUP BY ' + cols.agg + '
HAVING COUNT(*) > 1
) t
'
AS nvarchar(max)), 'UNION ALL')
FROM sys.tables t
JOIN sys.schemas s ON s.schema_id = t.schema_id
AND s.name = #schema
JOIN sys.indexes i ON i.object_id = t.object_id
AND i.is_unique = 1
CROSS APPLY (
SELECT STRING_AGG('t.' + QUOTENAME(c.name), ', ')
FROM sys.index_columns ic
JOIN sys.columns c ON c.column_id = ic.column_id AND c.object_id = t.object_id
WHERE ic.index_id = i.index_id
AND ic.object_id = i.object_id
AND ic.is_included_column = 0
) cols(agg)
);
PRINT #sql; -- for testing
EXEC sp_executesql #sql;
db<>fiddle
I feel like there might be a slightly more efficient method using GROUPING SETS in cases where there are multiple unique indexes/constraints on a single table, but I'll leave that to you.

Related

Find all unique values of a column name in a SQL database

We are building a large database using SQL. Every table in the database has many columns but one of the columns in the tables tells who added the row of data. That value "Name of person" is tied to a variable in SSIS. Again, the variable tells who added the row. How can I create a query to pull back all the names in that column, no matter where it is used in the database. The value of the column is different depending on the day.
RE: Every table has the same <Column_Name> ... a query to pull back all the values of that column, no matter where it is used in the database.
IF you have a large database and every table has the same column <Column_Name>, then you will be pulling a value from every row in every table ... in a very large database. Not sure that is what you want to do, but it can be easily done. The following will work even if <column_name> is only in a few tables.
Grab a list of every schema.table that contains <column_name>, then loop over it to get the the value for <column_name>. The following should work.
DECLARE #colname sysname = '<Column_name>' -- just in case it is not in every table
-- capture name of every table here
CREATE TABLE #tablename ( schema_name sysname, table_name sysname, Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED)
INSERT INTO #tablename (schema_name, table_name)
SELECT s.name schemaname, t.name tablename FROM sys.columns c
INNER JOIN sys.tables t ON t.object_id = c.object_id
INNER JOIN sys.schemas s ON s.schema_id = t.schema_id
WHERE c.name = #colname
-- capture result of query here
CREATE TABLE #result ( schema_name sysname, table_name sysname, column_value VARCHAR(100) )
DECLARE #i INT = 1, #imax INT
SELECT #imax = MAX(Id) FROM #tablename
-- loop over tablename
DECLARE #query NVARCHAR(255)
WHILE #i <= #imax
BEGIN
SELECT #query = N'SELECT ' + schema_name + '.' + table_name + ',' + ' <Column_Name> FROM ' + schema_name + '.' + table_name
FROM #tablename WHERE Id = #i
INSERT INTO #result ( schema_name, table_name, column_value)
EXEC sp_executesql #query
SET #i += 1
END

How do I create a select statement to return distinct values, column name and table name?

I would like to create a SQL Statement that will return the distinct values of the Code fields in my database, along with the name of the column for the codes and the name of the table on which the column occurs.
I had something like this:
select c.name as 'Col Name', t.name as "Table Name'
from sys.columns c, sys tables t
where c.object_id = t.object_id
and c.name like 'CD_%'
It generates the list of columns and tables I want, but obviously doesn't return any of the values for each of the codes in the list.
There are over 100 tables in my database. I could use the above result set and write the query for each one like this:
Select distinct CD_RACE from PERSON
and it will return the values, but it won't return the column and table name, plus I have to do each one individually. Is there any way I can get the value, column name and table name for EACH code in my database?
Any ideas? THanks...
Just generate your selects and bring in the column and table names as static values. Here's an Oracle version:
select 'select distinct '''||c.column_name||''' as "Col Name", '''||t.table_name||''' as "Table Name", '||c.column_name||' from '||t.table_name||';'
from all_tab_columns c, all_tables t
where c.table_name = t.table_name;
This will give you a bunch of separate statements, you can modify the query a bit to put a union between each select if you really want one uber query you can execute to get all your code values at once.
Here's an approach for SQL Server since someone else covered Oracle (and specific DBMS not mentioned. The following steps are completed:
Setup table to receive the schema, table, column name, and column value (in example below only table variable is used)
Build the list of SQL commands to execute (accounting for various schemas and names with spaces and such)
Run each command dynamically inserting values into the setup table from #1 above
Output results from table
Here is the example:
-- Store the values and source of the values
DECLARE #Values TABLE (
SchemaName VARCHAR(500),
TableName VARCHAR(500),
ColumnName VARCHAR(500),
ColumnValue VARCHAR(MAX)
)
-- Build list of SQL Commands to run
DECLARE #Commands TABLE (
Id INT PRIMARY KEY NOT NULL IDENTITY(1,1),
SchemaName VARCHAR(500),
TableName VARCHAR(500),
ColumnName VARCHAR(500),
SqlCommand VARCHAR(1000)
)
INSERT #Commands
SELECT
[TABLE_SCHEMA],
[TABLE_NAME],
[COLUMN_NAME],
'SELECT DISTINCT '
+ '''' + [TABLE_SCHEMA] + ''', '
+ '''' + [TABLE_NAME] + ''', '
+ '''' + [COLUMN_NAME] + ''', '
+ '[' + [COLUMN_NAME] + '] '
+ 'FROM [' + [TABLE_SCHEMA] + '].[' + [TABLE_NAME] + ']'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE 'CD_%'
-- Loop through commands
DECLARE
#Sql VARCHAR(1000),
#Id INT,
#SchemaName VARCHAR(500),
#TableName VARCHAR(500),
#ColumnName VARCHAR(500)
WHILE EXISTS (SELECT * FROM #Commands) BEGIN
-- Get next set of records
SELECT TOP 1
#Id = Id,
#Sql = SqlCommand,
#SchemaName = SchemaName,
#TableName = TableName,
#ColumnName = ColumnName
FROM #Commands
-- Add values for that command
INSERT #Values
EXEC (#Sql)
-- Remove command record
DELETE #Commands WHERE Id = #Id
END
-- Return the values and sources
SELECT * FROM #Values

Search SQL DB's For A Specific Word

I am completely new to SQL and have no experience what so ever in it so please bear with me with this question.
I need to know if it is possible to search a SQL database for a specific word and if so how?
We are currently going through a rebranding project and I need to look in our CMS (Content Management System) database for all reference to an email address. All I need to search for is:
.co.uk
Below is a screenshot of the database in question with all the containing tables, I just cant get me head around SQL and I have had no joy on Google trying to find the answer.
I need to search everything in this database but I don't know what tables, views, column names etc the content sits in as it's all spread across them all.
There are other tables I need to search but hopefully an answer will be provided which I can modify to search these.
DB's aren't really meant for such vague search descriptions, you should have some definition or model or requirement specs to describe where values like that could exist.
But of course, you could opt for an insanely slow method of doing it by using dynamic SQL.
I made this right fast and just tested it fast, but it should work:
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#SEARCHTABLE') IS NOT NULL
DROP TABLE #SEARCHTABLE
IF OBJECT_ID('tempdb..#RESULTS') IS NOT NULL
DROP TABLE #RESULTS
CREATE TABLE #SEARCHTABLE (ROWNUM INT IDENTITY(1,1), SEARCHCLAUSE VARCHAR(2000) COLLATE DATABASE_DEFAULT)
INSERT INTO #SEARCHTABLE (SEARCHCLAUSE)
SELECT 'SELECT TOP 1 '''+TAB.name+''', '''+C.name+'''
FROM ['+S.name+'].['+TAB.name+']
WHERE '
+CASE WHEN T.name <> 'xml'
THEN '['+C.name+'] LIKE ''%.co.uk%'' AND ['+C.name+'] LIKE ''%#%'''
ELSE 'CAST(['+C.name+'] AS VARCHAR(MAX)) LIKE ''%.co.uk%'' AND CAST(['+C.name+'] AS VARCHAR(MAX)) LIKE ''%#%'''
END AS SEARCHCLAUSE
FROM sys.tables TAB
JOIN sys.schemas S on S.schema_id = TAB.schema_id
JOIN sys.columns C on C.object_id = TAB.object_id
JOIN sys.types T on T.user_type_id = C.user_type_id
WHERE TAB.type_desc = 'USER_TABLE'
AND (T.name LIKE '%char%' OR
T.name LIKE '%xml%')
AND CASE WHEN C.max_length = -1 THEN 10 ELSE C.max_length END >= 6 -- To only search through sufficiently long column
CREATE TABLE #RESULTS (ROWNUM INT IDENTITY(1,1), TABLENAME VARCHAR(256) COLLATE DATABASE_DEFAULT, COLNAME VARCHAR(256) COLLATE DATABASE_DEFAULT)
DECLARE #ROWNUM_NOW INT, #ROWNUM_MAX INT, #SQLCMD VARCHAR(2000), #STATUSSTRING VARCHAR(256)
SELECT #ROWNUM_NOW = MIN(ROWNUM), #ROWNUM_MAX = MAX(ROWNUM) FROM #SEARCHTABLE
WHILE #ROWNUM_NOW <= #ROWNUM_MAX
BEGIN
SELECT #SQLCMD = SEARCHCLAUSE FROM #SEARCHTABLE WHERE ROWNUM = #ROWNUM_NOW
INSERT INTO #RESULTS
EXEC(#SQLCMD)
SET #STATUSSTRING = CAST(#ROWNUM_NOW AS VARCHAR(25))+'/'+CAST(#ROWNUM_MAX AS VARCHAR(25))+', time: '+CONVERT(VARCHAR, GETDATE(), 120)
RAISERROR(#STATUSSTRING, 10, 1) WITH NOWAIT
SELECT #ROWNUM_NOW = #ROWNUM_NOW + 1
END
SET NOCOUNT ON
SELECT 'This table and column contains strings ".co.uk" and a "#"' INFORMATION, TABLENAME, COLNAME FROM #RESULTS
-- Uncomment to drop the created temp tables
--IF OBJECT_ID('tempdb..#SEARCHTABLE') IS NOT NULL
-- DROP TABLE #TABLECOLS
--IF OBJECT_ID('tempdb..#RESULTS') IS NOT NULL
-- DROP TABLE #RESULTS
What it does, it search the DB for all user-created tables with their schemas, which have (n)char/(n)varchar/xml columns of a sufficient length, and search each of them one by one until at least one match is found, then it moves to the next one on the list. Match is defined as any string or XML cast as string, which contains the text ".co.uk" and an "#"-sign somewhere in there.
It will show the progress of the script (how many searchable TABLE.COLUMN combinations are have been found and which one on that list is currently running, as well as the current timestamps down to seconds) on the messages tab. When ready, it will show you all the tables and column names that contained at least one match.
So from that list, you'll have to search through the tables and columns manually to find exactly how many and what kinds of matches there are, and what it is you actually want to do.
Edit: Again I disregarded using sysnames for sysobjects, but I'll modify later if needed.
I threw together a quick query that seems to work for me:
--Search for a word in the current database
SET NOCOUNT ON;
--First make a hit list of possible tables/ columns
DECLARE #HitList TABLE (
Id INT IDENTITY(1,1) PRIMARY KEY,
TableName VARCHAR(255),
SchemaName VARCHAR(255),
ColumnName VARCHAR(255));
INSERT INTO
#HitList (
TableName,
SchemaName,
ColumnName)
SELECT
t.name,
s.name,
c.name
FROM
sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.schemas s ON s.schema_id = t.schema_id
WHERE
c.system_type_id = 167;
--Construct Dynamic SQL
DECLARE #Id INT = 1;
DECLARE #Count INT;
SELECT #Count = COUNT(*) FROM #HitList;
DECLARE #DynamicSQL VARCHAR(1024);
WHILE #Id <= #Count
BEGIN
DECLARE #TableName VARCHAR(255);
DECLARE #SchemaName VARCHAR(255);
DECLARE #ColumnName VARCHAR(255);
SELECT #TableName = TableName FROM #HitList WHERE Id = #Id;
SELECT #SchemaName = SchemaName FROM #HitList WHERE Id = #Id;
SELECT #ColumnName = ColumnName FROM #HitList WHERE Id = #Id;
SELECT #DynamicSQL = 'SELECT * FROM [' + #SchemaName + '].[' + #TableName + '] WHERE [' + #ColumnName + '] LIKE ''%co.uk%''';
--PRINT #DynamicSQL;
EXECUTE (#DynamicSQL);
IF ##ROWCOUNT != 0
BEGIN
PRINT 'We have a hit in ' + #TableName + '.' + #ColumnName + '!!';
END;
SELECT #Id = #Id + 1;
END;
Basically it makes a list of any VARCHAR columns (you might need to change this to include NVARCHARs if you have Unicode text columns - just change the test for system type id from 167 to 231) then performs a search for each one. When you run this from management studio switch to the messages pane to see the hits and just ignore the results.
It will be slow if your database is any sort of size... but then that is to be expected?

SQL how do you query for tables that refer to a specific foreign key value?

I have table A with a primary key on column ID and tables B,C,D... that have 1 or more columns with foreign key relationships to A.ID.
How do I write a query that shows me all tables that contain a specific value (eg 17) of the primary key?
I would like to have generic sql code that can take a table name and primary key value and display all tables that reference that specific value via a foreign key.
The result should be a list of table names.
I am using MS SQL 2012.
You want to look at sys.foreignkeys. I would start from http://blog.sqlauthority.com/2009/02/26/sql-server-2008-find-relationship-of-foreign-key-and-primary-key-using-t-sql-find-tables-with-foreign-key-constraint-in-database/
to give something like
declare #value nvarchar(20) = '1'
SELECT
'select * from '
+ QUOTENAME( SCHEMA_NAME(f.SCHEMA_ID))
+ '.'
+ quotename( OBJECT_NAME(f.parent_object_id) )
+ ' where '
+ COL_NAME(fc.parent_object_id,fc.parent_column_id)
+ ' = '
+ #value
FROM sys.foreign_keys AS f
INNER JOIN sys.foreign_key_columns AS fc ON f.OBJECT_ID = fc.constraint_object_id
INNER JOIN sys.objects AS o ON o.OBJECT_ID = fc.referenced_object_id
Not an ideal one, but should return what is needed (list of tables):
declare #tableName sysname, #value sql_variant
set #tableName = 'A'
set #value = 17
declare #sql nvarchar(max)
create table #Value (Value sql_variant)
insert into #Value values (#value)
create table #Tables (Name sysname, [Column] sysname)
create index IX_Tables_Name on #Tables (Name)
set #sql = 'declare #value sql_variant
select #value = Value from #Value
'
set #sql = #sql + replace((
select
'insert into #Tables (Name, [Column])
select ''' + quotename(S.name) + '.' + quotename(T.name) + ''', ''' + quotename(FC.name) + '''
where exists (select 1 from ' + quotename(S.name) + '.' + quotename(T.name) + ' where ' + quotename(FC.name) + ' = #value)
'
from
sys.columns C
join sys.foreign_key_columns FKC on FKC.referenced_column_id = C.column_id and FKC.referenced_object_id = C.object_id
join sys.columns FC on FC.object_id = FKC.parent_object_id and FC.column_id = FKC.parent_column_id
join sys.tables T on T.object_id = FKC.parent_object_id
join sys.schemas S on S.schema_id = T.schema_id
where
C.object_id = object_id(#tableName)
and C.name = 'ID'
order by S.name, T.name
for xml path('')), '
', CHAR(13))
--print #sql
exec(#sql)
select distinct Name
from #Tables
order by Name
drop table #Value
drop table #Tables
You could achive that by writing some SQL. I post an example but it is just a mockup showing the way you could do it.
CREATE TABLE tempTable
(
TABLE_NAME varchar(255)
);
CREATE UNIQUE CLUSTERED INDEX Idx_tempTable ON tempTable(TABLE_NAME);
DECLARE #var2 nvarchar(max)
INSERT INTO tempTable
SELECT DISTINCT
TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%COLUMN_NAME%'
/*FOREACH result of the tempTable you could find if the COLUMN_NAME of the result(table) has the value you want*/
SET #var2 = 'SELECT TABLE_NAME FROM ' + tempTableResult + ' WHERE COLUMN_NAME=VALUE'
exec(#var2)
DROP TABLE tempTable
The query will return a list of table names and append those names with the data (if used to find), or a "(no date)" if child data are held as daily instances.
Also, apologies up front for the use of a cursor. I tend to use them only for special cases such as this one (i.e. finding the few odd records that may exist across 100's of tables).
In my case, a table references just under 400 tables (all of which are generated automatically as part of a "learning" system), and depending on the type of entry saved, data may or may not written into these tables. A further twist is some of these data are also by-date, so the query must also check for the existence of a date column in each table with the foreign key (fortunately, in these instances the column will always be named "dt").
From the nearly 400 tables listed as referencing the "asset" table. Only a dozen tables actually held data for the particular entry I was investigating. All of the tables held the data as daily instances/detail.
The referenced table's name is "asset" and the Dynamic SQL includes a sub query (convert a human readable name to a primary key, used as a FK value).
The cursor query is from Gishu at How can I list all foreign keys referencing a given table in SQL Server?
DECLARE #TableName varchar(255)
DECLARE #FKeyColumn varchar(255)
DECLARE #rowcount int
DECLARE #sqlCMD NVARCHAR(500)
DECLARE #dt NVARCHAR(10) = '2008-08-25'
DECLARE #SymbolName NVARCHAR(9) = 'thingImLookingFor'
DECLARE #byDate varchar(255)
DECLARE TableCursor
CURSOR FOR select
t.name as TableWithForeignKey,
c.name as ForeignKeyColumn
from sys.foreign_key_columns as fk
inner join sys.tables as t on fk.parent_object_id = t.object_id
inner join sys.columns as c on fk.parent_object_id = c.object_id and fk.parent_column_id = c.column_id
where
fk.referenced_object_id = (select object_id from sys.tables where name = 'asset')
OPEN TableCursor
FETCH NEXT FROM TableCursor INTO #TableName, #FKeyColumn
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sqlCMD = 'SELECT #rowcount=count(*) FROM ' + #TableName + ' WHERE ' + #FKeyColumn + '=(SELECT asset_id FROM asset WHERE primary_symbol=''' + #SymbolName + ''')'
SET #byDate = ' (no date)'
IF EXISTS(SELECT 1 FROM sys.columns
WHERE sys.columns.name = N'dt'
AND sys.columns.object_id = Object_ID(#TableName))
BEGIN
SET #sqlCMD = #sqlCMD + ' AND dt=''' + #dt + ''''
SET #byDate = ' (' + #dt + ')'
END
EXEC sp_executesql #sqlCMD, N'#rowcount int output', #rowcount output
IF(#rowcount=1) PRINT(#TableName + #byDate)
FETCH NEXT FROM TableCursor INTO #TableName, #FKeyColumn
END
CLOSE TableCursor;
DEALLOCATE TableCursor;

SQL: Query many tables with same column name but different structure for specific value

I'm working on cleaning up an ERP and I need to get rid of references to unused users and user groups. There are many foreign key constraints and therefor I want to be sure to really get rid of all traces!
I found this tidy tidbit of code to find all tables in my db with a certain column name, in this case let's look at the user groups:
select table_name from information_schema.columns
where column_name = 'GROUP_ID'
With the results I can search through the 40+ tables for my unused ID... but this is tedius. So I'd like to automate this and create a query that loops through all these tables and deletes the rows where it finds Unused_Group in the GROUP_ID column.
Before deleting anything I'd like to visualize the existing data, so I started to build something like this using string concatenation:
declare #group varchar(50) = 'Unused_Group'
declare #table1 varchar(50) = 'TABLE1'
declare #table2 varchar(50) = 'TABLE2'
declare #tableX varchar(50) = 'TABLEX'
select #query1 = 'SELECT ''' + rtrim(#table1) + ''' as ''Table'', '''
+ rtrim(#group) + ''' = CASE WHEN EXISTS (SELECT GROUP_ID FROM ' + rtrim(#table1)
+ ' WHERE GROUP_ID = ''' + rtrim(#group) + ''') then ''MATCH'' else ''-'' end FROM '
+ rtrim(#table1)
select #query2 = [REPEAT FOR #table2 to #tableX]...
EXEC(#query1 + ' UNION ' + #query2 + ' UNION ' + #queryX)
This gives me the results:
TABLE1 | Match
TABLE2 | -
TABLEX | Match
This works for my purposes and I can run it for any user group without changing any other code, and is of course easily adaptable to DELETE from these same tables, but is unmanageable for the 75 or so tables that I have to deal with between users and groups.
I ran into this link on dynamic SQL which was intense and dense enough to scare me away for the moment... but I think the solution might be in there somewhere.
I'm very familiar with FOR() loops in JS and other languages, where this would be a piece of cake with a well structured array, but apparently it's not so simple in SQL (I'm still learning, but found alot of negative talk about the FOR and GOTO solutions available...). Ideally a I'd have a script that queries to find tables with a certain column name, query each table as above, and spit me a list of matches, and then execute a second similar script to delete the rows.
Can anyone help point me in the right direction?
Ok, try this, there are three variables; column, colValue and preview. Column should be the column you're checking equality on (Group_ID), colValue the value you're looking for (Unused_Group) and preview should be 1 to view what you'll delete and 0 to delete it.
Declare #column Nvarchar(256),
#colValue Nvarchar(256),
#preview Bit
Set #column = 'Group_ID'
Set #colValue = 'Unused_Group'
Set #preview = 1 -- 1 = preview; 0 = delete
If Object_ID('tempdb..#tables') Is Not Null Drop Table #tables
Create Table #tables (tID Int, SchemaName Nvarchar(256), TableName Nvarchar(256))
-- Get all the tables with a column named [GROUP_ID]
Insert #tables
Select Row_Number() Over (Order By s.name, so.name), s.name, so.name
From sysobjects so
Join sys.schemas s
On so.uid = s.schema_id
Join syscolumns sc
On so.id = sc.id
Where so.xtype = 'u'
And sc.name = #column
Select *
From #tables
Declare #SQL Nvarchar(Max),
#schema Nvarchar(256),
#table Nvarchar(256),
#iter Int = 1
-- As long as there are tables to look at keep looping
While Exists (Select 1
From #tables)
Begin
-- Get the next table record to look at
Select #schema = SchemaName,
#table = TableName
From #tables
Where tID = #iter
-- If the table we're going to look at has dependencies on tables we have not
-- yet looked at move it to the end of the line and look at it after we look
-- at it's dependent tables (Handle foreign keys)
If Exists (Select 1
From sysobjects o
Join sys.schemas s1
On o.uid = s1.schema_id
Join sysforeignkeys fk
On o.id = fk.rkeyid
Join sysobjects o2
On fk.fkeyid = o2.id
Join sys.schemas s2
On o2.uid = s2.schema_id
Join #tables t
On o2.name = t.TableName Collate Database_Default
And s2.name = t.SchemaName Collate Database_Default
Where o.name = #table
And s1.name = #schema)
Begin
-- Move the table to the end of the list to retry later
Update t
Set tID = (Select Max(tID) From #tables) + 1
From #tables t
Where tableName = #table
And schemaName = #schema
-- Move on to the next table to look at
Set #iter = #iter + 1
End
Else
Begin
-- Delete the records we don't want anymore
Set #Sql = Case
When #preview = 1
Then 'Select * ' -- If preview is 1 select from table
Else 'Delete t ' -- If preview is not 1 the delete from table
End +
'From [' + #schema + '].[' + #table + '] t
Where ' + #column + ' = ''' + #colValue + ''''
Exec sp_executeSQL #SQL;
-- After we've done the work remove the table from our list
Delete t
From #tables t
Where tableName = #table
And schemaName = #schema
-- Move on to the next table to look at
Set #iter = #iter + 1
End
End
Turning this into a stored procedure would simply involve changing the variables declaration at the top to a sproc creation so you would get rid of...
Declare #column Nvarchar(256),
#colValue Nvarchar(256),
#preview Bit
Set #column = 'Group_ID'
Set #colValue = 'Unused_Group'
Set #preview = 1 -- 1 = preview; 0 = delete
...
And replace it with...
Create Proc DeleteStuffFromManyTables (#column Nvarchar(256), #colValue Nvarchar(256), #preview Bit = 1)
As
...
And you'd call it with...
Exec DeleteStuffFromManyTable 'Group_ID', 'Unused_Group', 1
I commented the hell out of the code to help you understand what it's doing; good luck!
You're on the right track with INFORMATION_SCHEMA objects. Execute the below in a query editor, it produces SELECT and DELETE statements for tables that contain GROUP_ID column with 'Unused_Group' value.
-- build select DML to manually review data that will be deleted
SELECT 'SELECT * FROM [' + TABLE_SCHEMA + '].[' + TABLE_NAME + '] WHERE [GROUP_ID] = ''Unused_Group'';'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME = 'GROUP_ID';
-- build delete DML to remove data
SELECT 'DELETE FROM [' + TABLE_SCHEMA + '].[' + TABLE_NAME + '] WHERE [GROUP_ID] = ''Unused_Group'';'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME = 'GROUP_ID';
Since this seems to be a one-time cleanup effort, and especially since you need to review data before it is deleted, I don't see the value in making this more complicated.
Consider adding referential integrity and enforcing cascade delete, if you can. It won't help with visualizing the data before you delete it, but will help controlling orphaned rows.