Looping through column names with dynamic SQL - sql

I just came up with an idea for a piece of code to show all the distinct values for each column, and count how many records for each. I want the code to loop through all columns.
Here's what I have so far... I'm new to SQL so bear with the noobness :)
Hard code:
select [Sales Manager], count(*)
from [BT].[dbo].[test]
group by [Sales Manager]
order by 2 desc
Attempt at dynamic SQL:
Declare #sql varchar(max),
#column as varchar(255)
set #column = '[Sales Manager]'
set #sql = 'select ' + #column + ',count(*) from [BT].[dbo].[test] group by ' + #column + 'order by 2 desc'
exec (#sql)
Both of these work fine. How can I make it loop through all columns? I don't mind if I have to hard code the column names and it works its way through subbing in each one for #column.
Does this make sense?
Thanks all!

You can use dynamic SQL and get all the column names for a table. Then build up the script:
Declare #sql varchar(max) = ''
declare #tablename as varchar(255) = 'test'
select #sql = #sql + 'select [' + c.name + '],count(*) as ''' + c.name + ''' from [' + t.name + '] group by [' + c.name + '] order by 2 desc; '
from sys.columns c
inner join sys.tables t on c.object_id = t.object_id
where t.name = #tablename
EXEC (#sql)
Change #tablename to the name of your table (without the database or schema name).

This is a bit of an XY answer, but if you don't mind hardcoding the column names, I suggest you do just that, and avoid dynamic SQL - and the loop - entirely. Dynamic SQL is generally considered the last resort, opens you up to security issues (SQL injection attacks) if not careful, and can often be slower if queries and execution plans cannot be cached.
If you have a ton of column names you can write a quick piece of code or mail merge in Word to do the substitution for you.
However, as far as how to get column names, assuming this is SQL Server, you can use the following query:
SELECT c.name
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
Therefore, you can build your dynamic SQL from this query:
SELECT 'select '
+ QUOTENAME(c.name)
+ ',count(*) from [BT].[dbo].[test] group by '
+ QUOTENAME(c.name)
+ 'order by 2 desc'
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
and loop using a cursor.
Or compile the whole thing together into one batch and execute. Here we use the FOR XML PATH('') trick:
DECLARE #sql VARCHAR(MAX) = (
SELECT ' select ' --note the extra space at the beginning
+ QUOTENAME(c.name)
+ ',count(*) from [BT].[dbo].[test] group by '
+ QUOTENAME(c.name)
+ 'order by 2 desc'
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
FOR XML PATH('')
)
EXEC(#sql)
Note I am using the built-in QUOTENAME function to escape column names that need escaping.

You want to know the distinct coulmn values in all the columns of the table ? Just replace the table name Employee with your table name in the following code:
declare #SQL nvarchar(max)
set #SQL = ''
;with cols as (
select Table_Schema, Table_Name, Column_Name, Row_Number() over(partition by Table_Schema, Table_Name
order by ORDINAL_POSITION) as RowNum
from INFORMATION_SCHEMA.COLUMNS
)
select #SQL = #SQL + case when RowNum = 1 then '' else ' union all ' end
+ ' select ''' + Column_Name + ''' as Column_Name, count(distinct ' + quotename (Column_Name) + ' ) As DistinctCountValue,
count( '+ quotename (Column_Name) + ') as CountValue FROM ' + quotename (Table_Schema) + '.' + quotename (Table_Name)
from cols
where Table_Name = 'Employee' --print #SQL
execute (#SQL)

Related

Counting the number of rows for all columns in views within SQL database

I am hoping someone could help or provide insight. I am working on automating the way we audit our database views. The goal is to quickly see which columns are empty in which views. I started to work on a script but ran into different issues. All of the examples with this script run, but produce results that I did not intended. Right now some column headers have titles with a space in between.
Since I am fairly new to this, I am unsure what I am doing wrong or even how to obtain the results I am looking for.
Thank you in advance for your help!
Example 1 - Runs but produces Nulls
Example 2 - Runs but produces 0s,
Example 3 – Runs and produces counts for the entire view rather than
for the column itself (i.e. it gave the max amount of rows in the
view).
------------------------------------
-- EXAMPLE 1:
-- Did not produce the results expected.
-- Issue is it returns all as NULL
---------------------------------------\
DROP TABLE IF EXISTS #tempColumnCount;
CREATE TABLE #tempColumnCount
(
Name VARCHAR(100),
Row_Count INT
);
Declare #SQL VARCHAR(MAX)
SET #SQL = ''
SELECT #SQL = #SQL + 'INSERT INTO #tempColumnCount SELECT ''' + c.name + ''' as Name, SUM (CASE WHEN ''' + c.name +
''' IS NULL THEN 1 END) FROM ' + schema_name(v.schema_id) + '.' + OBJECT_NAME(c.object_id) +
CHAR(13)
FROM sys.columns c
JOIN sys.views v
ON v.object_id = c.object_id AND SCHEMA_NAME(v.schema_id)='analytics'
exec (#SQL)
SELECT Name, Row_Count
FROM #tempColumnCount
---------------------------------------\
-- Example 2: Issue is it returns all as 0
---------------------------------------\
DROP TABLE IF EXISTS #tempColumnCount;
CREATE TABLE #tempColumnCount
(
Name VARCHAR(100),
Row_Count INT
);
Declare #SQL VARCHAR(MAX)
SET #SQL = ''
SELECT #SQL = #SQL + 'INSERT INTO #tempColumnCount SELECT ''' + c.name + ''' as Name, COUNT(1) - COUNT(''' + c.name +
''' ) FROM ' + schema_name(v.schema_id) + '.' + OBJECT_NAME(c.object_id) +
CHAR(13)
FROM sys.columns c
JOIN sys.views v
ON v.object_id = c.object_id AND SCHEMA_NAME(v.schema_id)='analytics'
exec (#SQL)
SELECT Name, Row_Count
FROM #tempColumnCount
---------------------------------------\
-- Example 3: Issue is it returns the the max amount of rows for each column. Which just means it returns the row count for the whole view and not the column it self.
---------------------------------------\
DROP TABLE IF EXISTS #tempColumnCount;
CREATE TABLE #tempColumnCount
(
Name VARCHAR(100),
Row_Count INT
);
Declare #SQL VARCHAR(MAX)
SET #SQL = ''
SELECT #SQL = #SQL + 'INSERT INTO #tempColumnCount SELECT ''' + c.name + ''' as Name, COUNT(''' + c.name +
''' ) FROM ' + schema_name(v.schema_id) + '.' + OBJECT_NAME(c.object_id) +
CHAR(13)
FROM sys.columns c
JOIN sys.views v
ON v.object_id = c.object_id AND SCHEMA_NAME(v.schema_id)='analytics'
exec (#SQL)
SELECT Name, Row_Count
FROM #tempColumnCount

How to dynamically calculate the sums of many columns in a GROUP?

In the table below, I have a variable number of columns, and that number is in the 1000s. I need to sum all the values of each of the 1000 columns grouped by the person's name. So, smith's total test_score_1, total test_score_2,...total test_score_1000. And then Jackson's total test_score_1, total test_score_2,...total test_score_1000.
I don't know the number of 'test_score_n' columns beforehand and they are always changing.
So given this table:
name test_score_1 test_score_2 ... test_score_1000
smith 2 1 0
jackson 0 3 1
jackson 1 1 2
jackson 3 0 3
smith 4 5 1
How can I produce the table below?
name test_score_1 test_score_2 ... test_score_1000
smith 6 6 1
jackson 4 4 6
SQL to generate the SQL
DECLARE #generatedSQL nvarchar(max);
SET #generatedSQL = (
SELECT
'SELECT ' +
SUBSTRING(X.foo, 2, 2000) +
'FROM ' +
QUOTENAME(SCHEMA_NAME(t.schema_id)) + '.' + QUOTENAME(t.name) +
' GROUP BY name' --fix this line , edited
FROM
sys.tables t
CROSS APPLY
(
SELECT
', SUM(' + QUOTENAME(c.name) + ')'
FROM
sys.columns c
WHERE
c.object_id = t.object_id
AND
c.name <> 'Name'
FOR XML PATH('')
) X (foo)
WHERE
t.name = 'MyTable'
);
EXEC (#generatedSQL);
Demo: http://rextester.com/MAFCP19297
SQL
DECLARE #cols varchar(max), #sql varchar(max);
SELECT #cols =
COALESCE(#cols + ', ', '') + 'SUM(' + COLUMN_NAME + ') AS ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = '<tbl name>'
AND COLUMN_NAME <> 'name'
-- The AND below may be optional - see "Additional Notes #1"
AND TABLE_CATALOG = '<database schema name>';
SET #sql = 'SELECT name, ' + #cols + ' FROM tbl GROUP BY name;';
EXEC (#sql);
Explanation
The DECLARE creates two variables - one for storing the column summing part of the SQL and the other for storing the whole dynamically created SQL statement to run.
The SELECT queries the INFORMATION_SCHEMA.COLUMNS system table to get the names of all the columns in tbl apart from the name column. (Alternatively the sys tables could be used - answers to this question discuss the relative merits of each). These row values are then converted into a single comma separated value using this method (which is arguably a little simpler than the alternative FOR XML PATH ('') method). The comma-separated values are a bit more than just the column names - they SUM over each column name and then assign the result with an alias of the same name.
The SET then builds a simple SQL statement that selects the name and all the summed values - e.g: SELECT name, SUM(test_score_1) AS test_score_1, SUM(test_score_2) AS test_score_2, SUM(test_score_1000) AS test_score_1000 FROM tbl GROUP BY name;.
The EXEC then runs the above query.
Additional Notes
If there is a possibility that the table name may not be unique across all databases then the following clause is needed in the select: AND TABLE_CATALOG = '<database schema name>'
My initial answer to this question was mistakenly using MySQL rather than SQL Server - this has now been corrected but the previous version is still in the edit history and might be helpful to someone...
Try this dynamic column generation Sql script
DECLARE #Sql nvarchar(max)
SET #Sql=( SELECT DISTINCT 'SELECT'+
STUFF((SELECT ', '+ ' SUM( '+ COLUMN_NAME +' ) AS '+ QUOTENAME( COLUMN_NAME )
FROM INFORMATION_SCHEMA.COLUMNS Where TABLE_NAME ='Tab1000'
FOR XML PATH (''),type).value('.','varchar(max)'),1,2,'')
+' From Tab1000'From INFORMATION_SCHEMA.COLUMNS Where TABLE_NAME ='Tab1000')
EXEC (#sql)
Try the below script
(set the #tableName= [yourTablename] and #nameColumn to the name of the field you want to group by)
Declare #tableName varchar(50)='totalscores'
Declare #nameColumn nvarchar(50)='name'
Declare #query as nvarchar(MAX) ;
select #query = 'select ' + nameColumn + cast(sumColumns as nvarchar(max)) + 'from ' + #tableName +' group by ' + nameColumn from (
select #nameColumn nameColumn, (SELECT
', SUM(' + QUOTENAME(c.name) + ') ' + QUOTENAME(c.name)
FROM
sys.columns c
WHERE
c.object_id=t.object_id and c.name != #nameColumn
order by c.name
FOR
XML path(''), type
) sumColumns
from sys.tables t where t.name= #tableName
)t
EXECUTE(#query)
Change tablename with your tablename.
Declare #query as nvarchar(MAX) = (SELECT
'SELECT name,' + SUBSTRING(tbl.col, 2, 2000) + ' FROM ' + QUOTENAME(SCHEMA_NAME(t.schema_id)) + '.' + QUOTENAME(t.name) + 'Group By name'
FROM
sys.tables t
CROSS APPLY
(
SELECT
', SUM(' + QUOTENAME(columns.name) + ') as ' + columns.name
FROM
sys.columns columns
WHERE
columns.object_id = t.object_id and columns.name != 'name'
FOR XML PATH('')
) tbl (col)
WHERE
t.name = 'tablename')
select #query EXECUTE(#query)
GBN's dynamic SQL would be my first choice (+1), and would be more performant. However, if you are interested in breaking this horrible cycle of a 1,000+ columns, consider the following:
Example
Declare #YourTable Table ([col 1] int,[col 2] int,[col 1000] varchar(50))
Insert Into #YourTable Values
(2,1,0)
,(4,5,1)
Select Item = replace(C.Item,'_x0020_', ' ')
,Value = sum(C.Value)
From #YourTable A
Cross Apply (Select XMLData= cast((Select A.* for XML RAW) as xml)) B
Cross Apply (
Select Item = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','int')
From B.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('local-name(.)','varchar(100)') not in ('Fields','ToExclude')
) C
Group By C.Item
Returns
Item Value
col 1 6
col 2 6
col 1000 1

List of all tables that contains specific data in column

With this query, I get all tables that contains column named "Status_ID"
SELECT
c.name AS 'ColumnName'
,t.name AS 'TableName'
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE 'Status_ID'
Data in Status_ID may only have values from 1 to 6.
What I want is to get a list of all tables, where Status_ID = 2 at least once.
(Exclude all tables from the code above, that do not contain data with Status_ID = 2)
Solution 1:
Run the below query, which will give you a select query with all the tables containing status_id column. then copy and execute the select query to find the data.
SELECT 'select * from ' + TABLE_NAME + ' where Status_ID = ''2'''
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME = 'Status_ID'
Solution 2:
You may need to use the following solution to find the data in all tables
Find a string by searching all tables in SQL Server Management Studio 2008
This should do the trick:
DECLARE #sql NVARCHAR(MAX) = 'DECLARE #tables NVARCHAR(MAX) = '''' ;' ;
DECLARE #tables NVARCHAR(MAX) = '';
SELECT #SQL += 'IF EXISTS (SELECT ''X'' FROM ' + QUOTENAME(t.name)
+ 'WHERE STATUS_ID =2) SET #tables+= ' + '''' + + ',' +
QUOTENAME(t.name) + ''''
+ ';'
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE 'STATUS_ID'
SET #sql += 'SELECT SUBSTRING(#TABLES,2,LEN(#TABLES));'
EXEC(#SQL);

A query to get the MAX LEN of every field in a SQL table

I've been trying to come up with a query that will list the name of every column in a table with it's max length.
This is in SQL Server 2012.
Any help is greatly appreciated.
You can use below query to get the the Actual and used length of the columns in a table
DECLARE #TSQL VARCHAR(MAX) = ''
DECLARE #TableName sysname = 't1'
SELECT #TSQL = #TSQL + 'SELECT ' + QUOTENAME(sc.name, '''') + ' AS ColumnName, ' + QUOTENAME(t.name, '''') + ' AS DataType, ' +
QUOTENAME(sc.max_length, '''') + ' AS ActualLength, MAX(DATALENGTH(' + QUOTENAME(sc.name) + ')) AS UsedLength FROM '+#TableName+ char(10) +' UNION '
FROM sys.columns sc
JOIN sys.types t on t.system_type_id = sc.system_type_id and t.name != 'sysname'
WHERE sc.OBJECT_ID = OBJECT_ID(#TableName)
SET #TSQL = LEFT(#TSQL, LEN(#TSQL)-6)
EXEC(#TSQL)
If you want know your table detail use information_schema.columns
select *
from information_schema.columns
where table_name = 'YourTableName'
If you want the lenght of a string field use LEN
select len(field1)
from YourTableName

SQL sum across columns by name

I have a table with lots of columns, some of them with names beginning with "EQ". For an individual row, I'd like to sum all the values in the columns that start with "EQ", but not the other values. I know I can do it like this:
select EQ_DOMESTIC + EQ_INTL + EQ_OTHER from myTable where id=1
However, I have lots of columns, and I was wondering if I can do it systematically, without typing in the name of each column. Would I have to get the column names from the system tables in another query?
Follow up question: Some of the values are nulls, which makes the sum NULL. Is there any way to avoid writing out ISNULL(column,0) for the sum?
You can do this pretty easily with dynamic SQL:
DECLARE #sql NVARCHAR(MAX) = N'';
SELECT #sql += N'
+ COALESCE(' + QUOTENAME(name) + ', 0)'
FROM sys.columns
WHERE [object_id] = OBJECT_ID('dbo.MyTable')
AND name LIKE 'EQ[_]%';
SELECT #sql += N',' + QUOTENAME(name)
FROM sys.columns
WHERE [object_id] = OBJECT_ID('dbo.MyTable')
AND name NOT LIKE 'EQ[_]%';
SELECT #sql = 'SELECT [EQ_SUM] = 0' + #sql
+ ' FROM dbo.MyTable WHERE id = 1;';
PRINT #sql;
-- EXEC sp_executesql #sql;