How to dynamically calculate the sums of many columns in a GROUP? - sql

In the table below, I have a variable number of columns, and that number is in the 1000s. I need to sum all the values of each of the 1000 columns grouped by the person's name. So, smith's total test_score_1, total test_score_2,...total test_score_1000. And then Jackson's total test_score_1, total test_score_2,...total test_score_1000.
I don't know the number of 'test_score_n' columns beforehand and they are always changing.
So given this table:
name test_score_1 test_score_2 ... test_score_1000
smith 2 1 0
jackson 0 3 1
jackson 1 1 2
jackson 3 0 3
smith 4 5 1
How can I produce the table below?
name test_score_1 test_score_2 ... test_score_1000
smith 6 6 1
jackson 4 4 6

SQL to generate the SQL
DECLARE #generatedSQL nvarchar(max);
SET #generatedSQL = (
SELECT
'SELECT ' +
SUBSTRING(X.foo, 2, 2000) +
'FROM ' +
QUOTENAME(SCHEMA_NAME(t.schema_id)) + '.' + QUOTENAME(t.name) +
' GROUP BY name' --fix this line , edited
FROM
sys.tables t
CROSS APPLY
(
SELECT
', SUM(' + QUOTENAME(c.name) + ')'
FROM
sys.columns c
WHERE
c.object_id = t.object_id
AND
c.name <> 'Name'
FOR XML PATH('')
) X (foo)
WHERE
t.name = 'MyTable'
);
EXEC (#generatedSQL);

Demo: http://rextester.com/MAFCP19297
SQL
DECLARE #cols varchar(max), #sql varchar(max);
SELECT #cols =
COALESCE(#cols + ', ', '') + 'SUM(' + COLUMN_NAME + ') AS ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = '<tbl name>'
AND COLUMN_NAME <> 'name'
-- The AND below may be optional - see "Additional Notes #1"
AND TABLE_CATALOG = '<database schema name>';
SET #sql = 'SELECT name, ' + #cols + ' FROM tbl GROUP BY name;';
EXEC (#sql);
Explanation
The DECLARE creates two variables - one for storing the column summing part of the SQL and the other for storing the whole dynamically created SQL statement to run.
The SELECT queries the INFORMATION_SCHEMA.COLUMNS system table to get the names of all the columns in tbl apart from the name column. (Alternatively the sys tables could be used - answers to this question discuss the relative merits of each). These row values are then converted into a single comma separated value using this method (which is arguably a little simpler than the alternative FOR XML PATH ('') method). The comma-separated values are a bit more than just the column names - they SUM over each column name and then assign the result with an alias of the same name.
The SET then builds a simple SQL statement that selects the name and all the summed values - e.g: SELECT name, SUM(test_score_1) AS test_score_1, SUM(test_score_2) AS test_score_2, SUM(test_score_1000) AS test_score_1000 FROM tbl GROUP BY name;.
The EXEC then runs the above query.
Additional Notes
If there is a possibility that the table name may not be unique across all databases then the following clause is needed in the select: AND TABLE_CATALOG = '<database schema name>'
My initial answer to this question was mistakenly using MySQL rather than SQL Server - this has now been corrected but the previous version is still in the edit history and might be helpful to someone...

Try this dynamic column generation Sql script
DECLARE #Sql nvarchar(max)
SET #Sql=( SELECT DISTINCT 'SELECT'+
STUFF((SELECT ', '+ ' SUM( '+ COLUMN_NAME +' ) AS '+ QUOTENAME( COLUMN_NAME )
FROM INFORMATION_SCHEMA.COLUMNS Where TABLE_NAME ='Tab1000'
FOR XML PATH (''),type).value('.','varchar(max)'),1,2,'')
+' From Tab1000'From INFORMATION_SCHEMA.COLUMNS Where TABLE_NAME ='Tab1000')
EXEC (#sql)

Try the below script
(set the #tableName= [yourTablename] and #nameColumn to the name of the field you want to group by)
Declare #tableName varchar(50)='totalscores'
Declare #nameColumn nvarchar(50)='name'
Declare #query as nvarchar(MAX) ;
select #query = 'select ' + nameColumn + cast(sumColumns as nvarchar(max)) + 'from ' + #tableName +' group by ' + nameColumn from (
select #nameColumn nameColumn, (SELECT
', SUM(' + QUOTENAME(c.name) + ') ' + QUOTENAME(c.name)
FROM
sys.columns c
WHERE
c.object_id=t.object_id and c.name != #nameColumn
order by c.name
FOR
XML path(''), type
) sumColumns
from sys.tables t where t.name= #tableName
)t
EXECUTE(#query)

Change tablename with your tablename.
Declare #query as nvarchar(MAX) = (SELECT
'SELECT name,' + SUBSTRING(tbl.col, 2, 2000) + ' FROM ' + QUOTENAME(SCHEMA_NAME(t.schema_id)) + '.' + QUOTENAME(t.name) + 'Group By name'
FROM
sys.tables t
CROSS APPLY
(
SELECT
', SUM(' + QUOTENAME(columns.name) + ') as ' + columns.name
FROM
sys.columns columns
WHERE
columns.object_id = t.object_id and columns.name != 'name'
FOR XML PATH('')
) tbl (col)
WHERE
t.name = 'tablename')
select #query EXECUTE(#query)

GBN's dynamic SQL would be my first choice (+1), and would be more performant. However, if you are interested in breaking this horrible cycle of a 1,000+ columns, consider the following:
Example
Declare #YourTable Table ([col 1] int,[col 2] int,[col 1000] varchar(50))
Insert Into #YourTable Values
(2,1,0)
,(4,5,1)
Select Item = replace(C.Item,'_x0020_', ' ')
,Value = sum(C.Value)
From #YourTable A
Cross Apply (Select XMLData= cast((Select A.* for XML RAW) as xml)) B
Cross Apply (
Select Item = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','int')
From B.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('local-name(.)','varchar(100)') not in ('Fields','ToExclude')
) C
Group By C.Item
Returns
Item Value
col 1 6
col 2 6
col 1000 1

Related

SQL apply join dynamically based on column names mentioned in other table

I have table for example temp
Id Col1 Col2 Col3
1 1 2 3
and I have another table joininfo
Id SourceKey Table TargetKey
1 Col1 A ColA
2 Col2 B ColB
3 Col3 C ColC
I want to generate a query which will add inner join clause dynamically and will look like this
SELECT * FROM temp
INNER JOIN A ON Col1=ColA
INNER JOIN B ON Col2=ColB
INNER JOIN C ON Col3=ColC
Any help?
Can't do it.
SQL needs to know this stuff at query compile time, before looking at any data, so it can validate security and check for possible indexes. The only query element comes close to looking at data as if it were a column after query compile time is the PIVOT keyword.
Otherwise, you're down to a CASE expression listing every possible set of column compares, or writing dynamic SQL over multiple steps where you first execute a query to find what columns/joins you need, use those results to build a new query string, and then execute the string you just made.
As per the comment from Charlieface updating the answer using string_agg
declare
#dsql nvarchar(max)
select #dsql = 'select * from temp ' + string_agg(' join '+ Table + ' on ' + c ,' ')
from (
select Table , string_agg ( SourceKey + ' = ' + TargetKey,' and ') c
from table1
group by Table ) t
select #dsql
EXECUTE sp_executesql #dsql
You can build a dynamic SQL query quite neatly by using string aggregation:
Try to keep clear about which bits are static and which dynamic. And test the generated code by using PRINT #sql
DECLARE #sql nvarchar(max) = N'
SELECT *
FROM temp s
' +
(
SELECT STRING_AGG(CAST(
N'JOIN ' + QUOTENAME(SCHEMA_NAME(t.schema_id)) + N'.' + QUOTENAME(t.name) + N' AS T' + CAST(t.object_id AS nvarchar(10)) + N'
ON s.' + QUOTENAME(j.SourceKey) + N' = T' + CAST(t.object_id AS nvarchar(10)) + N'.' + QUOTENAME(c.name)
AS nvarchar(max)), N'
') WITHIN GROUP (ORDER BY j.Id)
FROM sys.tables t
JOIN sys.columns c ON c.object_id = t.object_id
JOIN joininfo j ON OBJECT_ID(j.[Table]) = t.object_id
AND j.TargetKey = c.name
);
PRINT #sql; -- for testing
EXECUTE sp_executesql #sql;
If your version of SQL Server does not support STRING_AGG you can use FOR XML PATH('') instead.

Build an SQL query dynamically based on columns' datatype

In my app, i have a feature where the user can enter multiple search fields and these will be used to query the db, for example, the user enters:
smith 123456 london 12/01/2020
These fields will be passed to a stored procedure as a table-valued parameter (consisting of one column as varchar). The sp uses a view as its datasource. For example for the above custom search, there will be a view with the following columns:
number, int
firstname, varchar
lastname, varchar
dob, datetime
address, varchar
The sp needs to build the sql query dynamically and this query should look like
select * from customersview
where 'smith' in (firstname, lastname, address)
and 123456 in (number)
and 'london' in (firstname, lastname, address)
and '12/01/2029' in (dob)
So basically, what the sp does is:
Take the search filters and determine what datatype they are
Map the filters' datatype with columns' datatype, so that, for example, an int filter is mapped to all int columns, etc.
So I started off with the following:
select COLUMN_NAME, DATA_TYPE
from INFORMATION_SCHEMA.VIEWS v
join INFORMATION_SCHEMA.COLUMNS c on c.TABLE_SCHEMA = v.TABLE_SCHEMA
and c.TABLE_NAME = v.TABLE_NAME
where c.TABLE_NAME = 'customersview'
which will give me the view's columns and their datatype.
But how can I match the data types (because the filters come in a TVP) so that I can build the various conditions?
Alternatively, I can change the TableType so that it has 3 unique columns (int, varchar, datetime) and the app determines the data type and adds the value in the correct column.
I just tried to build the query using a while loop and checking the datatype as following.
I have added comments in the query itself for easy understanding.
TODO:
1- You need to add other datatypes the below query.
2- You need to parameterized the query and use sp_executesql instead of execute to avoid any sql injection attack.
--Table to Store search inputs, which will be your table type parameter.
DECLARE #v TABLE (searchString VARCHAR(100))
--Sample Inputs
INSERT INTO #v
SELECT *
FROM (
VALUES ('smith')
,('1234')
,('london')
,('12/01/2020')
) t(v)
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
DROP TABLE #Temp
--Create a temporary table to loop the serach inputs
SELECT *
,0 AS IsProcessed
INTO #Temp
FROM #v
DECLARE #query NVARCHAR(max) = 'SELECT * FROM customersview WHERE 1 = 1 '
DECLARE #searchString VARCHAR(100)
--Loop through each search input
WHILE (
SELECT Count(*)
FROM #Temp
) > 0
BEGIN
SELECT TOP 1 #searchString = searchString
FROM #Temp
SELECT #searchString
--Check if input is int/bigint type
IF (ISNUMERIC(#searchString) = 1)
BEGIN
SET #query = #query + 'AND ' + #searchString + ' IN (' + Stuff((
SELECT DISTINCT ', ' + Quotename(COLUMN_NAME)
FROM (
SELECT COLUMN_NAME
,DATA_TYPE
FROM INFORMATION_SCHEMA.VIEWS v
JOIN INFORMATION_SCHEMA.COLUMNS c ON c.TABLE_SCHEMA = v.TABLE_SCHEMA
AND c.TABLE_NAME = v.TABLE_NAME
WHERE c.TABLE_NAME = 'customersview'
AND DATA_TYPE IN ('int', 'bigint')
) t
FOR XML path('')
,type
).value('.', 'NVARCHAR(MAX)'), 1, 1, '') + ')'
END
--Check if input is date type
ELSE IF (ISDATE(#searchString) = 1)
BEGIN
SET #query = #query + ' AND ''' + #searchString + ''' IN (' + Stuff((
SELECT DISTINCT ', ' + Quotename(COLUMN_NAME)
FROM (
SELECT COLUMN_NAME
,DATA_TYPE
FROM INFORMATION_SCHEMA.VIEWS v
JOIN INFORMATION_SCHEMA.COLUMNS c ON c.TABLE_SCHEMA = v.TABLE_SCHEMA
AND c.TABLE_NAME = v.TABLE_NAME
WHERE c.TABLE_NAME = 'customersview'
AND DATA_TYPE IN ('date', 'datetime')
) t
FOR XML path('')
,type
).value('.', 'NVARCHAR(MAX)'), 1, 1, '') + ')'
END
ELSE
BEGIN
--Check if input is VARCHAR/NVARCHAR type
SET #query = #query + ' AND ''' + #searchString + ''' IN (' + Stuff((
SELECT DISTINCT ', ' + Quotename(COLUMN_NAME)
FROM (
SELECT COLUMN_NAME
,DATA_TYPE
FROM INFORMATION_SCHEMA.VIEWS v
JOIN INFORMATION_SCHEMA.COLUMNS c ON c.TABLE_SCHEMA = v.TABLE_SCHEMA
AND c.TABLE_NAME = v.TABLE_NAME
WHERE c.TABLE_NAME = 'customersview'
AND DATA_TYPE IN ('VARCHAR', 'NVARCHAR')
) t
FOR XML path('')
,type
).value('.', 'NVARCHAR(MAX)'), 1, 1, '') + ')'
END
DELETE #Temp
WHERE searchString = #searchString
END
SELECT #query
--Execute the query
--EXEC(#Query)

How to get the length one column in all tables in one database?

I am having trouble to get the max length of records in one column in all tables.
I would only like to display the max length for each table for the specific column.
Below is what I have tried, I already found the way to return the column I need, but now, i need to get the max len. I know this is not the right way.
select max(len(site)) as site from
(
SELECT t.name AS TableName
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE 'site%')A
The expected result will display the column name, the table name and also the max length of the records for that column.
Thanks in advance
I don't understand what are you really trying to do, but I think you want something like
CREATE TABLE T1(
Site1 VARCHAR(45)
);
CREATE TABLE T2(
Site2 VARCHAR(45)
);
INSERT INTO T1 VALUES ('A'), ('AA');
INSERT INTO T2 VALUES ('BBB'), ('BBBBB');
DECLARE #SQL NVARCHAR(MAX) = 'SELECT ';
SELECT #SQL = #SQL +
N'(SELECT MAX(LEN(' + --You can also add ISNULL([Col],0) to get 0
QUOTENAME(c.name) + ')) FROM '+
QUOTENAME(t.name) + ') AS ' +
QUOTENAME(t.name + '.'+c.name) + ', '
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE 'site%';
SET #SQL = LEFT(#SQL, LEN(#SQL)-1);
EXEC sp_executesql #SQL;
Which will returns:
+----------+----------+
| T1.Site1 | T2.Site2 |
+----------+----------+
| 2 | 5 |
+----------+----------+
Live Demo
Try this:You will get individual Scripts to execute.
select 'select Max(len('+COLUMN_NAME+')),'''+COLUMN_NAME+ ''' as ColumnName ,'''+TABLE_NAME+''' as TableName from ' +table_name
from INFORMATION_SCHEMA.COLUMNS where COLUMN_NAME = 'YourColumnName'
If I understand your question, you may try to generate a dynamic SQL statement and execute this statement:
-- Declarations
DECLARE #stm nvarchar(max)
SET #stm = N''
-- Dynamic SQL
SELECT #stm = (
SELECT CONCAT(
N'UNION ALL ',
N'SELECT ''',
t.name,
N''' AS TableName, ''',
c.name,
N''' AS ColumnName, ',
N'ValueLength = (SELECT MAX(LEN(',
QUOTENAME(c.name),
')) FROM ',
QUOTENAME(t.name),
N')'
)
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE 'site%'
ORDER BY t.name, c.name
FOR XML PATH('')
)
SET #stm = STUFF(#stm, 1, 10, N'')
-- Execution
PRINT #stm
EXEC sp_executesql #stm

Transpose/Pivot Table without knowing the number or names of attributes

I want to transpose an SQL table from a row into a column of results. The statement will only return one record however at the time of running the query I will not know the names of attributes in the table. All the query will know is the table and the ID column to return the relevant record.
i.e. I would like to return this as a column of results:
SELECT * FROM ExampleTable WHERE (PKCol = 'XYZ');
That is the only information I will know at the time of running the query in SQL Server 2012.
Thanks
You should retrieve column names from sys.columns system view, concat them in cusror and use UNPIVOT.
Something like this:
DECLARE #columns AS NVARCHAR(MAX) = '', #columns_char AS NVARCHAR(MAX) = '', #query AS NVARCHAR(MAX)
SELECT #columns += ',' + c.name, #columns_char += ',CAST(' + c.name + ' AS VARCHAR(255)) AS ' + c.name FROM sys.columns AS c WHERE c.object_id = OBJECT_ID(N'Your Table Name')
SELECT #columns = STUFF(#columns, 1, 1, ''), #columns_char = STUFF(#columns_char, 1, 1, '')
SELECT #columns, #columns_char
SET #query =
N'SELECT
column_name,
col
FROM
(
SELECT ' + #columns_char + ' FROM ' + Your_table_name + N' AS t
WHERE t.id = ' + Your Id + N'
) AS sel
UNPIVOT
(
col FOR column_name IN(' + #columns + ')
) AS upt';
EXEC sp_executesql #query

Looping through column names with dynamic SQL

I just came up with an idea for a piece of code to show all the distinct values for each column, and count how many records for each. I want the code to loop through all columns.
Here's what I have so far... I'm new to SQL so bear with the noobness :)
Hard code:
select [Sales Manager], count(*)
from [BT].[dbo].[test]
group by [Sales Manager]
order by 2 desc
Attempt at dynamic SQL:
Declare #sql varchar(max),
#column as varchar(255)
set #column = '[Sales Manager]'
set #sql = 'select ' + #column + ',count(*) from [BT].[dbo].[test] group by ' + #column + 'order by 2 desc'
exec (#sql)
Both of these work fine. How can I make it loop through all columns? I don't mind if I have to hard code the column names and it works its way through subbing in each one for #column.
Does this make sense?
Thanks all!
You can use dynamic SQL and get all the column names for a table. Then build up the script:
Declare #sql varchar(max) = ''
declare #tablename as varchar(255) = 'test'
select #sql = #sql + 'select [' + c.name + '],count(*) as ''' + c.name + ''' from [' + t.name + '] group by [' + c.name + '] order by 2 desc; '
from sys.columns c
inner join sys.tables t on c.object_id = t.object_id
where t.name = #tablename
EXEC (#sql)
Change #tablename to the name of your table (without the database or schema name).
This is a bit of an XY answer, but if you don't mind hardcoding the column names, I suggest you do just that, and avoid dynamic SQL - and the loop - entirely. Dynamic SQL is generally considered the last resort, opens you up to security issues (SQL injection attacks) if not careful, and can often be slower if queries and execution plans cannot be cached.
If you have a ton of column names you can write a quick piece of code or mail merge in Word to do the substitution for you.
However, as far as how to get column names, assuming this is SQL Server, you can use the following query:
SELECT c.name
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
Therefore, you can build your dynamic SQL from this query:
SELECT 'select '
+ QUOTENAME(c.name)
+ ',count(*) from [BT].[dbo].[test] group by '
+ QUOTENAME(c.name)
+ 'order by 2 desc'
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
and loop using a cursor.
Or compile the whole thing together into one batch and execute. Here we use the FOR XML PATH('') trick:
DECLARE #sql VARCHAR(MAX) = (
SELECT ' select ' --note the extra space at the beginning
+ QUOTENAME(c.name)
+ ',count(*) from [BT].[dbo].[test] group by '
+ QUOTENAME(c.name)
+ 'order by 2 desc'
FROM sys.columns c
WHERE c.object_id = OBJECT_ID('dbo.test')
FOR XML PATH('')
)
EXEC(#sql)
Note I am using the built-in QUOTENAME function to escape column names that need escaping.
You want to know the distinct coulmn values in all the columns of the table ? Just replace the table name Employee with your table name in the following code:
declare #SQL nvarchar(max)
set #SQL = ''
;with cols as (
select Table_Schema, Table_Name, Column_Name, Row_Number() over(partition by Table_Schema, Table_Name
order by ORDINAL_POSITION) as RowNum
from INFORMATION_SCHEMA.COLUMNS
)
select #SQL = #SQL + case when RowNum = 1 then '' else ' union all ' end
+ ' select ''' + Column_Name + ''' as Column_Name, count(distinct ' + quotename (Column_Name) + ' ) As DistinctCountValue,
count( '+ quotename (Column_Name) + ') as CountValue FROM ' + quotename (Table_Schema) + '.' + quotename (Table_Name)
from cols
where Table_Name = 'Employee' --print #SQL
execute (#SQL)