Select non-empty columns using SQL Server - sql

I am using SQL Server 2012. i have a table with 90 columns. I am trying to select only columns that contains data. After searching i used the following procedure:
1- Getting all columns count using one select query
2- Pivoting Result Table into a Temp table
3- Creating Select query
4- Executing this query
Here is the query i used:
DECLARE #strTablename varchar(100) = 'dbo.MyTable'
DECLARE #strQuery varchar(max) = ''
DECLARE #strSecondQuery varchar(max) = 'SELECT '
DECLARE #strUnPivot as varchar(max) = ' UNPIVOT ([Count] for [Column] IN ('
CREATE TABLE ##tblTemp([Column] varchar(50), [Count] Int)
SELECT #strQuery = ISNULL(#strQuery,'') + 'Count([' + name + ']) as [' + name + '] ,' from sys.columns where object_id = object_id(#strTablename) and is_nullable = 1
SELECT #strUnPivot = ISNULL(#strUnPivot,'') + '[' + name + '] ,' from sys.columns where object_id = object_id(#strTablename) and is_nullable = 1
SET #strQuery = 'SELECT [Column],[Count] FROM ( SELECT ' + SUBSTRING(#strQuery,1,LEN(#strQuery) - 1) + ' FROM ' + #strTablename + ') AS p ' + SUBSTRING(#strUnPivot,1,LEN(#strUnPivot) - 1) + ')) AS unpvt '
INSERT INTO ##tblTemp EXEC (#strQuery)
SELECT #strSecondQuery = #strSecondQuery + '[' + [Column] + '],' from ##tblTemp WHERE [Count] > 0
DROP TABLE ##tblTemp
SET #strSecondQuery = SUBSTRING(#strSecondQuery,1,LEN(#strSecondQuery) - 1) + ' FROM ' + #strTablename
EXEC (#strSecondQuery)
The problem is that this query is TOO SLOW. Is there a best way to achieve this?
Notes:
Table have only one clustered index on primary key Column ID and does not contains any other indexes.
Table is not editable.
Table contains very large data.
Query is taking about 1 minute to be executed
Thanks in advance.

I do not know if this is faster, but you might use one trick: FOR XML AUTO will ommit columns without content:
DECLARE #tbl TABLE(col1 INT,col2 INT,col3 INT);
INSERT INTO #tbl VALUES (1,2,NULL),(1,NULL,NULL),(NULL,NULL,NULL);
SELECT *
FROM #tbl AS tbl
FOR XML AUTO
This is the result: col3 is missing...
<tbl col1="1" col2="2" />
<tbl col1="1" />
<tbl />
Knowing this, you could find the list of columns, which are not NULL in all rows, like this:
DECLARE #ColList VARCHAR(MAX)=
STUFF
(
(
SELECT DISTINCT ',' + Attr.value('local-name(.)','nvarchar(max)')
FROM
(
SELECT
(
SELECT *
FROM #tbl AS tbl
FOR XML AUTO,TYPE
) AS TheXML
) AS t
CROSS APPLY t.TheXML.nodes('/tbl/#*') AS A(Attr)
FOR XML PATH('')
),1,1,''
);
SELECT #ColList
The content of #ColList is now col1,col2. This string you can place in a dynamically created SELECT.
UPDATE: Hints
It would be very clever, to replace the SELECT * with a column list created from INFORMATION_SCHEMA.COLUMNS excluding all not-nullable. And - if needed and possible - types, wich contain very large data (BLOBs).
UPDATE2: Performance
Don't know what your very large data means actually... Just tried this on a table with about 500.000 rows (with SELECT *) and it returned correctly after less than one minute. Hope, this is fast enough...

Try using this condition:
where #columnname IS NOT NULL AND #columnname <> ' '

Related

Return SELECT query result as a CSV string

I have the following Sql Server 2016 SELECT statement that returns only 1 row:
SELECT TOP 1 * FROM tempdb.dbo.IMTD
How can I concatenate the values as a comma delimited string? NOTE: the column names of this temporary table are unknown as they can variate.
Thank you.
Something like this perhaps:
-- Sample data
DECLARE #someTable TABLE (SomeID int identity, SomeTxt varchar(100));
INSERT #someTable VALUES ('row1'),('row2'),('row3');
-- Solution
SELECT ConcatinatedString =
STUFF
((
SELECT ','+SomeTxt
FROM #someTable
FOR XML PATH(''), TYPE
).value('.','varchar(100)'),1,1,'');
You can use Dynamic query as below:
DECLARE #COLS VARCHAR(MAX) = ''
SELECT #COLS = #COLS + ',' + COLUMN_NAME
FROM tempdb.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME LIKE '#table[_]%' -- Dynamic Table (here, Temporary table)
DECLARE #COLNAMES VARCHAR(MAX) = REPLACE(STUFF(#COLS, 1, 1, ''), ',', '+ '','' +')
Declare #cmd varchar(max) = 'Select ' + #COLNAMES + ' as CSVCol from #table'
-- will generate
-- Select Column1+ ',' +Column2+ ',' +Column3 as CSVCol from #table
EXEC (#cmd)
Another solution you can try is this.
SELECT LTRIM(RTRIM(<ColumnName1>)) + ',',
LTRIM(RTRIM(<ColumnName2>)) + ',',
...
LTRIM(RTRIM(<ColumnNamen>)) + ','
FROM tempdb.dbo.IMTD
If you only want one row keep that top 1 In there like
SELECT TOP 1
LTRIM(RTRIM(<ColumnName1>)) + ',',
LTRIM(RTRIM(<ColumnName2>)) + ',',
...
LTRIM(RTRIM(<ColumnNamen>)) + ','
FROM tempdb.dbo.IMTD
The LTRIM and RTRIM will remove any white space and this should allow you to copy and paste the result set anywhere you may need it. You will need to do this for each columnname.
You can use the query below to get the column names from your temp table.
DECLARE #ColumnNames NVARCHAR(MAX)
SELECT
#ColumnNames= COALESCE(#ColumnNames +',','')+COLUMN_NAME
FROM
TempDB.INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = '#TempTableName'

sum columns dynamically sql

I have multiple columns with some amount in a table and I want to show the total of all those amounts in the last Total column. I have a table in sql which looks somewhat like this,
A_Amt B_Amt C_Amt D_Amt E_Amt F_Amt ...
------------------------------------------------
15 20 25 30 35 40
i have written a query as
declare #xmlResult xml=
(
select *
from Foo
for xml PATH
);
SELECT Nodes.node.value('sum(*[contains(local-name(.), "_Amt")])', 'decimal(15,2)') AS Total
FROM
#xmlResult.nodes('//row') as Nodes(node);
but the result I am getting has only one column total but i want all the columns in resultant table like A_amt etc..
This should be what you need, BUT ATTENTION! You should NOT do this. Aggregate rows should NEVER be fetched together with the "raw" data. This is - in most cases - something your UI should do (or a report...)
declare #table TABLE(ID INT IDENTITY, a INT,b INT,c INT);
insert into #table VALUES(1,1,1),(2,3,4),(5,6,7);
SELECT a,b,c
FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY t.ID) AS inx
,a,b,c
FROM #table AS t
UNION SELECT 999999,SUM(a),SUM(b),SUM(c)
FROM #table
) AS tbl
ORDER BY tbl.inx
I think this is what you are looking for, try this (replace spt_values with your table) :
USE MASTER
GO
declare #lsql nvarchar(max)
declare #lsql2 nvarchar(max)
declare #yourTable nvarchar(255) = 'spt_values'
Select #lsql = isnull(#lsql+'+','') + 'Case When ISNUMERIC('+name+') = 1 Then '+name+' else 0 end' from sys.Columns where Object_id = Object_id(#yourTable)
Print #lsql
SET #lsql2 = 'Select *, '+#lsql+' as Total_allcolumns From '+#yourTable+''
Exec(#lsql2)
Using Microsoft's system table is one way to achieve dynamic SQL and thus your goal. The code below is what you want or will at least get you started.
I wasn't sure what output you expected, so I included two outputs. Just use the one you want and discard the other one. Given your question, it is probably result1. (Result1 or Result2)
!!You have to write the table name in the script at the place indicated prior to executing it!!
--DISCLAIMER
--It assume you use SQL SERVER 2012. (Probably work on 2005+ with little adjustment)
--It assume data is in a table, (Not a view for example)
--Changing SQL SERVER version may break the code as Microsoft could change "system views".
--I don't remember well, but EXEC may be limited to 4000 characters in dynamic query. (But there is a work around, just look around if you need it)
--So use at your own risk
DECLARE #objectIDTable INT,
#AllColumnAdditionStatement NVARCHAR(MAX) = '',
#TableName NVARCHAR(250) = 'WriteYourTableNameHere',--!!!OVERWRITE THE TABLE NAME HERE
#Query NVARCHAR(MAX),
#AllSumStatement NVARCHAR(MAX) = ''
SELECT TOP 1 #objectIDTable = [object_id],
#AllColumnAdditionStatement = ''
FROM sys.objects
WHERE type_desc = 'USER_TABLE'
AND name = #TableName
SELECT #AllColumnAdditionStatement = #AllColumnAdditionStatement + 'CONVERT(DECIMAL(18, 4), (CASE WHEN ISNUMERIC(' + name + ') = 1 THEN ISNULL(' + name + ', ''0'') ELSE 0 END))' + ' + ',
#AllSumStatement = #AllSumStatement + name + 'Total = SUM(CONVERT(DECIMAL(18, 4), (CASE WHEN ISNUMERIC(' + name + ') = 1 THEN ISNULL(' + name + ', ''0'') ELSE 0 END))), ' + CHAR(10)
FROM sys.columns
WHERE object_id = #objectIDTable
AND name LIKE '%_Amt' --!!!Here is a column filter/selector to sum only column ending with _Amt
SELECT #AllColumnAdditionStatement = #AllColumnAdditionStatement + '0', --just too lazy to chop off last three char
#AllSumStatement = #AllSumStatement + 'Total_ = SUM(' + #AllColumnAdditionStatement + ')' + CHAR(10),
#Query = 'SELECT *,
Total_ = ' + #AllColumnAdditionStatement +'
FROM ' + #TableName
PRINT (#Query)
/********************************************************************************************/
EXEC (#Query) --or use sp_execute if you prefer
--Result1 : addition of all selected columns into total column with all column return as well
/********************************************************************************************/
SELECT #Query = 'SELECT ' + #AllSumStatement + '
FROM ' + #TableName
EXEC (#Query) --or use sp_execute if you prefer
--Result2 : Summation of all column individualy and summation of all of them into total column
/********************************************************************************************/

Get top three most common values from every column in a table

I'm trying to write a query that will produce a very small sample of data from each column of a table, in which the sample is made up of the top 3 most common values. This particular problem is part of a bigger task, which is to write scripts that can characterize a database and its tables, its data integrity, and also quickly survey common values in the table on a per-column basis. Think of this as an automated "analysis" of a table.
On a single column basis, I do this already by simply calculating the frequency of values and then sorting by frequency. If I had a column called "color" and all colors were in it, and it just so happened that the color "blue" was in most rows, then the top 1 most frequently occurring value would be "blue". In SQL that is easy to calculate.
However, I'm not sure how I would do this over multiple columns.
Currently, when I do a calculation over all columns of a table, I perform the following type of query:
USE database;
DECLARE #t nvarchar(max)
SET #t = N'SELECT '
SELECT #t = #t + 'count(DISTINCT CAST(' + c.name + ' as varchar(max))) "' + c.name + '",'
FROM sys.columns c
WHERE c.object_id = object_id('table');
SET #t = SUBSTRING(#t, 1, LEN(#t) - 1) + ' FROM table;'
EXEC sp_executesql #t
However, its not entirely clear to me how I would do that here.
(Sidenote:columns that are of type text, ntext, and image, since those would cause errors while counting distinct values, but i'm less concerned about solving that)
But the problem of getting top three most frequent values per column has got me absolutely stumped.
Ideally, I'd like to end up with something like this:
Col1 Col2 Col3 Col4 Col5
---------------------------------------------------------------------
1,2,3 red,blue,green 29,17,0 c,d,j nevada,california,utah
I hacked this together, but it seems to work:
I cant help but think I should be using RANK().
USE <DB>;
DECLARE #query nvarchar(max)
DECLARE #column nvarchar(max)
DECLARE #table nvarchar(max)
DECLARE #i INT = 1
DECLARE #maxi INT = 10
DECLARE #target NVARCHAR(MAX) = <table>
declare #stage TABLE (i int IDENTITY(1,1), col nvarchar(max), tbl nvarchar(max))
declare #results table (ColumnName nvarchar(max), ColumnValue nvarchar(max), ColumnCount int, TableName NVARCHAR(MAX))
insert into #stage
select c.name, o.name
from sys.columns c
join sys.objects o on o.object_id=c.object_id and o.type = 'u'
and c.system_type_id IN (select system_type_id from sys.types where [name] not in ('text','ntext','image'))
and o.name like #target
SET #maxi = (select max(i) from #stage)
while #i <= #maxi
BEGIN
set #column = (select col from #stage where i = #i)
set #table = (select tbl from #stage where i = #i)
SET #query = N'SELECT ' +''''+#column+''''+' , '+ #column
SELECT #query = #query + ', COUNT( ' + #column + ' ) as count' + #column + ' , ''' + #table + ''' as tablename'
select #query = #query + ' from ' + #table + ' group by ' + #column
--Select #query
insert into #results
EXEC sp_executesql #query
SET #i = #i + 1
END
select * from #results
; with cte as (
select *, ROW_NUMBER() over (partition by Columnname order by ColumnCount desc) as rn from #results
)
select * from cte where rn <=3
Start with this SQL Statement builder, and modify it to suit your liking:
EDIT Added Order by Desc
With ColumnSet As
(
Select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
From INFORMATION_SCHEMA.COLUMNS
Where 1=1
And TABLE_NAME IN ('Table1')
And COLUMN_NAME IN ('Column1', 'Column2')
)
Select 'Select Top 3 ' + COLUMN_NAME + ', Count (*) NumInstances From ' + TABLE_SCHEMA + '.'+ TABLE_NAME + ' Group By ' + COLUMN_NAME + ' Order by Count (*) Desc'
From ColumnSet

Dynamic SQL Result INTO Temporary Table

I need to store dynamic sql result into a temporary table #Temp.
Dynamic SQL Query result is from a pivot result, so number of columns varies(Not fixed).
SET #Sql = N'SELECT ' + #Cols + ' FROM
(
SELECT ResourceKey, ResourceValue
FROM LocaleStringResources where StateId ='
+ LTRIM(RTRIM(#StateID)) + ' AND FormId =' + LTRIM(RTRIM(#FormID))
+ ' AND CultureCode =''' + LTRIM(RTRIM(#CultureCode)) + '''
) x
pivot
(
max(ResourceValue)
for ResourceKey IN (' + #Cols + ')
) p ;'
--#Cols => Column Names which varies in number
Now I have to insert dynamic sql result to #Temp Table and use this #Temp Table with another existing table to perform joins or something else.
(#Temp table should exist there to perform operations with other existing tables)
How can I Insert dynamic SQL query result To a Temporary table?
Thanks
Can you please try the below query.
SET #Sql = N'SELECT ' + #Cols + '
into ##TempTable
FROM
(
SELECT ResourceKey, ResourceValue
FROM LocaleStringResources where StateId ='
+ LTRIM(RTRIM(#StateID)) + ' AND FormId =' + LTRIM(RTRIM(#FormID))
+ ' AND CultureCode =''' + LTRIM(RTRIM(#CultureCode)) + '''
) x
pivot
(
max(ResourceValue)
for ResourceKey IN (' + #Cols + ')
) p ;'
You can then use the ##TempTable for further operations.
However, do not forget to drop the ##TempTable at the end of your query as it will give you error if you run the query again as it is a Global Temporary Table
As was answered in (https://social.msdn.microsoft.com/Forums/sqlserver/en-US/144f0812-b3a2-4197-91bc-f1515e7de4b9/not-able-to-create-hash-table-inside-stored-proc-through-execute-spexecutesql-strquery?forum=sqldatabaseengine),
you need to create a #Temp table in advance:
CREATE TABLE #Temp(columns definition);
It seems that the task is impossible, if you know nothing about the dynamic list of columns in advance. But, most likely you do know something.
You do know the types of dynamic columns, because they come from PIVOT. Most likely, you know the maximum possible number of dynamic columns. Even if you don't, SQL Server has a limit of 1024 columns per (nonwide) table and there is a limit of 8060 bytes per row (http://msdn.microsoft.com/en-us/library/ms143432.aspx). So, you can create a #Temp table in advance with maximum possible number of columns and use only some of them (make all your columns NULLable).
So, CREATE TABLE will look like this (instead of int use your type):
CREATE TABLE #Temp(c1 int NULL, c2 int NULL, c3 int NULL, ..., c1024 int NULL);
Yes, column names in #Temp will not be the same as in #Cols. It should be OK for your processing.
You have a list of columns in your #Cols variable. You somehow make this list of columns in some external code, so when #Cols is generated you know how many columns there are. At this moment you should be able to generate a second list of columns that matches the definition of #Temp. Something like:
#TempCols = N'c1, c2, c3, c4, c5';
The number of columns in #TempCols should be the same as the number of columns in #Cols. Then your dynamic SQL would look like this (I have added INSERT INTO #Temp (#TempCols) in front of your code):
SET #Sql = N'INSERT INTO #Temp (' + #TempCols + N') SELECT ' + #Cols + N' FROM
(
SELECT ResourceKey, ResourceValue
FROM LocaleStringResources where StateId ='
+ LTRIM(RTRIM(#StateID)) + ' AND FormId =' + LTRIM(RTRIM(#FormID))
+ ' AND CultureCode =''' + LTRIM(RTRIM(#CultureCode)) + '''
) x
pivot
(
max(ResourceValue)
for ResourceKey IN (' + #Cols + ')
) p ;'
Then you execute your dynamic SQL:
EXEC (#Sql) OR sp_executesql #Sql
And then do other processing using the #Temp table and temp column names c1, c2, c3, ...
MSDN says:
A local temporary table created in a stored procedure is dropped
automatically when the stored procedure is finished.
You can also DROP the #Temp table explicitly, like this:
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
DROP TABLE #Temp'
All this T-SQL code (CREATE TABLE, EXEC, ...your custom processing..., DROP TABLE) would naturally be inside the stored procedure.
Alternative to create a temporary table is to use the subquery
select t1.name,t1.lastname from(select * from table)t1.
where "select * from table" is your dyanmic query. which will return result which you can use as temp table t1 as given in example .
IF OBJECT_ID('tempdb..##TmepTable') IS NOT NULL DROP TABLE ##TmepTable
CREATE TABLE ##TmepTable (TmpCol CHAR(1))
DECLARE #SQL NVARCHAR(max) =' IF OBJECT_ID(''tempdb..##TmepTable'') IS NOT
NULL DROP TABLE ##TmepTable
SELECT * INTO ##TmepTable from [MyTableName]'
EXEC sp_executesql #SQL
SELECT Alias.* FROM ##TmepTable as Alias
IF OBJECT_ID('tempdb..##TmepTable') IS NOT NULL DROP TABLE ##TmepTable
Here is step by step solution for your problem.
Check for your temporary tables if they exist, and delete them.
IF OBJECT_ID('tempdb..#temp') IS NOT NULL
DROP TABLE #temp
IF OBJECT_ID('tempdb..##abc') IS NOT NULL
DROP TABLE ##abc
Store your main query result in first temp table (this step is for simplicity and more readability).
SELECT *
INTO #temp
FROM (SELECT ResourceKey, ResourceValue
FROM LocaleStringResources
where StateId ='+ LTRIM(RTRIM(#StateID)) + ' AND FormId =' + LTRIM(RTRIM(#FormID))
+ ' AND CultureCode =' + LTRIM(RTRIM(#CultureCode)) + ') AS S
Write below query to create your pivot and store result in another temp table.
DECLARE #str NVARCHAR(1000)
DECLARE #sql NVARCHAR(1000)
SELECT #str = COALESCE(#str+',', '') + ResourceKey FROM #temp
SET #sql = N'select * into ##abc from (select ' + #str + ' from (SELECT ResourceKey, ResourceValue FROM #temp) as A
Pivot
(
max(ResourceValue)
for ResourceKey in (' + #str + ')
)as pvt) as B'
Execute below query to get the pivot result in your next temp table ##abc.
EXECUTE sp_executesql #sql
And now you can use ##abc as table where-ever you want like
select * from ##abc
Hope this will help you.

unpivot a table using column_name function

I have a table which has been designed to have columns named the same as a value I need to reference it with. I know I need to unpivot the table and have got it working if I manually type all the column names, however this table keeps getting new columns added to it so I want to capture all the columns in the unpivot rather than script them manually.
I can get the column names using the column_name function and was wondering if this can be added to the unpivot at all, ive been playing around with it and its not looking possible at the moment to me so thought id check to see if there were any other suggestions.
Sadly I cant redesign the table with where the column names keep getting added to although that would be the ideal solution.
select Day, Rota, RotaTemplate
from table1 t1
unpivot
(
Rota
for RotaTemplate in (select Column_name
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'table1')
) unpiv;
You are not allowed to do this. You need to build dynamic SQL statement like follows:
DECLARE #DynamicSQL NVARCHAR(MAX)
SET #DynamicSQL = N'select Day, Rota, RotaTemplate' + CHAR(10) +
'from table1 t1'+ CHAR(10) +
'unpivot' + CHAR(10) +
'(' + CHAR(10) + CHAR(9) +
'Rota for RotaTemplate in ('
+
STUFF
(
(
SELECT ',[' + [COLUMN_NAME] + '] '
FROM [INFORMATION_SCHEMA].[COLUMNS]
WHERE [TABLE_NAME] = 'table1'
FOR XML PATH('')
)
,1
,1
,''
)
+')' + CHAR(10) +
') unpiv;'
EXEC sp_executesql #DynamicSQL