SQL Server count number of distinct values in each column of a table

SQL Server count number of distinct values in each column of a table - sql

I have a table with ~150 columns. I'd like to find the count(distinct(colName)) for each column but am wondering if there's a way to do so without actually typing out each column name.
Ideally I would use count(distinct(*)) but this doesn't work.
Any other suggestions?
EDIT:
if this is my table:
id col1 col2 col3 ...
01 10001 west north
02 10001 west south
03 10002 east south
04 10002 west north
05 10001 east south
06 10003 west north
I'm looking for this output
count(distinct(id)) count(distinct(col1)) count(distinct(col2)) count(distinct(col3))
6 3 2 2

You can do this:
DECLARE #query varchar(max)
SELECT #query =
'SELECT ' + SUBSTRING((SELECT ',' +'COUNT(DISTINCT(' + column_name + '))
As ' + column_name + ' '
FROM information_schema.columns
WHERE
table_name = 'table_name'
for xml path('')),2,200000) + 'FROM table_name'
PRINT(#query)

Use the below script to build T-SQL query that will return a distinct count of each column in a table. Replace #Table value with your table name.
DECLARE #Table SYSNAME = 'TableName';
-- REVERSE and STUFF used to remove trailing UNION in string
SELECT REVERSE(STUFF(REVERSE((SELECT 'SELECT ''' + name
+ ''' AS [Column], COUNT(DISTINCT('
+ QUOTENAME(name) + ')) AS [Count] FROM '
+ QUOTENAME(#Table) + ' UNION '
-- get column name from sys.columns
FROM sys.columns
WHERE object_id = Object_id(#Table)
-- concatenate result strings with FOR XML PATH
FOR XML PATH (''))), 1, 7, ';'));

Expanded answer from Bryan. His great answer lists the fields alphabetically.
This is no problem if you have a dozen fields or so. If you have 150 fields, like OP stated, this keeps the fields in their table's order.
I modified his query in order to examine a 213 column (vendor's) table and wanted to post for future reference.
DECLARE #Table SYSNAME = 'Your table name; without schema; no square brackets';
-- REVERSE and STUFF used to remove trailing UNION in string
SELECT REVERSE(STUFF(REVERSE((SELECT 'SELECT '
+ CAST(column_id AS VarChar(4)) + ' AS [column_id],' -- extra column
+ '''' + name
+ ''' AS [Column], COUNT(DISTINCT('
+ QUOTENAME(name) + ')) AS [Count] FROM '
+ QUOTENAME(#Table) + ' UNION '
-- get column name from sys.columns
FROM sys.columns
WHERE system_type_id NOT IN (34,240) AND object_id = Object_id(#Table)
ORDER BY column_id -- keeps columns in table order
-- concatenate result strings with FOR XML PATH
FOR XML PATH (''))), 1, 7, ';'));
I decided to not edit Bryan's answer because often people won't need the extra column.
(The ORDER BY has no effect if you don't add the column_id column. I believe this is because only the outermost ORDER BY is guarenteed to order the final output; I'd love to have a msft reference that confirms this)
EDIT: Using function Count with field types Image and Geography throws error.
Added "system_type_id NOT IN (34,240)".

I don't believe this is possible with just MySQL.
I would think your going to have to use a server side language to get the results you want.
Use "DESC TABLE" as your first query and then for each "field" row, compile your query.
Ignore this, wrong system tag :)

This should do:
select count(*) from (select distinct * from myTable) as t
Here is SQL Fiddle test.
create table Data
(
Id int,
Data varchar(50)
)
insert into Data
select 1, 'ABC'
union all
select 1, 'ABC'
select count(*)
from (select distinct * from Data) as t

Related

Find out the MAX value of a column that is present in multiple table and in multiple schema using SQL [duplicate]

This question already has answers here:
How to get Max Date Value of Date column in Multiple tables
(4 answers)
Closed 12 months ago.
I want to find the max value of a column which is present in multiple tables. Eg.
SELECT MAX(UpdatedDatetime) FROM schema_1.table_1
UNION ALL
SELECT MAX(UpdatedDatetime) FROM schema_1.table_2
UNION ALL
SELECT MAX(UpdatedDatetime) FROM schema_2.table_1
UNION ALL
SELECT MAX(UpdatedDatetime) FROM schema_2.table_2
Clearly the above method is quite cumbersome when there are say more than 100 tables.
Is there any way to dynamically generate the table name as above and form the query string?
I mean using INFORMATION_SCHEMA.COLUMNS?

Below should do the work. You can always use String_AGG function. But in my sql server this function is not supported so using XML path.
Declare #SQL varchar(max)
SELECT #SQL=STUFF((SELECT ' ' + CAST(Name AS VARCHAR(max)) [text()]
FROM (
SELECT 'select Max(UpdatedDatetime) from ' + QUOTENAME(ss.name) + '.' + QUOTENAME(st.name) + ' union all' as Name
FROM sys.tables st
INNER JOIN sys.schemas ss on st.[schema_id] = ss.[schema_id]
WHERE st.is_ms_shipped = 0
AND EXISTS (
SELECT 1
FROM sys.columns sc
WHERE sc.[object_id] = st.[object_id]
AND sc.name = 'UpdatedDatetime'
)
) ap
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ')
set #SQL = left(#SQL, len(#SQL) - 10) -- remove last union all
select #SQL
EXEC (#SQL)

How to get column-level dependencies in a view

I've made some research on the matter but don't have solution yet. What I want to get is column-level dependencies in a view. So, let's say we have a table like this
create table TEST(
first_name varchar(10),
last_name varchar(10),
street varchar(10),
number int
)
and a view like this:
create view vTEST
as
select
first_name + ' ' + last_name as [name],
street + ' ' + cast(number as varchar(max)) as [address]
from dbo.TEST
What I'd like is to get result like this:
column_name depends_on_column_name depends_on_table_name
----------- --------------------- --------------------
name first_name dbo.TEST
name last_name dbo.TEST
address street dbo.TEST
address number dbo.TEST
I've tried sys.dm_sql_referenced_entities function, but referencing_minor_id is always 0 there for views.
select
referencing_minor_id,
referenced_schema_name + '.' + referenced_entity_name as depends_on_table_name,
referenced_minor_name as depends_on_column_name
from sys.dm_sql_referenced_entities('dbo.vTEST', 'OBJECT')
referencing_minor_id depends_on_table_name depends_on_column_name
-------------------- --------------------- ----------------------
0 dbo.TEST NULL
0 dbo.TEST first_name
0 dbo.TEST last_name
0 dbo.TEST street
0 dbo.TEST number
The same is true for sys.sql_expression_dependencies and for obsolete sys.sql_dependencies.
So do I miss something or is it impossible to do?
There're some related questions (Find the real column name of an alias used in a view?), but as I said - I haven't found a working solution yet.
EDIT 1: I've tried to use DAC to query if this information is stored somewhere in System Base Tables but haven't find it

This solution could answer your question only partially. It won't work for columns that are expressions.
You could use sys.dm_exec_describe_first_result_set to get column information:
#include_browse_information
If set to 1, each query is analyzed as if it has a FOR BROWSE option on the query. Additional key columns and source table information are returned.
CREATE TABLE txu(id INT, first_name VARCHAR(10), last_name VARCHAR(10));
CREATE TABLE txd(id INT, id_fk INT, address VARCHAR(100));
CREATE VIEW v_txu
AS
SELECT t.id AS PK_id,
t.first_name AS name,
d.address,
t.first_name + t.last_name AS name_full
FROM txu t
JOIN txd d
ON t.id = d.id_fk
Main query:
SELECT name, source_database, source_schema,
source_table, source_column
FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM v_txu', null, 1) ;
Output:
+-----------+--------------------+---------------+--------------+---------------+
| name | source_database | source_schema | source_table | source_column |
+-----------+--------------------+---------------+--------------+---------------+
| PK_id | fiddle_0f9d47226c4 | dbo | txu | id |
| name | fiddle_0f9d47226c4 | dbo | txu | first_name |
| address | fiddle_0f9d47226c4 | dbo | txd | address |
| name_full | null | null | null | null |
+-----------+--------------------+---------------+--------------+---------------+
DBFiddleDemo

It is a solution based on query plan. It has some adventages
almost any select queries can be processed
no SchemaBinding
and disadventages
has not been tested properly
can become broken suddenly if Microsoft change XML query plan.
The core idea is that every column expression inside XML query plan is defined in "DefinedValue" node. First subnode of "DefinedValue" is a reference to output column and second one is a expression. The expression computes from input columns and constant values.
As mentioned above It's based only on empirical observation and needs to be tested properly.
It's a invocation example:
exec dbo.GetColumnDependencies 'select * from dbo.vTEST'
target_column_name | source_column_name | const_value
---------------------------------------------------
address | Expr1007 | NULL
name | Expr1006 | NULL
Expr1006 | NULL | ' '
Expr1006 | [testdb].[dbo].first_name | NULL
Expr1006 | [testdb].[dbo].last_name | NULL
Expr1007 | NULL | ' '
Expr1007 | [testdb].[dbo].number | NULL
Expr1007 | [testdb].[dbo].street | NULL
It's code.
First of all get XML query plan.
declare #select_query as varchar(4000) = 'select * from dbo.vTEST' -- IT'S YOUR QUERY HERE.
declare #select_into_query as varchar(4000) = 'select top (1) * into #foo from (' + #select_query + ') as src'
, #xml_plan as xml = null
, #xml_generation_tries as tinyint = 10
;
while (#xml_plan is null and #xml_generation_tries > 0) -- There is no guaranty that plan will be cached.
begin
execute (#select_into_query);
select #xml_plan = pln.query_plan
from sys.dm_exec_query_stats as qry
cross apply sys.dm_exec_sql_text(qry.sql_handle) as txt
cross apply sys.dm_exec_query_plan(qry.plan_handle) as pln
where txt.text = #select_into_query
;
end
if (#xml_plan is null
) begin
raiserror(N'Can''t extract XML query plan from cache.' ,15 ,0);
return;
end
;
Next is a main query. It's biggest part is recursive common table expression for column extraction.
with xmlnamespaces(default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
,'http://schemas.microsoft.com/sqlserver/2004/07/showplan' as shp -- Used in .query() for predictive namespace using.
)
, cte_column_dependencies as
(
The seed of recursion is a query that extracts columns for #foo table that store 1 row of interested select query.
select
(select foo_col.info.query('./ColumnReference') for xml raw('shp:root') ,type) -- Becouse .value() can't extract attribute from root node.
as target_column_info
, (select foo_col.info.query('./ScalarOperator/Identifier/ColumnReference') for xml raw('shp:root') ,type)
as source_column_info
, cast(null as xml) as const_info
, 1 as iteration_no
from #xml_plan.nodes('//Update/SetPredicate/ScalarOperator/ScalarExpressionList/ScalarOperator/MultipleAssign/Assign')
as foo_col(info)
where foo_col.info.exist('./ColumnReference[#Table="[#foo]"]') = 1
The recursive part searches for "DefinedValue" node with depended column and extract all "ColumnReference" and "Const" subnodes that used in column expression. It's over complicated by XML to SQL conversions.
union all
select
(select internal_col.info.query('.') for xml raw('shp:root') ,type)
, source_info.column_info
, source_info.const_info
, prev_dependencies.iteration_no + 1
from #xml_plan.nodes('//DefinedValue/ColumnReference') as internal_col(info)
inner join cte_column_dependencies as prev_dependencies -- Filters by depended columns.
on prev_dependencies.source_column_info.value('(//ColumnReference/#Column)[1]' ,'nvarchar(4000)') = internal_col.info.value('(./#Column)[1]' ,'nvarchar(4000)')
and exists (select prev_dependencies.source_column_info.value('(.//#Schema)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Schema)[1]' ,'nvarchar(4000)'))
and exists (select prev_dependencies.source_column_info.value('(.//#Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Database)[1]' ,'nvarchar(4000)'))
and exists (select prev_dependencies.source_column_info.value('(.//#Server)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Server)[1]' ,'nvarchar(4000)'))
cross apply ( -- Becouse only column or only constant can be places in result row.
select (select source_col.info.query('.') for xml raw('shp:root') ,type) as column_info
, null as const_info
from internal_col.info.nodes('..//ColumnReference') as source_col(info)
union all
select null as column_info
, (select const.info.query('.') for xml raw('shp:root') ,type) as const_info
from internal_col.info.nodes('..//Const') as const(info)
) as source_info
where source_info.column_info is null
or (
-- Except same node selected by '..//ColumnReference' from its sources. Sorry, I'm not so well to check it with XQuery simple.
source_info.column_info.value('(//#Column)[1]' ,'nvarchar(4000)') <> internal_col.info.value('(./#Column)[1]' ,'nvarchar(4000)')
and (select source_info.column_info.value('(//#Schema)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Schema)[1]' ,'nvarchar(4000)')) is null
and (select source_info.column_info.value('(//#Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Database)[1]' ,'nvarchar(4000)')) is null
and (select source_info.column_info.value('(//#Server)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./#Server)[1]' ,'nvarchar(4000)')) is null
)
)
Finally, It's select statement that convert XML to appropriate human text.
select
-- col_dep.target_column_info
--, col_dep.source_column_info
--, col_dep.const_info
coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/#Server)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/#Database)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/#Schema)[1]' ,'nvarchar(4000)') + '.' ,'')
+ col_dep.target_column_info.value('(.//shp:ColumnReference/#Column)[1]' ,'nvarchar(4000)')
as target_column_name
, coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/#Server)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/#Database)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/#Schema)[1]' ,'nvarchar(4000)') + '.' ,'')
+ col_dep.source_column_info.value('(.//shp:ColumnReference/#Column)[1]' ,'nvarchar(4000)')
as source_column_name
, col_dep.const_info.value('(/shp:root/shp:Const/#ConstValue)[1]' ,'nvarchar(4000)')
as const_value
from cte_column_dependencies as col_dep
order by col_dep.iteration_no ,target_column_name ,source_column_name
option (maxrecursion 512) -- It's an assurance from infinite loop.

All what you need is mentioned into definition of view.
so we can extract this information via following the next steps:-
Assign the view definition into a string variable.
Split it with (,) comma.
Split the alias with (+) plus operator via using CROSS APPLY with XML.
use the system tables for getting the accurate information like original table.
Demo:-
Create PROC psp_GetLevelDependsView (#sViewName varchar(200))
AS
BEGIN
Declare #stringToSplit nvarchar(1000),
#name NVARCHAR(255),
#dependsTableName NVARCHAR(50),
#pos INT
Declare #returnList TABLE ([Name] [nvarchar] (500))
SELECT TOP 1 #dependsTableName= table_schema + '.'+ TABLE_NAME
FROM INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
select #stringToSplit = definition
from sys.objects o
join sys.sql_modules m on m.object_id = o.object_id
where o.object_id = object_id( #sViewName)
and o.type = 'V'
WHILE CHARINDEX(',', #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
select COLUMN_NAME , b.Name as Expression
Into #Temp
FROM INFORMATION_SCHEMA.COLUMNS a , #returnList b
WHERE TABLE_NAME= #sViewName
And (b.Name) like '%' + ( COLUMN_NAME) + '%'
SELECT A.COLUMN_NAME as column_name,
Split.a.value('.', 'VARCHAR(100)') AS depends_on_column_name , #dependsTableName as depends_on_table_name
Into #temp2
FROM
(
SELECT COLUMN_NAME,
CAST ('<M>' + REPLACE(Expression, '+', '</M><M>') + '</M>' AS XML) AS Data
FROM #Temp
) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
SELECT b.column_name , a.COLUMN_NAME as depends_on_column_name , b.depends_on_table_name
FROM INFORMATION_SCHEMA.VIEW_COLUMN_USAGE a , #temp2 b
WHERE VIEW_NAME= #sViewName
and b.depends_on_column_name like '%' + a.COLUMN_NAME + '%'
drop table #Temp
drop table #Temp2
END
Test:-
exec psp_GetLevelDependsView 'vTest'
Result:-
column_name depends_on_column_name depends_on_table_name
----------- --------------------- --------------------
name first_name dbo.TEST
name last_name dbo.TEST
address street dbo.TEST
address number dbo.TEST

I was playing around with this but didn't have time to go any further. Maybe this will help:
-- Returns all table columns called in the view and the objects they pull from
SELECT
v.[name] AS ViewName
,d.[referencing_id] AS ViewObjectID
,c.[name] AS ColumnNames
,OBJECT_NAME(d.referenced_id) AS ReferencedTableName
,d.referenced_id AS TableObjectIDsReferenced
FROM
sys.views v
INNER JOIN sys.sql_expression_dependencies d ON d.referencing_id = v.[object_id]
INNER JOIN sys.objects o ON d.referencing_id = o.[object_id]
INNER JOIN sys.columns c ON d.referenced_id = c.[object_id]
WHERE v.[name] = 'vTEST'
-- Returns all output columns in the view
SELECT
OBJECT_NAME([object_id]) AS ViewName
,[object_id] AS ViewObjectID
,[name] AS OutputColumnName
FROM sys.columns
WHERE OBJECT_ID('vTEST') = [object_id]
-- Get the view definition
SELECT
VIEW_DEFINITION
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = 'vTEST'

Unfortunately, SQL Server does not explicitly store mapping between source table columns and view columns. I suspect the main reason is simply due to the potential complexity of views (expression columns, functions called on those columns, nested queries etc.).
The only way that I can think of to determine the mapping between view columns and source columns would be to either parse the query associated to the view or parse the execution plan of the view.
The approach I have outlined here focuses on the second option and relies on the fact that SQL Server will avoid generating output lists for columns not required by a query.
The first step is to get the list of dependent tables and their associated columns required for the view. This can be achieved via the standard system tables in SQL Server.
Next, we enumerate all of the view’s columns via a cursor.
For each view column, we create a temporary wrapper stored procedure that only selects the single column in question from view. Because only a single column is requested SQL Server will only retrieve the information needed to output that single view column.
The newly created procedure will run the query in format only mode and will therefore not cause any actual I/O operations on the database, but it will generate an estimated execution plan when executed. After the query plan is generate, we query the output lists from the execution plan. Since we know which view column was selected we can now associate the output list to view column in question. We can further refine the association by only associating columns that form part of our original dependency list, this will eliminate expression outputs from the result set.
Note that with this method if the view needs to join different tables together to generate the output then all columns required to generate the output will be returned even if it is not directly used in the column expression since it is still in directly required.
The following stored procedure demonstrates the above implementation method:
CREATE PROCEDURE ViewGetColumnDependencies
(
#viewName NVARCHAR(50)
)
AS
BEGIN
CREATE TABLE #_suppress_output
(
result NVARCHAR(500) NULL
);
DECLARE #viewTableColumnMapping TABLE
(
[ViewName] NVARCHAR(50),
[SourceObject] NVARCHAR(50),
[SourceObjectColumnName] NVARCHAR(50),
[ViewAliasColumn] NVARCHAR(50)
)
-- Get list of dependent tables and their associated columns required for the view.
INSERT INTO #viewTableColumnMapping
(
[ViewName]
,[SourceObject]
,[SourceObjectColumnName]
)
SELECT v.[name] AS [ViewName]
,'[' + OBJECT_NAME(d.referenced_major_id) + ']' AS [SourceObject]
,c.[name] AS [SourceObjectColumnName]
FROM sys.views v
LEFT OUTER JOIN sys.sql_dependencies d ON d.object_id = v.object_id
LEFT OUTER JOIN sys.columns c ON c.object_id = d.referenced_major_id AND c.column_id = d.referenced_minor_id
WHERE v.[name] = #viewName;
DECLARE #aliasColumn NVARCHAR(50);
-- Next, we enumerate all of the views columns via a cursor.
DECLARE ViewColumnNameCursor CURSOR FOR
SELECT aliases.name AS [AliasName]
FROM sys.views v
LEFT OUTER JOIN sys.columns AS aliases on v.object_id = aliases.object_id -- c.column_id=aliases.column_id AND aliases.object_id = object_id('vTEST')
WHERE v.name = #viewName;
OPEN ViewColumnNameCursor
FETCH NEXT FROM ViewColumnNameCursor
INTO #aliasColumn
DECLARE #tql_create_proc NVARCHAR(MAX);
DECLARE #queryPlan XML;
WHILE ##FETCH_STATUS = 0
BEGIN
/*
For each view column, we create a temporary wrapper stored procedure that
only selects the single column in question from view. The stored procedure
will run the query in format only mode and will therefore not cause any
actual I/O operations on the database, but it will generate an estimated
execution plan when executed.
*/
SET #tql_create_proc = 'CREATE PROCEDURE ___WrapView
AS
SET FMTONLY ON;
SELECT CONVERT(NVARCHAR(MAX), [' + #aliasColumn + ']) FROM [' + #viewName + '];
SET FMTONLY OFF;';
EXEC (#tql_create_proc);
-- Execute the procedure to generate a query plan. The insert into the temp table is only done to
-- suppress the empty result set from being displayed as part of the output.
INSERT INTO #_suppress_output
EXEC ___WrapView;
-- Get the query plan for the wrapper procedure that was just executed.
SELECT #queryPlan = [qp].[query_plan]
FROM [sys].[dm_exec_procedure_stats] AS [ps]
JOIN [sys].[dm_exec_query_stats] AS [qs] ON [ps].[plan_handle] = [qs].[plan_handle]
CROSS APPLY [sys].[dm_exec_query_plan]([qs].[plan_handle]) AS [qp]
WHERE [ps].[database_id] = DB_ID() AND OBJECT_NAME([ps].[object_id], [ps].[database_id]) = '___WrapView'
-- Drop the wrapper view
DROP PROCEDURE ___WrapView
/*
After the query plan is generate, we query the output lists from the execution plan.
Since we know which view column was selected we can now associate the output list to
view column in question. We can further refine the association by only associating
columns that form part of our original dependency list, this will eliminate expression
outputs from the result set.
*/
;WITH QueryPlanOutputList AS
(
SELECT T.X.value('local-name(.)', 'NVARCHAR(max)') as Structure,
T.X.value('./#Table[1]', 'NVARCHAR(50)') as [SourceTable],
T.X.value('./#Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
T.X.query('*') as SubNodes
FROM #queryPlan.nodes('*') as T(X)
UNION ALL
SELECT QueryPlanOutputList.structure + N'/' + T.X.value('local-name(.)', 'nvarchar(max)'),
T.X.value('./#Table[1]', 'NVARCHAR(50)') as [SourceTable],
T.X.value('./#Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
T.X.query('*')
FROM QueryPlanOutputList
CROSS APPLY QueryPlanOutputList.SubNodes.nodes('*') as T(X)
)
UPDATE #viewTableColumnMapping
SET ViewAliasColumn = #aliasColumn
FROM #viewTableColumnMapping CM
INNER JOIN
(
SELECT DISTINCT QueryPlanOutputList.Structure
,QueryPlanOutputList.[SourceTable]
,QueryPlanOutputList.[SourceColumnName]
FROM QueryPlanOutputList
WHERE QueryPlanOutputList.Structure like '%/OutputList/ColumnReference'
) SourceColumns ON CM.[SourceObject] = SourceColumns.[SourceTable] AND CM.SourceObjectColumnName = SourceColumns.SourceColumnName
FETCH NEXT FROM ViewColumnNameCursor
INTO #aliasColumn
END
CLOSE ViewColumnNameCursor;
DEALLOCATE ViewColumnNameCursor;
DROP TABLE #_suppress_output
SELECT *
FROM #viewTableColumnMapping
ORDER BY [ViewAliasColumn]
END
The stored procedure can now be executed as follow:
EXEC dbo.ViewGetColumnDependencies #viewName = 'vTEST'

Trying to format SQL query results

Found this query here on Stack Overflow which I found very helpful to pull all table names and corresponding columns from a Microsoft SQL Server Enterprise Edition (64-bit) 10.50.4286 SP2 database.
SELECT o.Name, c.Name
FROM sys.columns c
JOIN sys.objects o ON o.object_id = c.object_id
WHERE o.type = 'U'
ORDER BY o.Name, c.Name
It produces a table with two columns like this, each row has the table name in column 01 and the corressponding columns in column 02:
What I really want however is something like this, one column for each table name and the tables columns listed below it like this:
I've already started doing this manually in Excel, but with over 5000 rows returned it would be really nice if there was a way to format the results in the query itself to look like this. Thanks in advance!

As everyone is telling you, this is an un-SQL-y thing to do. Your resultset will have an arbitrary number of columns (equal to the number of user tables in your database, which could be huge). Since the resultset must be rectangular, it will have as many rows as the maximum number of columns in any of your tables, so many of the values will be NULL.
That said, a straightforward dynamic PIVOT gets you what you want:
DECLARE #columns nvarchar(max);
DECLARE #sql nvarchar(max);
SET #columns = STUFF ( (
SELECT '],[' + t.name
FROM sys.tables t
WHERE t.type = 'U'
FOR XML PATH('') ), 1, 2, '')
+ ']';
SET #sql = '
SELECT ' + #columns + '
FROM
(
SELECT t.Name tName
, c.Name cName
, ROW_NUMBER() OVER (PARTITION BY t.Name ORDER BY c.Name) rn
FROM sys.columns c
JOIN sys.tables t ON t.object_id = c.object_id
WHERE t.type = ''U''
) raw
PIVOT (MAX(cName) FOR tName IN ( ' + #columns + ' ))
AS pvt;
';
EXECUTE(#sql);
This is what it produces on my master database:
spt_fallback_db spt_fallback_dev spt_fallback_usg spt_monitor MSreplication_options
------------------- ------------------- ------------------- --------------- ----------------------
dbid high dbid connections install_failures
name low lstart cpu_busy major_version
status name segmap idle minor_version
version phyname sizepg io_busy optname
xdttm_ins status vstart lastrun revision
xdttm_last_ins_upd xdttm_ins xdttm_ins pack_errors value
xfallback_dbid xdttm_last_ins_upd xdttm_last_ins_upd pack_received NULL
xserver_name xfallback_drive xfallback_vstart pack_sent NULL
NULL xfallback_low xserver_name total_errors NULL
NULL xserver_name NULL total_read NULL
NULL NULL NULL total_write NULL
(11 row(s) affected)

It might be easiest to do for example something like this:
Build a comma separated list using for XML path, for example like this.
Then copy that result to excel and use data to columns to create separate columns from the items
Use copy + paste special -> transpose to turn rows into columns

Inject array of columns to COUNT in SELECT MSSQL - dynamically

Currently I have prepared a sting of columns which can be added (hence the need for the dynamic query.
I have #cols which can be printed with an output like "Color","Size","Width"
I then have a SELECT/COUNT statement which needs to look like as follows...
SELECT
Product_code,
count(distinct [Color]),
count(distinct [Size]),
count(distinct [Width])
FROM.....
I need of the columns that I have in my string of columns to be counted with distinct..
Also would be even better if I could add a AS with the name of each of these in here too!
Many help is much appreciated - my SQL are OK but the dynamic bit turns me blue!
Cheers.

Convert your comma seperated list to a table first. See this
Assuming that table name to be ListOfColumns
DECLARE #Query VARCHAR( 1000 ) = '';
SELECT #Query+=#Query + ', COUNT(DISTINCT ' + COLUMN_NAME + ') AS ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN ListOfColumns d ON c.COLUMN_NAME = d.ColName
WHERE TABLE_NAME = 'MyTable';
SET #Query = 'SELECT ' + STUFF( #query,1,3,'' ) + ' FROM Tbl';
EXEC ( #Query );

Pivoting help needed

Please help me with this. I am totally stuck. I have coders block or something.
I have the following table
ID Name Cost Included
---- ---------- ------- ----------
1 Package1 10.00 Yes
2 Package2 20.00 No
3 Package3 20.00 Yes
I would like to crosstab this information, to display like the following example,there will be more columns in the table.
Type Package1 Package2 Package3
----- ------------ ----------- ----------
Name Package1 Package2 Package3
Cost 10.00 20.00 30.00
Included Yes No Yes

It seems to me that you are trying to build a product comparison list. If this is true, you might unpivot the table first and then join individual records together.
The 'transponded' part unpivots the columns. All columns must be of compatible types or converted to one. I choose varchar(100). transponded returns table with three columns, ID from ProductInfo, Type as column name and Value as value of corresponding column.
Select part joins together info on as many product as demanded by adding another left join transponded tn on t1.Type = tnType and tn.ID = #parametern. This part seems as a hassle, but when I tried to do this part with pivot I failed to get column in proper order - pivot sorted names in Type. It would however demand dynamic sql generation. This solution is fixed providing that you add enough joins for maximum products you wish to compare at once. I belive it would not be over 5.
=1, =2 and =3 should be replaced by parameters. The query should be hosted in stored procedure.
; with transponded as
(
select ID, Type, Value
from
(
select ID,
Name,
cast (Cost as varchar(100)) Cost,
cast (case when Included = 1 then 'Yes' else 'No' end as varchar(100)) Included
from ProductInfo
) p
unpivot (Value for Type in (Name, Cost, Included) ) a
)
select t1.Type,
t1.Value Product1,
t2.Value Product2,
t3.Value Product3
from transponded t1
left join transponded t2
on t1.Type = t2.Type
and t2.id = 2
left join transponded t3
on t1.Type = t3.Type
and t3.id = 3
where t1.id = 1
In short, transpond one record at time and join to another transponded record by Type column.
Oh, and here is a Sql Fiddle playground.

There is no easy way to do this, as the pivot will need to be aggregated by column. Given that adding columns to the input table would cause a maintenance issue where these values will not be presented to the output until the code is changed wherever it is used, I'd say you're probably best doing it once with a stored procedure, which will dynamically generate the output you're looking for based on the schema of the input table.
I have demonstrated how this can be done, using the data you have supplied. This data is stored in a temp table (not #temp, because the stored proc won't work with temporary tables), populated thus:
CREATE TABLE temp (
_key int,
package_name varchar(50),
cost float,
included bit
)
INSERT INTO temp VALUES(1,'Package1', 10.00, 1)
INSERT INTO temp VALUES(2,'Package2', 20.00, 0)
INSERT INTO temp VALUES(3,'Package3', 20.00, 1)
The stored procedure retrieves a list of values based on the #pivot_field parameter, and uses these values as a column list to be inserted after the "Type" field. It then unions the pivot field and all other fields together to generate the rows, pivoting one column at a time. The procedure is as follows:
CREATE PROCEDURE usp_get_pivot (#table_name nvarchar(255), #pivot_field nvarchar(255)) AS
BEGIN
CREATE TABLE #temp (val nvarchar(max))
DECLARE #sql NVARCHAR(MAX), #cols NVARCHAR(MAX), #col NVARCHAR(255)
SET #sql = 'SELECT DISTINCT ' + #pivot_field + ' FROM ' + #table_name
INSERT INTO #temp EXEC sp_executesql #sql;
SET #cols = (SELECT '[' + val + '],' FROM #temp FOR XML PATH(''))
SET #cols = SUBSTRING(#cols, 1, LEN(#cols)-1)
SET #SQL = N'SELECT ''' + #pivot_field + ''' as [type], *
FROM (SELECT ' + #pivot_field + ', ' + #pivot_field + ' as ' + #pivot_field + '1 FROM ' + #table_name + ') AS source_table
PIVOT (max(' + #pivot_field + '1) FOR ' + #pivot_field + ' IN (' + #cols + ')) AS pivot_table'
DECLARE csr CURSOR FOR
SELECT c.name FROM sys.columns c, sys.objects o
WHERE c.object_id = o.object_id AND o.name = #table_name
AND c.name <> #pivot_field
ORDER BY column_id
OPEN csr
FETCH NEXT FROM csr INTO #col
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = #sql + ' UNION ALL
SELECT ''' + #col + ''' as [type], *
FROM (SELECT ' + #pivot_field + ', CAST(' + #col + ' AS VARCHAR) AS ' + #col + ' FROM ' + #table_name + ') AS source_table
PIVOT (max(' + #col + ') FOR ' + #pivot_field + ' IN (' + #cols + ')) AS pivot_table'
FETCH NEXT FROM csr INTO #col
END
CLOSE csr
DEALLOCATE csr
DROP TABLE #temp
EXEC sp_executesql #sql
END
You should be able to simply copy and paste the procedure into management studio, create the data is shown above and execute the procedure with:
EXEC usp_get_pivot 'temp', 'package_name'

If number of packages is not static there is no option for you I think. PIVOT clause can produce only static/defined number of columns.
You may do some table-to-table rewriting using multiple statements - but still you have to face with static number of columns.
But you may set it to for example to 10 and then display up to 10 packages, having NULL-s in rest of columns if there are less packages.
You may also use dynamic SQL to have dynamic number of columns - but it will be a headache.
If you're going to export this data to Excel - do not pivot it at SQL - do a transposition in Excel (it's under "paste special").

Basically what i have at this stage is the following.
SELECT [Type],
MAX(Beginner) AS [Beginner],
MAX(Intermediate) AS [Intermediate],
MAX(Advanced) AS [Advanced]
FROM
(
SELECT
'Name' AS TYPE,
CASE WHEN Name='Beginner' THEN Name END AS [Beginner],
CASE WHEN Name='Intermediate' THEN Name END AS [Intermediate],
CASE WHEN Name='Advanced' THEN Name END AS [Advanced]
FROM Administration.Package
UNION ALL
SELECT
'Price' AS TYPE,
CASE WHEN Name='Beginner' THEN CAST(Price AS VARCHAR) END AS [Beginner],
CASE WHEN Name='Intermediate' THEN CAST(Price AS VARCHAR) END AS [Intermediate],
CASE WHEN Name='Advanced' THEN CAST(Price AS VARCHAR) END AS [Advanced]
FROM Administration.Package
)A
GROUP BY [Type]
But it does not feel right to have the union for each and every column.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server count number of distinct values in each column of a table - sql

You can do this: DECLARE #query varchar(max) SELECT #query = 'SELECT ' + SUBSTRING((SELECT ',' +'COUNT(DISTINCT(' + column_name + ')) As ' + column_name + ' ' FROM information_schema.columns WHERE table_name = 'table_name' for xml path('')),2,200000) + 'FROM table_name' PRINT(#query)

I don't believe this is possible with just MySQL. I would think your going to have to use a server side language to get the results you want. Use "DESC TABLE" as your first query and then for each "field" row, compile your query. Ignore this, wrong system tag :)

This should do: select count() from (select distinct from myTable) as t Here is SQL Fiddle test. create table Data ( Id int, Data varchar(50) ) insert into Data select 1, 'ABC' union all select 1, 'ABC' select count() from (select distinct from Data) as t

Related

Find out the MAX value of a column that is present in multiple table and in multiple schema using SQL [duplicate]

How to get column-level dependencies in a view

Trying to format SQL query results

Inject array of columns to COUNT in SELECT MSSQL - dynamically

Pivoting help needed

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server count number of distinct values in each column of a table - sql

You can do this: DECLARE #query varchar(max) SELECT #query = 'SELECT ' + SUBSTRING((SELECT ',' +'COUNT(DISTINCT(' + column_name + ')) As ' + column_name + ' ' FROM information_schema.columns WHERE table_name = 'table_name' for xml path('')),2,200000) + 'FROM table_name' PRINT(#query)

I don't believe this is possible with just MySQL. I would think your going to have to use a server side language to get the results you want. Use "DESC TABLE" as your first query and then for each "field" row, compile your query. Ignore this, wrong system tag :)

This should do: select count(*) from (select distinct * from myTable) as t Here is SQL Fiddle test. create table Data ( Id int, Data varchar(50) ) insert into Data select 1, 'ABC' union all select 1, 'ABC' select count(*) from (select distinct * from Data) as t

Related

Find out the MAX value of a column that is present in multiple table and in multiple schema using SQL [duplicate]

How to get column-level dependencies in a view

Trying to format SQL query results

Inject array of columns to COUNT in SELECT MSSQL - dynamically

Pivoting help needed

Categories

Resources

This should do: select count() from (select distinct from myTable) as t Here is SQL Fiddle test. create table Data ( Id int, Data varchar(50) ) insert into Data select 1, 'ABC' union all select 1, 'ABC' select count() from (select distinct from Data) as t