Select tables where any data exists

Select tables where any data exists - sql

I have acces to huge MSSQL DB. This DB have many tables, but huge amount of it is empty. How do I query DB schema to select table names, where any rows exists? (I'd like to create ERD only from these tables that haves some data, when I achieve this). I did not found any related questions.

A quick but approximate query you can use is the following one, just check the RowCount column:
SELECT
TableName = t.NAME,
SchemaName = s.Name,
[RowCount] = p.rows,
TotalSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.total_pages) * 8 / 1024.0),
UsedSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.used_pages) * 8 / 1024.0),
UnusedSpaceMB = CONVERT(DECIMAL(18,2), (SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024.0)
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name,
s.Name,
p.Rows
ORDER BY
[RowCount] DESC
If you want the real count, you will have to issue a SELECT that returns a script of multiple SELECT with COUNT(*) and probably a bunch of UNION ALL. It might take long to finish if you are concurrently accessing the tables or if they are very big.

If you do a real count then you can use a dynamic script to do this. Note that, as #Ezlo mentions, this will be (a lot) slower than the estimated counts using the sys.partitions object:
DECLARE #SQL nvarchar(MAX),
#CRLF nchar(2) = NCHAR(13) + NCHAR(10);
SET #SQL = STUFF((SELECT #CRLF +
N'UNION ALL' + #CRLF +
N'SELECT N' + QUOTENAME(s.[name],'''') + N' AS SchemaName,' + #CRLF +
N' N' + QUOTENAME(t.[name],'''') + N' AS TableName,' + #CRLF +
N' COUNT(*) AS TotalRows' + #CRLF +
N'FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name])
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)'),1,11,N'') + N';'
--SELECT #SQL; --To see the SQL if you want
EXEC sp_executesql #SQL;

Related

Search for a value in all column and all tables of a database

I want to find the column name that contain the value "Commerciale", but i do not know the column name or the table so I need to search in the whole database. How can i do that with a query?
I'm using SQL SERVER

If you are looking for columns where the name is Commerciale then you can simply use the sys objects:
SELECT s.[name] AS SchemaName,
t.[name] AS TableName,
c.[name] AS ColumnName
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
WHERE c.[name] = N'Commerciale';
If, however, you need to search the contents of the values in the rows, you'll need to use dynamic SQL. This will return a dataset for every table in your database which has at least 1 string type column, and will return any rows where the value of one of those columns has the value 'Commerciale'. If it needs to contain the value, change the WHERE to use a LIKE in it's clauses instead (note the query will be horrifically slow with that):
DECLARE #SQL nvarchar(MAX),
#CRLF nchar(2) = NCHAR(13) + NCHAR(10);
SET #SQL = STUFF((SELECT #CRLF +
N'SELECT N' + QUOTENAME(s.[name],'''') + N' AS SchemaName,' + #CRLF +
N' N' + QUOTENAME(t.[name],'''') + N' AS TableName,' + #CRLF +
N' *' + #CRLF +
N'FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name]) + #CRLF +
N'WHERE ' +
STUFF((SELECT #CRLF +
N' AND ' + QUOTENAME(c.[name]) + N' = ''Commerciale'''
FROM sys.columns c
JOIN sys.types ct ON c.system_type_id = ct.system_type_id
WHERE c.object_id = t.object_id
AND ct.[name] IN (N'char',N'varchar',N'nchar',N'nvarchar')
FOR XML PATH(''),TYPE).value('(./text())[1]','nvarchar(MAX)'),1,8,N'') + N';'
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
FOR XML PATH(''),TYPE).value('(./text())[1]','nvarchar(MAX)'),1,2,N'');
--PRINT #SQL; --YOu best friend
EXEC sp_executesql #SQL;
This won't tell you what column has the value, you'll need to use your own eyes to do that, but I wasn't entertaining writing a dynamic table dynamic pivot.

you can use system tables :
SELECT
c.name ColumnName
, t.name TableName
FROM sys.columns AS c
JOIN sys.tables AS t
ON c.object_id = t.object_id
WHERE c.name like '%Commerciale%'

How to use sp_MSforeachdb

I have a code that returns all indexes with a fragmentation % greater than 30
Iwant this code to run through all my databases and add the resultset to a table I have called IndexesToRebuild
is there a way I can use the sp_MSforeachdb to run this query thoughout all databases and insert the resultset to the IndexesToRebuild table
here is the code I have so far
if(not exists(select 1 from Utility..dtlIndexesToRebuild))
begin
insert into utility..dtlIndexesToRebuild
select
DB_NAME(),
dbschemas.[name],
dbtables.[name],
dbindexes.[name],
indexstats.avg_fragmentation_in_percent
from
sys.dm_db_index_physical_stats (DB_ID(), null, null, null, null) as indexstats
inner join sys.tables dbtables on dbtables.[object_id] = indexstats.[object_id]
inner join sys.schemas dbschemas on dbtables.[schema_id] = dbschemas.[schema_id]
inner join sys.indexes as dbindexes on dbindexes.[object_id] = indexstats.[object_id]
and indexstats.index_id = dbindexes.index_id
where
indexstats.database_id = DB_ID()
and avg_fragmentation_in_percent > 30
end

sp_msforeachdb has some "features". For something as simple as this, it'll likely be easier to simply leverage some dynamic SQL:
USE master;
GO
DECLARE #SQL nvarchar(MAX),
#CRLF nchar(2) = NCHAR(13) + NCHAR(10);
SET #SQL = STUFF((SELECT #CRLF + #CRLF +
N'USE ' + QUOTENAME([name]) + N';' + #CRLF +
N'INSERT INTO utility.dbo.dtlIndexesToRebuild (DatabaseName, SchemaName, TableName, IndexName, Fragmentation)' + #CRLF + --Guessed names of your columns
N'SELECT DB_NAME(),' + #CRLF +
N' dbschemas.[name],' + #CRLF +
N' dbtables.[name],' + #CRLF +
N' dbindexes.[name],' + #CRLF +
N' indexstats.avg_fragmentation_in_percent' + #CRLF +
N'FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats' + #CRLF +
N' INNER JOIN sys.tables dbtables ON dbtables.[object_id] = indexstats.[object_id]' + #CRLF +
N' INNER JOIN sys.schemas dbschemas ON dbtables.[schema_id] = dbschemas.[schema_id]' + #CRLF +
N' INNER JOIN sys.indexes dbindexes ON dbindexes.[object_id] = indexstats.[object_id]' + #CRLF +
N' AND indexstats.index_id = dbindexes.index_id' + #CRLF +
N'WHERE indexstats.database_id = DB_ID()' + #CRLF +
N' AND avg_fragmentation_in_percent > 30;'
FROM sys.databases d
WHERE d.database_id > 4
--AND d.[name] != N'utility' --Don't know if you want to skip this
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,4,N'');
--PRINT #SQL; --Your best friend. Use SELECT for over 4,000 characters.
EXEC sys.sp_executesql #SQL;
Your best friend will help you debug any errors, but I've assumed the statement you supplied is valid.

Select column names and top 1 records along, dynamically

I was trying to select top 1 column value from the table to have a glimpse of the data, based on the output
(i.e. equivalently,
SELECT c.name FROM st.Name
This query retrieves column names and their data type along with the tables they're in. I am looking for an additional column that shows top 1 record from the columns.
SELECT
st.name 'Table Name',
c.name 'Column Name',
t.name 'Data Type'
FROM sys.columns c
INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
LEFT OUTER JOIN sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
LEFT OUTER JOIN sys.indexes i ON ic.object_id = i.object_id AND ic.index_id = i.index_id
LEFT OUTER JOIN sys.tables st ON st.object_id = i.object_id
I have been trying to use a dynamic sql, but as it should put the table name in a single quotation as a string, it couldn't work; when I try to avoid that, it just displays the declared variable.
Any idea is much appreciated. Thanks

So the way this works is basically creates a bunch of selects like:
SELECT 'dbo' AS [Schema Name]
, 'Table1' AS [Table Name]
, 'Id' AS [Column Name]
, 'bigint' AS [Data Type]
, (SELECT TOP 1 CONVERT(NVARCHAR(MAX), Id) FROM [dbo].[Table1]) AS [Top 1 Value]
UNION ALL
-- Another table
Values are converted into NVARCHAR(MAX) because column type in an union has to match and I guess that's the best bet.
Here goes:
DECLARE #query NVARCHAR(MAX) = ''
SELECT #Query +=
'SELECT ' + '''' + sch.name + '''' + ' AS [Schema Name],' + CHAR(13)+CHAR(10)
+ '''' + st.name + '''' + ' AS [Table Name],' + CHAR(13)+CHAR(10)
+ '''' + c.name + '''' + ' AS [Column Name],' + CHAR(13)+CHAR(10)
+ '''' + t.name + '''' + ' AS [Data Type],' + CHAR(13)+CHAR(10)
+ '(SELECT TOP 1 CONVERT(NVARCHAR(MAX), ' + c.name + ') FROM ' + QUOTENAME(sch.name) + '.' + QUOTENAME(st.name) + ') AS [Top 1 Value] ' + CHAR(13)+CHAR(10)
+ 'UNION ALL'+CHAR(13)+CHAR(10)
FROM sys.columns c
JOIN sys.types t ON c.user_type_id = t.user_type_id
JOIN sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
JOIN sys.indexes i ON ic.object_id = i.object_id AND ic.index_id = i.index_id
JOIN sys.tables st ON st.object_id = i.object_id
JOIN sys.schemas sch ON sch.schema_id = st.schema_id
-- Get rid of trailing UNION ALL
SET #Query = LEFT(#Query, LEN(#Query) - LEN('UNION ALLxx'))
PRINT #query
EXEC sp_executesql #query
Consider running with TOP 10 or some such first to make sure it's producing the right results.

you could use a while loop with d sql
-
--drop table #temp
SELECT
CONCAT(s.name,'.',st.name) 'Table Name',
c.name 'Column Name',
t.name 'Data Type',
CAST(null AS datetime) as IND,
cast('' AS varchar(max)) data
INTO #temp
FROM sys.columns c
INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
LEFT OUTER JOIN sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
LEFT OUTER JOIN sys.indexes i ON ic.object_id = i.object_id AND ic.index_id = i.index_id
INNER JOIN sys.tables st ON st.object_id = i.object_id
INNER JOIN sys.schemas s ON s.schema_id = st.schema_id
declare
#TableName varchar(255),
#ColumnName Varchar(255),
#sql varchar(max)
WHILE (SELECT count(*) FROM #temp where IND is null) > 0
begin
SELECT TOP 1
#TableName = [Table Name]
,#ColumnName = [Column Name]
FROM #temp
WHERE IND IS NULL
SET #sql =
'update #temp
set data = (SELECT top 1 [' + #ColumnName + '] from ' + #TableName + '),
IND = getdate()
where [Table Name] = ''' + #TableName + ''' and [Column Name] = ''' + #ColumnName + ''''
exec(#sql)
end
SELECT *
FROM #temp

SQL query optimization

I am trying to find out a script that can help me in Data Density of my DBs. the point is I already figure out the query and what I do need but the problem is the query takes for ever. it works find for small DBs, but that doesn't happen a lot.
So I am looking for kind of optimization or any ideas to help me.
the script:
DECLARE Cur CURSOR
FOR
SELECT DB_Name() AS DatabaseName
,s.[name] AS SchemaName
,t.[name] AS TableName
,c.[name] AS ColumnName
,'[' + DB_Name() + ']' + '.[' + s.NAME + '].' + '[' + T.NAME + ']' AS FullQualifiedTableName
,d.[name] AS DataType
FROM sys.schemas s
INNER JOIN sys.tables t ON s.schema_id = t.schema_id
INNER JOIN sys.columns c ON t.object_id = c.object_id
INNER JOIN sys.types d ON c.user_type_id = d.user_type_id
WHERE d.NAME LIKE '%int%'
OR d.NAME LIKE '%float%'
OR d.NAME LIKE '%decimal%'
OR d.NAME LIKE '%numeric%'
OR d.NAME LIKE '%real%'
OR d.NAME LIKE '%money%'
OR d.NAME LIKE '%date%'
OR d.NAME LIKE '%datetime%'
AND is_identity = 0
OPEN Cur
FETCH NEXT
FROM Cur
INTO #DatabaseName
,#SchemaName
,#TableName
,#ColumnName
,#FullyQualifiedTableName
,#DataType
WHILE ##FETCH_STATUS = 0 -- The FETCH statement was successful.
BEGIN
DECLARE #SQL VARCHAR(MAX) = NULL
SET #SQL = ' Select ''' + #DatabaseName + ''' AS DatabaseName, ''' +
#SchemaName + ''' AS TableName,
''' + #TableName + ''' AS SchemaName,
''' + #ColumnName + ''' AS ColumnName,
''' + #DataType + ''' AS ColumnName,
(Select MAX(' + #ColumnName + ') from ' + #FullyQualifiedTableName + ' with (nolock))
AS MaxValue,
(Select MIN(' + #ColumnName + ') from ' + #FullyQualifiedTableName + ' with (nolock))
AS MinValue,
(Select COUNT(*) from ' + #FullyQualifiedTableName + ' with (nolock))
AS CountValue,
(Select COUNT(*) from ' + #FullyQualifiedTableName + ' Where ' + #ColumnName + ' IS NOT NULL )
AS NotNULLCount,
(Select 0 from ' + #FullyQualifiedTableName + ')
AS DataDensity'
PRINT #SQL
The following script will give me the MAX, MIN, COUNT, NotNULLCount and the DATA DENSITY for every and each column form the declared types above. but u can imagine a DB with 70 tables and each table has 30-50 columns....
running this script will take for ever.

You should always try and avoid using cursors, this query will give you a list of select queries that you can copy and paste to get the data that you require. Note also I have removed the sub selects as they are not required:
SELECT 'Select ''' + DB_Name() + ''' AS DatabaseName, ''' + s.Name + ''' AS SchemaName, ''' + t.Name + ''' AS TableName, ''' + c.Name + ''' AS ColumnName, ''' + d.Name + ''' AS ColumnName,' +
'MAX([' + c.Name + ']) AS MaxValue,' +
'MIN([' + c.Name + ']) AS MinValue,' +
'COUNT(*) AS CountValue,' +
'COUNT([' + c.Name + ']) AS NotNullCount,' +
'CAST(COUNT(DISTINCT [' + c.name + ']) AS float) / COUNT([' + C.Name + ']) AS DataDensity ' +
'from [' + DB_Name() + '].[' + s.Name + '].[' + t.name + '] with (nolock)'
FROM sys.schemas s
INNER JOIN sys.tables t ON s.schema_id = t.schema_id
INNER JOIN sys.columns c ON t.object_id = c.object_id
INNER JOIN sys.types d ON c.user_type_id = d.user_type_id
WHERE d.NAME LIKE '%int%'
OR d.NAME LIKE '%float%'
OR d.NAME LIKE '%decimal%'
OR d.NAME LIKE '%numeric%'
OR d.NAME LIKE '%real%'
OR d.NAME LIKE '%money%'
OR d.NAME LIKE '%date%'
OR d.NAME LIKE '%datetime%'
AND is_identity = 0
This will give you a list of select statements in the following form:
Select 'MyDB' AS DatabaseName, 'dbo' AS SchemaName, 'MyTable' AS TableName, 'ID' AS ColumnName, 'int' AS ColumnName,MAX([ID]) AS MaxValue,MIN([ID]) AS MinValue,COUNT(*) AS CountValue,COUNT([ID]) AS NotNullCount,CAST(COUNT(DISTINCT [ID]) AS float) / COUNT([ID]) AS DataDensity from [MyDB].[dbo].[MyTable] with (nolock)
Of course SQL Server stores these sorts of statistics for useful columns, You can find which ones it has be using:
EXEC SP_HELPSTATS 'MyTable', 'ALL'
Then using the list of statistics returned such as:
_WA_Sys_00000014_004FB3FB ID
to get the actual stats using:
DBCC SHOW_STATISTICS('MyTable','_WA_Sys_00000002_004FB3FB')
This will return data like:
Name Updated Rows Rows Sampled Steps Density Average key length String Index Filter Expression Unfiltered Rows
_WA_Sys_00000002_004FB3FB Jan 8 2017 8:01PM 16535 16535 200 0.2493151 4.459389 NO NULL 16535
and
All density Average Length Columns
0.0006038647 4.459389 EffectiveDate
and another rowset showing a histogram of values.
You can automatically generate these DBCC commands using:
SELECT 'DBCC SHOW_STATISTICS([' + OBJECT_NAME(s.object_Id) + '],''' + s.Name + ''')'
FROM sys.stats s
INNER JOIN sys.stats_columns sc
ON s.object_id = sc.object_id AND s.stats_id = sc.stats_id
INNER JOIN sys.columns c
ON sc.object_id = c.object_id AND sc.column_id = c.column_id
WHERE s.Name LIKE '_WA%'
ORDER BY s.stats_id, sc.column_id;

dropping multiple tables ending with "1617"

I need to drop multiple tables ending with the string "1617"
I have come across massive procedures to do this but is there an easy way
My tables look like mytable1617 and I have loads of them
DECLARE #sql NVARCHAR(MAX) = N'';
SELECT #sql += '
DROP TABLE '
+ QUOTENAME(s.name)
+ '.' + QUOTENAME(t.name) + ';'
FROM sys.tables AS t
INNER JOIN sys.schemas AS s
ON t.[schema_id] = s.[schema_id]
WHERE t.name LIKE '1617%';
PRINT #sql;
-- EXEC sp_executesql #sql;

This:
WHERE t.name LIKE '1617%';
is looking for tables starting with 1617. You wanted:
WHERE t.name LIKE '%1617';

Just change the search pattern
DECLARE #sql NVARCHAR(MAX) = N'';
SELECT #sql += '
DROP TABLE '
+ QUOTENAME(s.name)
+ '.' + QUOTENAME(t.name) + ';'
FROM sys.tables AS t
INNER JOIN sys.schemas AS s
ON t.[schema_id] = s.[schema_id]
WHERE t.name LIKE '%1617'; --tables ending with 1617
PRINT #sql;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select tables where any data exists - sql

I have acces to huge MSSQL DB. This DB have many tables, but huge amount of it is empty. How do I query DB schema to select table names, where any rows exists? (I'd like to create ERD only from these tables that haves some data, when I achieve this). I did not found any related questions.

Related

Search for a value in all column and all tables of a database

How to use sp_MSforeachdb

Select column names and top 1 records along, dynamically

SQL query optimization

dropping multiple tables ending with "1617"

Categories

Resources