ms sql server how to check table has “id” column and count rows if "id" exist - sql

There are too many tables in my SQL Server db. Most of them have an 'id' column, but some do not. I want to know which table(s) doesn't have the 'id' column and to count the rows where id=null if an 'id' column exists. The query results may look like this:
TABLE_NAME | HAS_ID | ID_NULL_COUNT | ID_NOT_NULL_COUNT
table1 | false | 0 | 0
table2 | true | 10 | 100
How do I write this query?

Building query:
WITH cte AS (
SELECT t.*, has_id = CASE WHEN COLUMN_NAME = 'ID' THEN 'true' ELSE 'false' END
FROM INFORMATION_SCHEMA.TABLES t
OUTER APPLY (SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE t.TABLE_NAME = c.TABLE_NAME
AND t.[TABLE_SCHEMA] = c.[TABLE_SCHEMA]
AND c.COLUMN_NAME = 'id') s
WHERE t.TABLE_SCHEMA IN (...)
)
SELECT
query_to_run = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
'SELECT tab_name = ''<tab_name>'',
has_id = ''<has_id>'',
id_null_count = <id_null_count>,
id_not_null_count = <id_not_null_count>
FROM <schema_name>.<tab_name>'
,'<tab_name>', TABLE_NAME)
,'<schema_name>', TABLE_SCHEMA)
,'<has_id>', has_id)
,'<id_null_count>', CASE WHEN has_id = 'false' THEN '0' ELSE 'SUM(CASE WHEN id IS NULL THEN 1 END)' END)
,'<id_not_null_count>', CASE WHEN has_id = 'false' THEN '0' ELSE 'COUNT(id)' END)
FROM cte;
Copy the output and execute in separate window. UNION ALL could be added to get single resultset.
db<>fiddle demo

This might be useful for you... lists out the row count for all tables that have an "id" column. It filters out tables that start with "sys" because those are mostly internal tables. If you have a table that starts with "sys", you'll probably want to delete that part of the WHERE clause.
SELECT DISTINCT OBJECT_NAME(r.[object_id]) AS [TableName], [row_count] AS [RowCount]
FROM sys.dm_db_partition_stats r
WHERE index_id = 1
AND EXISTS (SELECT 1 FROM sys.columns c WHERE c.[object_id] = r.[object_id] AND c.[name] = N'id')
AND OBJECT_NAME(r.[object_id]) NOT LIKE 'sys%'
ORDER BY [TableName]
Note you can change the "c.[name] = N'id'" to be any column name, or even change the "=" to "<>" to find only tables without an id column

pmbAustin answers how to list all tables without "ID" column.
To know how many rows in each table, SQL Server has a built-in report for you.
Right click the database in SSMS, click "Reports", "Standard Reports" then "Disk Usage by Table"
You now know how many rows in each table, and from pmbAustin's answer you know how which tables do and do not have "ID" columns. with a simple Vlookup in Excel you can combine these two datasets to arrive at any answer you wish.

This will give you the info about which tables have or not have column named "ID":
SELECT Table_Name
, case when column_name not like '%ID%' then 'false'
else 'true'
end as HAS_ID
FROM INFORMATION_SCHEMA.COLUMNS;
Here is a small demo
And here is one way that you can use to select all the tables that have columns named ID and if this columns are null or not:
CREATE TABLE #AllIDSNullable (TABLE_NAME NVARCHAR(256) NOT NULL
, HAS_ID VARCHAR(10)
, ID_NULL_COUNT INT DEFAULT 0
, ID_NOT_NULL_COUNT INT DEFAULT 0);
DECLARE CT CURSOR FOR
SELECT Table_Name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE column_name = 'ID';
DECLARE #name NVARCHAR(MAX), #SQL NVARCHAR(MAX);
OPEN CT; FETCH NEXT FROM CT INTO #name;
WHILE ##FETCH_STATUS=0 BEGIN
SET #SQL = 'INSERT #AllIDSNullable (TABLE_NAME , HAS_ID) SELECT Table_Name, case when column_name not like ''%ID%'' then ''false'' else ''true'' end FROM INFORMATION_SCHEMA.COLUMNS;';
EXEC (#SQL);
SET #SQL = 'UPDATE #AllIDSNullable SET ID_NULL_COUNT = (SELECT COUNT(*) FROM ['+#name+'] WHERE ID IS NULL), ID_NOT_NULL_COUNT = (SELECT COUNT(*) FROM ['+#name+'] WHERE ID IS NOT NULL) WHERE TABLE_NAME='''+#name+''';';
EXEC (#SQL);
FETCH NEXT FROM CT INTO #name;
END;
CLOSE CT;
SELECT *
FROM #AllIDSNullable;
Here is a demo
Result:

Related

Insert column names into stored procedure SELECT statement from separate table

I am trying to figure out the best method to insert column names held in Table1 into a SELECT statement running against Table2. This query is running in a stored procedure. That doesn't do a very good job of explaining, so lets say I had these values in Table1:
What I am trying to do is use these column names in the SELECT statement against Table2:
Select -- Column Names
from Table2
where UserId = 3;
I'm not sure if an input parameter could be used in that way or how to pass the values into it. For example:
Select #ColumnNames
from Table2
where UserId = 3;
Or maybe a join to table 2?
Thanks!
You will have to use Dynamic SQL
declare #columns varchar(1000)
declare #sql varchar(8000)
select #columns='', #sql=''
select #columns=#columns+value+',' from table1
set #columns=left(#columns,len(#columns)-1)
set #sql='select '+#columns+' from table2'
exec(#sql)
But beware of SQL Injection and read www.sommarskog.se/dynamic_sql.html
You could query the system tables to get the column(s) i.e. (take out WHERE clause to see all the tables and columns)
SELECT tab.name AS TableName,
col.name AS ColName,
tp.name AS SType,
col.max_length,
col.[precision],
(CASE col.is_nullable
WHEN 1 THEN 'true'
WHEN 0 THEN 'false'
ELSE 'unknown'
END) AS Is_Nullable
FROM sys.tables as tab
LEFT OUTER JOIN sys.columns AS col
ON tab.object_id = col.object_id
LEFT OUTER JOIN sys.types AS tp
ON col.system_type_id = tp.system_type_id
WHERE tab.name = 'Table1'
ORDER BY tab.name,col.name

Looping through different tables of different dates

We have a legacy application which created multiple tables with the following naming convention: table_20140618, table_20140623, etc where the date is when the program run. I am trying to clean up the database now, and drop some of these tables.
In each table there are two fields: DateStarted and DateFinished. I want to select the tables (and then drop them) where DateStarted has value and DateFinished is NOT null.
At the moment I am using the following query to select all the tables that start with 'table_'
such as:
Select (TABLE_NAME) FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
AND TABLE_NAME LIKE 'table_%';
I am not sure how to get all the tables together by searching within their fields. I could do it through the code, but that should mean multiple hits on the database. Any ideas?
Made this after my first comment above, but you should be able to alter the code to fit your specs. Basically, this will use dynamic SQL to generate the commands based on your filters and conditions. So you can use whatever conditions you want in the SELECT #SQL = ... part, to check for the dates, and then add the table name when the conditions are met.
The script returns a list with tablenames and the drop command, so you can check what you're doing before you do it. But from there you can just copy the drop command list and execute it if you want.
IF OBJECT_ID('tempdb..#TABLES') IS NOT NULL
DROP TABLE #TABLES
CREATE TABLE #TABLES (ROWNMBER INT IDENTITY(1,1), TABLENAME VARCHAR(256) COLLATE DATABASE_DEFAULT)
/*
-- Old code to fetch ALL tables with specified name
INSERT INTO #TABLES
SELECT name
FROM sys.tables
WHERE name LIKE 'table[_]%'
*/
-- Updated code to fetch only those tables which contain the DateStarted and DateFinished columns
INSERT INTO #TABLES
SELECT TAB.name
FROM sys.tables TAB
LEFT JOIN sys.columns C1 on C1.object_id = TAB.object_id
AND C1.name = 'DateStarted'
LEFT JOIN sys.columns C2 on C2.object_id = TAB.object_id
AND C2.name = 'DateFinished'
WHERE TAB.name LIKE 'table[_]%'
AND C1.name IS NOT NULL AND C2.name IS NOT NULL
IF OBJECT_ID('tempdb..#DROPPABLE_TABLES') IS NOT NULL
DROP TABLE #DROPPABLE_TABLES
CREATE TABLE #DROPPABLE_TABLES (TABLENAME VARCHAR(256) COLLATE DATABASE_DEFAULT)
DECLARE #ROW_NOW INT, #ROW_MAX INT, #SQL VARCHAR(MAX), #TABLENAME VARCHAR(256)
SELECT #ROW_NOW = MIN(ROWNMBER), #ROW_MAX = MAX(ROWNMBER) FROM #TABLES
WHILE #ROW_NOW <= #ROW_MAX
BEGIN
SELECT #TABLENAME = TABLENAME FROM #TABLES WHERE ROWNMBER = #ROW_NOW
SELECT #SQL =
'IF (SELECT COUNT(*) FROM '+#TABLENAME+' WHERE DateStarted IS NOT NULL) > 0
AND (SELECT COUNT(*) FROM '+#TABLENAME+' WHERE DateFinished IS NOT NULL) > 0
SELECT '''+#TABLENAME+''''
INSERT INTO #DROPPABLE_TABLES
EXEC(#SQL)
SET #ROW_NOW = #ROW_NOW+1
END
SELECT *, 'DROP TABLE '+TABLENAME DROPCOMMAND FROM #DROPPABLE_TABLES
EDIT:
As per your comment, it seems not all such tables have those columns. You can use the following script to identify said tables and which column is missing, so you can check into them further. And you can use the same idea to filter the results of the first query to only count in tables which have those columns.
SELECT TAB.name TABLENAME
, CASE WHEN C1.name IS NULL THEN 'Missing' ELSE '' END DateStarted_COL
, CASE WHEN C2.name IS NULL THEN 'Missing' ELSE '' END DateFinished_COL
FROM sys.tables TAB
LEFT JOIN sys.columns C1 on C1.object_id = TAB.object_id
AND C1.name = 'DateStarted'
LEFT JOIN sys.columns C2 on C2.object_id = TAB.object_id
AND C2.name = 'DateFinished'
WHERE TAB.name LIKE 'table[_]%'
AND (C1.name IS NULL
OR C2.name IS NULL)

How to access the column name of table using while loop

I have a below table(#Temp):
RowNo Item
1 A
2 B
My requirement is if Item equals to B do action.
declare #count int = 1
WHILE(#count < (select count(*) from #Temp))
Begin
// Here I have to access my column name(Item) , so that I can check its value to B
set #count = #count + 1
End
Please suggest
You can use DESCRIBE:
DESCRIBE my_table;
Or in newer versions you can use INFORMATION_SCHEMA:
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'my_database' AND TABLE_NAME = 'my_table';
Or you can use SHOW COLUMNS:
SHOW COLUMNS FROM my_table;

Count number of rows across multiple tables in one query

I have a SQL Server 2005 database that stores data for multiple users. Each table that contains user-owned data has a column called OwnerID that identifies the owner; most but not all tables have this column.
I want to be able to count number of rows 'owned' by a user in each table. In other words, I want a query that returns the names of each table that contains an OwnerID column, and counts the number of rows in each table that match a given OwnerID value.
I can return just the names of the matching tables using this query:
SELECT OBJECT_NAME(object_id) [Table] FROM sys.columns
WHERE name = 'OwnerID' ORDER BY OBJECT_NAME(object_id);
That query returns a list of table names like this:
+---------+
| Table |
+---------+
| Alpha |
| Beta |
| Gamma |
| ... |
+---------+
But is it possible to write a query that can also count the number of rows in each table that match a given OwnerID? ie:
+---------+------------+
| Table | RowCount |
+---------+------------+
| Alpha | 2042 |
| Beta | 49 |
| Gamma | 740 |
| ... | ... |
+---------+------------+
Note: The list of table names needs to be returned dynamically, it is not suitable to hard-code table names into this query.
Edit: the answer...
(I can't edit your answers yet but I can edit my own question so I'm putting it here...)
Damien_The_Unbeliever had essentially the correct answer, but SQL Server doesn't allow string concatenation in an exec statement so I had to set the query prior to the exec statement. The final query is as follows:
DECLARE #OwnerID int;
SET #OwnerID = 1;
DECLARE #ForEachSQL varchar(100);
SET #ForEachSQL = 'INSERT INTO #t(TableName,RowsOwned) SELECT ''?'', COUNT(*) FROM ? WHERE OwnerID = ' + CONVERT(varchar(11), #OwnerID);
CREATE TABLE #t(TableName sysname, RowsOwned int);
EXEC sp_MSforeachtable #ForEachSQL,
#whereAnd = 'AND o.id IN (SELECT id FROM syscolumns where name=''OwnerID'')';
SELECT * FROM #t ORDER BY TableName;
DROP TABLE #t;
You can use sp_MSForeachtable, and the #whereand parameter, to specify a filter so you're only working against tables with an OwnerID column. Create a temp table, and populate that for each matching table. Something like:
create table #t(tablename sysname,Cnt int)
sp_MSforeachtable 'insert into #t(tablename,Cnt) select ''?'',COUNT(*) from ?',#whereAnd='and o.id in (select id from syscolumns where name=''OwnerID'')'
select * from #t
Two major caveats to mention - first is that sp_MSforeachtable is "undocumented", so you use it at your own risk - it could be suddenly removed from SQL Server by any kind of servicing, or in the next release.
The second is that, having a dynamic schema is usually a sign that something else has gone wrong in modelling - possibly attribute splitting (where sales for January and February are given different tables, even though they're logically the same thing and should appear in the same table, with possibly an additional column to distinguish them)
And, of course, you wanted to filter based on a particular clientID, so the query would be more like:
'insert into #t(tablename,Cnt) select ''?'',COUNT(*) from ? where OwnerID=' + #OwnerID
(Assuming #OwnerID is the owner sought, and is an int)
This would get the info from sysindexes. It can be slightly out of date but will give you a rough count
SELECT
[TableName] = so.name,
[RowCount] = MAX(si.rows)
FROM
sysobjects so,
sysindexes si
WHERE
so.xtype = 'U'
AND
si.id = OBJECT_ID(so.name)
GROUP BY
so.name
ORDER BY
2 DESC
If you needed it to be 100% right then you could use the undocumented feature sp_MSForEachTable
DECLARE #SQL VARCHAR(255)
SET #SQL = 'DBCC UPDATEUSAGE (' + DB_NAME() + ')'
EXEC(#SQL)
CREATE TABLE #foo
(
tablename VARCHAR(255),
rc INT
)
INSERT #foo
EXEC sp_msForEachTable
'SELECT PARSENAME(''?'', 1),
COUNT(*) FROM ?'
SELECT tablename, rc
FROM #foo
ORDER BY rc DESC
DROP TABLE #foo
You can use this:
DECLARE #nSQL NVARCHAR(MAX)
SELECT #nSQL = COALESCE(#nSQL + 'UNION ALL ' + CHAR(10), '')
+ 'SELECT ''' + TABLE_NAME + ''' AS TableName, COUNT(*) FROM ' + QUOTENAME(TABLE_NAME) + CHAR(10)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME = 'strKey'
-- This will PRINT out the dynamically generated SQL statement. Just replace this with EXECUTE(#nSQL) when you are happy to run it.
PRINT #nSQL
Update: To search for a specific OwnerId:
DECLARE #nSQL NVARCHAR(MAX)
DECLARE #OwnerId INTEGER
SET #OwnerId = 1
SELECT #nSQL = COALESCE(#nSQL + 'UNION ALL ' + CHAR(10), '')
+ 'SELECT ''' + TABLE_NAME + ''' AS TableName, COUNT(*) FROM ' + QUOTENAME(TABLE_NAME) + ' WHERE OwnerId = #OwnerId' + CHAR(10)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME = 'strKey'
EXECUTE sp_executesql #nSQL, '#OwnerId INTEGER', #OwnerId
SELECT
O.ID,
O.NAME,
I.ROWCNT
FROM SYSOBJECTS O
INNER JOIN SYSINDEXES I
ON O.ID = I.ID
WHERE O.UID = 5
AND O.XTYPE = 'U'
AND I.STATUS = 0
Try using this query it will give you id of the table, table name and no of rows for that table.
UID = 5 means I want to check in particular schema which has id = 5.You can check schema id using SELECT SCHEMA_ID('<schema name>');
XTYPE = 'U' means User defined tables only.

SQL Query to check if 40 columns in table is null

How do I select few columns in a table that only contain NULL values for all the rows?
Suppose if Table has 100 columns, among this 100 columns 60 columns has null values.
How can I write where condition to check if 60 columns are null.
maybe with a COALESCE
SELECT * FROM table WHERE coalesce(col1, col2, col3, ..., colN) IS NULL
where c1 is null and c2 is null ... and c60 is null
shortcut using string concatenation (Oracle syntax):
where c1||c2||c3 ... c59||c60 is null
First of all, if you have a table that has so many nulls and you use SQL Server 2008 - you might want to define the table using sparse columns (http://msdn.microsoft.com/en-us/library/cc280604.aspx).
Secondly I am not sure if coalesce solves the question asks - it seems like Ammu might actually want to find the list of columns that are null for all rows, but I might have misunderstood. Nevertheless - it is an interesting question, so I wrote a procedure to list null columns for any given table:
IF (OBJECT_ID(N'PrintNullColumns') IS NOT NULL)
DROP PROC dbo.PrintNullColumns;
go
CREATE PROC dbo.PrintNullColumns(#tablename sysname)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #query nvarchar(max);
DECLARE #column sysname;
DECLARE columns_cursor CURSOR FOR
SELECT c.name
FROM sys.tables t JOIN sys.columns c ON t.object_id = c.object_id
WHERE t.name = #tablename AND c.is_nullable = 1;
OPEN columns_cursor;
FETCH NEXT FROM columns_cursor INTO #column;
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #query = N'
DECLARE #c int
SELECT #c = COUNT(*) FROM ' + #tablename + ' WHERE ' + #column + N' IS NOT NULL
IF (#c = 0)
PRINT (''' + #column + N''');'
EXEC (#query);
FETCH NEXT FROM columns_cursor INTO #column;
END
CLOSE columns_cursor;
DEALLOCATE columns_cursor;
SET NOCOUNT OFF;
RETURN;
END;
go
If you don't want to write the columns names, Try can do something like this.
This will show you all the rows when all of the columns values are null except for the columns you specified (IgnoreThisColumn1 & IgnoreThisColumn2).
DECLARE #query NVARCHAR(MAX);
SELECT #query = ISNULL(#query+', ','') + [name]
FROM sys.columns
WHERE object_id = OBJECT_ID('yourTableName')
AND [name] != 'IgnoreThisColumn1'
AND [name] != 'IgnoreThisColumn2';
SET #query = N'SELECT * FROM TmpTable WHERE COALESCE('+ #query +') IS NULL';
EXECUTE(#query)
Result
If you don't want rows when all the columns are null except for the columns you specified, you can simply use IS NOT NULL instead of IS NULL
SET #query = N'SELECT * FROM TmpTable WHERE COALESCE('+ #query +') IS NOT NULL';
Result
[
Are you trying to find out if a specific set of 60 columns are null, or do you just want to find out if any 60 out of the 100 columns are null (not necessarily the same 60 for each row?)
If it is the latter, one way to do it in oracle would be to use the nvl2 function, like so:
select ... where (nvl2(col1,0,1)+nvl2(col2,0,1)+...+nvl2(col100,0,1) > 59)
A quick test of this idea:
select 'dummy' from dual where nvl2('somevalue',0,1) + nvl2(null,0,1) > 1
Returns 0 rows while:
select 'dummy' from dual where nvl2(null,0,1) + nvl2(null,0,1) > 1
Returns 1 row as expected since more than one of the columns are null.
It would help to know which db you are using and perhaps which language or db framework if using one.
This should work though on any database.
Something like this would probably be a good stored procedure, since there are no input parameters for it.
select count(*) from table where col1 is null or col2 is null ...
Here is another method that seems to me to be logical as well (use Netezza or TSQL)
SELECT KeyColumn, MAX(NVL2(TEST_COLUMN,1,0) AS TEST_COLUMN
FROM TABLE1
GROUP BY KeyColumn
So every TEST_COLUMN that has MAX value of 0 is a column that contains all nulls for the record set. The function NVL2 is saying if the column data is not null return a 1, but if it is null then return a 0.
Taking the MAX of that column will reveal if any of the rows are not null. A value of 1 means that there is at least 1 row that has data. Zero (0) means that each row is null.
I use the below query when i have to check for multiple columns NULL. I hope this is helpful . If the SUM comes to a value other than Zero , then you have NULL in that column
select SUM (CASE WHEN col1 is null then 1 else 0 end) as null_col1,
SUM (CASE WHEN col2 is null then 1 else 0 end) as null_col2,
SUM (CASE WHEN col3 is null then 1 else 0 end) as null_col3, ....
.
.
.
from tablename
you can use
select NUM_NULLS , COLUMN_NAME from all_tab_cols where table_name = 'ABC' and COLUMN_NAME in ('PQR','XYZ');