How can I find potential not null columns? - sql

I'm working with a SQL Server database which is very light on constraints and want to apply some not null constraints. Is there any way to scan all nullable columns in the database and select which ones do not contain any nulls or even better count the number of null values?

Perhaps with a little dynamic SQL
Example
Declare #SQL varchar(max) = '>>>'
Select #SQL = #SQL
+ 'Union All Select TableName='''+quotename(Table_Schema)+'.'+quotename(Table_Name)+''''
+',ColumnName='''+quotename(Column_Name)+''''
+',NullValues=count(*)'
+' From '+quotename(Table_Schema)+'.'+quotename(Table_Name)
+' Where '+quotename(Column_Name)+' is null '
From INFORMATION_SCHEMA.COLUMNS
Where Is_Nullable='YES'
Select #SQL='Select * from (' + replace(#SQL,'>>>Union All ','') + ') A Where NullValues>0'
Exec(#SQL)
Returns (for example)
TableName ColumnName NullValues
[dbo].[OD-Map] [Map-Val2] 185
[dbo].[OD-Map] [Map-Val3] 225
[dbo].[OD-Map] [Map-Val4] 225
For all table/columns with counts >= 0
...
Select #SQL=replace(#SQL,'>>>Union All ','')
Exec(#SQL)

Check this query. This was originally written by Linda Lawton
Original Article: https://www.daimto.com/sql-server-finding-columns-with-null-values
Finding columns with null values in your Database - Find Nulls Script
set nocount on
declare #columnName nvarchar(500)
declare #tableName nvarchar(500)
declare #select nvarchar(500)
declare #sql nvarchar(500)
-- check if the Temp table already exists
if OBJECT_ID('tempdb..#LocalTempTable') is null
Begin
CREATE TABLE #LocalTempTable(
TableName varchar(150),
ColumnName varchar(150))
end
else
begin
truncate table #LocalTempTable;
end
-- Build a select for each of the columns in the database. That checks for nulls
DECLARE check_cursor CURSOR FOR
select column_name, table_name, concat(' Select ''',column_name,''',''',table_name,''' from ',table_name,' where [',COLUMN_NAME,'] is null')
from INFORMATION_SCHEMA.COLUMNS
OPEN check_cursor
FETCH NEXT FROM check_cursor
INTO #columnName, #tableName,#select
WHILE ##FETCH_STATUS = 0
BEGIN
-- Insert it if there if it exists.
set #sql = 'insert into #LocalTempTable (ColumnName, TableName)' + #select
print #sql
-- Run the statment
exec( #sql)
FETCH NEXT FROM check_cursor
INTO #columnName, #tableName,#select
end
CLOSE check_cursor;
DEALLOCATE check_cursor;
SELECT TableName, ColumnName, COUNT(TableName) 'Count'
FROM #LocalTempTable
GROUP BY TableName, ColumnName
ORDER BY TableName
The query result would be something like this.

This will tell you which columns in your database are currently NULLABLE.
USE <Your_DB_Name>
GO
SELECT o.name AS Table_Name
, c.name AS Column_Name
FROM sys.objects o
INNER JOIN sys.columns c ON o.object_id = c.object_id
AND c.is_nullable = 1 /* 1 = NULL, 0 = NOT NULL */
WHERE o.type_desc = 'USER_TABLE'
AND o.type NOT IN ('PK','F','D') /* NOT Primary, Foreign of Default Key */

Yes, it is fairly straight forward. Note: if the table contains a lot of records, I suggest using SELECT TOP 1000 *, instead of SELECT *.
-- Identify records where a specific column is NOT NULL
SELECT *
FROM TableName
WHERE ColumNName IS NOT NULL
-- Identify the count of records where a specific column contains NULL
SELECT COUNT(1)
FROM TableName
WHERE ColumNName IS NULL
-- Identify all NULLable columns in a database
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE IS_NULLABLE = 'YES'
For more information on the INFORMATION_SCHEMA views, see this: https://learn.microsoft.com/en-us/sql/relational-databases/system-information-schema-views/system-information-schema-views-transact-sql
If you want to scan all tables and columns in a given database for NULLs, then it is a two step process.
1.) Get the list of tables and columns that are NULLABLE.
-- Identify all NULLable columns in a database
SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE IS_NULLABLE = 'YES'
2.) Use Excel to create a SELECT statement to get the NULL counts for each table/column. To do this, copy and paste the query results from step 1 into EXCEL. Assuming you have copied the header row, then your data starts on row 2. In cell E2, enter the following formula.
="SELECT COUNT(1) FROM "&A2&"."&B2&"."&C2&" WHERE "&D2&" IS NULL"
Copy and paste that down the entire sheet. This will generate the SQL SELECT statement that you require. Copy the results in column E and paste into SQL Server and run it. This may take a while depending on the number of tables/columns to scan.

Related

Find all unique values of a column name in a SQL database

We are building a large database using SQL. Every table in the database has many columns but one of the columns in the tables tells who added the row of data. That value "Name of person" is tied to a variable in SSIS. Again, the variable tells who added the row. How can I create a query to pull back all the names in that column, no matter where it is used in the database. The value of the column is different depending on the day.
RE: Every table has the same <Column_Name> ... a query to pull back all the values of that column, no matter where it is used in the database.
IF you have a large database and every table has the same column <Column_Name>, then you will be pulling a value from every row in every table ... in a very large database. Not sure that is what you want to do, but it can be easily done. The following will work even if <column_name> is only in a few tables.
Grab a list of every schema.table that contains <column_name>, then loop over it to get the the value for <column_name>. The following should work.
DECLARE #colname sysname = '<Column_name>' -- just in case it is not in every table
-- capture name of every table here
CREATE TABLE #tablename ( schema_name sysname, table_name sysname, Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED)
INSERT INTO #tablename (schema_name, table_name)
SELECT s.name schemaname, t.name tablename FROM sys.columns c
INNER JOIN sys.tables t ON t.object_id = c.object_id
INNER JOIN sys.schemas s ON s.schema_id = t.schema_id
WHERE c.name = #colname
-- capture result of query here
CREATE TABLE #result ( schema_name sysname, table_name sysname, column_value VARCHAR(100) )
DECLARE #i INT = 1, #imax INT
SELECT #imax = MAX(Id) FROM #tablename
-- loop over tablename
DECLARE #query NVARCHAR(255)
WHILE #i <= #imax
BEGIN
SELECT #query = N'SELECT ' + schema_name + '.' + table_name + ',' + ' <Column_Name> FROM ' + schema_name + '.' + table_name
FROM #tablename WHERE Id = #i
INSERT INTO #result ( schema_name, table_name, column_value)
EXEC sp_executesql #query
SET #i += 1
END

Selecting column names that have specified value

We are receiving rather large files, of which we have no control over the format of, that are being bulk-loaded into a SQL Server table via SSIS to be later imported into our internal structure. These files can contain over 800 columns, and often the column names are not immediately recognizable.
As a result, we have a large table that represents the contents of the file with over 800 Varchar columns.
The problem is: I know what specific values I'm looking for in this data, but I do not know what column contains it. And eyeballing the data to find said column is neither efficient nor ideal.
My question is: is it at all possible to search a table by some value N and return the column names that have that value? I'd post some code that I've tried, but I really don't know where to start on this one... or if it's even possible.
For example:
A B C D E F G H I J K L M N ...
------------------------------------------------------------
'a' 'a' 'a' 'a' 'a' 'b' 'a' 'a' 'a' 'b' 'b' 'a' 'a' 'c' ...
If I were to search this table for the value 'b', I would want to get back the following results:
Columns
---------
F
J
K
Is something like this possible to do?
This script will search all tables and all string columns for a specific string. You might be able to adapt this for your needs:
DECLARE #tableName sysname
DECLARE #columnName sysname
DECLARE #value varchar(100)
DECLARE #sql varchar(2000)
DECLARE #sqlPreamble varchar(100)
SET #value = 'EDUQ4' -- *** Set this to the value you're searching for *** --
SET #sqlPreamble = 'IF EXISTS (SELECT 1 FROM '
DECLARE theTableCursor CURSOR FAST_FORWARD FOR
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'dbo' AND TABLE_TYPE = 'BASE TABLE'
AND TABLE_NAME NOT LIKE '%temp%' AND TABLE_NAME != 'dtproperties' AND TABLE_NAME != 'sysdiagrams'
ORDER BY TABLE_NAME
OPEN theTableCursor
FETCH NEXT FROM theTableCursor INTO #tableName
WHILE ##FETCH_STATUS = 0 -- spin through Table entries
BEGIN
DECLARE theColumnCursor CURSOR FAST_FORWARD FOR
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #tableName AND (DATA_TYPE = 'nvarchar' OR DATA_TYPE = 'varchar')
ORDER BY ORDINAL_POSITION
OPEN theColumnCursor
FETCH NEXT FROM theColumnCursor INTO #columnName
WHILE ##FETCH_STATUS = 0 -- spin through Column entries
BEGIN
SET #sql = #tableName + ' WHERE ' + #columnName + ' LIKE ''' + #value +
''') PRINT ''Value found in Table: ' + #tableName + ', Column: ' + #columnName + ''''
EXEC (#sqlPreamble + #sql)
FETCH NEXT FROM theColumnCursor INTO #columnName
END
CLOSE theColumnCursor
DEALLOCATE theColumnCursor
FETCH NEXT FROM theTableCursor INTO #tableName
END
CLOSE theTableCursor
DEALLOCATE theTableCursor
One option you have is to be a little creative using XML in SQL Server.
Turn a row at a time into XML using cross apply and query for the nodes that has a certain value in a second cross apply.
Finally you output the distinct list of node names.
declare #Value nvarchar(max)
set #Value= 'b'
select distinct T3.X.value('local-name(.)', 'nvarchar(128)') as ColName
from YourTable as T1
cross apply (select T1.* for xml path(''), type) as T2(X)
cross apply T2.X.nodes('*[text() = sql:variable("#Value")]') as T3(X)
SQL Fiddle
If you have access to the files are RegEx will be must faster than performing a generic search in SQL.
If you are forced to use SQL #pmbAustin's answer is the way to go. Be warned, it won't run quickly.

How to get a list of tables which contains a common column value in SQL Server

I have a number of tables in which some of them has common Asset_ID, what will be the query to get a list of table names with common Asset_ID.
try select table_name from information_schema.columns where column_name='Asset_ID'
If you want to get the table names depending on the value of data in the column you cannot get it by simple query try the following
declare #val_to_search varchar(50), #column_name varchar(50)
Select #val_to_search = 'ddh224', #column_name='Asset_ID'
declare tbl cursor for
select table_name from information_schema.columns where column_name=#column_name
declare #tablename varchar(200),#qstr varchar(max)
declare #datapen table(table_name varchar(200))
open tbl
fetch tbl into #tablename
while ##fetch_status=0
begin
select #qstr='select top 1 '''+#tablename+''' from '+#tablename+' where '+ #column_name + ' =''' + #val_to_search + ''''
insert into #datapen
exec(#qstr)
fetch tbl into #tablename
end
close tbl
deallocate tbl
select * from #datapen pen
Below Query might help u out!
This query will list down all tables along with Columns Names!
SELECT * FROM INFORMATION_SCHEMA.COLUMNS where COLUMN_NAME LIKE
'%columnName%'
This query will list down all tables!
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE
TABLE' AND TABLE_NAME LIKE '%tableName%'
i always use this to find all match keyword, it will list down all match sp, tables, function...
select distinct name
from syscomments c
join sysobjects o on c.id = o.id
where TEXT like '%Asset_ID%'

How to alter all columns without specifying column_name?

Is there a way to apply an alter statement to ALL the columns in a table without having to specify the column names? This would only be used in a temp table I need to clean up the duplicate data in a report.
Here's a sample of what I'm wondering if its possible:
select T1.Column1, T1.Column2, T1.Column3, T2.Column1, T2.Column2
into #templateTable
from Table1 T1
join Table2 T2 on T1.Column1 = T2.Column2
alter table #templateTable
alter column * null
All I really need is my #tempTable to allow null values in the columns that it originally gets data from which don't previously allow null values.
Lastly, there are 2 reasons I don't want to go through and edit each column:
There are many columns (~50) being pulled from at least 10 tables.
I don't know which ones allow null and researching which ones do would take me the next two days.
Any help will be appreciated.
Ugly way, but it works in a SqlServer Management Studio, at least (can probably be used as "strings")
select 'alter table ' + table_name + ' alter column '+ column_name +' ' + data_type +
(case
when character_maximum_length is not null and CHARACTER_MAXIMUM_LENGTH <> -1 then ' (' + CAST(CHARACTER_MAXIMUM_LENGTH as varchar) +')'
when CHARACTER_MAXIMUM_LENGTH = -1 then '(MAX)'
else '' end) + ' null;' from tempdb.information_schema.COLUMNS
where TABLE_NAME = '<YOUR TABLE NAME>'
and IS_NULLABLE='NO';
copy result, paste and execute...
I'm not entirely gathering what you are trying to accomplish with your temp tables, but as far as your first question you can query the sysobjects tables for your table's name and then alter each column in that table by looping through the syscolumns table
So lets say I want to loop through each column in the table master:
declare #i int
declare #colcount int
declare #column varchar(max)
declare #sql nvarchar(max)
set #i = 1
set #colcount = (select MAX(colID) from syscolumns where ID = (select id from sysobjects
where name = 'master'))
WHILE #i < #colcount
BEGIN
set #column = (select name from syscolumns where ID = (select id from sysobjects where name = 'master') and colid = #i)
set #sql = 'select top 1000 [' + #column + '] from master'
EXECUTE(#sql)
set #i = #i + 1
End
you can change #sql to whatever you need it to be and that should get it
Check out this question Create a nullable column using SQL Server SELECT INTO?
I'm guessing this is a one time run...
select 'alter table xxx alter column ' + columnname + ...
from information_schema.columns
where ...
Copy and paste the result in a separate window and run.

SQL to return first two columns of a table

Is there any SQL lingo to return JUST the first two columns of a table WITHOUT knowing the field names?
Something like
SELECT Column(1), Column(2) FROM Table_Name
Or do I have to go the long way around and find out the column names first? How would I do that?
You have to get the column names first. Most platforms support this:
select column_name,ordinal_position
from information_schema.columns
where table_schema = ...
and table_name = ...
and ordinal_position <= 2
There it´s
declare #select varchar(max)
set #select = 'select '
select #select=#select+COLUMN_NAME+','
from information_schema.columns
where table_name = 'TABLE' and ordinal_position <= 2
set #select=LEFT(#select,LEN(#select)-1)+' from TABLE'
exec(#select)
A dynamic query using for xml path will also do the job:
declare #sql varchar(max)
set #sql = (SELECT top 2 COLUMN_NAME + ',' from information_schema.columns where table_name = 'YOUR_TABLE_NAME_HERE' order by ordinal_position for xml path(''))
set #sql = (SELECT replace(#sql +' ',', ',''))
exec('SELECT ' + #sql + ' from YOUR_TABLE_NAME_HERE')
I wrote a stored procedure a while back to do this exact job. Even though in relational theory there is no technical column order SSMS is not completely relational. The system stores the order in which the columns were inserted and assigns an ID to them. This order is followed using the typical SELECT * statement which is why your SELECT statements appear to return the same order each time. In practice its never a good idea to SELECT * with anything as it doesn't lock the result order in terms of columns or rows. That said I think people get so stuck on 'you shouldn't do this' that they don't write scripts that actually can do it. Fact is there is predictable system behavior so why not use it if the task isn't super important.
This SPROC of course has caveats and is written in T-SQL but if your looking to just return all of the values with the same behavior of SELECT * then this should do the job pretty easy for you. Put in your table name, the amount of columns, and hit F5. It returns them in order from left to right the same as you'd be expecting. I limited it to only 5 columns but you can edit the logic if you need any more. Takes both temp and permanent tables.
EXEC OnlySomeColumns 'MyTable', 3
/*------------------------------------------------------------------------------------------------------------------
Document Title: The Unknown SELECT SPROC.sql
Created By: CR
Date: 4.28.2013
Purpose: Returns all results from temp or permanent table when not knowing the column names
SPROC Input Example: EXEC OnlySomeColumns 'MyTable', 3
--------------------------------------------------------------------------------------------------------------------*/
IF OBJECT_ID ('OnlySomeColumns', 'P') IS NOT NULL
DROP PROCEDURE OnlySomeColumns;
GO
CREATE PROCEDURE OnlySomeColumns
#TableName VARCHAR (1000),
#TotalColumns INT
AS
DECLARE #Column1 VARCHAR (1000),
#Column2 VARCHAR (1000),
#Column3 VARCHAR (1000),
#Column4 VARCHAR (1000),
#Column5 VARCHAR (1000),
#SQL VARCHAR (1000),
#TempTable VARCHAR (1000),
#PermanentTable VARCHAR (1000),
#ColumnNamesAll VARCHAR (1000)
--First determine if this is a temp table or permanent table
IF #TableName LIKE '%#%' BEGIN SET #TempTable = #TableName END --If a temporary table
IF #TableName NOT LIKE '%#%' BEGIN SET #PermanentTable = #TableName END --If a permanent column name
SET NOCOUNT ON
--Start with a few simple error checks
IF ( #TempTable = 'NULL' AND #PermanentTable = 'NULL' )
BEGIN
RAISERROR ( 'ERROR: Please select a TempTable or Permanent Table.',16,1 )
END
IF ( #TempTable <> 'NULL' AND #PermanentTable <> 'NULL' )
BEGIN
RAISERROR ( 'ERROR: Only one table can be selected at a time. Please adjust your table selection.',16,1 )
END
IF ( #TotalColumns IS NULL )
BEGIN
RAISERROR ( 'ERROR: Please select a value for #TotalColumns.',16,1 )
END
--Temp table to gather the names of the columns
IF Object_id('tempdb..#TempName') IS NOT NULL DROP TABLE #TempName
CREATE TABLE #TempName ( ID INT, Name VARCHAR (1000) )
--Select the column order from a temp table
IF #TempTable <> 'NULL'
BEGIN
--Verify the temp table exists
IF NOT EXISTS ( SELECT 1
FROM tempdb.sys.columns
WHERE object_id = object_id ('tempdb..' + #TempTable +'') )
BEGIN
RAISERROR ( 'ERROR: Your TempTable does not exist - Please select a valid TempTable.',16,1 )
RETURN
END
SET #SQL = 'INSERT INTO #TempName
SELECT column_id AS ID, Name
FROM tempdb.sys.columns
WHERE object_id = object_id (''tempdb..' + #TempTable +''')
ORDER BY column_id'
EXEC (#SQL)
END
--From a permanent table
IF #PermanentTable <> 'NULL'
BEGIN
--Verify the temp table exists
IF NOT EXISTS ( SELECT 1
FROM syscolumns
WHERE id = ( SELECT id
FROM sysobjects
WHERE Name = '' + #PermanentTable + '' ) )
BEGIN
RAISERROR ( 'ERROR: Your Table does not exist - Please select a valid Table.',16,1 )
RETURN
END
SET #SQL = 'INSERT INTO #TempName
SELECT colorder AS ID, Name
FROM syscolumns
WHERE id = ( SELECT id
FROM sysobjects
WHERE Name = ''' + #PermanentTable + ''' )
ORDER BY colorder'
EXEC (#SQL)
END
--Set the names of the columns
IF #TotalColumns >= 1 BEGIN SET #Column1 = (SELECT Name FROM #TempName WHERE ID = 1) END
IF #TotalColumns >= 2 BEGIN SET #Column2 = (SELECT Name FROM #TempName WHERE ID = 2) END
IF #TotalColumns >= 3 BEGIN SET #Column3 = (SELECT Name FROM #TempName WHERE ID = 3) END
IF #TotalColumns >= 4 BEGIN SET #Column4 = (SELECT Name FROM #TempName WHERE ID = 4) END
IF #TotalColumns >= 5 BEGIN SET #Column5 = (SELECT Name FROM #TempName WHERE ID = 5) END
--Create a select list of only the column names you want
IF Object_id('tempdb..#FinalNames') IS NOT NULL DROP TABLE #FinalNames
CREATE TABLE #FinalNames ( ID INT, Name VARCHAR (1000) )
INSERT #FinalNames
SELECT '1' AS ID, #Column1 AS Name UNION ALL
SELECT '2' AS ID, #Column2 AS Name UNION ALL
SELECT '3' AS ID, #Column3 AS Name UNION ALL
SELECT '4' AS ID, #Column4 AS Name UNION ALL
SELECT '5' AS ID, #Column5 AS Name
--Comma Delimite the names to insert into a select statement. Bracket the names in case there are spaces
SELECT #ColumnNamesAll = COALESCE(#ColumnNamesAll + '], [' ,'[') + Name
FROM #FinalNames
WHERE Name IS NOT NULL
ORDER BY ID
--Add an extra bracket at the end to complete the string
SELECT #ColumnNamesAll = #ColumnNamesAll + ']'
--Tell the user if they selected to many columns
IF ( #TotalColumns > 5 AND EXISTS (SELECT 1 FROM #FinalNames WHERE Name IS NOT NULL) )
BEGIN
SELECT 'This script has been designed for up to 5 columns' AS ERROR
UNION ALL
SELECT 'Only the first 5 columns have been selected' AS ERROR
END
IF Object_id('tempdb..#FinalNames') IS NOT NULL DROP TABLE ##OutputTable
--Select results using only the Columns you wanted
IF #TempTable <> 'NULL'
BEGIN
SET #SQL = 'SELECT ' + #ColumnNamesAll + '
INTO ##OutputTable
FROM ' + #TempTable + '
ORDER BY 1'
EXEC (#SQL)
END
IF #PermanentTable <> 'NULL'
BEGIN
SET #SQL = 'SELECT ' + #ColumnNamesAll + '
INTO ##OutputTable
FROM ' + #PermanentTable + '
ORDER BY 1'
EXEC (#SQL)
END
SELECT *
FROM ##OutputTable
SET NOCOUNT OFF
SQL doesn't understand the order of columns. You need to know the column names to get them.
You can look into querying the information_schema to get the column names. For example:
SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'tbl_name'
ORDER BY ordinal_position
LIMIT 2;
You can query the sysobject of the table to find out the first two column then dynamically generate the SQL statement you need.
If you want a permant object that you can query over and over again make a view for each table that only returns the first 2 columns. You can name the columns Column1 and Column2 or use the existing names.
If you want to return the first two columns from any table without any preprocessing steps create a stored procedure that queries the system information and executes a dynamic query that return the first two columns from the table.
Or do I have to go the long way around and find out the column names first? How would I do that?
It's pretty easy to do manually.
Just run this first
select * from tbl where 1=0
This statement works on all major DBMS without needing any system catalogs.
That gives you all the column names, then all you need to do is type the first two
select colname1, colnum2 from tbl