How to use SQL column name only if the column exists? - sql

I've been trying to get a textual result from CASE(def.OPTION_CATEGORY_ID) in a select, but I'm having a hard time implementing the terms.
What I'm trying to do is to check if OPTION_CATEGORY_ID is an existing column in sys.columns. If it is then I'm trying to make the int to text translation using the When - Then on the bottom but the SQL is not aware of the column name so CASE(def.OPTION_CATEGORY_ID) is failing because the column name is invalid.
Is there any way to call def.OPTION_CATEGORY_ID using an alias name so the SQL won't fail it before hand?
Thanks
SELECT DISTINCT *
FROM
(SELECT def.OPTION_DEF_ID AS 'Def ID',
ass.ASSET_NAME AS 'Asset Name',
CASE(def.OPTION_TYPE_ID)
WHEN 1 THEN 'A'
WHEN 2 THEN 'B'
END AS 'Option Type',
case when exists (SELECT name
FROM sys.columns
WHERE Name = 'OPTION_CATEGORY_ID'
AND Object_ID = Object_ID(N'TFC_OPTION_DEFINITION'))
then
-- Column Exists
CASE(def.OPTION_CATEGORY_ID)
WHEN 1 THEN 'C'
WHEN 2 THEN 'D'
WHEN 3 THEN 'E'
WHEN 4 THEN 'F'
end
End AS 'test' ,
-- the rest of the select

If you want to select a column and you are not sure if it exists, you cannot write it explicitly in your code, and expect it to compile.
You should build the statement dynamically based on the run time information of which the column exists or not:
DECLARE #statement VARCHAR(MAX)
IF((SELECT COUNT(*)
FROM sys.columns
WHERE Name = 'OPTION_CATEGORY_ID'
AND Object_ID = Object_ID(N'TFC_OPTION_DEFINITION')) > 0)
BEGIN
SET #statement = --assign a query which uses OPTION_CATEGORY_ID
END
ELSE
BEGIN
SET #statement = --assign a query which does not use OPTION_CATEGORY_ID
END
EXEC sp_executesql #statement

Related

ms sql server how to check table has “id” column and count rows if "id" exist

There are too many tables in my SQL Server db. Most of them have an 'id' column, but some do not. I want to know which table(s) doesn't have the 'id' column and to count the rows where id=null if an 'id' column exists. The query results may look like this:
TABLE_NAME | HAS_ID | ID_NULL_COUNT | ID_NOT_NULL_COUNT
table1 | false | 0 | 0
table2 | true | 10 | 100
How do I write this query?
Building query:
WITH cte AS (
SELECT t.*, has_id = CASE WHEN COLUMN_NAME = 'ID' THEN 'true' ELSE 'false' END
FROM INFORMATION_SCHEMA.TABLES t
OUTER APPLY (SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE t.TABLE_NAME = c.TABLE_NAME
AND t.[TABLE_SCHEMA] = c.[TABLE_SCHEMA]
AND c.COLUMN_NAME = 'id') s
WHERE t.TABLE_SCHEMA IN (...)
)
SELECT
query_to_run = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
'SELECT tab_name = ''<tab_name>'',
has_id = ''<has_id>'',
id_null_count = <id_null_count>,
id_not_null_count = <id_not_null_count>
FROM <schema_name>.<tab_name>'
,'<tab_name>', TABLE_NAME)
,'<schema_name>', TABLE_SCHEMA)
,'<has_id>', has_id)
,'<id_null_count>', CASE WHEN has_id = 'false' THEN '0' ELSE 'SUM(CASE WHEN id IS NULL THEN 1 END)' END)
,'<id_not_null_count>', CASE WHEN has_id = 'false' THEN '0' ELSE 'COUNT(id)' END)
FROM cte;
Copy the output and execute in separate window. UNION ALL could be added to get single resultset.
db<>fiddle demo
This might be useful for you... lists out the row count for all tables that have an "id" column. It filters out tables that start with "sys" because those are mostly internal tables. If you have a table that starts with "sys", you'll probably want to delete that part of the WHERE clause.
SELECT DISTINCT OBJECT_NAME(r.[object_id]) AS [TableName], [row_count] AS [RowCount]
FROM sys.dm_db_partition_stats r
WHERE index_id = 1
AND EXISTS (SELECT 1 FROM sys.columns c WHERE c.[object_id] = r.[object_id] AND c.[name] = N'id')
AND OBJECT_NAME(r.[object_id]) NOT LIKE 'sys%'
ORDER BY [TableName]
Note you can change the "c.[name] = N'id'" to be any column name, or even change the "=" to "<>" to find only tables without an id column
pmbAustin answers how to list all tables without "ID" column.
To know how many rows in each table, SQL Server has a built-in report for you.
Right click the database in SSMS, click "Reports", "Standard Reports" then "Disk Usage by Table"
You now know how many rows in each table, and from pmbAustin's answer you know how which tables do and do not have "ID" columns. with a simple Vlookup in Excel you can combine these two datasets to arrive at any answer you wish.
This will give you the info about which tables have or not have column named "ID":
SELECT Table_Name
, case when column_name not like '%ID%' then 'false'
else 'true'
end as HAS_ID
FROM INFORMATION_SCHEMA.COLUMNS;
Here is a small demo
And here is one way that you can use to select all the tables that have columns named ID and if this columns are null or not:
CREATE TABLE #AllIDSNullable (TABLE_NAME NVARCHAR(256) NOT NULL
, HAS_ID VARCHAR(10)
, ID_NULL_COUNT INT DEFAULT 0
, ID_NOT_NULL_COUNT INT DEFAULT 0);
DECLARE CT CURSOR FOR
SELECT Table_Name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE column_name = 'ID';
DECLARE #name NVARCHAR(MAX), #SQL NVARCHAR(MAX);
OPEN CT; FETCH NEXT FROM CT INTO #name;
WHILE ##FETCH_STATUS=0 BEGIN
SET #SQL = 'INSERT #AllIDSNullable (TABLE_NAME , HAS_ID) SELECT Table_Name, case when column_name not like ''%ID%'' then ''false'' else ''true'' end FROM INFORMATION_SCHEMA.COLUMNS;';
EXEC (#SQL);
SET #SQL = 'UPDATE #AllIDSNullable SET ID_NULL_COUNT = (SELECT COUNT(*) FROM ['+#name+'] WHERE ID IS NULL), ID_NOT_NULL_COUNT = (SELECT COUNT(*) FROM ['+#name+'] WHERE ID IS NOT NULL) WHERE TABLE_NAME='''+#name+''';';
EXEC (#SQL);
FETCH NEXT FROM CT INTO #name;
END;
CLOSE CT;
SELECT *
FROM #AllIDSNullable;
Here is a demo
Result:

How to UPDATE all columns of a record without having to list every column

I'm trying to figure out a way to update a record without having to list every column name that needs to be updated.
For instance, it would be nice if I could use something similar to the following:
// the parts inside braces are what I am trying to figure out
UPDATE Employee
SET {all columns, without listing each of them}
WITH {this record with id of '111' from other table}
WHERE employee_id = '100'
If this can be done, what would be the most straightforward/efficient way of writing such a query?
It's not possible.
What you're trying to do is not part of SQL specification and is not supported by any database vendor. See the specifications of SQL UPDATE statements for MySQL, Postgresql, MSSQL, Oracle, Firebird, Teradata. Every one of those supports only below syntax:
UPDATE table_reference
SET column1 = {expression} [, column2 = {expression}] ...
[WHERE ...]
This is not posible, but..
you can doit:
begin tran
delete from table where CONDITION
insert into table select * from EqualDesingTabletoTable where CONDITION
commit tran
be carefoul with identity fields.
Here's a hardcore way to do it with SQL SERVER. Carefully consider security and integrity before you try it, though.
This uses schema to get the names of all the columns and then puts together a big update statement to update all columns except ID column, which it uses to join the tables.
This only works for a single column key, not composites.
usage: EXEC UPDATE_ALL 'source_table','destination_table','id_column'
CREATE PROCEDURE UPDATE_ALL
#SOURCE VARCHAR(100),
#DEST VARCHAR(100),
#ID VARCHAR(100)
AS
DECLARE #SQL VARCHAR(MAX) =
'UPDATE D SET ' +
-- Google 'for xml path stuff' This gets the rows from query results and
-- turns into comma separated list.
STUFF((SELECT ', D.'+ COLUMN_NAME + ' = S.' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #DEST
AND COLUMN_NAME <> #ID
FOR XML PATH('')),1,1,'')
+ ' FROM ' + #SOURCE + ' S JOIN ' + #DEST + ' D ON S.' + #ID + ' = D.' + #ID
--SELECT #SQL
EXEC (#SQL)
In Oracle PL/SQL, you can use the following syntax:
DECLARE
r my_table%ROWTYPE;
BEGIN
r.a := 1;
r.b := 2;
...
UPDATE my_table
SET ROW = r
WHERE id = r.id;
END;
Of course that just moves the burden from the UPDATE statement to the record construction, but you might already have fetched the record from somewhere.
How about using Merge?
https://technet.microsoft.com/en-us/library/bb522522(v=sql.105).aspx
It gives you the ability to run Insert, Update, and Delete. One other piece of advice is if you're going to be updating a large data set with indexes, and the source subset is smaller than your target but both tables are very large, move the changes to a temporary table first. I tried to merge two tables that were nearly two million rows each and 20 records took 22 minutes. Once I moved the deltas over to a temp table, it took seconds.
If you are using Oracle, you can use rowtype
declare
var_x TABLE_A%ROWTYPE;
Begin
select * into var_x
from TABLE_B where rownum = 1;
update TABLE_A set row = var_x
where ID = var_x.ID;
end;
/
given that TABLE_A and TABLE_B are of same schema
It is possible. Like npe said it's not a standard practice. But if you really have to:
1. First a scalar function
CREATE FUNCTION [dte].[getCleanUpdateQuery] (#pTableName varchar(40), #pQueryFirstPart VARCHAR(200) = '', #pQueryLastPart VARCHAR(200) = '', #pIncludeCurVal BIT = 1)
RETURNS VARCHAR(8000) AS
BEGIN
DECLARE #pQuery VARCHAR(8000);
WITH cte_Temp
AS
(
SELECT
C.name
FROM SYS.COLUMNS AS C
INNER JOIN SYS.TABLES AS T ON T.object_id = C.object_id
WHERE T.name = #pTableName
)
SELECT #pQuery = (
CASE #pIncludeCurVal
WHEN 0 THEN
(
STUFF(
(SELECT ', ' + name + ' = ' + #pQueryFirstPart + #pQueryLastPart FROM cte_Temp FOR XML PATH('')), 1, 2, ''
)
)
ELSE
(
STUFF(
(SELECT ', ' + name + ' = ' + #pQueryFirstPart + name + #pQueryLastPart FROM cte_Temp FOR XML PATH('')), 1, 2, ''
)
) END)
RETURN 'UPDATE ' + #pTableName + ' SET ' + #pQuery
END
2. Use it like this
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery(<your table name>, <query part before current value>, <query part after current value>, <1 if current value is used. 0 if updating everything to a static value>);
EXEC (#pQuery)
Example 1: make all employees columns 'Unknown' (you need to make sure column type matches the intended value:
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery('employee', '', 'Unknown', 0);
EXEC (#pQuery)
Example 2: Remove an undesired text qualifier (e.g. #)
DECLARE #pQuery VARCHAR(8000) = dte.getCleanUpdateQuery('employee', 'REPLACE(', ', ''#'', '''')', 1);
EXEC (#pQuery)
This query can be improved. This is just the one I saved and sometime I use. You get the idea.
Similar to an upsert, you could check if the item exists on the table, if so, delete it and insert it with the new values (technically updating it) but you would lose your rowid if that's something sensitive to keep in your case.
Behold, the updelsert
IF NOT EXISTS (SELECT * FROM Employee WHERE ID = #SomeID)
INSERT INTO Employee VALUES(#SomeID, #Your, #Vals, #Here)
ELSE
DELETE FROM Employee WHERE ID = #SomeID
INSERT INTO Employee VALUES(#SomeID, #Your, #Vals, #Here)
you could do it by deleting the column in the table and adding the column back in and adding a default value of whatever you needed it to be. then saving this will require to rebuild the table

SQL Server stored procedure/cursor

I have a list of 20 spatial tables (Zoom1-Zoom20) and from time invalid geometry pops up in these tables. When the invalid geometry occurs I run the following statement to find where the invalid geometry is:
SELECT ID FROM Zoom10 WhERE Location.STIsValid() = 0
Typically I have to run the above statement for every Zoom table (the error that leads to the invalid geometry does not indicate which zoom table has invalid geometry) and when a result is returned from the select statement I run the following statement to correct the geometry:
UPDATE MGeoZoom10 set Location = Location.MakeValid() where Location.STIsValid() = 0
My question is can this process be automated with a stored procedure that gets the list of zoom tables
select name from sys.tables where name like '%zoom'
and then loops through the zoom tables with
SELECT ID FROM Zoom10 WhERE Location.STIsValid() = 0
and if a result is returned it runs the update statement on the zoom table?
Try this:
sp_msforeachtable '
if ''?'' Like ''%Zoom%''
Begin
If Exists(SELECT ID FROM ? WhERE Location.STIsValid() = 0)
UPDATE ? set Location = Location.MakeValid() where Location.STIsValid() = 0
End
'
Do you have 2 UDFs called STIsValid and MakeValid? If so, you could do something like this...
SELECT id INTO #Processed FROM Sysobjects WHERE name = '(no such table)'
DECLARE #TableId int, #TableName varchar(255), #CorrectionSQL varchar(255)
SELECT #TableId = MIN(id) FROM Sysobjects WHERE type = 'U' AND name LIKE '%zoom'
AND id NOT IN (SELECT id FROM #Processed)
SET #TableId = ISNULL(#TableId, -1)
WHILE #TableId > -1 BEGIN
PRINT #TableId
SELECT #TableName = name FROM Sysobjects WHERE type = 'U' AND id = #TableId
SET #CorrectionSQL = 'UPDATE ' + #TableName + ' SET Location = dbo.MakeValid(Location) where dbo.STIsValid(Location) = 0'
PRINT #CorrectionSQL
EXEC(#CorrectionSQL)
INSERT INTO #Processed (id) VALUES(#TableId)
SELECT #TableId = MIN(id) FROM Sysobjects WHERE type = 'U' AND name IN ('DimAccount', 'DimCurrency', 'DimCustomer')
AND id NOT IN (SELECT id FROM #Processed)
END

SQL omit columns with all same value

Is there a way to write a SQL query to omit columns that have all of the same values? For example,
row A B
1 9 0
2 7 0
3 5 0
4 2 0
I'd like to return just
row A
1 9
2 7
3 5
4 2
Although it is possible to use SQL to find if all rows in a column have identical values, there is no way to make a fixed SQL statement not return a column based on the content of the query.
Here is how to find out if all items in a column have identical values:
SELECT COUNT(row)=COUNT(DISTINCT B) FROM my_table
You can run a preliminary query to see if a column needs to be displayed, and then form the query dynamically, including the column only when you need it.
The only way to change what columns are returned is by executing separate queries. So you'd have to do something like:
IF EXISTS(SELECT null FROM myTable WHERE B <> 0)
BEGIN
SELECT row, A, B FROM myTable
END
ELSE
BEGIN
SELECT row, A FROM myTable
END
However it's generally bad practice to return different columns based on the data - otherwise you make the client determine if a particular column is in the result set first before trying to access the data.
This sort of requirement is more commonly done when displaying the data, e.g. in a web page, in a report, etc.
There is no way to write a query statement that will only return columns that have disparate values. You can however use some conditonnal statements to execute different queries based on your needs.
You could also insert your query result into a temporary table, loop over the columns, build a new select statement that includes only the columns that have different values and finally execute the statement.
Note: You should probably just include bit columns to indicate wheter columns are all duplicates or not. The application could then just discard any column that has been indicated as all duplicates.
Anyway, here's an example solution for SQL SERVER
-- insert results into a temp table
SELECT *
INTO #data
FROM (
SELECT 1 AS col1, 1 AS col2, 2 AS col3
UNION ALL
SELECT 2, 1, 3
) d;
DECLARE
#column sysname,
#sql nvarchar(max) = '',
#finalSql nvarchar(500) = 'SELECT ',
#allDuplicates bit;
DECLARE colsCur CURSOR
FOR
SELECT name
FROM tempdb.sys.columns
WHERE object_id = OBJECT_ID('tempdb..#data');
OPEN colsCur;
FETCH NEXT FROM colsCur INTO #column;
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = N'SELECT #allDuplicates = CASE COUNT(DISTINCT ' + #column + ') WHEN 1 THEN 1 ELSE 0 END FROM #data';
EXEC sp_executesql #sql, N'#allDuplicates int OUT', #allDuplicates = #allDuplicates OUT;
IF #allDuplicates = 0 SET #finalSql = #finalSql + #column + ',';
FETCH NEXT FROM colsCur INTO #column;
END
CLOSE colsCur;
DEALLOCATE colsCur;
SET #finalSql = LEFT(#finalSql, LEN(#finalSql) - 1) + ' FROM #data';
EXEC sp_executesql #finalSql;
DROP TABLE #data;

SQL Query to check if 40 columns in table is null

How do I select few columns in a table that only contain NULL values for all the rows?
Suppose if Table has 100 columns, among this 100 columns 60 columns has null values.
How can I write where condition to check if 60 columns are null.
maybe with a COALESCE
SELECT * FROM table WHERE coalesce(col1, col2, col3, ..., colN) IS NULL
where c1 is null and c2 is null ... and c60 is null
shortcut using string concatenation (Oracle syntax):
where c1||c2||c3 ... c59||c60 is null
First of all, if you have a table that has so many nulls and you use SQL Server 2008 - you might want to define the table using sparse columns (http://msdn.microsoft.com/en-us/library/cc280604.aspx).
Secondly I am not sure if coalesce solves the question asks - it seems like Ammu might actually want to find the list of columns that are null for all rows, but I might have misunderstood. Nevertheless - it is an interesting question, so I wrote a procedure to list null columns for any given table:
IF (OBJECT_ID(N'PrintNullColumns') IS NOT NULL)
DROP PROC dbo.PrintNullColumns;
go
CREATE PROC dbo.PrintNullColumns(#tablename sysname)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #query nvarchar(max);
DECLARE #column sysname;
DECLARE columns_cursor CURSOR FOR
SELECT c.name
FROM sys.tables t JOIN sys.columns c ON t.object_id = c.object_id
WHERE t.name = #tablename AND c.is_nullable = 1;
OPEN columns_cursor;
FETCH NEXT FROM columns_cursor INTO #column;
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #query = N'
DECLARE #c int
SELECT #c = COUNT(*) FROM ' + #tablename + ' WHERE ' + #column + N' IS NOT NULL
IF (#c = 0)
PRINT (''' + #column + N''');'
EXEC (#query);
FETCH NEXT FROM columns_cursor INTO #column;
END
CLOSE columns_cursor;
DEALLOCATE columns_cursor;
SET NOCOUNT OFF;
RETURN;
END;
go
If you don't want to write the columns names, Try can do something like this.
This will show you all the rows when all of the columns values are null except for the columns you specified (IgnoreThisColumn1 & IgnoreThisColumn2).
DECLARE #query NVARCHAR(MAX);
SELECT #query = ISNULL(#query+', ','') + [name]
FROM sys.columns
WHERE object_id = OBJECT_ID('yourTableName')
AND [name] != 'IgnoreThisColumn1'
AND [name] != 'IgnoreThisColumn2';
SET #query = N'SELECT * FROM TmpTable WHERE COALESCE('+ #query +') IS NULL';
EXECUTE(#query)
Result
If you don't want rows when all the columns are null except for the columns you specified, you can simply use IS NOT NULL instead of IS NULL
SET #query = N'SELECT * FROM TmpTable WHERE COALESCE('+ #query +') IS NOT NULL';
Result
[
Are you trying to find out if a specific set of 60 columns are null, or do you just want to find out if any 60 out of the 100 columns are null (not necessarily the same 60 for each row?)
If it is the latter, one way to do it in oracle would be to use the nvl2 function, like so:
select ... where (nvl2(col1,0,1)+nvl2(col2,0,1)+...+nvl2(col100,0,1) > 59)
A quick test of this idea:
select 'dummy' from dual where nvl2('somevalue',0,1) + nvl2(null,0,1) > 1
Returns 0 rows while:
select 'dummy' from dual where nvl2(null,0,1) + nvl2(null,0,1) > 1
Returns 1 row as expected since more than one of the columns are null.
It would help to know which db you are using and perhaps which language or db framework if using one.
This should work though on any database.
Something like this would probably be a good stored procedure, since there are no input parameters for it.
select count(*) from table where col1 is null or col2 is null ...
Here is another method that seems to me to be logical as well (use Netezza or TSQL)
SELECT KeyColumn, MAX(NVL2(TEST_COLUMN,1,0) AS TEST_COLUMN
FROM TABLE1
GROUP BY KeyColumn
So every TEST_COLUMN that has MAX value of 0 is a column that contains all nulls for the record set. The function NVL2 is saying if the column data is not null return a 1, but if it is null then return a 0.
Taking the MAX of that column will reveal if any of the rows are not null. A value of 1 means that there is at least 1 row that has data. Zero (0) means that each row is null.
I use the below query when i have to check for multiple columns NULL. I hope this is helpful . If the SUM comes to a value other than Zero , then you have NULL in that column
select SUM (CASE WHEN col1 is null then 1 else 0 end) as null_col1,
SUM (CASE WHEN col2 is null then 1 else 0 end) as null_col2,
SUM (CASE WHEN col3 is null then 1 else 0 end) as null_col3, ....
.
.
.
from tablename
you can use
select NUM_NULLS , COLUMN_NAME from all_tab_cols where table_name = 'ABC' and COLUMN_NAME in ('PQR','XYZ');