Merge Statement for more than hundreds of columns - sql

I have a Source and Target table that contain more than 150 columns, actually the problem is I need to compare, and insert this 150 column in my MERGE statement, is there any other way to do this?
MERGE targettable AS [Target]
USING (
---Source Query*
) AS [Source] ON [Target].Key = [Source].Key
WHEN MATCHED --Matching records with change //Update
AND [Target].[StartDt] <> [Source].[StartDt]
OR [Target].[ADStatusDesc] <> [Source].[ADStatusDesc]
..... --more than 150 columns
OR [Target].[StatusInd] <> [Source].[StatusInd]
THEN
UPDATE
SET [Target].[StartDt] = [Source].[StartDt]
.... ----more than 150 columns
,[Target].[StatusInd]= [Source].[StatusInd]

Yes, you need to spell them out explicitly. But you can generate that code:
SELECT
col.name
, N' OR NOT(a.' + QUOTENAME(col.name) + N' = b.' + QUOTENAME(col.name) + N' OR (a.' + QUOTENAME(col.name) + N' IS NULL AND b.' + QUOTENAME(col.name) + N' IS NULL))'
FROM sys.columns col
JOIN sys.objects obj ON col.object_id = obj.object_id
JOIN sys.types tp ON col.user_type_id = tp.user_type_id
WHERE obj.name = 'TableNameHere' AND col.is_computed = 0
ORDER BY col.column_id
This properly deals with NULL values as well. For strings, you should probably add collation clauses to use a binary collation.

Are you sure you need to compare all columns on a merge? Can't you simply UPDATE all columns of [Target] with [Source] data whenever [Target].Key = [Source].Key, since [Source] contains the good/new values?
Clever use of text editor features, particularly what's usually called column mode editing, should help you build your SQL statement around a list of column names, one per line (which can be usually extracted from existing SQL statements with even more clever text editing).

Related

Update all rows where "NULL" as string needs to be updated to a DB NULL

Is there a way to change all occurrences of a certain value within SQL regardless of column?
I have a table with ~200 columns which was imported from a text file. The NULL values came through as the string value 'NULL' and occur in most columns within the table. Is there a way to convert those values to true NULL values? I would like to avoid using UPDATE on each individual column is possible.
A single update may not be too painful:
update t
set col1 = nullif(col1, 'NULL'),
col2 = nullif(col2, 'NULL'),
. . .;
You can generate the code in SQL or a spreadsheet by querying INFORMATION_SCHEMA.COLUMNS(or similar) for string columns.
You can use dynamic sql to build out the update script...
DECLARE #update_sql NVARCHAR(MAX) = N''
SELECT
#update_sql = CONCAT(#update_sql, N',
mt.', c.name, N' = NULLIF(mt.', c.name, N', ''NULL'')')
FROM
sys.columns c
WHERE
c.object_id = OBJECT_ID(N'dbo.MyTable')
AND c.collation_name IS NOT NULL; -- easy way to make sure you're only looking at columns that can hold test data.
SET #update_sql = CONCAT(N'
UPDATE mt SET',
STUFF(#update_sql, 1, 1, ''), N'
FROM
dbo.MyTable mt;')
PRINT(#update_sql);
You'll end up with output formatted like the following...
UPDATE mt SET
mt.column_9 = NULLIF(mt.column_9, 'NULL'),
mt.column_10 = NULLIF(mt.column_10, 'NULL'),
mt.column_11 = NULLIF(mt.column_11, 'NULL'),
mt.column_14 = NULLIF(mt.column_14, 'NULL'),
...
mt.column_165 = NULLIF(mt.column_165, 'NULL'),
mt.column_166 = NULLIF(mt.column_166, 'NULL'),
mt.column_167 = NULLIF(mt.column_167, 'NULL'),
mt.column_168 = NULLIF(mt.column_168, 'NULL')
FROM
dbo.MyTable mt;
Note... The PRINT command is limited to 8000 characters of ASCII and 4000 characters of unicode. So, if you notice that the output script is being truncated, post back, I have a "long print" procedure that get around that limitation.
use the merge statement and set null for all matching rows which is fasters and efficient way to do it.
There is no way to do this without doing an update on each individual column.
There are shortcuts to writing such an update, like right-click>script as... or dynamic sql, but so far that's not what you've asked.

How can I exclude GUIDs from a select distinct without listing all other columns in a table?

So let's say 'table' has two columns that act as a GUID - the ID column and msrepl_tran_version. Our original programmer did not know that our replication created this column and included it in a comparison, which has resulted in almost 20,000 records being put into this table, of which only 1,588 are ACTUALLY unique, and it's causing long load times.
I'm looking for a way to exclude the ID and replication columns from a select distinct, without having to then list every single column in the table, since I'm going to have to select from the record set multiple times to fix this (there are other tables affected and the query is going to be ridiculous) I don't want to have to deal with my code being messy if I can help it.
Is there a way to accomplish this without listing all of the other columns?
Select distinct {* except ID, msrepl_tran_version} from table
Other than (where COL_1 is ID and COL_N is the replication GUID)
Select distinct COL_2, ..., COL_N-1, COL_N+1, ... from table
After more searching, I found the answer:
SELECT * INTO #temp FROM table
ALTER TABLE #temp DROP COLUMN id
ALTER TABLE #temp DROP COLUMN msrepl_tran_version
SELECT DISTINCT * FROM #temp
This works for what I need. Thanks for the answers guys!
Absolutely, 100% not possible, there is no subtract columns instruction.
It can't be done in the spirit of the OP's initial question. However, it can be done with dynamic sql:
--Dynamically build list of column names.
DECLARE #ColNames NVARCHAR(MAX) = ''
SELECT #ColNames = #ColNames + '[' + c.COLUMN_NAME + '],'
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA = 'dbo'
AND c.TABLE_NAME = 'YourTable'
--Exclude these.
AND c.COLUMN_NAME NOT IN ('ID', 'msrepl_tran_version')
--Keep original column order for appearance, convenience.
ORDER BY c.ORDINAL_POSITION
--Remove trailing comma.
SET #ColNames = LEFT(#ColNames, LEN(#ColNames) - 1)
--Verify query
PRINT ('SELECT DISTINCT ' + #ColNames + ' FROM [dbo].[YourTable]')
--Uncomment when ready to proceed.
--EXEC ('SELECT DISTINCT ' + #ColNames + ' FROM [dbo].[YourTable]')
One additional note: since you need to select from the record set multiple times and potentially join to other tables, you can use the above to create a view on the table. This should make your code fairly clean.

Is there a better way to apply isnull to all columns than what I'm doing?

A number of times over the last month I've had to replace 'null' fields with '0' to every column returned from a query.
to save a lot of time (some of these are returning a high number of columns) I've been using the following and then pasting the results for relevant columns into a new query:
select ', isnull(' + COLUMN_NAME + ', 0)' + ' as ' + COLUMN_NAME
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'summary_by_scca_sales_category '
and TABLE_SCHEMA = 'property''
Essentially I'm wondering if there's a better way that I can do this? Ideally a method where I could automatically apply isnull to all columns being returned in a query (without using two queries).
For example:
I want to take a query like:
select *
from tablename
And for every column returned by * replace null results with 0 without having to write an isnull() line for each column.
edit:
Will accomplish this with a view (doh, should have thought of that). For interests / educations sake is there a way to do something like this with code also?
You could create a VIEW against the tables in question where the ISNULL logic you want is set up. Then queries against the views would return the data you want.
EDIT:
As requested, some sample code to accomplish creating the VIEWs automatically. This is pretty gross, but for something that only has to be run once it will work. Beware of type issues (you stated everything should transmute to 0 so I assume all your columns are of a suitable numeric type):
DECLARE #table_def varchar(max)
SET #table_def = 'CREATE VIEW <tname>_NoNull AS SELECT '
SELECT #table_def = REPLACE(#table_def, '<tname>', t.name) +
'ISNULL(' + c.name + ', 0) AS ' + c.name + ', '
FROM sys.tables t
INNER JOIN sys.columns c ON t.object_id = c.object_id
WHERE t.name = <<table name>>
SELECT #table_def

How to select some particular columns from a table if the table has more than 100 columns

I need to select 90 columns out of 107 columns from my table.
Is it possible to write select * except( column1,column2,..) from table or any other way to get specific columns only, or I need to write all the 90 columns in select statement?
You could generate the column list:
select name + ', '
from sys.columns
where object_id = object_id('YourTable')
and name not in ('column1', 'column2')
It's possible to do this on the fly with dynamic SQL:
declare #columns varchar(max)
select #columns = case when #columns is null then '' else #columns + ', ' end +
quotename(name)
from sys.columns
where object_id = object_id('YourTable')
and name not in ('column1', 'column2')
declare #query varchar(max)
set #query = 'select ' + #columns + ' from YourTable'
exec (#query)
No, there's no way of doing * EXCEPT some columns. SELECT * itself should rarely, if ever, be used outside of EXISTS tests.
If you're using SSMS, you can drag the "columns" folder (under a table) from the Object Explorer into a query window, and it will insert all of the column names (so you can then go through them and remove the 17 you don't want)
There is no way in SQL to do select everything EXCEPT col1, col2 etc.
The only way to do this is to have your application handle this, and generate the sql query dynamically.
You could potentially do some dynamic sql for this, but it seems like overkill. Also it's generally considered poor practice to use SELECT *... much less SELECT * but not col3, col4, col5 since you won't get consistent results in the case of table changes.
Just use SSMS to script out a select statement and delete the columns you don't need. It should be simple.
No - you need to write all columns you need. You might create an view for that, so your actual statement could use select * (but then you have to list all columns in the view).
Since you should never be using select *, why is this a problem? Just drag the columns over from the Object Explorer and delete the ones you don't want.

MERGE Command in SQL Server

I have been using the statement
insert into target
select * from source
where [set of conditions] for a while.
Recently found this MERGE command that will be more effective to use for my purpose so that I can change the above statement to
MERGE target
USING source ON [my condtion]
WHEN NOT MATCHED BY TARGET
THEN INSERT VALUES (source.col1, source.col2, source.col3)
But the problem for me is lets say if I have 20+ columns in my source table I have to list all of them, I need a way to specify it to insert source.* . Is there a way ? I'm new to SQL. Appreciate your help.
Thanks in advance :)
Me too; I hate typing column names.
I normally build the Merge statement in dynamic SQL.
I have a function that takes a table name as a parameter, and returns a string containing all column names formatted properly with Table Name prefix, [] brackets and comma, as in S.Col1, S.Col2, S.Col3
I could also tell you that I build a temp table with the required columns, and pass the temp table to my function, because some times you don't want a list of all columns. But that would probably be a confusing wooble, obscuring the important bits;
Use dynamic sql
Use a function to create csv list of columns.
Everything that I have read regarding the MERGE statement says that you need to specify the columns for your INSERT statement. If you are looking for a quick way to get the INSERT statment, you can right mouse click the table in SSMS and select Script Table As->INSERT To->Clipboard. You can then paste this into your query and alter just the VALUES part.
Merge statement
There's simply no advantage of using MERGE in this situation. Why overcomplicate? Stick to the KISS principle, for chrissake.
Anyways, here's the script:
declare
#targetTableName varchar(100) = 'target'
,#targetSchemaName varchar(20) = 'dbo'
,#sourceTableName varchar(100) = 'source'
,#sourceSchemaName varchar(20) = 'dbo2'
,#matchCondition varchar(50) = 't.id = s.id'
,#columns varchar(max)
set #columns = (select ','+quotename(c.name)
from sys.tables t
join sys.columns as c on t.object_id = c.object_id
join sys.schemas s on s.schema_id = t.schema_id
where t.name = #targetTableName and s.name = isnull(#targetSchemaName, s.name)
for xml path(''))
--a column name starts with a comma
declare #sql varchar(max) = '
merge #target t
using #source s on #matchCondition
when not matched then
insert (#columns)
values #sourceColumns'
set #sql =
replace(replace(replace(replace(replace(#sql
, '#matchCondition', #matchCondition)
--replace #columns with column list with the first comma removed
, '#columns', stuff(#columns, 1, 1, ''))
--replace #sourceColumns with column list with the 's.' prefix and comma removed
, '#sourceColumns', stuff(replace(#columns, ',', ',s.'),1,1,''))
, '#target', quotename(#targetSchemaName)+'.'+quotename(#targetTableName))
, '#source', quotename(#sourceSchemaName)+'.'+quotename(#sourceTableName))
print #sql
--exec(#sql)
And we'll get something like this:
merge [dbo].[target] t
using [dbo2].[source] s on t.id = s.id
when not matched then
insert ([column1], [column2], [column3], [column4])
values s.[column1], s.[column2], s.[column3], s.[column4]