Synapse:CopyActivity,Upsert option:The data type xml cannot be used as operand to the UNION, INTERSECT or EXCEPT operators because it not comparable - azure-synapse

Synapse:CopyActivity,Upsert option:The data type xml cannot be used as operand to the UNION, INTERSECT or EXCEPT operators because it not comparable
In database i checked sink table have 2 xml column so i am getting error. How to over come this error. I dont want to change datatype in table.

I used script activity in synapse to cast xml data to varchar in each table and then to union all the tables.
Then I copied the data to target table which has xml data type using copy activity. Below are the steps.,
Source table and target table has xml data type column
Use the below script in script activity for dynamically converting all xml data to varchar.
Declare #column_list varchar(max)
Set #column_list = (
SELECT
STUFF((SELECT ',' + case when system_type_id= 241 then 'Cast(' + name + ' as nvarchar(max)) as ' + name Else name End FROM sys.columns SCI
WHERE SCI.object_id =SCO.object_id
ORDER BY column_id
FOR XML PATH('')), 1, 1, '') AS 'a'
FROM sys.columns SCO where object_id = object_id('table1')
group by object_id)
Declare #sql nvarchar(max) = ''
Set #sql = 'Drop table if exists union_stg_table; Select * into union_stg_table From (
Select ' + #column_list + ' From table1
union
Select ' + #column_list + ' From table2
union
Select ' + #column_list + ' From table3
)A'
exec sp_executesql #sql
Copy the script activity's output table into the target table using copy activity. Target table has xml data type.
As wbob-MSFT has suggested, you can also use stored procedure for transformation. Similarly, we can use intersection, except statements between two tables with the above approach.

XML data type is not supported in synapse
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-data-types#unsupported-data-types

Related

Dynamic SQL causing SQL statement failed error

I am trying to select all the contents of all the columns stored in a variable from a table in SQL Server using dynamic SQL.
Here is my code:
IF #UserChoice# = 'MANGO' BEGIN
DECLARE #columns NVARCHAR(MAX)
SELECT #columns = COALESCE(#columns + ', ', '') + Cols
FROM (
SELECT DISTINCT x.value('local-name(.)', 'SYSNAME') AS Cols
FROM DBO.Fruits AS t
CROSS APPLY (SELECT t.* FOR XML PATH(''), TYPE, ROOT('root')) AS t1(c)
CROSS APPLY c.nodes('/root/*') AS t2(x)
) temp
DECLARE #sql NVARCHAR(MAX) = 'SELECT ' + #columns + ' FROM DBO.Foods;'
EXEC sp_executesql #sql;
END
In this code, there are two tables dbo.Fruits has the columns and dbo.Foods has all the columns existing in dbo.Fruits and additional columns and rows. However, dbo.Fruits has some null columns that is columns with no data in all of its rows. Thus with the help of XML null columns are removed and only non-null columns are stored in a variable #columns.
Then a dynamic SQL is written that performs select statement of the #columns from the table dbo.Foods.
Filtering of null columns is working however when I try to run the dynamic SQL. I get an error saying SQL statement failed.
FYI: I have huge data in the table. I have also tried timeout feature but not working.
Any help is appreciated.
Thanks in advance.

How to convert XML type return text into select columns

I'm trying to get the column names of a table using XML datatype and information_schema columns. When I tried to use the result in another select statement, I have the results with the repeated column name instead of the results set. I have even tried to cast it to varchar but it still failed. what have done wrong ?
DECLARE #TSQL1 varchar(1000);
SELECT #TSQL1 = CAST((SELECT SUBSTRING((SELECT ', ' + QUOTENAME(COLUMN_NAME)
FROM [ProdLS].[ information_schema.columns]
WHERE table_name = 'roles'
ORDER BY ORDINAL_POSITION
FOR XML PATH('')), 3, 200000)) AS varchar(max));
SELECT #TSQL1
FROM [aubeakon_scrm4].[acl_roles]
My query to get the results from roles table using the column name retrieved from.
You cannot execute dynamic SQL like that. You need to use sp_executesql. You also need to declare dynamic SQL as nvarchar(max).
You should also use .value to unescape the XML
DECLARE #TSQL1 nvarchar(max) = N'
SELECT
' + STUFF((
SELECT ', ' + QUOTENAME(COLUMN_NAME)
FROM [ProdLS].[information_schema].columns
WHERE table_name = 'roles'
ORDER BY ORDINAL_POSITION
FOR XML PATH(''), TYPE
).value('text()[1]', 'nvarchar(max)'), 1, LEN(', '), '') + '
FROM [aubeakon_scrm4].[acl_roles];
';
EXEC sp_executesql #TSQL1;

function to return a column separated list of column names

I would like to create a function that returns a comma separated list of field name for any given table. The function should accept database, schema and table name as input as return the comma separated list.
I can do this in a stored procedure but I want to do this in a function so I can join it into datasets. However I am problems with dynamic sql is not allowed in function - so how can I do this?
here is the proc which i want to duplicate in a function
alter proc dbo.usp_generate_column_name_string
#database varchar(100),#schema varchar(100), #table varchar(100)
as
declare #str varchar(max) = '
select stuff((select '','' + name as [text()] from
(
select c.name from ' + #database + '.sys.tables a
inner join ' + #database + '.sys.schemas b on a.schema_id = a.schema_id
inner join ' + #database + '.sys.columns c on c.object_id= a.object_id
where b.name = '''+#schema+''' and a.name ='''+#table+''') x
for xml path ('''')),1,1,'''')
'
exec (#str)
go
exec dbo.usp_generate_column_name_string 'test' , 'dbo','jl1_tmp'
There are so many ways to do it, one easier way is to insert the proc result into a temp table and use it in join
create table #coltemp(colList varchar(max))
insert into #coltemp
exec dbo.usp_generate_column_name_string 'test' , 'dbo','jl1_tmp'
select * from #coltemp
check the following question to know about diff ways to insert proc results into temp table Insert results of a stored procedure into a temporary table
Here is the basic idea:
create function usp_generate_column_name_string (
#schema varchar(100),
#table varchar(100)
)
returns varchar(max) as
begin
return (select stuff( (select ',' + column_name
from information_schema.columns
where table_name = #table and table_schema = #schema
for xml path ('')
), 1, 1, ''
)
);
end;
Notes:
This doesn't handle special characters in the column names. I'm not sure how you want to escape those, but the logic is easily adjusted.
Database is left out. That is much harder in SQL Server, because the system tables are organized by database. If that is a requirement, you basically cannot do this (easily).

isnull for dynamically Generated column

I am getting temp table with dynamically generated columns let say it is columns A,B,C,D etc from other source.
Now in my hand I have temp table with column generated. I had to write stored procedure with the use of temp table.
So my stored procedure is like
create proc someproc()
as
begin
Insert into #searchtable
select isnull(#temp.*,0.00)
End
Now #searchresult is table created by me to store temp table columns. The problem arises when I want to check isnull for #tempdb columns. Because from source it comes it may be 3 columns, again next time it may be 4 columns. It changes.
Since it is dynamically generated I cannot use each column name and use like below:
isnull(column1,0.00)
isnull(column2,0.00)
I had to use all column generated and check if value is empty use 0.00
I tried this below but not working:
isnull(##temp.*,0.00),
Try with Dynamic code by fetching the column name for your dynamic table from [database].NFORMATION_SCHEMA.COLUMNS
--Get the Column Names for the your dynamic table and add the ISNULL Check:
DECLARE #COLS VARCHAR(MAX) = ''
SELECT #COLS = #COLS + ', ISNULL(' + COLUMN_NAME + ', 0.00) AS ' + COLUMN_NAME
FROM tempdb.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME LIKE '#temp[_]%' -- Dynamic Table (here, Temporary table)
DECLARE #COLNAMES VARCHAR(MAX) = STUFF(#COLS, 1, 1, '')
--Build your Insert Command:
DECLARE #cmd VARCHAR(MAX) = '
INSERT INTO #temp1
SELECT ' + #COLNAMES + ' FROM #temp'
--Execute:
EXEC (#cmd)
Hope, I understood your comment right:
CREATE PROCEDURE someproc
AS
IF OBJECT_ID(N'#searchtable') IS NOT NULL DROP TABLE #searchtable
IF OBJECT_ID(N'#temp') IS NOT NULL
BEGIN
DECLARE #sql nvarchar(max),
#cols nvarchar(max)
SELECT #cols = (
SELECT ',COALESCE('+QUOTENAME([name])+',0.00) as '+QUOTENAME([name])
FROM sys.columns
WHERE [object_id] = OBJECT_ID(N'#temp')
FOR XML PATH('')
)
SELECT #sql = N'SELECT '+STUFF(#cols,1,1,'')+' INTO #searchtable FROM #temp'
EXEC sp_executesql #sql
END
This SP checks if #temp table exists. If exists then it takes all column names from sys.columns table and we make a string like ,COALESCE([Column1],0.00) as [Column1], etc. Then we make a dynamic SQL query like:
SELECT COALESCE([Column1],0.00) as [Column1] INTO #searchtable FROM #temp
And execute it. This query result will be stored in #searchtable.
Notes: Use COALESCE instead of ISNULL, and sp_executesql instead of direct exec. It is a good practice.

MS SQL Store Procedure to Merge Multiple Rows into Single Row based on Variable Table and Column Names

I'm working with MS SQL Server 2008. I'm trying to create a stored procedure to Merge (perhaps) several rows of data (answers) into a single row on target table(s). This uses a 'table_name' field and 'column_name' field from the answers table. The data looks like something like this:
answers table
--------------
id int
table_name varchar
column_name varchar
answer_value varchar
So, the target table (insert/update) would come from the 'table_name'. Each row from the anwsers would fill one column on the target table.
table_name_1 table
--------------
id int
column_name_1 varchar
column_name_2 varchar
column_name_3 varchar
etc...
Note, there can be many target tables (variable from answers table: table_name_1, table_name_2, table_name_3, etc.) that insert into many columns (column_name_1...2...3) on each target table.
I thought about using a WHILE statement to loop through the answers table. This could build a variable which would be the insert/update statement(s) for the target tables. Then executing those statements somehow. I also noticed Merge looks like it might help with this problem (select/update/insert), but my MS SQL Stored Procedure experience is very little. Could someone suggestion a strategy or solution to this problem?
Note 6/23/2014: I'm considering using a single Merge statement, but I'm not sure it is possible.
I'm probably missing something, but the basic idea to solve the problem is to use meta-programming, like a dynamic pivot.
In this particular case there is another layer to make the solution more difficult: the result need to be in different execution instead of beeing grouped.
The backbone for a possible solution is
DECLARE #cols AS NVARCHAR(MAX)
DECLARE #query AS NVARCHAR(MAX)
--using a cursor on SELECT DISTINCT table_name FROM answers iterate:
--*Cursor Begin Here*
--mock variable for the first value of the cursor
DECLARE #table AS NVARCHAR(MAX) = 't1'
-- Column list
SELECT #cols = STUFF((SELECT distinct
',' + QUOTENAME(column_name)
FROM answers with (nolock)
WHERE table_name = #table
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
, 1, 1, '')
--Query definition
SET #query = '
SELECT ' + #cols + '
INTO ' + #table + '
FROM (SELECT column_name, answer_value
FROM answers
WHERE table_name = ''' + #table + ''') b
PIVOT (MAX(answer_value) FOR column_name IN (' + #cols + ' )) p '
--select #query
EXEC sp_executesql #query
--select to verify the execution
--SELECT * FROM t1
--*Cursor End Here*
SQLFiddle Demo
The cursor definition is omitted, because I'm not sure if it'll work on SQLFiddle
In addition to the template for a Dynamic Pivot the columns list is filtered by the new table name, and in the query definition there is a SELECT ... INTO instead of a SELECT.
This script does not account for table already in the database, if that's a possibility the query can be divided in two:
SET #query = '
SELECT TOP 0 ' + #cols + '
INTO ' + #table + '
FROM (SELECT column_name, answer_value
FROM answers
WHERE table_name = ''' + #table + ''') b
PIVOT (MAX(answer_value) FOR column_name IN (' + #cols + ' )) p '
to create the table without data, if needed, and
SET #query = '
INSERT INTO ' + #table + '(' + #cols + ')'
SELECT ' + #cols + '
FROM (SELECT column_name, answer_value
FROM answers
WHERE table_name = ''' + #table + ''') b
PIVOT (MAX(answer_value) FOR column_name IN (' + #cols + ' )) p '
or a MERGE to insert/update the values in the table.
Another possibility will be to DROP and recreate every table.
Approach I took to this complex problem:
Create several temporary tables to work with your data
Select and populate the temporary tables with the data
Use dynamic pivoting to pivot the rows into one row
Use a CURSOR with WHILE loop for multiple table entries
SET #query with the dynamically built MERGE statement
EXECUTE(#query)
Drop temporary tables