Replacing unwanted portions of a string value in SQL column - sql

I have a table in the following format where COL1 contains a unique identifier and COL2 contains a collection of phone numbers followed by a tag (<abc> or <def>) and delimited by pipe (|). Number of phone records in each row is unknown - it may contain just one phone number followed by tag or upto 10.
Table
----------
COL1 : COL2
----------
ID1 : 1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>
I need to copy this data into a new table with the result in following format i.e. remove all portion of the string with the tag <def>.
Table
----------
COL1 : COL2
----------
ID1 : 1234567890<abc>,4312314124<abc>,4131234131<abc>
What are the best ways to do this to get optimum performance? I need the program to transform data in a table that contains about a million records.

If performance is important then I would suggest delimitedSplit8k_Lead. You can just use the pipe as a delimiter to split the string then exclude items (tokens) that don't end with .
DECLARE #table TABLE (COL1 VARCHAR(10), COL2 VARCHAR(1000));
INSERT #table
VALUES
('ID1','1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>'),
('ID2','2662314129<abc>|7868845133<abc>|6831234131<abc>|41234139999<xxx>|1234567999<abc>')
SELECT t.COL1, ds.item
FROM #table t
CROSS APPLY dbo.DelimitedSplit8K_LEAD(t.COL2,'|') ds
WHERE ds.Item LIKE '%<abc>';
Returns
COL1 item
---------- -----------------
ID1 1234567890<abc>
ID1 4312314124<abc>
ID1 4131234131<abc>
ID2 2662314129<abc>
ID2 7868845133<abc>
ID2 6831234131<abc>
ID2 1234567999<abc>
Then you use XML PATH for concatenation like this:
DECLARE #table TABLE (COL1 VARCHAR(10), COL2 VARCHAR(1000));
INSERT #table
VALUES
('ID1','1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>'),
('ID2','2662314129<abc>|7868845133<abc>|6831234131<abc>|41234139999<xxx>|1234567999<abc>')
SELECT t.COL1, stripBadNumbers.newString
FROM #table t
CROSS APPLY
(VALUES((
SELECT ds.item
FROM dbo.DelimitedSplit8K_LEAD(t.COL2,'|') ds
WHERE ds.Item LIKE '%<abc>'
FOR XML PATH(''), TYPE
).value('.', 'varchar(1000)'))) stripBadNumbers(newString);
Returns:
COL1 newString
---------- -------------------------------------------------------------------
ID1 1234567890<abc>4312314124<abc>4131234131<abc>
ID2 2662314129<abc>7868845133<abc>6831234131<abc>1234567999<abc>

That string of yours can easily be transformed into some XML basically using replace(). The phone numbers with the right tag can then be selected using XQuery. As a bonus this might work with an arbitrary number of phone numbers.
(I don't get your schema, so I use my own. Translate it into yours yourself.)
CREATE TABLE elbat
(nmuloc nvarchar(MAX));
INSERT INTO elbat
(nmuloc)
VALUES ('1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>');
WITH
cte AS
(
SELECT convert(xml,
concat('<phonenumbers><phonenumber number="',
replace(replace(substring(nmuloc,
1,
len(nmuloc) - 1),
'<',
'" tag="'),
'>|',
'"/><phonenumber number="'),
'"/></phonenumbers>')) phonenumbers
FROM elbat
)
SELECT stuff((SELECT ',' + nodes.node.value('concat(./#number, "<", ./#tag, ">")',
'nvarchar(max)')
FROM cte
CROSS APPLY phonenumbers.nodes('/phonenumbers/phonenumber[#tag="abc"]') nodes(node)
FOR XML PATH(''),
TYPE).value('(.)[1]',
'nvarchar(max)'),
1,
1,
'');
But while you're at it you should really consider to normalize your schema and don't use delimiter separated lists and the also non atomic number and tag combination in a string anymore!
SQL Fiddle

I didn't understand your question at first.but for answer you can you following code if you sql server is 2016 or upper.I think it has a good performance
Insert into table2 (ID1)
SELECT
STUFF((SELECT [value] +N',' AS 'data()' FROM STRING_SPLIT(ID1,'|') WHERE [value] LIKE'%<abc>' FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),1,2,N'') AS ID1
FROM
table1

Related

Split a column with comma delimiter

I have a table with 3 columns with the data given below.
ID | Col1 | Col2 | Status
1 8007590006 8002240001,8002170828 I
2 8002170828 8002000004 I
3 8002000001 8002240001 I
4 8769879809 8002000001 I
5 8769879809 8002000001 I
Col2 can contain multiple comma delimited values. I need to update status to C if there is a value in col2 that is also present in col1.
For example, for ID = 1, col2 contains 8002170828 which is present in Col1, ID = 2. So, status = 'C'
From what I tried, I know it won't work where there are multiple values as I need to split that data and get individual values and then apply update.
UPDATE Table1
SET STATUS = 'C'
WHERE Col1 IN (SELECT Col2 FROM Table1)
If you are using SQL Server 2016 or later, then STRING_SPLIT comes in handy:
WITH cte AS (
SELECT ID, Col1, value AS Col2
FROM Table1
CROSS APPLY STRING_SPLIT(Col2, ',')
)
UPDATE t1
SET Status = 'C'
FROM Table1 t1
INNER JOIN cte t2
ON t1.Col1 = t2.Col2;
Demo
This answer is intended as a supplement to Tim's answer
As you don't have the native string split that came in 2016 we can make one:
CREATE FUNCTION dbo.STRING_SPLIT
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT y.i.value('(./text())[1]', 'nvarchar(4000)') as value
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
--credits to sqlserverperfomance.com for the majority of this code - https://sqlperformance.com/2012/07/t-sql-queries/split-strings
Now Tim's answer should work out for you, so I won't need to repeat it here
I chose an xml based approach because it performs well and your data seems sane and won't have any xml chars in it. If it ever will contain xml chars like > that will break the parsing they should be escaped then unescaped after split
If you aren't allowed to make functions you can extract everything between the RETURNS and the GO, insert it into Tim's query,tweak the variable names to be column names and it'll still work out

Dynamic Comma Seperated string into different column

May someone please help me for this strange scenario. i have a data as given below.
DECLARE #TABLE TABLE
(
ID INT,
PHONE001 VARCHAR(500)
)
INSERT TEST
SELECT 1,'01323840261,01323844711' UNION ALL
SELECT 2,'' UNION ALL
SELECT 3,',01476862000' UNION ALL
SELECT 4,'01233625418,1223822583,125985' UNION ALL
SELECT 5,'2089840022,9.99021E+13'
and i am trying to put in seperate column for each comma value. the max number of column depends on the largest comma seperated string.
Expected Output
1|01323840261|01323844711|''
2|''|''|''
3|01476862000|''|''|
4|01233625418|1223822583|125985|
5|2089840022|9.99021E+13|''|
try
select id,T.c.value('t[1]','varchar(50)') as col1,
T.c.value('t[2]','varchar(50)') as col2 ,
T.c.value('t[3]','varchar(50)') as col3 from
(select id,cast ('<t>'+ replace(PHONE001,',','</t><t>') +'</t>'
as xml) x
from #TABLE) a cross apply x.nodes('.') t(c)

SQL Server: Convert single row to comma delimited (separated) format

As the title states, I need help in converting a single row of data E.g,
col1 col2 col3 <-- This are column names
value1 value2 value3
To something like
dataResult <-- this is the column name from running the procedure or call
value1,value2,value3
The requirements are that this call ( or rather procedure) needs to be able to accept the results of sql queries of any column length and is able to convert that row to a comma delimited string format. Been stuck at this for weeks any help would be greatly appreciated...
EDIT*
Assume the unique key is the first column. Also assume that only 1 row will be returned with each query. Multiple rows will never occur.
The idea is to convert that row to a comma separated string without having to select the column names manually (in a sense automatically convert the query results)
You might try it like this:
A declared table variable to mock-up as test table. Be aware of the NULL value in col2!
DECLARE #tbl TABLE(col1 VARCHAR(100),col2 VARCHAR(100),col3 VARCHAR(100));
INSERT INTO #tbl VALUES('test1',NULL,'test3');
--This is the query:
SELECT
STUFF(
(
SELECT ',' + elmt.value('.','nvarchar(max)')
FROM
(
SELECT
(
/*YOUR QUERY HERE*/
SELECT TOP 1 *
FROM #tbl
/*--------------------*/
FOR XML AUTO ,ELEMENTS XSINIL,TYPE
)
) AS A(t)
CROSS APPLY t.nodes('/*/*') AS B(elmt)
FOR XML PATH('')
),1,1,'')
FOR XML AUTO will return each row as XML with all the values within attributes. But this would omit NULL values. Your returned string would not inlcude the full count of values in this case. Stating ELEMENT XSINIL forces the engine to include NULL values into the XML. This CROSS APPLY t.nodes('/*/*') will return all the elements as derived table and the rest is re-conactenation.
See the double comma in the middle! This is the NULL value of col2
test1,,test3
ATTENTION: You must be aware, that the whole approach will break, if there is a comma part of a (string) column...
Hint
Better was a solution with XML or JSON. Comma separated values are outdated...
Applay the next Approach:-
Use For Xml to sperate comma,
Get Columns Names Via using INFORMATION_SCHEMA.COLUMNS.
According to your need, select TOP (1) for getting First
Row.
Demo:-
Create database MyTestDB
go
Use MyTestDB
go
Create table Table1 ( col1 varchar(10), col2 varchar(10),col3 varchar(10))
go
insert into Table1 values ('Value1','Value2','Value3')
insert into Table1 values ('Value11','Value12','Value13')
insert into Table1 values ('Value21','Value22','Value23')
go
Declare #Values nVarchar(400),
#TableName nvarchar (100),
#Query nvarchar(max)
Set #TableName = 'Table1'
Select #Values = Stuff(
(
Select '+'','' + ' + C.COLUMN_NAME
From INFORMATION_SCHEMA.COLUMNS As C
Where C.TABLE_SCHEMA = T.TABLE_SCHEMA
And C.TABLE_NAME = T.TABLE_NAME
Order By C.ORDINAL_POSITION
For Xml Path('')
), 1, 2, '')
From INFORMATION_SCHEMA.TABLES As T
where TABLE_NAME = #TableName
select #Values = right(#Values,len(#Values)-4)
select #Query = 'select top(1)' + #Values + ' from ' + #TableName
exec sp_executeSQL #Query
Result:-

SQL for concatenating strings/rows into one string/row? (How to use FOR XML PATH with INSERT?)

I am concatenating several rows/strings in an table (on Microsoft SQL Server 2010) into a string by using a method as suggested here:
SELECT ',' + col FROM t1 FOR XML PATH('')
However, if I try to insert the resulting string as (single) row into another table like so:
INSERT INTO t2
SELECT ', ' + col FROM t1 FOR XML PATH('')
I receive this error message:
The FOR XML clause is not allowed in a INSERT statement.
t2 currently has a single column of type NVARCHAR(80). How can I overcome this problem, i.e. how can I collapse a table t1 with many rows into a table t2 with row that concatenates all the strings from t1 (with commas)?
Rather than xml path why not do it like this?
DECLARE #Cols VARCHAR(8000)
SELECT #Cols = COALESCE(#Cols + ', ', '') +
ISNULL(col, 'N/A')
FROM t1
Insert into t2 values(#Cols);
You need to cast it back to an nvarchar() before inserting. I use this method, deletes the first separator as well and as I'm doing the , type part, it handles entities correctly.
insert into t2
select stuff((
select ', ' + col from t1
for xml path(''), type
).value('.', 'nvarchar(80)'), 1, 2, '')
So you concatenate all col with prepending comma+space as an xml-object. Then you take the .value() of child with xquery-path . which means "take the child we are at, don't traverse anywhere". You cast it as an nvarchar(80) and replace a substring starting at position 1 and length 2 with an empty string ''. So the 2 should be replaced with however long your separator is.

How can I combine multiple rows into one during a Select?

I'm joining a bunch of tables and then inserting that data into a table variable. I then SELECT those records from the table. The data looks like this:
As you can see from the example, the unique data is only in column 7 and 8. In this example, there's only two rows. But it can be an infinite number. So instead of sending a bunch of rows to the client and then sorting out the data, I want to do it in SQL and only send back one row.
Since all of the data is the same, except for two columns, I wanted to concatenate the data and separate them by commas. This will make the client side operations much easier.
So in the end, I'll have one row and Col7 and Col8 will look like this:
Any ideas on how to accomplish this task?
You could try using FOR XML:
SELECT STUFF((SELECT',' + Col7 FROM #test
FOR XML PATH('')), 1, 1, '' ) as col7
Assuming you want to collapse the entire table into a single row, and discard the data in columns like ID:
DECLARE #x TABLE(Col7 varchar(255), Col8 varchar(255));
INSERT #x SELECT 'foo','bar'
UNION ALL SELECT 'blat','splunge';
SELECT Col7 = STUFF((SELECT ',' + Col7 FROM #x FOR XML PATH(''),
TYPE).value(N'./text()[1]', N'varchar(max)'), 1, 1, ''),
Col8 = STUFF((SELECT ',' + Col8 FROM #x FOR XML PATH(''),
TYPE).value(N'./text()[1]', N'varchar(max)'), 1, 1, '');
Result:
Col7 Col8
-------- -----------
foo,blat bar,splunge
On SQL Server 2017+, it is much simpler:
SELECT Col7 = STRING_AGG(Col7, ','),
Col8 = STRING_AGG(Col8, ',')
FROM #x;