Negative indexing with charindex() and substring function - sql

I have a column that holds the data in the following format:
Field Name
123_456_ABC_DEF
12_34_456_XYZ_PQR
LMN_OPQ_123_456
In each case I require, the last two block of data i.e.
ABC_DEF
XYZ_PQR
123_456
Is there a way to use charindex() in manner where it counts '_' from the right side of the string?

Here's an unreadable & slightly mad way of doing it :-)
-- DDL and sample data population, start
DECLARE #tbl TABLE (tokens VARCHAR(256));
INSERT #tbl VALUES
('123_456_ABC_DEF'),
('12_34_456_XYZ_PQR'),
('LMN_OPQ_123_456');
SELECT REVERSE(LEFT(REVERSE(tokens),CHARINDEX('_',REVERSE(tokens),CHARINDEX('_',REVERSE(tokens))+1)-1))
FROM #tbl
Basically reversing the text, searching forwards & reversing it back at the end....(SQL Server T-SQL)

Please try the following solution.
It is based on tokenization via XML and XQuery.
Notable points:
CROSS APPLY is tokenizing input as XML.
The XPath predicate [position() ge (last()-1)] gives us last two
tokens.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (tokens VARCHAR(256));
INSERT #tbl VALUES
('123_456_ABC_DEF'),
('12_34_456_XYZ_PQR'),
('LMN_OPQ_123_456');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = '_';
SELECT t.*
, REPLACE(c.query('data(/root/r[position() ge (last()-1)])').value('.', 'VARCHAR(256)')
, SPACE(1), #separator) AS Result
FROM #tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +
REPLACE(tokens, #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)) AS t1(c);
Output
tokens
Result
123_456_ABC_DEF
ABC_DEF
12_34_456_XYZ_PQR
XYZ_PQR
LMN_OPQ_123_456
123_456

Just another option using JSON and a CROSS APPLY
Example
Select NewValue = reverse(JSON_VALUE(JS,'$[0]')+'_'+JSON_VALUE(JS,'$[1]'))
From YourTable A
Cross Apply (values ('["'+replace(string_escape(reverse([Field Name]),'json'),'_','","')+'"]') ) B(JS)
Results
NewValue
ABC_DEF
XYZ_PQR
123_456

Related

How to add '' and , for multiple ID in SQL Server

I am writing a SELECT query that has multiple id, and I have to manually add '','' (e.g '12L','22C').
I have around 2000 id in an Excel sheet.
Is there any quicker way to add '','' to all the ID?
SELECT id, name
FROM table
WHERE id IN ('12L', '22C', 33j, 7k, 44J, 234C)
DECLARE #Ids VARCHAR(MAX) = '12L,22C,33j,7k,44J,234C'
--Your question's answer.
DECLARE #Splitted VARCHAR(MAX) = STUFF((
SELECT CONCAT(',''', value, '''')
FROM string_split(#Ids, ',')
FOR XML PATH('')), 1, 1, '')
SELECT #Splitted
--'12L','22C','33j','7k','44J','234C'
OR simplified
SELECT id, name from table where id in (SELECT value FROM string_split(#Ids, ','))
string_split: for more information docs
concat: for more information docs
Here is a conceptual example for you. It will work in SQL Server 2012 onwards.
It is a three step process:
Convert input string into XML.
Convert XML into a relational resultset inside the CTE.
Join with a DB table.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, Code VARCHAR(10), City VARCHAR(50));
INSERT INTO #tbl (Code, City) VALUES
('10T', 'Miami'),
('45L', 'Orlando'),
('50Z', 'Dallas'),
('70W', 'Houston');
-- DDL and sample data population, end
DECLARE #Str VARCHAR(100) = '22C,45L,50Z,105M'
, #separator CHAR(1) = ',';
DECLARE #parameter XML = TRY_CAST('<root><r><![CDATA[' +
REPLACE(#Str, #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML);
;WITH rs AS
(
SELECT c.value('.', 'VARCHAR(10)') AS Code
FROM #parameter.nodes('/root/r/text()') AS t(c)
)
SELECT t.*
FROM #tbl AS t INNER JOIN
rs ON t.Code = rs.Code;
Two alternatives if you're ok doing the transformation outside of SQL.
As one of the comments on your question suggests, you could do this in Excel using this as a formula:
="'" & A1 & "',"
Replace the "A1" with whatever cell your first ID is in. After you enter the formula, click the cell it's in, and there will be a small square on the bottom right. Double click that and it will apply the formula to every cell in the column, automatically shifting the cell reference to match the current row. You can then copy the values from that column and erase the comma at the end.
You could also use an editor that supports regular expression like SSMS, Azure Data Studio, Notepad++, etc and do a Find+Replace:
Paste your IDs in
Hit the replace hotkey (Ctrl+H in all 3 of the ones I listed). There will be an option to enable Regular Expression (SSMS/ADS have a little .* icon, Notepad++ has a labeled radio button). Click it
Find this:
(\w+)
Replace it with this
'$1',
Copy and paste the formatted IDs into your query. Same as above, you'll have to erase the final comma
This will work as long as your IDs are alphanumeric with no spaces, punctuation, etc. If the formatting is more complex, the regex (the (\w+) you search for) will need to be more complex as well. Using this strategy, you could also get rid of the linebreaks by using the regex (\w+)\r\n.
hei, you can use Function CONCATENATE in Excel before you copy those ID in sql.

Replace string without fixed length

I have some data that I'm looking at that has text formatting stored within a NTEXT field.
Happy enough with SQL Replace to remove data of a known length and format, however there are some fields with what looks like colour formatting and I'm trying to find a way to remove these.
An example of the data below, however (if possible) I would like to be able to remove whatever numbers follow the colours in the data but can't see how to introduce a wildcard into the replace statement.
Something like '\red***\green\***\blue***' as per Excel, but this doesn't work in Sql Server.
declare #str varchar(1500) = '\red3\green73\blue125;Jimmy Jazz\red31\green73\blue125;'
select #str,
replace(#str,'\red31\green73\blue125;','')
Any pointers would be gratefully received, thanks in advance.
Based on your sample data it would appear that you only need to remove the numbers in your string you can use patreplace8k or using patextract8K. Note the sample data and examples below:
-- Sample data
DECLARE #strings TABLE(stringId INT IDENTITY, string VARCHAR(100));
INSERT #strings VALUES('DeepPurple1978\yellow2\red009;pink\black3322'),
('red202\yellow5\red009;hotpink2'),('purple999\gray65\violet;blue\yellow381');
--==== Solution #1 Patreplace8k
SELECT
s.stringId,
pr.newString
FROM #strings AS s
CROSS APPLY samd.patReplace8K(s.string,'[0-9]','') AS pr;
--==== Solution #2 PatExtract8k + STRING_AGG (SQL 2017+)
SELECT
s.stringId,
NewString = STRING_AGG(pe.Item,'') WITHIN GROUP (ORDER BY pe.ItemNumber)
FROM #strings AS s
CROSS APPLY samd.patExtract8K(s.string,'[0-9]') AS pe
GROUP BY s.stringId;
--==== Solution #3 PatExtract8k + XML Concatination (Pre SQL 2017\)
SELECT
s.stringId,
NewString =
(
SELECT pe.item+''
FROM #strings AS s2
CROSS APPLY samd.patExtract8K(s2.string,'[0-9]') AS pe
WHERE s.stringId = s2.stringid
ORDER BY pe.itemNumber
FOR XML PATH('')
)
FROM #strings AS s
GROUP BY s.stringId;
Each of these solutions return:
stringId NewString
----------- -------------------------------------
1 DeepPurple\yellow\red;pink\black
2 red\yellow\red;hotpink
3 purple\gray\violet;blue\yellow
The second and third leverage concatenation, the second compatible with SQL Server 2017+ the third works with earlier versions (you did not include what version you are on.)
To only strip the numbers that follow one or more pre-defined colors you could use patternsplitCM. Note the use of a table with a group of colors your are seeking; in the real world I'd use a real table.
-- Colors
DECLARE #colors TABLE(color VARCHAR(20) PRIMARY KEY);
INSERT #colors VALUES('red'),('green'),('blue'),('yellow'),('purple'),('grey');
-- Sample data
DECLARE #strings TABLE(stringId INT IDENTITY, string VARCHAR(100));
INSERT #strings VALUES('Burger1978\yellow2\red009;pink\86thisfool'),
('red202\yellow5\red009;Freddy99'),('green999\grey65\violet;blue\yellow381');
SELECT
s.stringId, s.string, NewString =
(
SELECT
(
SELECT SUBSTRING(f.Item, IIF(f.M=0 AND EXISTS (SELECT c.Color FROM #colors AS c
WHERE c.Color = f.L),NULLIF(PATINDEX('%[^0-9]',f.item),0),1),8000)
FROM
(
SELECT ps.ItemNumber, ps.Item, ps.[Matched],
LAG(ps.Item,1,ps.Item) OVER (ORDER BY ps.ItemNumber)
FROM dbo.PatternSplitCM(s.string,'[^0-9\ ;]') AS ps
) AS f(ItemNumber,Item,M,L)
ORDER BY f.ItemNumber
FOR XML PATH(''), TYPE
).value('(text())[1]','varchar(8000)')
)
FROM #strings AS s;
Returns:
stringId string NewString
----------- --------------------------------------------- ----------------------------------------
1 Burger1978\yellow2\red009;pink\86thisfool Burger1978\yellow\red;pink\86thisfool
2 red202\yellow5\red009;Freddy99 red\yellow\red;Freddy99
3 green999\grey65\violet;blue\yellow381 green\grey\violet;blue\yellow

Concatenate/aggregate strings with JSON in SQL Server

This might be a simple question for those who are experienced in working with JSON in SQL Server. I found this interesting way of aggregating strings using FOR XML in here.
create table #t (id int, name varchar(20))
insert into #t
values (1, 'Matt'), (1, 'Rocks'), (2, 'Stylus')
select id
,Names = stuff((select ', ' + name as [text()]
from #t xt
where xt.id = t.id
for xml path('')), 1, 2, '')
from #t t
group by id
How can I do the same using JSON instead of XML?
You cannot replace the XML approach with JSON. This string concatenation works due to some XML inner peculiarities, which are not the same in JSON.
Starting with SQL Server 2017 onwards you can use STRING_AGG(), but with earlier versions, the XML approach is the way to go.
Some background and a hint
First the hint: The code you showed is not safe for the XML special characters. Check my example below.
First I declare a simple XML
DECLARE #xml XML=
N'<a>
<b>1</b>
<b>2</b>
<b>3</b>
<c>
<d>x</d>
<d>y</d>
<d>z</d>
</c>
</a>';
--The XPath . tells the XML engine to use the current node (and all within)
--Therefore this will return any content within the XML
SELECT #xml.value('.','varchar(100)')
--You can specify the path to get 123 or xyz
SELECT #xml.query('/a/b').value('.','varchar(100)')
SELECT #xml.query('//d').value('.','varchar(100)')
Now your issue to concatenate tabular data:
DECLARE #tbl TABLE(SomeString VARCHAR(100));
INSERT INTO #tbl VALUES('This'),('will'),('concatenate'),('magically'),('Forbidden Characters & > <');
--The simple FOR XML query will tag the column with <SomeString> and each row with <row>:
SELECT SomeString FROM #tbl FOR XML PATH('row');
--But we can create the same without any tags:
--Attention: Look closely, that the result - even without tags - is XML typed and looks like a hyper link in SSMS.
SELECT SomeString AS [*] FROM #tbl FOR XML PATH('');
--Now we can use as a sub-select within a surrounding query.
--The result is returned as string, not XML typed anymore... Look at the forbidden chars!
SELECT
(SELECT SomeString FROM #tbl FOR XML PATH('row'))
,(SELECT SomeString AS [*] FROM #tbl FOR XML PATH(''))
--We can use ,TYPE to enforce the sub-select to be treated as XML typed itself
--This allows to use .query() and/or .value()
SELECT
(SELECT SomeString FROM #tbl FOR XML PATH('row'),TYPE).query('data(//SomeString)').value('.','nvarchar(max)')
,(SELECT SomeString AS [*] FROM #tbl FOR XML PATH(''),TYPE).value('.','nvarchar(max)')
XQuery's .data() can be used to concatenate named elements with blanks in between.
XQuery's .value() must be used to re-escpae forbidden characters.

SQL server 2012 parsing XML with namespaces

I have a table containing rows of xml in the following format:
<msit:message xmlns:wsa="http://URL1" xmlns:msit="http://URL2" xmlns:env="http://URL3">
<env:Body>
<ns0:parent xmlns:ns0="http://URL4">
<ns0:child>123456789</ns0:child>
...
</ns0:parent>
</env:Body>
</msit:message>`
in a table name mytable, column name data.
I have written the following query:
;with xmlnamespaces('http://URL2' as msit,
'http://URL3' as env,
'http://URL1' as wsa,
'http://URL4' as ns0)
select
t2.field.value('child[1]','varchar(20)') as ban
from mytable
cross apply data.nodes('/message/Body/parent') t2(field)
it returns empty set, when I need to return 123456789
What am I doing wrong ?
Thank you
you may need to include the prefixes in the xpath expressions:
declare #mytable table (data xml)
insert into #mytable values
('<msit:message xmlns:wsa="http://URL1" xmlns:msit="http://URL2" xmlns:env="http://URL3">
<env:Body>
<ns0:parent xmlns:ns0="http://URL4">
<ns0:child>123456789</ns0:child>
</ns0:parent>
</env:Body>
</msit:message>')
;with xmlnamespaces('http://URL2' as msit,
'http://URL3' as env,
'http://URL1' as wsa,
'http://URL4' as ns0)
select
t2.field.value('ns0:child[1]','varchar(20)') as ban
from #mytable
cross apply data.nodes('/msit:message/env:Body/ns0:parent') t2(field)
The whole point of namespaces is to differentiate between elements that were brought together from multiple documents.
It is similar to the way we qualify columns with tables' names or aliases, e.g. t1.x Vs. t2.x.
So when you refer to an element you should qualify it with the right namespace.
You might also want to use outer apply instead of cross apply in case there's a missing element.
create table mytable (x xml);
insert into mytable (x) values
(
'
<msit:message xmlns:wsa="http://URL1" xmlns:msit="http://URL2" xmlns:env="http://URL3">
<env:Body>
<ns0:parent xmlns:ns0="http://URL4">
<ns0:child>123456789</ns0:child>
</ns0:parent>
</env:Body>
</msit:message>
'
)
;
;
with xmlnamespaces
(
'http://URL2' as msit
,'http://URL3' as env
,'http://URL1' as wsa
,'http://URL4' as ns0
)
select t2.field.value('ns0:child[1]','varchar(20)') as ban
from mytable
outer apply x.nodes('/msit:message/env:Body/ns0:parent') t2(field)
;

SQL Server 2005: How to perform a split on a string

I have the following string that I need to split from a field called symbols
234|23|HC
This is my current SQL statement
declare #t xml;
Set #t = (
Select symbols from tc for xml auto, elements)
Select #t;
which produces <symbols>234|23|HC</symbols>
but I need to split the string into child nodes so the result is like this:
<symbols>
<symbol>234</symbol>
<symbol>23</symbol>
<symbol>HC</symbol>
</symbols>
A replace version that takes care of the problem characters.
declare #T table(symbol varchar(50))
insert into #T values ('234|23|HC|Some problem chars <> &')
select cast('<symbols><symbol>'+
replace(cast(cast('' as xml).query('sql:column("symbol")') as varchar(max)),
'|',
'</symbol><symbol>')+
'</symbol></symbols> ' as xml)
from #T
Result:
<symbols>
<symbol>234</symbol>
<symbol>23</symbol>
<symbol>HC</symbol>
<symbol>Some problem chars <> &</symbol>
</symbols>