find the end point of a pattern in SQL server - sql

There is a comma separated string in a column which looks like
test=1,value=2.2,system=321
I want to extract value out from the string. I can use select PatIndex('%value=%',columnName) then use left, but this only find the beginning of the patindex.
How to identify the end of pattern value=%, so we can extract the value out?

Chain a few SUBSTRING with CHARINDEX and your PATHINDEX.
DECLARE #text VARCHAR(100) = 'test=1,value=2.21954,system=321'
SELECT
Original = #text,
Parsed = SUBSTRING( -- Get a portion of the original value
#text,
PATINDEX('%value=%',#text) + 6, -- ... starting from the 'value=' (without the 'value=')
-1 + CHARINDEX( -- ... and get as many characters until the first comma
',',
SUBSTRING( -- ... (find the comma starting from the 'value=' onwards)
#text,
PATINDEX('%value=%',#text) + 6,
100)))
Result:
Original Parsed
test=1,value=2.2,system=321 2.2
Note that the CHARINDEX will fail if there is no comma after your value=. You can filter this with a WHERE.
I strongly suggest to store your values already split on a proper table and you wont have to deal with string nightmares like this.

You can use CHARINDEX with starting position to find the first comma after the pattern. CROSS APPLY is used to keep the query easier to read:
WITH tests(str) AS (
SELECT 'test=1,value=2.2,system=321'
)
SELECT str, substring(str, pos1, pos2 - pos1) AS match
FROM tests
CROSS APPLY (SELECT PATINDEX('%value=%', str) + 6) AS ca1(pos1)
CROSS APPLY (SELECT CHARINDEX(',', str, pos1 + 1)) AS ca2(pos2)
-- 2.2

First of all, don't store denormalized data in this way, if you want to query them. SQL, the language, isn't good at string manipulation. Parsing and splitting strings can't take advantage of indexes either, which means any query that tried to find eg all records that refer to system 321 would have to scan and parse all rows.
SQL Server 2016 and JSON
SQL Server 2016 added suppor for JSON and the STRING_SPLIT function. Earlier versions already provided the XML type. It's better to store complex values as JSON or XML instead of trying to parse the string.
One option is to convert the string into a JSON object and retrieve the value contents, eg :
DECLARE #text VARCHAR(100) = 'test=1,value=2.2,system=321'
select json_value('{"' + replace(replace(#text,',','","'),'=','":"') + '"}','$.value')
This returns 2.2.
The replacements converted the original string into
{"test":"1","value":"2.2","system":"321"}
JSON_VALUE(#json,'$.') will return the value property of that object
Earlier SQL Server versions
In earlier SQL Server version, you can convert that string into an XML element the same way and use XQuery :
DECLARE #text VARCHAR(100) = 'test=1,value=2.2,system=321';
declare #xml varchar(100)='<r ' + replace(replace(#text,',','" '),'=',' ="') + '" />';
select #xml
select cast(#xml as xml).value('(/r[1]/#value)','varchar(20)')
In this case #xml contains :
<r test ="1" value ="2.2" system ="321" />
The query result is 2.2

You can try like following.
DECLARE #xml AS XML
SELECT #xml = Cast(( '<X>' + Replace(txt, ',', '</X><X>') + '</X>' ) AS XML)
FROM (VALUES ('test=1,value=2.2,system=321')) v(txt)
SELECT LEFT(value, Charindex('=', value) - 1) AS LeftPart,
RIGHT(value, Charindex('=', Reverse(value)) - 1) AS RightPart
FROM (SELECT n.value('.', 'varchar(100)') AS value
FROM #xml.nodes('X') AS T(n))T
Online Demo
Output
+----------+-----------+
| LeftPart | RightPart |
+----------+-----------+
| test | 1 |
+----------+-----------+
| value | 2.2 |
+----------+-----------+
| system | 321 |
+----------+-----------+

You can try the below query if you are using SQL Server (2016 or above)
SELECT RIGHT(Value,CHARINDEX('=',REVERSE(Value))-1) FROM YourTableName
CROSS APPLY STRING_SPLIT ( ColumnName , ',' )
WHERE Value Like 'Value=%'

Related

select and concatenate everything before and after a certain character

I've got a string like AAAA.BBB.CCCC.DDDD.01.A and I'm looking to manipulate this and end up with AAAA-BBB
I've achieved this by writing this debatable piece of code
declare #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
select replace(substring(#string,0,charindex('.',#string)) + substring(#string,charindex('.',#string,CHARINDEX('.',#string)),charindex('.',#string,CHARINDEX('.',#string)+1)-charindex('.',#string)),'.','-')
Is there any other way to achieve this which is more elegant and readable ?
I was looking at some string_split operations, but can't wrap my head around it.
If you are open to some JSON transformations, the following approach is an option. You need to transform the text into a valid JSON array (AAAA.BBB.CCCC.DDDD.01.A is transformed into ["AAAA","BBB","CCCC","DDDD","01","A"]) and get the required items from this array using JSON_VALUE():
Statement:
DECLARE #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
SET #string = CONCAT('["', REPLACE(#string, '.', '","'), '"]')
SELECT CONCAT(JSON_VALUE(#string, '$[0]'), '-', JSON_VALUE(#string, '$[1]'))
Result:
AAAA-BBB
Notes: With this approach you can easily access all parts from the input string by index (0-based).
I think this is a little cleaner:
declare #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
select
replace( -- replace '.' with '-' (A)
substring(#string, 1 -- in the substring of #string starting at 1
,charindex('.', #string -- and going through 1 before the index of '.'(B)
,charindex('.',#string)+1) -- that is after the first index of the first '.'
-1) -- (B)
,'.','-') -- (A)
Depending on what is in your string you might be able to abuse PARSENAME into doing it. Intended for breaking up names like adventureworks.dbo.mytable.mycolumn it works like this:
DECLARE #x as VARCHAR(100) = 'aaaa.bbb.cccc.ddddd'
SELECT CONCAT( PARSENAME(#x,4), '-', PARSENAME(#x,3) )
You could also look at a mix of STUFF to delete the first '.' and replace with '-' then LEFT the result by the index of the next '.' but it's unlikely to be neater than this or Kevin's proposal
Using string split would likely be as unwieldy:
SELECT CONCAT(MAX(CASE WHEN rn = 1 THEN v END), '-', MAX(CASE WHEN rn = 2 THEN v END))
FROM (
SELECT row_number () over (order by (select 0)) rn, value as v
FROM string_split(#x,'.')
) y WHERE rn IN (1,2)
Because the string is split to rows which then need to be numbered in order to filter and pull the parts you want. This also relies on the strings coming out of string split in the order they were in the original string, which MS do not guarantee will be the case

String aggregation using JSON in SQL Server 2016

I would like to format a json string '[{"_":7},{"_":13},{"_":17}]' as '[7,13,17]'
Tried with REPLACE Method in TSQL. I have to use REPLACE method three times to get the desire result.
SELECT REPLACE(REPLACE(REPLACE('[{"_":7},{"_":13},{"_":17}]','},{"_":',', '),'{"_":',''),'}','')
is there a better way to do that?
I am using SQL Server 2016.
After some comments for this post, This my actual issue.
I have some customer data. Customer Table
CustomerId | Name
1 ABC
2 XYZ
3 EFG
each customer has some area of interest. Customer Area of Interest
CustomerAreaInterestId | FK_CustomerId | FK_AreaOfInterestId
1 1 2
2 1 3
3 1 5
4 2 1
5 2 2
6 3 3
7 3 4
Area of interest table
AreaOfInterestId | Description
1 Interest1
2 Interest2
3 Interest3
4 Interest4
5 Interest5
In the final result set, I have to include area of interest id's as an array of value
[
{
"CustomerName": "ABC",
"AreaofInterest": "[2,3,5]"
},
{
"CustomerName": "XYZ",
"AreaofInterest": "[1,2]"
},
{
"CustomerName": "EFG",
"AreaofInterest": "[3,4]"
}
]
The result consists with some other data’s as well. I have omitted for the code brevity.
Short Version
Cast the numeric field to text before trying to aggregate it
From the comments, it looks like the real question is how to use JSON to aggregate strings in SQL Server 2016, as shown in this answer.
SELECT
JSON_VALUE(
REPLACE(
(SELECT _ = someField FROM someTable FOR JSON PATH)
,'"},{"_":"',', '),'$[0]._'
)
or, rewritten for clarity :
SELECT
JSON_VALUE( REPLACE(
(SELECT _ = someField
FROM someTable
FOR JSON PATH)
,'"},{"_":"',', ')
,'$[0]._')
That query works only with string fields. One needs to understand what it does before it can be adopted to other types.
The inner query generates a JSON string from a field's values, eg '[{"_":"value1"},{"_":"value2"}]'.
REPLACE replaces the quotes and separators between objects, changing that array of objects to '[{"_":"value1,value2"}]'. That's a single object in an array, whose single attribute is a comma-separated string.
JSON_VALUE(...,,'$[0]._') extracts the _ attribute of that single array item.
That trick can't be used with numeric values because they don't have quotes. The solution is to cast them to text first:
SELECT
JSON_VALUE( REPLACE(
(SELECT _ = CAST(someNumber as nvarchar(20))
FROM someTable
FOR JSON PATH)
,'"},{"_":"',', ')
,'$[0]._')
Eg :
declare #t table (id int)
insert into #t
values
(7),
(13),
(17)
SELECT
JSON_VALUE( REPLACE(
(SELECT _ = cast(ID as nvarchar(20))
FROM #t
FOR JSON PATH)
,'"},{"_":"',', '),'$[0]._')
The only change from the original query is the cast clause.
This produces :
7, 13, 17
This conversion is localized so care must be taken with decimals and dates, to avoid producing unexpected results, eg 38,5, 40,1 instead of 38.5, 40.1.
PS: That's no different than the XML technique, except STUFF is used there to cut off the leading separator. That technique also needs casting numbers to text, eg :
SELECT STUFF(
( SELECT N', ' + cast(ID as nvarchar(20))
FROM #t FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),
1,2,N'')
If you want to use only JSON functions (not string-based approach), the next example may help:
DECLARE #json nvarchar(max) = N'[{"_":7},{"_":13},{"_":17}]'
DECLARE #output nvarchar(max) = N'[]'
SELECT #output = JSON_MODIFY(#output, 'append $', j.item)
FROM OPENJSON(#json) WITH (item int '$."_"') j
SELECT #output AS [Result]
Result:
Result
[7,13,17]
Of course, the approach based on string aggregation is also a possible solution:
DECLARE #json nvarchar(max) = N'[{"_":7},{"_":13},{"_":17}]'
SELECT CONCAT(
N'[',
STUFF(
(
SELECT CONCAT(N',', j.item)
FROM OPENJSON(#json) WITH (item int '$."_"') j
FOR XML PATH('')
), 1, 1, N''
),
N']'
)
Yes you could do it with only 2 replace :
SELECT REPLACE(REPLACE('[{"_":7},{"_":13},{"_":17}]','{"_":',''),'}','')
DEMO HERE
Except if you really need a space after coma which is not what you asked to be honest.

Trying to extract number between 2 characters '|' MS SQL

I have column and need to extract number between 2 pipes |, example data inside is AAA|12345678|#RRR. I need to get this number 12345678.
my code is:
SELECT SUBSTRING(column_name,CHARINDEX('|',column_name) + 1, CHARINDEX('|',column_name) - CHARINDEX('|',column_name) - 1)
FROM [name].[name].[table_name]
Using your own code:
SELECT SUBSTRING(column_name,CHARINDEX('|',column_name) + 1,
CHARINDEX('|',column_name) - CHARINDEX('|',column_name) - 1)
FROM [name].[name].[table_name]
The second part of substring is not correct. It should be:
SELECT SUBSTRING(column_name,CHARINDEX('|',column_name) + 1,
CHARINDEX('|',column_name, CHARINDEX('|',column_name)))
FROM [name].[name].[table_name]
The nested CHARINDEX will look for the position of the second pipe. and the SUBSTRING will start from the first pipe and continue to the second
Assuming the 2nd position, you can use a little XML or ParseName()
XML Example
Declare #YourTable table (ID int,column_name varchar(max))
Insert Into #YourTable values
(1,'AAA|12345678|#RRR')
Select ID
,SomeValue = Cast('<x>' + replace(column_name,'|','</x><x>')+'</x>' as xml).value('/x[2]','varchar(max)')
From #YourTable
ParseName() Example
Select ID
,SomeValue = parsename(replace(column_name,'|','.'),2)
From #YourTable
Both would Return
ID SomeValue
1 12345678
String extraction is generally tricky in SQL Server. But if you only have one numeric value and are looking for it, then the code isn't that bad:
select patindex('%[0-9]|%', str),
substring(str, patindex('%|[0-9]%', str), patindex('%[0-9]|%', str) - patindex('%|[0-9]%', str) + 1)
from (values ('AAA|12345678|#RRR')) v(str)
I would use PARSENAME() :
select parsename(replace(str, '|', '.'), 2)
from ( values ('AAA|12345678|#RRR')
) v(str);

Removing leading zeros in a string in sqlserver

I want to remove leading zeros for a varchar column. Actually we are storing version information in a column. Find below example versions.
2.00.001
The output would be : 2.0.1
Input : 2.00.00.001
The output would be: 2.0.0.1
Input : 2.00
The output would be : 2.0
The dots in the version column not constant. It may be two or three or four
I found some solutions in google but those are not working. Find below are the queries I tried.
SELECT SUBSTRING('2.00.001', PATINDEX('%[^0 ]%', '2.00.001' + ' '), LEN('2.00.001'))
SELECT REPLACE(LTRIM(REPLACE('2.00.001', '0', ' ')),' ', '0')
Please suggest me the best approach in sqlserver.
One way is to use a string splitting function with cross apply, for xml path, and stuff.
For an explanation on how stuff and for xml works together to concatenate a string from selected rows, read this SO post.
Using a string splitting function will enable you to convert each number part of the string to int, that will remove the leading zeroes. Executing a select statement on the result of the string splitting function will enable you to get your int values back into a varchar value, seperated by dot.
The stuff function will remove the first dot.
Create the string splitting function:
CREATE FUNCTION SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I've chosen to use an xml based function because it's fairly simple. If you are using 2016 version you can use the built in string_split function. For earlier versions, I would stronly suggest reading Aaron Bertrand's Split strings the right way – or the next best way.
Create and populate sample table (Please save us this step in your future questions)
DECLARE #T AS TABLE
(
col varchar(20)
)
INSERT INTO #T VALUES
('2.00.001'),
('2.00.00.001'),
('2.00')
The query:
SELECT col, result
FROM #T
CROSS APPLY
(
SELECT STUFF(
(
SELECT '.' + CAST(CAST(Item as int) as varchar(20))
FROM SplitStrings_XML(col, '.')
FOR XML PATH('')
)
, 1, 1, '') As result
) x
Results:
col result
2.00.001 2.0.1
2.00.00.001 2.0.0.1
2.00 2.0
You can see it in action on this link on rextester
No need for Split/Parse Function, and easy to expand if there could be more than 5 groups
Declare #YourTable table (YourCol varchar(25))
Insert Into #YourTable Values
('2.00.001'),
('2.00.00.001'),
('2.00')
Update #YourTable
Set YourCol = concat(Pos1,'.'+Pos2,'.'+Pos3,'.'+Pos4,'.'+Pos5)
From #YourTable A
Cross Apply (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','int')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','int')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','int')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','int')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','int')))
From (Select Cast('<x>' + replace((Select replace(A.YourCol,'.','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
) B
Select * from #YourTable
Returns
YourCol
2.0.1
2.0.0.1
2.0
Easy, fast, compatible and readable way – without tables or XML tricks.
Correctly handles all cases including empty string, NULL, or numbers like 00100.
Supports unlimited number of groups. Runs on all SQL Server versions.
Step 1: Remove leading zeros from all groups.
Step 2: Place single zero to groups where no digits remained.
[Edit: Not sure why it was downvoted twice. Check the solution: ]
The function:
CREATE FUNCTION dbo.fncGetNormalizedVersionNumber(#Version nvarchar(200))
RETURNS nvarchar(200) AS
BEGIN
-- Preprocessing: Surround version string by dots so all groups have the same format.
SET #Version = '.' + #Version + '.';
-- Step 1: Remove any leading zeros from groups as long as string length decreases.
DECLARE #PreviousLength int = 0;
WHILE #PreviousLength <> LEN(#Version)
BEGIN
SET #PreviousLength = LEN(#Version);
SET #Version = REPLACE(#Version, '.0', '.');
END;
-- Step 2: Insert 0 to any empty group as long as string length increases.
SET #PreviousLength = 0;
WHILE #PreviousLength <> LEN(#Version)
BEGIN
SET #PreviousLength = LEN(#Version);
SET #Version = REPLACE(#Version, '..', '.0.');
END;
-- Strip leading and trailing dot added by preprocessing.
RETURN SUBSTRING(#Version, 2, LEN(#Version) - 2);
END;
Usage:
SELECT dbo.fncGetNormalizedVersionNumber('020.00.00.000100');
20.0.0.100
Performance per 100,000 calculations:
solution using helper function + helper tables + XML: 54519 ms
this solution (used on table column): 2574 ms (→ 21 times faster) (UPDATED after comment.)
For SQL Server 2016:
SELECT
STUFF
((SELECT
'.' + CAST(CAST(value AS INT) AS VARCHAR)
FROM STRING_SPLIT('2.00.001', '.')
FOR XML PATH (''))
, 1, 1, '')
According to this: https://sqlperformance.com/2016/03/sql-server-2016/string-split
It's the fastest way :)
Aaron Bertrand knows it's stuff.
For an interesting and deep read about splitting strings on SQL Server plese read this gem of knowledge: http://www.sqlservercentral.com/articles/Tally+Table/72993/
It has some clever strategies
I am not sure this is what you are looking for but you can give a go, it should handle up to 4 zeros.
DECLARE #VERSION NVARCHAR(20) = '2.00.00.001'
SELECT REPLACE(REPLACE(REPLACE(#VERSION, '0000','0'),'000','0'),'00','0')
2.0.0.01
SET #VERSION = '2.00.00.01'
SELECT REPLACE(REPLACE(REPLACE(#VERSION, '0000','0'),'000','0'),'00','0')
2.0.0.01
SET #VERSION = '2.000.0000.0001'
SELECT REPLACE(REPLACE(REPLACE(#VERSION, '0000','0'),'000','0'),'00','0')
2.0.0.01
Try this one
SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))
Here is another sample:
CREATE TABLE #tt(s VARCHAR(15))
INSERT INTO #tt VALUES
('2.00.001'),
('2.00.00.001'),
('2.00')
SELECT t.s,STUFF(c.s,1,1,'') AS news FROM #tt AS t
OUTER APPLY(
SELECT '.'+LTRIM(z.n) FROM (VALUES(CONVERT(XML,'<n>'+REPLACE(t.s,'.','</n><n>')+'</n>'))) x(xs)
CROSS APPLY(SELECT n.value('.','int') FROM x.xs.nodes('n') AS y(n)) z(n)
FOR XML PATH('')
) c(s)
s news
--------------- -----------
2.00.001 2.0.1
2.00.00.001 2.0.0.1
2.00 2.0

How to sum numbers in a delimited string using SQL Server

I have a string containing numbers delimited by a pipe like so 23|12|12|32|43.
Using SQL I want to extract each number, add 10 and then sum to get a total.
Here is another alternative:
declare #str nvarchar(max) = '23|12|12|32|43';
set #str = 'select '+replace(#str, '|', '+');
exec(#str);
The answer using a recursive common table expression:
WITH cte AS (
SELECT
'23|12|12|32|43' + '|' AS string
,0 AS total
UNION ALL
SELECT
RIGHT(string, LEN(string) - PATINDEX('%|%', string))
,CAST(LEFT(string, PATINDEX('%|%', string) - 1) AS INT) + 10
FROM cte
WHERE PATINDEX('%|%', string) > 0
)
SELECT SUM(total) AS total FROM cte
As the recursion terminator I have put in a check to see if any more pipes exist in the string, however this then missed the last element which I have got around by concatenating an extra pipe on to the end of my original string, I think there is probably a better way to express the WHERE clause.
Here is another way of doing it:
DECLARE #s VARCHAR(1000) = '23|12|12|32|43'
SELECT CAST('<root><e>' + REPLACE(#s, '|', '</e><e>') + '</e></root>' AS XML)
.value('sum(/root/e) + count(/root/e) * 10', 'INT')
This uses casting to XML data type and functions provided by it.
I posted this just as an example, your approach has a much better performance.