Extract a number from String in SQL - sql

I have the following string:
"FLEETWOOD DESIGNS 535353110XXXXX" (The X's are actually numbers I just wanted to hide them here)
Does anyone know how can I search through Strings in SQL and extract numbers that are greater then lets say 10 characters long?

This a quite old post but might help anyone else. I was searching for an user defined function in SQL Server to extract only the numbers of a given string, and, surprisingly I could not find exactly what I was looking for.
Let me put here the code of a function to "Extract a number from string in SQL" (valid for SQL Server). This is taken from the fantastic blog of Pinal Dave, I've modified it just to return NULL is a NULL value is passed to the function.
CREATE FUNCTION [dbo].[ExtractInteger](#String VARCHAR(2000))
RETURNS VARCHAR(1000)
AS
BEGIN
DECLARE #Count INT
DECLARE #IntNumbers VARCHAR(1000)
SET #Count = 0
SET #IntNumbers = ''
IF #String IS NULL
RETURN NULL;
WHILE #Count <= LEN(#String)
BEGIN
IF SUBSTRING(#String,#Count,1) >= '0' AND SUBSTRING(#String,#Count,1) <= '9'
BEGIN
SET #IntNumbers = #IntNumbers + SUBSTRING(#String,#Count,1)
END
SET #Count = #Count + 1
END
RETURN #IntNumbers
END
Tests
select '"' + dbo.ExtractInteger('1a2b3c4d5e6f7g8h9i') + '"'
GO
select '"' + dbo.ExtractInteger('abcdefghi') + '"'
GO
select '"' + dbo.ExtractInteger(NULL) + '"'
GO
select '"' + dbo.ExtractInteger('') + '"'
GO
Results
"123456789"
""
NULL
""

You don't mention the DB engine, so we don't know what features are available...
If regexpressions are available then pattern like \d{10,} would match numbers with 10 or more digit.
In mySQL REGEXP can only return true or false (0 or 1) so you'd have to use some ugly hack like
SELECT
LEAST(
INSTR(field,'0'),
INSTR(field,'1'),
INSTR(field,'2'),
INSTR(field,'3'),
INSTR(field,'4'),
INSTR(field,'5'),
INSTR(field,'6'),
INSTR(field,'7'),
INSTR(field,'8'),
INSTR(field,'9')
) AS startPos,
REVERSE(field) AS backward,
LEAST(
INSTR(backward,'0'),
INSTR(backward,'1'),
INSTR(backward,'2'),
INSTR(backward,'3'),
INSTR(backward,'4'),
INSTR(backward,'5'),
INSTR(backward,'6'),
INSTR(backward,'7'),
INSTR(backward,'8'),
INSTR(backward,'9')
) AS endPos,
SUBSTRING(field, startPos, endPos - startPos + 1)
FROM tab
WHERE(field REGEXP '[0-9]{10,}')
but this isn't perfect - it would extract false substring for string like "ABC 9 A 1234567891", not to mention that it is probably so slooooow that it is faster to go througt data by hand.

SUBSTRING('FLEETWOOD DESIGNS 535353110XXXXX', 18, 32)
You could also use LEN() to get the length of the string itself. If you know the serial number length, you can just subtract that from the end index to get your start index of the substring.

It could be done like this
Declare #X varchar(100)
Select #X= 'Here is where15234Numbers'
--
Select #X= SubString(#X,PATINDEX('%[0-9]%',#X),Len(#X))
Select #X= SubString(#X,0,PATINDEX('%[^0-9]%',#X))
--// show result
Select #X

Related

How to create a function to split date and time from a string in SQL?

How can I remove value before '_' and show date and time in one row in TSQL Function?
Below is sample:
Declare #inputstring as varchar(50) = 'Studio9_20230126_203052' ;
select value from STRING_SPLIT( #inputstring ,'_')
Output Required: 2023-01-26 20:30:52.000
If we can safely assume that the value is always in the format {Some String}_{yyyyMMdd}_{hhmmss} then you can use STUFF a few times, firstly to remove the leading string up to the first underscore (_) character (using CHARINDEX to find that character), and then to inject 2 colon (:) characters. Finally you can REPLACE the remaining underscore with a space ( ), and then use TRY_CONVERT to attempt to convert the value to a datetime2(0).
DECLARE #inputstring varchar(50) = 'Studio9_20230126_203052';
SELECT TRY_CONVERT(datetime2(0),REPLACE(STUFF(STUFF(STUFF(#inputstring,1,CHARINDEX('_',#inputstring),''),14,0,':'),12,0,':'),'_',' '));
Note that this doesn't give the value you state you want in your question (2023-01-26 20:05:52.000) , but I assume this is a typographical error, and that the 05 for minutes should be 30.
Creating function
CREATE FUNCTION [dbo].[convert_to_date] (#inputstring NVARCHAR(MAX))
RETURNS DATETIME AS
BEGIN
DECLARE #finalString varchar(50), #out varchar(100)
SET #finalString = REPLACE ( (SUBSTRING (#inputstring, CHARINDEX('_', #inputstring)+1 , LEN(#inputstring))), '_', ' ')
--SELECT #finalString
SET #out = LEFT (#finalString, 4) + '-'
+ SUBSTRING(#finalString, 5, 2) + '-'
+ SUBSTRING(#finalString, 7, 2) + ' '
+ SUBSTRING(#finalString, 10, 2) + ':'
+ SUBSTRING(#finalString, 12, 2) + ':'
+ SUBSTRING(#finalString, 14, 2) + '.000'
RETURN #out
END
Select Query
SELECT dbo.[convert_to_date] ('Studio54541659_20230126_203052')
Output
2023-01-26 20:30:52.000
This will tolerate "somestring" in the format of "somestring_YYYYMMDD_HHMISS" being variable in length.
Declare #inputstring as varchar(50) = 'Studio9_20230126_203052' ;
SELECT DateAndTime = CONVERT(DATETIME,STUFF(STUFF(STUFF(v2.DT,14,0,':'),12,0,':'),9,1,' '))
,Identifier = LEFT(#inputstring,v1.Pos1-1) --Included this because I know how people are :D --Comment out if not wanted.
,Original = #inputstring --Original string just for checking. Comment out when happy.
FROM (VALUES(CHARINDEX('_',#inputstring)))v1(Pos1) --Position of first Underscore
CROSS APPLY (VALUES(SUBSTRING(#inputstring,v1.Pos1+1,50)))v2(DT) --String after first Underscore
;
Output looks like this and you end up with a DATETIME datatype. Comment out what you don't want for columns in the return.
I'll let you have some of the fun by converting it into an iTVF (inline Table Valued Function). Remember that any function that contains a "BEGIN" is ultimately going to be a part of a performance issue so make sure it's an iTVF :D
EDIT: Crud... I've gotta remember to scroll down. #Lamu already posted the same thing but it's probably better and fast if you just want the time and not the identifier I included.

how to extract string after any character

I have data shown below. I'd like to extract the last part after the last underscore.
The data before the underscore can be any length but written with the same syntax i means : letters_letters_letters.
So i wrote this code to extract the part after the last underscore then it's works perfectly but i noticed that there is 2 names written differently like (letters_letters-letters )
with - instead of _ at the end.
NAME= (SELECT SUBSTRING('''+#NAME+''', CHARINDEX(''_'','''+#NAME+''',CHARINDEX(''_'','''+#NAME+''')+1)+1, CHARINDEX(''_'','''+#NAME+''') + CHARINDEX(''_'','''+#NAME+''',CHARINDEX(''_'','''+#NAME+''')+1)) FROM TABLE)
My question is : is there away to check or extract string after any character (without specify if it's underscore or other) ?
Can anyone help please.
Column name is like :
BOB_LOU_K
the thow others columns are like :
BOB_LOU-K
Thanks
Create a function that finds the position of the last character that is not a letter and return the substring from that position to the end of the string. Call that function on your query, something like this SELECT Col1, [dbo].[fn_GetLastString](Col1) AS lastString FROM mytable your function fn_LastString would look somehing like this:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION fn_GetLastString(#cSearchedExpression VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #pos INT = LEN(#cSearchedExpression)
DECLARE #currentChar VARCHAR(1)
DECLARE #lastString VARCHAR(MAX)
WHILE #pos>0
BEGIN
SET #currentChar = SUBSTRING(#cSearchedExpression, #pos, 1)
IF NOT (#currentChar LIKE '[a-Z ]' )
BEGIN
SET #lastString = SUBSTRING(#cSearchedExpression, #pos, LEN(#cSearchedExpression) - #pos + 1)
BREAK;
END
SET #pos = #pos - 1
END
RETURN #lastString
END
GO
p.s. notice the space in '[a-Z ]' that will include spaces.
select txt
,right(txt, patindex('%[^a-zA-Z]%', reverse(txt) + ' ') - 1) as last_token
from (values ('BOB_LOU_K'), ('BOB_LOU-K'), ('Yet-aonther-badly-written-question'), ('*followed*by*mediocre*answers'), ('at~best'), ('Quite depressing'), ('isn''t it?'), ('Yep')) t(txt)
txt
last_token
BOB_LOU_K
K
BOB_LOU-K
K
Yet-aonther-badly-written-question
question
*followed*by*mediocre*answers
answers
at~best
best
Quite depressing
depressing
isn't it?
Yep
Yep
Fiddle
A a minimal reproducible example is not provided. So, I am shooting from the hip.
If there are some additional characters to take care of, just add them to the #junkChars variable.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, tokens VARCHAR(100));
INSERT INTO #tbl (tokens) VALUES
('BOB_LOU_K'),
('BOB_LOU-K'),
('Yet-aonther-badly-written-question'),
('*followed*by*mediocre*answers'),
('at/the\best'),
('Quite depressing'),
('isn''t it?'),
('Yep');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = '_'
, #junkChars VARCHAR(10) = '-*\ ';;
SELECT t.*
, result = c.value('(/root/r[last()]/text())[1]', 'VARCHAR(30)')
FROM #tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +
REPLACE(TRANSLATE(tokens, #junkChars, REPLICATE(#separator, DATALENGTH(#junkChars))), #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)) AS t1(c);
Output
ID
tokens
result
1
BOB_LOU_K
K
2
BOB_LOU-K
K
3
Yet-aonther-badly-written-question
question
4
followedbymediocreanswers
answers
5
at/the\best
best
6
Quite depressing
depressing
7
isn't it?
it?
8
Yep
Yep

How to identify and redact all instances of a matching pattern in T-SQL

I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!
You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE #String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE #StartPos INT = 1, #EndPos INT = 1;
DECLARE #Input VARCHAR(500) = ISNULL(#String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE #OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (#StartPOS <> 0)
BEGIN
SET #StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #Input);
IF #StartPOS <> 0
BEGIN
SET #OutputString += SUBSTRING(#Input, 1, #StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET #Input = SUBSTRING(#Input, #StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET #EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', #Input)); --First occurance of 4 numbers with a not number behind it.
SET #Input = STUFF(#Input, 1, (#EndPos - 1), REPLICATE('*', (#EndPos - 1))); --#EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET #OutputString += #Input; --Append the last element
SET #OutputString = LEFT(#OutputString, LEN(#OutputString))
SELECT #OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.
A dirty solution with recursive CTE
DECLARE
#tags nvarchar(max) = N'Some text with 12345 and 1234 and 12345678',
#c nchar(1) = N' ';
;
WITH Process (s, i)
as
(
SELECT #tags, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #tags)
UNION ALL
SELECT value, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', value)
FROM
(SELECT SUBSTRING(s,0,i)+'*'+SUBSTRING(s,i+4,len(s)) value
FROM Process
WHERE i >0) calc
-- we surround the value and the string with leading/trailing ,
-- so that cloth isn't a false positive for clothing
)
SELECT * FROM Process
WHERE i=0
I think a better solution it's to add clr function in Ms SQL Server to manage regexp.
sql-clr/RegEx
Here is an option using the DelimitedSplit8K_LEAD which can be found here. https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2 This is an extension of Jeff Moden's splitter that is even a little bit faster than the original. The big advantage this splitter has over most of the others is that it returns the ordinal position of each element. One caveat to this is that I am using a space to split on based on your sample data. If you had numbers crammed in the middle of other characters this will ignore them. That may be good or bad depending on you specific requirements.
declare #Something varchar(100) = 'Some text with 12345 and 1234 and 12345678';
with MyCTE as
(
select x.ItemNumber
, Result = isnull(case when TRY_CONVERT(bigint, x.Item) is not null then isnull(replicate('*', len(convert(varchar(20), TRY_CONVERT(bigint, x.Item))) - 4), '') + right(convert(varchar(20), TRY_CONVERT(bigint, x.Item)), 4) end, x.Item)
from dbo.DelimitedSplit8K_LEAD(#Something, ' ') x
)
select Output = stuff((select ' ' + Result
from MyCTE
order by ItemNumber
FOR XML PATH('')), 1, 1, '')
This produces: Some text with *2345 and 1234 and ****5678

SSMS replace all commas outside of quotation marks in string

I've written the following function in SSMS to replace any commas that are outside of quotation marks with ||||:
CREATE FUNCTION dbo.fixqualifier (#string nvarchar(max))
returns nvarchar(max)
as begin
DECLARE #STRINGTOPAD NVARCHAR(MAX)
DECLARE #position int = 1,#newstring nvarchar(max) ='',#QUOTATIONMODE INT = 0
WHILE(LEN(#string)>0)
BEGIN
SET #STRINGTOPAD = SUBSTRING(#string,0,IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)))
SET #newstring = #newstring + IIF(#QUOTATIONMODE = 1, REPLACE(#STRINGTOPAD,',','||||'),#STRINGTOPAD)
SET #QUOTATIONMODE = IIF(#QUOTATIONMODE = 1,0,1)
set #string = SUBSTRING(#string,1+IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)),LEN(#string))
END
return #newstring
end
The idea is for the function to find the first ", replace all ',' before that then switch to quotation mode 1 so it knows to not replace the , until it changes back to quotation mode 0 when it hits the 2nd " and so on.
so for example the string:
qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl
would become:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
It works as expected but it's really inefficient when it comes to doing this for several thousand rows.
Is there a better way or doing this or at least speeding the function up.
Do a simple trick by Modulus
DECLARE #VAR VARCHAR(100) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl'
,#OUTPUT VARCHAR(100) = '';
SELECT #OUTPUT = #OUTPUT + CASE WHEN (LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', ''))) % 2 = 0
THEN REPLACE(VAL, ',', '||||') ELSE VAL END
FROM (
SELECT SUBSTRING(#VAR, NUMBER, 1) VAL
FROM master.dbo.spt_values
WHERE type = 'P'
AND NUMBER BETWEEN 1 AND LEN(#VAR)
) A
PRINT #OUTPUT
Result:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
By this LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', '')) expression, you will get count of ". By taking Modulus of the count %2, if it is zero its even then you can replace commas, otherwise you will keep them.
This uses DelimitedSplit8k and completely avoids any RBAR methods (such as a WHILE or #Variable = #Variable +... (which is a hidden form of RBAR)).
It firstly splits on the quotation, and then on the commas, where the string isn't quoted. Finally it then puts the strings back together again, using the "old" STUFF and FOR XML PATH method:
USE Sandbox;
DECLARE #String varchar(8000) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl';
WITH Splits AS(
SELECT QS.ItemNumber AS QuoteNumber, CS.ItemNumber AS CommaNumber, ISNULL(CS.Item, '"' + QS.Item + '"') AS DelimitedItem
FROM dbo.DelimitedSplit8K(#string,'"') QS
OUTER APPLY (SELECT *
FROM dbo.DelimitedSplit8K(QS.Item,',')
WHERE QS.ItemNumber % 2 = 1) CS
WHERE QS.Item <> ',')
SELECT STUFF((SELECT '||||' + S.DelimitedItem
FROM Splits S
ORDER BY S.QuoteNumber, S.CommaNumber
FOR XML PATH('')),1,1,'') AS DelimitedList;
(Note, DelimitedSplit8K does not accept more than 8,000 characters. If you have more than that, SQL Server is really not the right tool. STRING_SPLIT does not provide the ordinal position, so you would be unable to guarantee the rebuild order with it.)

A SQL Query to select a string between two known strings

I need a SQL query to get the value between two known strings (the returned value should start and end with these two strings).
An example.
"All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought."
In this case the known strings are "the dog" and "immediately". So my query should return "the dog had been very bad and required harsh punishment immediately"
I've come up with this so far but to no avail:
SELECT SUBSTRING(#Text, CHARINDEX('the dog', #Text), CHARINDEX('immediately', #Text))
#Text being the variable containing the main string.
Can someone please help me with where I'm going wrong?
The problem is that the second part of your substring argument is including the first index.
You need to subtract the first index from your second index to make this work.
SELECT SUBSTRING(#Text, CHARINDEX('the dog', #Text)
, CHARINDEX('immediately',#text) - CHARINDEX('the dog', #Text) + Len('immediately'))
I think what Evan meant was this:
SELECT SUBSTRING(#Text, CHARINDEX(#First, #Text) + LEN(#First),
CHARINDEX(#Second, #Text) - CHARINDEX(#First, #Text) - LEN(#First))
An example is this: You have a string and the character $
String :
aaaaa$bbbbb$ccccc
Code:
SELECT SUBSTRING('aaaaa$bbbbb$ccccc',CHARINDEX('$','aaaaa$bbbbb$ccccc')+1, CHARINDEX('$','aaaaa$bbbbb$ccccc',CHARINDEX('$','aaaaa$bbbbb$ccccc')+1) -CHARINDEX('$','aaaaa$bbbbb$ccccc')-1) as My_String
Output:
bbbbb
You need to adjust for the LENGTH in the SUBSTRING. You were pointing it to the END of the 'ending string'.
Try something like this:
declare #TEXT varchar(200)
declare #ST varchar(200)
declare #EN varchar(200)
set #ST = 'the dog'
set #EN = 'immediately'
set #TEXT = 'All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought.'
SELECT SUBSTRING(#Text, CHARINDEX(#ST, #Text), (CHARINDEX(#EN, #Text)+LEN(#EN))-CHARINDEX(#ST, #Text))
Of course, you may need to adjust it a bit.
I had a similar need to parse out a set of parameters stored within an IIS logs' csUriQuery field, which looked like this: id=3598308&user=AD\user&parameter=1&listing=No needed in this format.
I ended up creating a User-defined function to accomplish a string between, with the following assumptions:
If the starting occurrence is not found, a NULL is returned, and
If the ending occurrence is not found, the rest of the string is returned
Here's the code:
CREATE FUNCTION dbo.str_between(#col varchar(max), #start varchar(50), #end varchar(50))
RETURNS varchar(max)
WITH EXECUTE AS CALLER
AS
BEGIN
RETURN substring(#col, charindex(#start, #col) + len(#start),
isnull(nullif(charindex(#end, stuff(#col, 1, charindex(#start, #col)-1, '')),0),
len(stuff(#col, 1, charindex(#start, #col)-1, ''))+1) - len(#start)-1);
END;
GO
For the above question, the usage is as follows:
DECLARE #a VARCHAR(MAX) = 'All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought.'
SELECT dbo.str_between(#a, 'the dog', 'immediately')
-- Yields' had been very bad and required harsh punishment '
Try this and replace '[' & ']' with your string
SELECT SUBSTRING(#TEXT,CHARINDEX('[',#TEXT)+1,(CHARINDEX(']',#TEXT)-CHARINDEX('[',#TEXT))-1)
I have a feeling you might need SQL Server's PATINDEX() function. Check this out:
Usage on Patindex() function
So maybe:
SELECT SUBSTRING(#TEXT, PATINDEX('%the dog%', #TEXT), PATINDEX('%immediately%',#TEXT))
SELECT
SUBSTRING( '123#yahoo.com', charindex('#','123#yahoo.com',1) + 1, charindex('.','123#yahoo.com',1) - charindex('#','123#yahoo.com',1) - 1 )
DECLARE #Text VARCHAR(MAX), #First VARCHAR(MAX), #Second VARCHAR(MAX)
SET #Text = 'All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought.'
SET #First = 'the dog'
SET #Second = 'immediately'
SELECT SUBSTRING(#Text, CHARINDEX(#First, #Text),
CHARINDEX(#Second, #Text) - CHARINDEX(#First, #Text) + LEN(#Second))
You're getting the starting position of 'punishment immediately', but passing that in as the length parameter for your substring.
You would need to substract the starting position of 'the dog' from the charindex of 'punishment immediately', and then add the length of the 'punishment immediately' string to your third parameter. This would then give you the correct text.
Here's some rough, hacky code to illustrate the process:
DECLARE #text VARCHAR(MAX)
SET #text = 'All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought.'
DECLARE #start INT
SELECT #start = CHARINDEX('the dog',#text)
DECLARE #endLen INT
SELECT #endLen = LEN('immediately')
DECLARE #end INT
SELECT #end = CHARINDEX('immediately',#text)
SET #end = #end - #start + #endLen
SELECT #end
SELECT SUBSTRING(#text,#start,#end)
Result: the dog had been very bad and required harsh punishment immediately
Among the many options is to create a simple function.
Can keep your code cleaner.
Gives the ability to handle errors if the start or end marker/string is not present.
This function also allows for trimming leading or trailing whitespace as an option.
SELECT dbo.GetStringBetweenMarkers('123456789', '234', '78', 0, 1)
Yields:
56
--Code to create the function
USE [xxxx_YourDB_xxxx]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[GetStringBetweenMarkers] (#FullString varchar(max), #StartMarker varchar(500), #EndMarker varchar(500), #TrimLRWhiteSpace bit, #ReportErrorInResult bit)
RETURNS varchar(max)
AS
BEGIN
--Purpose is to simply return the string between 2 string markers. ew 2022-11-06
--Will perform a LTRIM and RTRIM if #TrimLRWhiteSpace = 1
--Will report errors of either marker not being found in the RETURNed string if #ReportErrorInResult = 1.
-- When #ReportErrorInResult = 0, if the start marker isn't found, will return everything from the start of the #FullString to the left of the end marker.
-- When #ReportErrorInResult = 0, if the end marker isn't found, SQL will return an error of "Invalid length parameter passed to the LEFT or SUBSTRING function."
DECLARE #ReturnString VARCHAR(max) = ''
DECLARE #StartOfStartMarker INT = CHARINDEX(#StartMarker, #FullString)
DECLARE #StartOfTarget INT = CHARINDEX(#StartMarker, #FullString) + LEN(#StartMarker)
DECLARE #EndOfTarget INT = CHARINDEX(#EndMarker, #FullString, #StartOfTarget)
--If a marker wasn't found, put that into the
IF #ReportErrorInResult = 1
BEGIN
IF #EndOfTarget = 0 SET #ReturnString = '[ERROR: EndMarker not found.]'
IF #StartOfStartMarker = 0 SET #ReturnString = '[ERROR: StartMarker not found.]'
IF #StartOfStartMarker = 0 AND #EndOfTarget = 0 SET #ReturnString = '[ERROR: Both StartMarker and EndMarker not found.]'
END
--If not reporting errors, and start marker not found (i.e. CHARINDEX = 0) we would start our string at the LEN(#StartMarker).
-- This would give an odd result. Best to just provide from 0, i.e. the start of the #FullString.
IF #ReportErrorInResult = 0 AND #StartOfStartMarker = 0 SET #StartOfTarget = 0
--Main action
IF #ReturnString = '' SET #ReturnString = SUBSTRING(#FullString, #StartOfTarget, #EndOfTarget - #StartOfTarget)
IF #TrimLRWhiteSpace = 1 SET #ReturnString = LTRIM(RTRIM(#ReturnString))
RETURN #ReturnString
--Examples
-- SELECT '>' + dbo.GetStringBetweenMarkers('123456789','234','78',0,1) + '<' AS 'Result-Returns what is in between markers w/ white space'
-- SELECT '>' + dbo.GetStringBetweenMarkers('1234 56 789','234','78',0,1) + '<' AS 'Result-Without trimming white space'
-- SELECT '>' + dbo.GetStringBetweenMarkers('1234 56 789','234','78',1,1) + '<' AS 'Result-Will trim white space with a #TrimLRWhiteSpace = 1'
-- SELECT '>' + dbo.GetStringBetweenMarkers('abcdefgh','ABC','FG',0,1) + '<' AS 'Result-Not Case Sensitive'
-- SELECT '>' + dbo.GetStringBetweenMarkers('abc_de_fgh','_','_',0,1) + '<' AS 'Result-Using the same marker for start and end'
--Errors are returned if start or end marker are not found
-- SELECT '>' + dbo.GetStringBetweenMarkers('1234 56789','zz','78',0,1) + '<' AS 'Result-Start not found'
-- SELECT '>' + dbo.GetStringBetweenMarkers('1234 56789','234','zz',0,1) + '<' AS 'Result-End not found'
-- SELECT '>' + dbo.GetStringBetweenMarkers('1234 56789','zz','zz',0,1) + '<' AS 'Result-Niether found'
--If #ReportErrorInResult = 0
-- SELECT '>' + dbo.GetStringBetweenMarkers('123456789','zz','78',0,0) + '<' AS 'Result-Start not found-Returns from the start of the #FullString'
-- SELECT '>' + dbo.GetStringBetweenMarkers('123456789','34','zz',0,0) + '<' AS 'Result-End found-should get "Invalid length parameter passed to the LEFT or SUBSTRING function."'
END
GO
SELECT SUBSTRING('aaaaa$bbbbb$ccccc',instr('aaaaa$bbbbb$ccccc','$',1,1)+1, instr('aaaaa$bbbbb$ccccc','$',1,2)-1) -instr('aaaaa$bbbbb$ccccc','$',1,1)) as My_String
Hope this helps :
Declared a variable , in case of any changes need to be made thats only once .
declare #line varchar(100)
set #line ='Email_i-Julie#mail.com'
select SUBSTRING(#line ,(charindex('-',#line)+1), CHARINDEX('#',#line)-charindex('-',#line)-1)
I needed to get (099) 0000111-> (099) | 0000111 like two different columns.
SELECT
SUBSTRING(Phone, CHARINDEX('(', Phone) + 0, (2 + ((LEN(Phone)) - CHARINDEX(')', REVERSE(Phone))) - CHARINDEX('(', Phone))) AS CodePhone,
LTRIM(SUBSTRING(Phone, CHARINDEX(')', Phone) + 1, LEN(Phone))) AS NumberPhone
FROM
Suppliers
WHERE
Phone LIKE '%(%)%'
DECLARE #text VARCHAR(MAX)
SET #text = 'All I knew was that the dog had been very bad and required harsh punishment immediately regardless of what anyone else thought.'
DECLARE #pretext AS nvarchar(100) = 'the dog'
DECLARE #posttext AS nvarchar(100) = 'immediately'
SELECT
CASE
WHEN CHARINDEX(#posttext, #Text) - (CHARINDEX(#pretext, #Text) + len(#pretext)) < 0
THEN ''
ELSE SUBSTRING(#Text,
CHARINDEX(#pretext, #Text) + LEN(#pretext),
CHARINDEX(#posttext, #Text) - (CHARINDEX(#pretext, #Text) + LEN(#pretext)))
END AS betweentext
I'm a few years behind, but here's what I did to get a string between characters, that are not the same and also in the even you don't find the ending character, to still give you the substring
BEGIN
DECLARE #TEXT AS VARCHAR(20)
SET #TEXT='E101465445454-1'
SELECT SUBSTRING(#TEXT, CHARINDEX('E', #TEXT)+1, CHARINDEX('-',#TEXT)) as 'STR',
CAST(CHARINDEX('E', #TEXT)+1 AS INT) as 'val1', CAST(CHARINDEX('-', #TEXT) AS INT) as 'val2',
(CAST(CHARINDEX('-',#TEXT) AS INT) - CAST(CHARINDEX('E',#TEXT)+1 AS INT)) as 'SUBTR', LEN(#TEXT) as 'LEN'
SELECT CASE WHEN (CHARINDEX('-', #TEXT) > 0) THEN
SUBSTRING(#TEXT, CHARINDEX('E', #TEXT)+1, (CAST(CHARINDEX('-',#TEXT) AS INT) - CAST(CHARINDEX('E',#TEXT)+1 AS INT)))
ELSE
SUBSTRING(#TEXT, CHARINDEX('E', #TEXT)+1,LEN(#TEXT)- CHARINDEX('E', #TEXT))
END
END
Try it and comment for any improvements or if it does the job
select substring(#string,charindex('#first',#string)+1,charindex('#second',#string)-(charindex('#first',#string)+1))
Let us consider we have a string DUMMY_DATA_CODE_FILE and we want to find out the substring between 2nd and 3rd underscore(_). Then we use query something like this.
select SUBSTRING('DUMMY_DATA_CODE_FILE',charindex('_', 'DUMMY_DATA_CODE_FILE', (charindex('_','DUMMY_DATA_CODE_FILE', 1))+1)+1, (charindex('_', 'DUMMY_DATA_CODE_FILE', (charindex('_','DUMMY_DATA_CODE_FILE', (charindex('_','DUMMY_DATA_CODE_FILE', 1))+1))+1)- charindex('_', 'DUMMY_DATA_CODE_FILE', (charindex('_','DUMMY_DATA_CODE_FILE', 1))+1)-1)) as Code