I want to know a flexible way to extract the string between two '-'. The issue is that '-' may or may not exist in the source string. I have the following code which works fine when '-' exists twice in the source string, marking the start and end of the extracted string. But it throws an error "Invalid length parameter passed to the LEFT or SUBSTRING function" when there is only one '-' or none at all or if the string is blank. Can someone please help? Thanks
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
Desired Output: ExtractThis
If there is one dash only e.g. 'BLAH90-ExtractThisWOW' then the output should be everything after the first dash i.e. ExtractThisWOW. If there are no dashes then the string will have a blank space instead e.g. 'BLAH90 ExtractThisWOW' and should return everything after the blank space i.e. ExtractThisWOW.
You can try something like this.
When there is no dash, it starts at the space if there is one or take the whole string if not.
Then I look if there is only one dash or 2
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
declare #dash_pos integer = CHARINDEX('-',#string)
SELECT CASE
WHEN #dash_pos = 0 THEN
RIGHT(#string,LEN(#string)-CHARINDEX(' ',#string))
ELSE (
CASE
WHEN #dash_pos = LEN(#string)-CHARINDEX('-',REVERSE(#string))+1
THEN RIGHT(#string,LEN(#string)-#dash_pos)
ELSE SUBSTRING(#string,#dash_pos+1, CHARINDEX('-',#string,#dash_pos+1) -
#dash_pos -1)
END
)
END as My_String
Try this. If there are two dashes, it'll take what is inside. If there is only one or none, it'll keep the original string.
declare #string varchar(100) = 'BLAH-90ExtractThisWOW'
declare #dash_index1 int = case when #string like '%-%' then CHARINDEX('-', #string) else -1 end
declare #dash_index2 int = case when #string like '%-%'then len(#string) - CHARINDEX('-', reverse(#string)) + 1 else -1 end
SELECT case
when #dash_index1 <> #dash_index2 then SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1)
else #string end
as My_String
Take your existing code:
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
insert one line, like so:
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SET #string = #string + '--'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
and you're done. (If NULL, you will get NULL returned. Also, this will return all data based on the FIRST dash found in the string, regardless of however many dashes are in the string.)
Related
I had a previous question and it got me started but now I'm needing help completing this. Previous question = How to search a string and return only numeric value?
Basically I have a table with one of the columns containing a very long XML string. There's a number I want to extract near the end. A sample of the number would be this...
<SendDocument DocumentID="1234567">true</SendDocument>
So I want to use substrings to find the first part = true so that Im only left with the number.
What Ive tried so far is this:
SELECT SUBSTRING(xml_column, CHARINDEX('>true</SendDocument>', xml_column) - CHARINDEX('<SendDocument',xml_column) +10087,9)
The above gives me the results but its far from being correct. My concern is that, what if the number grows from 7 digits to 8 digits, or 9 or 10?
In the previous question I was helped with this:
SELECT SUBSTRING(cip_msg, CHARINDEX('<SendDocument',cip_msg)+26,7)
and thats how I got started but I wanted to alter so that I could subtract the last portion and just be left with the numbers.
So again, first part of the string that contains the digits, find the two substrings around the digits and remove them and retrieve just the digits no matter the length.
Thank you all
You should be able to setup your SUBSTRING() so that both the starting and ending positions are variable. That way the length of the number itself doesn't matter.
From the sound of it, the starting position you want is right After the "true"
The starting position would be:
CHARINDEX('<SendDocument DocumentID=', xml_column) + 25
((adding 25 because I think CHARINDEX gives you the position at the beginning of the string you are searching for))
Length would be:
CHARINDEX('>true</SendDocument>',xml_column) - CHARINDEX('<SendDocument DocumentID=', xml_column)+25
((Position of the ending text minus the position of the start text))
So, how about something along the lines of:
SELECT SUBSTRING(xml_column, CHARINDEX('<SendDocument DocumentID=', xml_column)+25,(CHARINDEX('>true</SendDocument>',xml_column) - CHARINDEX('<SendDocument DocumentID=', xml_column)+25))
Have you tried working directly with the xml type? Like below:
DECLARE #TempXmlTable TABLE
(XmlElement xml )
INSERT INTO #TempXmlTable
select Convert(xml,'<SendDocument DocumentID="1234567">true</SendDocument>')
SELECT
element.value('./#DocumentID', 'varchar(50)') as DocumentID
FROM
#TempXmlTable CROSS APPLY
XmlElement.nodes('//.') AS DocumentID(element)
WHERE element.value('./#DocumentID', 'varchar(50)') is not null
If you just want to work with this as a string you can do the following:
DECLARE #SearchString varchar(max) = '<SendDocument DocumentID="1234567">true</SendDocument>'
DECLARE #Start int = (select CHARINDEX('DocumentID="',#SearchString)) + 12 -- 12 Character search pattern
DECLARE #End int = (select CHARINDEX('">', #SearchString)) - #Start --Find End Characters and subtract start position
SELECT SUBSTRING(#SearchString,#Start,#End)
Below is the extended version of parsing an XML document string. In the example below, I create a copy of a PLSQL function called INSTR, the MS SQL database does not have this by default. The function will allow me to search strings at a designated starting position. In addition, I'm parsing a sample XML string into a variable temp table into lines and only looking at lines that match my search criteria. This is because there may be many elements with the words DocumentID and I'll want to find all of them. See below:
IF EXISTS (select * from sys.objects where name = 'INSTR' and type = 'FN')
DROP FUNCTION [dbo].[INSTR]
GO
CREATE FUNCTION [dbo].[INSTR] (#String VARCHAR(8000), #SearchStr VARCHAR(255), #Start INT, #Occurrence INT)
RETURNS INT
AS
BEGIN
DECLARE #Found INT = #Occurrence,
#Position INT = #Start;
WHILE 1=1
BEGIN
-- Find the next occurrence
SET #Position = CHARINDEX(#SearchStr, #String, #Position);
-- Nothing found
IF #Position IS NULL OR #Position = 0
RETURN #Position;
-- The required occurrence found
IF #Found = 1
BREAK;
-- Prepare to find another one occurrence
SET #Found = #Found - 1;
SET #Position = #Position + 1;
END
RETURN #Position;
END
GO
--Assuming well formated xml
DECLARE #XmlStringDocument varchar(max) = '<SomeTag Attrib1="5">
<SendDocument DocumentID="1234567">true</SendDocument>
<SendDocument DocumentID="1234568">true</SendDocument>
</SomeTag>'
--Split Lines on this element tag
DECLARE #SplitOn nvarchar(25) = '</SendDocument>'
--Let's hold all lines in Temp variable table
DECLARE #XmlStringLines TABLE
(
Value nvarchar(100)
)
While (Charindex(#SplitOn,#XmlStringDocument)>0)
Begin
Insert Into #XmlStringLines (value)
Select
Value = ltrim(rtrim(Substring(#XmlStringDocument,1,Charindex(#SplitOn,#XmlStringDocument)-1)))
Set #XmlStringDocument = Substring(#XmlStringDocument,Charindex(#SplitOn,#XmlStringDocument)+len(#SplitOn),len(#XmlStringDocument))
End
Insert Into #XmlStringLines (Value)
Select Value = ltrim(rtrim(#XmlStringDocument))
--Now we have a table with multple lines find all Document IDs
SELECT
StartPosition = CHARINDEX('DocumentID="',Value) + 12,
--Now lets use the INSTR function to find the first instance of '">' after our search string
EndPosition = dbo.INSTR(Value,'">',( CHARINDEX('DocumentID="',Value)) + 12,1),
--Now that we know the start and end lets use substring
Value = SUBSTRING(value,(
-- Start Position
CHARINDEX('DocumentID="',Value)) + 12,
--End Position Minus Start Position
dbo.INSTR(Value,'">',( CHARINDEX('DocumentID="',Value)) + 12,1) - (CHARINDEX('DocumentID="',Value) + 12))
FROM
#XmlStringLines
WHERE Value like '%DocumentID%' --Only care about lines with a document id
String = "Vishal Hariharan" .. Replace first "H" with "A" SQL
I have one scenario where I want replace only first "H" character among others "H" positions and remaining keep as it is.
There are many simplistic answers for this on Stack Overflow that don't take into account the case where the character being searched for doesn't exist in the string. In these cases, the SQL will return NULL if the replacement character does not exist in the string. Try this instead:
declare #name varchar(max), #ToReplace varchar(max), #ReplaceWith varchar(max)
set #name = 'Vishal Hariharan'
set #ToReplace = 'H'
set #ReplaceWith = 'A'
select #name
select stuff(#name, charindex(#ToReplace,#name),1, CASE WHEN charindex(#ToReplace,#name) <> 0 THEN #ReplaceWith ELSE #ToReplace END)
The CASE statement simply checks if the character exists in the string, and if it doesn't, it replaces the matching character with itself.
DECLARE #findChar varchar(max)='H'
DECLARE #RepalceCharacter varchar(max)='A'
DECLARE #OriginText varchar(max)='Vishal Hariharan'
IF (CharIndex(#findChar, #OriginText)<>0)
SELECT Stuff(#OriginText, CharIndex(#findChar, #OriginText), Len(#findChar), #RepalceCharacter)
ELSE
SELECT #OriginText
I have a column '<InstructorID>' which may contain data like "79,instr1,inst2,13" and so on.
The following code gives me result like this "791213"
declare #InstructorID varchar(100)
set #InstructorID= (select InstructorID from CourseSession where CourseSessionNum=262)
WHILE PATINDEX('%[^0-9]%', #InstructorID) > 0
BEGIN
SET #InstructorID = STUFF(#InstructorID, PATINDEX('%[^0-9]%', #InstructorID), 1, '')
END
select #InstructorID
I need the output ti be like this "79,13"
i.e those numbers attached to characters shoud not appear in output.
P.S: I need to achieve this using sql only. Unfortunately i'm unable to use Regex which would have made this task much easier.
I agree with others that your problem would seem to be indicative of a mistake in your data design.
However, accepting that you cannot change the design, the following would allow you to achieve what you are looking for:
DECLARE #InstructorID VARCHAR(100)
DECLARE #Part VARCHAR(100)
DECLARE #Pos INT
DECLARE #Return VARCHAR(100)
SET #InstructorID = '79,instr1,inst2,13'
SET #Return = ''
-- Continue until InstructorID is empty
WHILE (LEN(#InstructorID) > 0)
BEGIN
-- Get the position of the next comma, and set to the end of InstructorID if there are no more
SET #Pos = CHARINDEX(',', #InstructorID)
IF (#Pos = 0)
SET #Pos = LEN(#InstructorID)
-- Get the next part of the text and shorted InstructorID
SET #Part = SUBSTRING(#InstructorID, 1, #Pos)
SET #InstructorID = RIGHT(#InstructorID, LEN(#InstructorID) - #Pos)
-- Check that the part is numeric
IF (ISNUMERIC(#Part) = 1)
SET #Return = #Return + #Part
END
-- Trim trailing comma (if any)
IF (RIGHT(#Return, 1) = ',')
SET #Return = LEFT(#Return, LEN(#Return) - 1)
PRINT #Return
Essentially, this loops through the #InstructorID, extracting parts of text between commas.
If the part is numeric then it adds it to the output text. I am PRINTing the text but you could SELECT it or use it however you wish.
Obviously, where I have SET #InstructorID = xyz, you should change this to your SELECT statement.
This code can be placed into a UDF if preferred, although as I say, your data format seems less than ideal.
I am using SQL Server 2008
I have sql string in column with ; separated values. How i can trim the below value
Current string:
;145615;1676288;178829;
Output:
145615;1676288;178829;
Please help with sql query to trim the first ; from string
Note : The first char may be or may not be ; but if it is ; then only it should trim.
Edit: What i had tried before, although it doesn't make sense after so many good responses.
DECLARE
#VAL VARCHAR(1000)
BEGIN
SET #VAL =';13342762;1334273;'
IF(CHARINDEX(';',#VAL,1)=1)
BEGIN
SELECT SUBSTRING(#VAL,2,LEN(#VAL))
END
ELSE
BEGIN
SELECT #VAL
END
END
SELECT CASE WHEN col LIKE ';%'
THEN STUFF(col,1,1,'') ELSE col END
FROM dbo.table;
Just check the first character, and if it matches, start from the second character:
SELECT CASE WHEN SUBSTRING(col,1,1) = ';'
THEN SUBSTRING(col,2,LEN(col))
ELSE col
END AS col
Here's an example:
DECLARE #v varchar(10)
SET #v = ';1234'
SELECT
CASE
WHEN LEFT(#v,1) = ';' THEN RIGHT(#v, LEN(#v) - 1)
ELSE #v
END
A further development on #Aaron Bertrand's answer:
SELECT
STUFF(col, 1, PATINDEX(';%', col), '')
FROM ...
PATINDEX is similar to LIKE in that it uses a pattern search, but being a function it also returns the position of the first match. In this case, since we a looking for a ; specifically at the beginning of a string, the position returned is going to be either 1 (if found) or 0 (if not found). If it is 1, the STUFF function will delete 1 character at the beginning of the string, and if the position is 0, STUFF will delete 0 characters.
I have this following data:
0297144600-4799 0297485500-5599
The 0297485500-5599 based on observation always on position 31 char from the left which this is an easy approach.
But I would like to do is to anticipate just in case the data is like this below which means the position is no longer valid:
0297144600-4799 0297485500-5599 0297485600-5699
As you can see, I guess the first approach will the split by 1 blank space (" ") but due to number of space is unknown (varies) how do I take this approach then? Is there any method to find the space in between and shrink into 1 blank space (" ").
BTW ... it needs to be done in TSQL (Ms SQL 2005) unfortunately cause it's for SSIS :(
I am open with your idea/suggestion.
Thanks
I have updated my answer a bit, now that I know the number pattern will not always match. This code assumes the sequences will begin and end with a number and be separated by any number of spaces.
DECLARE #input nvarchar -- max in parens
DECLARE #pattern nvarchar -- max in parens
DECLARE #answer nvarchar -- max in parens
DECLARE #pos int
SET #input = ' 0297144623423400-4799 5615618131201561561 0297485600-5699 '
-- Make sure our search string has whitespace at the end for our pattern to match
SET #input = #input + ' '
-- Find anything that starts and ends with a number
WHILE PATINDEX('%[0-9]%[0-9] %', #input) > 0
BEGIN
-- Trim off the leading whitespace
SET #input = LTRIM(#input)
-- Find the end of the sequence by finding a space
SET #pos = PATINDEX('% %', #input)
-- Get the result out now that we know where it is
SET #answer = SUBSTRING(#input, 0, #pos)
SELECT [Result] = #answer
-- Remove the result off the front of the string so we can continue parsing
SET #input = SUBSTRING(#input, LEN(#answer) + 1, 8096)
END
Assuming you're processing one line at a time, you can also try this:
DECLARE #InputString nvarchar(max)
SET #InputString = '0297144600-4799 0297485500-5599 0297485600-5699'
BEGIN
WHILE CHARINDEX(' ',#InputString) > 0 -- Checking for double spaces
SET #InputString =
REPLACE(#InputString,' ',' ') -- Replace 2 spaces with 1 space
END
PRINT #InputString
(taken directly from SQLUSA, fnRemoveMultipleSpaces1)