How can I use SQL Substring to extract characters from this filename? - sql

I'm attempting to use the SUBSTRING function to extract a name out of a filename.
An example filename would be: "73186_RHIMagnesita_PHI_StopLoss_TruncSSN_NonRedact_Inc_to_Apr2022_Paid_to_Apr2022_EDIT"
I'm attempting to extract the "RHIMagnesita" from this filename.
The substring I used was:
SUBSTRING(DFH.FileName, CHARINDEX('_', DFH.FileName) + 1, CHARINDEX('_PHI', DFH.FileName) - 1)
The results it gave were: "RHIMagnesita_PHI_S"
How do I extract only "RHIMagnesita" using the Substring function?

The third parameter in SUBSTRING is length not position, so you would need to substract the length of the beginning string.
SUBSTRING(DFH.FileName, CHARINDEX('_', DFH.FileName) + 1, CHARINDEX('_PHI', DFH.FileName) - CHARINDEX('_', DFH.FileName))
You might need to add or substract 1, but that's the idea.

You were close. You need to use CHARINDEX to also find the position of the second underscore.
SELECT SUBSTRING(FileName,
CHARINDEX('_', FileName) + 1,
CHARINDEX('_', FileName, CHARINDEX('_', FileName) + 1) -
CHARINDEX('_', FileName) - 1) AS FilePart
FROM yourTable;

Here's a way using STRING_SPLIT and FETCH, rather than SUBSTRING We split the string and only return the second row
SELECT
value
FROM STRING_SPLIT('73186_RHIMagnesita_PHI_StopLoss_TruncSSN_NonRedact_Inc_to_Apr2022_Paid_to_Apr2022_EDIT','_')
ORDER BY (SELECT NULL)
OFFSET 1 ROWS
FETCH NEXT 1 ROWS ONLY;
Note: On Azure Sql Server STRING_SPLIT has an ordinal parameter, so you could write this
SELECT
value
FROM
STRING_SPLIT('73186_RHIMagnesita_PHI_StopLoss_TruncSSN_NonRedact_Inc_to_Apr2022_Paid_to_Apr2022_EDIT','_', 1)
WHERE ordinal = 2

Related

is there a way to use a mid-style function in sql like you can for excel?

I have the following string of characters:
594074_Ally_Financial_TokioMarine_MD_SLDET_20210101_20211130_20211208
I am attempting to extract everything after the first '_' but before the '_TokioMarine', so the final string will look like:
Ally_Financial
Is this possible to do with SQL? I attempted but it was pulling the incorrect characters. I cant get the ones in between the values specified.
SELECT
#CurPolicyHolder = Right( DFH.FileName, CHARINDEX('_', DFH.FileName) - 1)
To extract everything between the first _ character and the _TokyoMarine string, you can use:
SELECT
#CurPolicyHolder = SUBSTRING(DFH.FileName, CHARINDEX('_', DFH.FileName) + 1,
CHARINDEX('_TokioMarine', DFH.FileName) - CHARINDEX('_', DFH.FileName) - 1)
SUBSTRING (Transact-SQL)
CHARINDEX (Transact-SQL)

SQL Substring code to deal with extra characters

I have a string in the format 'Filename_30062021_095700.txt' and I wrote some SQL to get the date bit (in the format DDMMYYYY after the first underscore) then convert to an INT:
declare #filename varchar(50), #filename2 varchar(50)
Set #filename = 'Filename_240621_122110.txt'
Set #filename2 = 'Filename_240621_122110_1.txt'
Select CAST(SUBSTRING(#filename, CHARINDEX('_', #filename) + 1, CHARINDEX('.', #filename) - CHARINDEX('_', #filename) - 1) AS Int) As IntDateFilename1
The problem I have is where the filename randomly has an _1 character at the end before the file extension.
I can't see what to do to my query that would cope with the extra '_1'. I've written something to check the number of _ underscore characters and if there are three then I could do something differently, I just can't see what to do for the best.
I thought of more or less the same query, except for a Left(#filename, 15) + '.txt' instead of #filename but is there a better solution in case the DD or MM ever appear as one digit?
I would use the 2 surrounding underscores as the markers here:
SELECT
SUBSTRING(filename, CHARINDEX('_', filename) + 1,
CHARINDEX('_', filename, CHARINDEX('_', filename) + 1) -
CHARINDEX('_', filename) - 1)
FROM yourTable;
Demo
A little trick you can do to deal with your _1 issue in particular is to work with the string reversed, then charindex gives you the correct character postitions and copes equally well with _1234 etc
Set #filename = 'Filename_240621_122110_1234.txt'
select Reverse(Stuff(Reverse(#filename),(1+CharIndex('.',Reverse(#filename))),( CharIndex('_',Reverse(#filename))-CharIndex('.',Reverse(#filename)) ),''))
I would just look for '_' followed by six digits. Then remove the initial part of the string and take 6 digits:
SELECT LEFT(STUFF(filename, 1, PATINDEX('%_[0-9][0-9][0-9][0-9][0-9][0-9]%', filename), ''), 6)
FROM t ;
Or if you prefer using SUBSTRING():
SUBSTRING(filename, PATINDEX('%_[0-9][0-9][0-9][0-9][0-9][0-9]%', filename) + 1, 6)
Here is a db<>fiddle.

Get text after second occurrence of a specific character

I have a column with data like this:
firstNameLetter_surname_numericCode
For example:
m_johnson_1234
I need to extract the numeric code. I've tried with SUBSTRING function but I just get:
surname_numericCode
This is the code I've used:
SET number = substring(t2.code, charindex('_', t2.code, 2) + 1, len(t2.code))
How can I get just the numeric code?
Heres a one-liner for you
select right('m_johnson_1234', charindex('_', reverse('m_johnson_1234') + '_') - 1)
Call the CHARINDEX function twice:
SELECT SUBSTRING(
code,
NULLIF(CHARINDEX('_', code, NULLIF(CHARINDEX('_', code), 0) + 1), 0) + 1,
LEN(code)
)
FROM (VALUES
('a_b_c'),
('a_b')
) x(code)
One method is to look for the first _ in the reversed string:
select col,
stuff(col, 1, len(col) - charindex('_', reverse(col)) + 1, '') as numericCode
from (values ('firstNameLetter_surname_numericCode')) v(col);
If the numeric code is really a number -- and no other numbers start the preceding values -- then you can use patindex():
select col,
stuff(col, 1, patindex('%[_][0-9]%', col), '') as numericCode
from (values ('firstNameLetter_surname_0000')) v(col);
The SUBSTR and INSTR functions can be combined to get you the numeric code.
SELECT SUBSTR('m_johnson_1234', INSTR('m_johnson_1234', '_', 1, 2)+1) FROM TABLE;
For the start_pos argument use INSTR to start at the beginning of the string, and find the index of the second instance of the '_' character. Then use that to start one position after and read to the end of the string.
If you need the result to be numeric instead of still a string, then wrap the SUBSTR() in a TO_NUMBER() function.
Late answer, and just because I didn't see PARSENAME() mentioned.
Example
Select parsename(replace(t2.code,'_','.'),1)
From YourTable

TSQL extract part of string with regex

i would make a script that iterate over the records of a table with a cursor
and extract from a column value formatted like that "yyy://xx/bb/147011"
only the final number 147011and to put this value in a variable.
It's possible to do something like that?
Many thanks.
You don't need a cursor for this. You can just use a query. The following gets everything after the last /:
select right(str, charindex('/', reverse(str)) - 1 )
from (values ('yyy://xx/bb/147011')) v(str)
It does not specifically check if it is a number, but that can be added as well.
You can also use the below query.
SELECT RIGHT(RTRIM('yyy://xx/bb/147011'),
CHARINDEX('/', REVERSE('/' + RTRIM('yyy://xx/bb/147011'))) - 1) AS LastWord
If numeric value has exact position defined with sample data, then you can do :
SELECT t.*, SUBSTRING(t.col, PATINDEX('%[0-9]%', t.col), LEN(t.col))
FROM table t;

sql server all characters to right of first hyphen

In SQL Server 2014, how can I extract all characters to the right of the first hyphen in a field where the first hyphen will have many combinations following it.
example 1:
Aegean-1GB-7days-COMP
desired result:
1GB-7days-COMP
example 2:
Aegean-SchooliesSpecial-7GB
desired result:
SchooliesSpecial-7GB
example 3:
AkCityOaks-1Day-3GB
desired result:
1Day-3GB
Using CHARINDEX AND SUBSTRING would work:
DECLARE #HTXT as nvarchar(max)
SET #HTXT='lkjhgf-wtrfghvbn-jk87fry--jk'
SELECT SUBSTRING(#HTXT, CHARINDEX('-', #HTXT) + 1, LEN(#HTXT))
Result:
wtrfghvbn-jk87fry--jk
You can use a combination of CharIndex and 'SubString' to get the desired result.
When you do this, you will get the location of the first hyphen starting from the first character.
CharIndex ('Aegean-1GB-7days-COMP', '-', 1)
Then cutting the string is easy
Select
SubString (
'Aegean-1GB-7days-COMP',
CharIndex ('-', 'Aegean-1GB-7days-COMP', 1) + 1,
Len('Aegean-1GB-7days-COMP') - CharIndex ('-', 'Aegean-1GB-7days-COMP', 1)
)
Since your data is most likely in a column, I would change this to
Select
SubString (
YourColumnName,
CharIndex ('-', YourColumnName, 1) + 1,
Len(YourColumnName) - CharIndex ('-', YourColumnName, 1)
)
From YourTableName
If you want to match -- instead of -, then look at PatIndex`
Read Here about CharIndex
Read Here about PatIndex
Read Here about SubString
Hi you can use PATINDEX and SUBSTRING like this:
DECLARE #Text NVARCHAR(4000)
DECLARE #StartPos int
SET #StartPos = PATINDEX('%-%',#Text) + 1
RETURN SUBSTRING(#Text,#StartPos,LEN(#Text)-#StartPos)
Or in one:
SUBSTRING([Text],PATINDEX('%A%',[Text]) + 1, LEN([Text]) - PATINDEX('%A%',[Text]) + 1)