Text between string provided examples below - ssms

Hi have a string like these in a column:
Abc_def_ghi_contact.pdf
Asdd_dk_hk_can.pdf
The result which i need are
To extract what ever is there
Before the . And after the last _ in the above
Result for above should be
Cantact
Can
Need this in SSMS code

If you have always have .pdf extension file below query works.
declare #str varchar(100) = 'Abc_def_ghi_contact.pdf'
select
SUBSTRING(
right(#str, charindex('_', reverse(#str) + '_') - 1)
,1
,CASE WHEN CHARINDEX('.',right(#str, charindex('_', reverse(#str) + '_') - 1)) >1
THEN CHARINDEX('.',right(#str, charindex('_', reverse(#str) + '_') - 1))-1
ELSE LEN(right(#str, charindex('_', reverse(#str) + '_') - 1))
END
)

Related

Substring in middle of names

My table contain column [File] with names of files
I have files like :
U_1456789_23456789_File1_automaticrepair
U_3456789_3456789_File2_jumpjump
B_1134_445673_File3_plane
I_111345_333345_File4_chupapimonienio
P_1156_3556_File5 idk what
etc...
I want to create column where i will see only bolded values, how i can do that ?
If your RDBMS supports it, a regular expression is a much cleaner solution. If it doesn't, (and SQL Server doesn't by default) you can use a combination of SUBSTRING and CHARINDEX to get the text in the column between the second and third underscores as explained in this question.
Assuming a table created as follows:
CREATE TABLE [Files] ([File] NVARCHAR(200));
INSERT INTO [Files] VALUES
('U_1456789_23456789_File1_automaticrepair'),
('U_3456789_3456789_File2_jumpjump'),
('B_1134_445673_File3_plane'),
('I_111345_333345_File4_chupapimonienio'),
('P_1156_3556_File5 idk what');
You can use the query:
SELECT [File],
SUBSTRING([File],
-- Start taking after the second underscore
-- in the original field value
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1,
-- Continue taking for the length between the
-- index of the second and third underscores
CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) - (CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1)) AS Part
FROM [Files];
To get the results:
File
Part
U_1456789_23456789_File1_automaticrepair
23456789
U_3456789_3456789_File2_jumpjump
3456789
B_1134_445673_File3_plane
445673
I_111345_333345_File4_chupapimonienio
333345
P_1156_3556_File5 idk what
3556
See the SQL Fiddle
Edit: to brute force support for inputs with only two underscores:
CREATE TABLE [Files] ([File] NVARCHAR(200));
INSERT INTO [Files] VALUES
('U_1456789_23456789_File1_automaticrepair'),
('U_3456789_3456789_File2_jumpjump'),
('B_1134_445673_File3_plane'),
('I_111345_333345_File4_chupapimonienio'),
('P_1156_3556_File5 idk what'),
('K_25444_filenamecar');
Add a case for when a third underscore could not be found and adjust the start position/length passed to SUBSTRING.
SELECT [File],
CASE WHEN CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) = 0
THEN
SUBSTRING([File],
CHARINDEX('_', [File]) + 1,
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) - (CHARINDEX('_', [File]) + 1))
ELSE
SUBSTRING([File],
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1,
CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) - (CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1))
END AS Part
FROM [Files];
File
Part
U_1456789_23456789_File1_automaticrepair
23456789
U_3456789_3456789_File2_jumpjump
3456789
B_1134_445673_File3_plane
445673
I_111345_333345_File4_chupapimonienio
333345
P_1156_3556_File5 idk what
3556
K_25444_filenamecar
25444
See the SQL Fiddle
Note that this approach is even more brittle and you're definitely in the realm of problem that is likely better handled in application code instead of by the SQL engine.

Get substring between second and fourth slash

I have a string that looks like this:
Y:\Data\apples\oranges\Scott\notes
I need a column that looks like this:
apples\oranges
This is what I have so far and it does not work:
SELECT SUBSTRING(
[Group],
CHARINDEX('\', [Group]) + 1,
LEN([Group]) - CHARINDEX('\', [Group]) - CHARINDEX('\', REVERSE([Group]))
) from datamap.finaltest
The strings will not always have a finite amount of slashes. For example you could have:
Y:\Data\Apples\bananas
Y:\Apples\Pears\oranges\peanuts
The data will always have:
drive letter + '\' + '1st level folder' + '\' + 'Second level folder'
It may have more than two levels though.
I have searched the forum but can't find anything specific.
Thanks
A blatant approach by converting your input into XML and taking the values by node and re-concatenating the nodes you want in output
;WITH MyTempData
AS
(
SELECT Convert(xml,'<n>'+Replace('Y:\Data\Apples','\','</n><n>')+'</n>') XMLString
)
SELECT COALESCE(XMLString.value('(/n[3])', 'varchar(20)'),'') + '\' +
COALESCE(XMLString.value('(/n[4])', 'varchar(20)'),'') MyFinalOutput
FROM MyTempData
Probably not the best way, but this will get you there.
DECLARE #string varchar(255) = 'Y:\data\apples\oranges\Scott\notes'
SELECT LEFT(RIGHT(#string,LEN(#string)-CHARINDEX('\', #string, CHARINDEX('\', #string,1) + 1)),CHARINDEX('\', RIGHT(#string,LEN(#string)-CHARINDEX('\', #string, CHARINDEX('\', #string,1) + 1)), CHARINDEX('\',RIGHT(#string,LEN(#string)-CHARINDEX('\', #string, CHARINDEX('\', #string,1) + 1)),1)+1)-1)
Here is a way using recursive CHARINDEX
declare #var varchar(4000) = 'Y:\Data\apples\oranges\Scott\notes'
declare #firstSlash int = (select CHARINDEX('\',#var,CHARINDEX('\',#var) + 1))
declare #fourthSlash int = (select CHARINDEX('\',#var,CHARINDEX('\',#var,CHARINDEX('\',#var,CHARINDEX('\',#var) + 1)+1)+1))
select SUBSTRING(#var,#firstSlash + 1,#fourthSlash - #firstSlash - 1)
Or, for your data table...
select SUBSTRING([Group],CHARINDEX('\',[Group],CHARINDEX('\',[Group]) + 1) + 1,CHARINDEX('\',[Group],CHARINDEX('\',[Group],CHARINDEX('\',[Group],CHARINDEX('\',[Group]) + 1)+1)+1) - CHARINDEX('\',[Group],CHARINDEX('\',[Group]) + 1) - 1)
If this is something you need to do often, or is prone to changing, it may be beneficial to implement a function which will make your code more readable/maintainable:
SELECT SUBSTRING(#t, dbo.CHARINDEX2('\', #t, 2) + 1, dbo.CHARINDEX2('\', #t, 3));
Using this 'find nth occurence' function:
http://www.sqlservercentral.com/scripts/Miscellaneous/30497/

Pad Zero before first hypen and remove spaces and add BA and IN

I have data as below
98-45.3A-22
104-44.0A-23
00983-29.1-22
01757-42.5A-22
04968-37.3A2-23
Output Looking for output as below in SQL Server
00098-BA45.3A-IN-22
00104-BA44.0A-IN-23
00983-BA29.1-IN-22
01757-BA42.5A-IN-22
04968-BA37.3A2-IN-23
I splitted parts to cope with tricky data templates. This should work even with non-dash-2-digit tail:
WITH Src AS
(
SELECT * FROM (VALUES
('98-45.3A-22'),
('104-44.0A-23'),
('00983-29.1-22'),
('01757-42.5A-22'),
('04968-37.3A2-23')
) T(X)
), Parts AS
(
SELECT *,
RIGHT('00000'+SUBSTRING(X, 1, CHARINDEX('-',X, 1)-1),5) Front,
'BA'+SUBSTRING(X, CHARINDEX('-',X, 1)+1, 2) BA,
SUBSTRING(X, PATINDEX('%.%',X), LEN(X)-CHARINDEX('-', REVERSE(X), 1)-PATINDEX('%.%',X)+1) P,
SUBSTRING(X, LEN(X)-CHARINDEX('-', REVERSE(X), 1)+1, LEN(X)) En
FROM Src
)
SELECT Front+'-'+BA+P+'-IN'+En
FROM Parts
It returns:
00098-BA45.3A-IN-22
00104-BA44.0A-IN-23
00983-BA29.1-IN-22
01757-BA42.5A-IN-22
04968-BA37.3A2-IN-23
Try this,
DECLARE #String VARCHAR(100) = '98-45.3A-22'
SELECT ISNULL(REPLICATE('0',6 - CHARINDEX('-',#String)),'') -- Add leading Zeros
+ STUFF(
STUFF(#String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN') -- Add 'IN'
What if I have more than 6 digit number before first hyphen and want to remove the leading zeros to make it 6 digits.
DECLARE #String VARCHAR(100) = '0000098-45.3A-22'
SELECT CASE WHEN CHARINDEX('-',#String) <= 6
THEN ISNULL(REPLICATE('0',6 - CHARINDEX('-',#String)),'') -- Add leading Zeros
+ STUFF(
STUFF( #String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN') -- Add 'IN'
ELSE STUFF(
STUFF(
STUFF(#String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN'), -- Add 'IN'
1, CHARINDEX('-',#String) - 6, '' -- remove extra leading Zeros
)
END
Making assumptions that the format is consistent (e.g. always ends with "-" + 2 characters....)
DECLARE #Data TABLE (Col1 VARCHAR(100))
INSERT #Data ( Col1 )
SELECT Col1
FROM (
VALUES ('98-45.3A-22'), ('104-44.0A-23'),
('00983-29.1-22'), ('01757-42.5A-22'),
('04968-37.3A2-23')
) x (Col1)
SELECT RIGHT('0000' + LEFT(Col1, CHARINDEX('-', Col1) - 1), 5)
+ '-BA' + SUBSTRING(Col1, CHARINDEX('-', Col1) + 1, CHARINDEX('.', Col1) - CHARINDEX('-', Col1))
+ SUBSTRING(Col1, CHARINDEX('.', Col1) + 1, LEN(Col1) - CHARINDEX('.', Col1) - 3)
+ '-IN-' + RIGHT(Col1, 2)
FROM #Data
It's not ideal IMO to do this string manipulation all the time in SQL. You could shift it out to your presentation layer, or store the pre-formatted value in the db to save the cost of this every time.
Use REPLICATE AND CHARINDEX:
Replicate: will repeat given character till reach required count specify in function
CharIndex: Finds the first occurrence of any character
Declare #Data AS VARCHAR(50)='98-45.3A-22'
SELECT REPLICATE('0',6-CHARINDEX('-',#Data)) + #Data
SELECT
SUBSTRING
(
(REPLICATE('0',6-CHARINDEX('-',#Data)) +#Data)
,0
,6
)
+'-'+'BA'+ CAST('<x>' + REPLACE(#Data,'-','</x><x>') + '</x>' AS XML).value('/x[2]','varchar(max)')
+'-'+ 'IN'+ '-' + CAST('<x>' + REPLACE(#Data,'-','</x><x>') + '</x>' AS XML).value('/x[3]','varchar(max)')
In another way by using PARSENAME() you can use this query:
WITH t AS (
SELECT
PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 3) AS p1,
REPLACE(PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 2), '###', '.') AS p2,
PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 1) AS p3
FROM yourTable)
SELECT RIGHT('00000' + p1, 5) + '-BA' + p2 + '-IN-' + p3
FROM t;

Extract string between after second / and before -

I have a field that holds an account code. I've managed to extract the first 2 parts OK but I'm struggling with the last 2.
The field data is as follows:
812330/50110/0-0
812330/50110/BDG001-0
812330/50110/0-X001
I need to get the string between the second "/" and the "-" and after the "-" .Both fields have variable lengths, so I would be looking to output 0 and 0 on the first record, BDG001 and 0 on the second record and 0 and X001 on the third record.
Any help much appreciated, thanks.
You can use CHARINDEX and LEFT/RIGHT:
CREATE TABLE #tab(col VARCHAR(1000));
INSERT INTO #tab VALUES ('812330/50110/0-0'),('812330/50110/BDG001-0'),
('812330/50110/0-X001');
WITH cte AS
(
SELECT
col,
r = RIGHT(col, CHARINDEX('/', REVERSE(col))-1)
FROM #tab
)
SELECT col,
r,
sub1 = LEFT(r, CHARINDEX('-', r)-1),
sub2 = RIGHT(r, LEN(r) - CHARINDEX('-', r))
FROM cte;
LiveDemo
EDIT:
or even simpler:
SELECT
col
,sub1 = SUBSTRING(col,
LEN(col) - CHARINDEX('/', REVERSE(col)) + 2,
CHARINDEX('/', REVERSE(col)) -CHARINDEX('-', REVERSE(col))-1)
,sub2 = RIGHT(col, CHARINDEX('-', REVERSE(col))-1)
FROM #tab;
LiveDemo2
EDIT 2:
Using PARSENAME SQL SERVER 2012+ (if your data does not contain .):
SELECT
col,
sub1 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 2),
sub2 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 1)
FROM #tab;
LiveDemo3
...Or you can do this, so you only go from left side to right, so you don't need to count from the end in case you have more '/' or '-' signs:
SELECT
SUBSTRING(columnName, CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) + 1,
CHARINDEX('-', columnName) - CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) - 1) AS FirstPart,
SUBSTRING(columnName, CHARINDEX('-' , columnName) + 1, LEN(columnName)) AS LastPart
FROM table_name
One method way is to download a split() function off the web and use it. However, the values end up in separate rows, not separate columns. An alternative is a series of nested subqueries, CTEs, or outer applies:
select t.*, p1.part1, p12.part2, p12.part3
from table t outer apply
(select t.*,
left(t.field, charindex('/', t.field)) as part1,
substring(t.field, charindex('/', t.field) + 1) as rest1
) p1 outer apply
(select left(p1.rest1, charindex('/', p1.rest1) as part2,
substring(p1.rest1, charindex('/', p1.rest1) + 1, len(p1.rest1)) as part3
) p12
where t.field like '%/%/%';
The where clause guarantees that the field value is in the right format. Otherwise, you need to start sprinkling the code with case statements to handle misformated data.

simplifying a LEFT / REPLACE query

I have a query that is in dire need of being simplified. Here is part of the query:
SELECT
LEFT(MLIS.REQUESTOR_FIRST_NAME, CharIndex( ' ', MLIS.REQUESTOR_FIRST_NAME + ' ' ) - 1)
, CharIndex( ' ', LEFT(MLIS.REQUESTOR_FIRST_NAME, CharIndex( ' ', MLIS.REQUESTOR_FIRST_NAME + ' ' ) - 1) + ' ' ) - 1)
+REPLICATE(' ',25),25)+
LEFT(' '+REPLICATE(' ',20),20)+
LEFT(
LEFT(
LEFT(MLIS.REQUESTOR_LAST_NAME, CharIndex( ',', MLIS.REQUESTOR_LAST_NAME + ',' ) - 1)
, CharIndex( ',', LEFT(MLIS.REQUESTOR_LAST_NAME, CharIndex( ',', MLIS.REQUESTOR_LAST_NAME + ',' ) - 1) + ',' ) - 1)
the reason I am doing the replicates is because i am building a fixed length string. each column needs to be a fixed length.
in addition to the above query, for every occurrence of MLIS.REQUESTOR_FIRST_NAME and MLIS.REQUESTOR_LAST_NAME i need to do:
REPLACE(REPLACE(MLIS.REQUESTOR_FIRST_NAME,', MD',''),',MD','')
and
REPLACE(REPLACE(MLIS.REQUESTOR_LAST_NAME,', MD',''),',MD','')
How do I include these REPLACES in the query and simplify the entire thing?
thanks so much for your guidance and kind help.
select the common bits in a subquery... (you'll have a bit more)
SELECT
LEFT(REQUESTOR_FIRST_NAME, fname_idx - 1)
, CharIndex( ' ', LEFT(MLIS.REQUESTOR_FIRST_NAME, fname_idx - 1) + ' ' ) - 1)
..
FROM ( select CharIndex( ' ', MLIS.REQUESTOR_FIRST_NAME + ' ' ) fname_idx, REQUESTOR_FIRST_NAME from...
Using a subquery will help with the syntax. In addition, you can cast to a CHAR() to pad and truncate strings to a given length.
I think the following does what you want:
SELECT cast(fname as char(25)) + ' ' + cast(lname as char(25))
from (select replace(replace(LEFT(MLIS.REQUESTOR_FIRST_NAME,
CharIndex(' ', MLIS.REQUESTOR_FIRST_NAME + ' ' ) - 1
),
',MD', ''),
', MD', '') as fname,
replace(relpace(left(MLIS.REQUESTOR_LAST_NAME,
CharIndex(',', MLIS.REQUESTOR_LAST_NAME + ',' ) - 1),
CharIndex(',', LEFT(MLIS.REQUESTOR_LAST_NAME,
CharIndex( ',', MLIS.REQUESTOR_LAST_NAME + ',' ) - 1) + ','
) - 1
),
',MD', ''),
', MD', '') as lname
However, it is hard to follow the original query, and there might be a syntax error. This query is meant to give you some guidance on solving the problem. I would also put a cast after the concatenate to be sure the final string is the right length.