Substring in middle of names - sql

My table contain column [File] with names of files
I have files like :
U_1456789_23456789_File1_automaticrepair
U_3456789_3456789_File2_jumpjump
B_1134_445673_File3_plane
I_111345_333345_File4_chupapimonienio
P_1156_3556_File5 idk what
etc...
I want to create column where i will see only bolded values, how i can do that ?

If your RDBMS supports it, a regular expression is a much cleaner solution. If it doesn't, (and SQL Server doesn't by default) you can use a combination of SUBSTRING and CHARINDEX to get the text in the column between the second and third underscores as explained in this question.
Assuming a table created as follows:
CREATE TABLE [Files] ([File] NVARCHAR(200));
INSERT INTO [Files] VALUES
('U_1456789_23456789_File1_automaticrepair'),
('U_3456789_3456789_File2_jumpjump'),
('B_1134_445673_File3_plane'),
('I_111345_333345_File4_chupapimonienio'),
('P_1156_3556_File5 idk what');
You can use the query:
SELECT [File],
SUBSTRING([File],
-- Start taking after the second underscore
-- in the original field value
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1,
-- Continue taking for the length between the
-- index of the second and third underscores
CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) - (CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1)) AS Part
FROM [Files];
To get the results:
File
Part
U_1456789_23456789_File1_automaticrepair
23456789
U_3456789_3456789_File2_jumpjump
3456789
B_1134_445673_File3_plane
445673
I_111345_333345_File4_chupapimonienio
333345
P_1156_3556_File5 idk what
3556
See the SQL Fiddle
Edit: to brute force support for inputs with only two underscores:
CREATE TABLE [Files] ([File] NVARCHAR(200));
INSERT INTO [Files] VALUES
('U_1456789_23456789_File1_automaticrepair'),
('U_3456789_3456789_File2_jumpjump'),
('B_1134_445673_File3_plane'),
('I_111345_333345_File4_chupapimonienio'),
('P_1156_3556_File5 idk what'),
('K_25444_filenamecar');
Add a case for when a third underscore could not be found and adjust the start position/length passed to SUBSTRING.
SELECT [File],
CASE WHEN CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) = 0
THEN
SUBSTRING([File],
CHARINDEX('_', [File]) + 1,
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) - (CHARINDEX('_', [File]) + 1))
ELSE
SUBSTRING([File],
CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1,
CHARINDEX('_', [File], CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1) - (CHARINDEX('_', [File], CHARINDEX('_', [File]) + 1) + 1))
END AS Part
FROM [Files];
File
Part
U_1456789_23456789_File1_automaticrepair
23456789
U_3456789_3456789_File2_jumpjump
3456789
B_1134_445673_File3_plane
445673
I_111345_333345_File4_chupapimonienio
333345
P_1156_3556_File5 idk what
3556
K_25444_filenamecar
25444
See the SQL Fiddle
Note that this approach is even more brittle and you're definitely in the realm of problem that is likely better handled in application code instead of by the SQL engine.

Related

How do I combine a substring and trim right in SQL

I am trying to extract the data between two underscore characters. In some situations, the 2nd underscore may not exist.
MyFld
P_36840
U_216137
C_203134_H
C_203134_W
I tried this:
substring(i.[MyFld],
CHARINDEX ('_',i.[MyFld])+1,len(i.[MyFld])
-CHARINDEX ('_',i.[MyFld])
) [DerivedPrimaryKey]
And I get this:
DerivedPrimaryKey
36840
216137
203134_H
203134_W
https://dbfiddle.uk/uPKC6oX4
I want to remove the second underscore and data that follows it. I'm trying to combine it with a trim right, but I'm unsure where to start.
How can I do this?
We can start by simplifying what you have so far. I will also add enough to make this a complete query, so we can see it in context for later steps:
SELECT
right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)) [DerivedPrimaryKey]
FROM I
With this much done, we can now use it as the source for removing the trailing portion of the field:
SELECT
reverse(substring(reverse(step1)
, charindex('_', reverse(step1))+1
, len(step1)
)) [DerivedPrimaryKey]
FROM (
SELECT right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)) [step1]
FROM I
) T
Notice the layer of nesting. You can, of course, remove the nesting, but it means replicating the entire inner expression every time you see step1 (good thing I took the time to simplify it):
SELECT
reverse(substring(reverse(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)))
, charindex('_', reverse(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld))))+1
, len(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)))
))
FROM I
And now back to just the expression:
reverse(substring(reverse(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)))
, charindex('_', reverse(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld))))+1
, len(right(i.MyFld, len(i.MyFld) - charindex('_', i.MyFld)))
))
See it work here:
https://dbfiddle.uk/nFO4Vwhm
There is also this alternate expression that saves one function call:
left( right(i.MyFld,len(i.MyFld)-charindex('_',i.MyFld)),
coalesce(
nullif(
charindex('_',
right(i.MyFld,len(i.MyFld)-charindex('_',i.MyFld))
) -1, -1,
),
len( right(i.MyFld,len(i.MyFld)-charindex('_',i.MyFld)) )
)
)
Just a two more options. One using parsename() provided your data does not have more than 4 segments. The second using a JSON array
Example
Declare #YourTable Table ([MyFld] varchar(50)) Insert Into #YourTable Values
('P_36840')
,('U_216137')
,('C_203134_H')
,('C_203134_W')
Select *
,UsingParseName = reverse(parsename(reverse(replace(MyFld,'_','.')),2))
,UsingJSONValue = json_value('["'+replace(MyFld,'_','","')+'"]','$[1]')
From #You
Results
MyFld UsingParseName UsingJSONValue
P_36840 36840 36840
U_216137 216137 216137
C_203134_H 203134 203134
C_203134_W 203134 203134
We can do this:
Declare #testData Table ([MyFld] varchar(50));
Insert Into #testData (MyFld)
Values ('P_36840')
, ('U_216137')
, ('C_203134_H')
, ('C_203134_W');
Select *
, second_element = substring(v.MyFld, p1.pos, p2.pos - p1.pos - 1)
From #testData As td
Cross Apply (Values (concat(td.MyFld, '__'))) As v(MyFld) -- Make sure we have at least 2 delimiters
Cross Apply (Values (charindex('_', v.MyFld, 1) + 1)) As p1(pos) -- First Position
Cross Apply (Values (charindex('_', v.MyFld, p1.pos) + 1)) As p2(pos) -- Second Position
If you actually have a fixed number of characters in the first element, then it could be simplified to:
Select *
, second_element = substring(v.MyFld, 3, charindex('_', v.MyFld, 4) - 3)
From #testData td
Cross Apply (Values (concat(td.MyFld, '_'))) As v(MyFld)
Often I try to fake out SQL if an expected character isn't always present and I don't need the resulting value:
SELECT SUBSTRING(field_Calculated, 1, CHARINDEX('_', field_Calculated) - 1)
FROM (SELECT SUBSTRING(MyFld, CHARINDEX('_', MyFld) + 1, LEN(MyFld)) + '_' As field_Calculated
FROM MyTable) T
I think this is clear, but I really like the ParseName solution #JohnCappalletti suggests.
If it's only ever one numeric value you can use string_split:
SELECT * FROM MyTable
CROSS APPLY string_split(MyFld, '_')
WHERE ISNUMERIC(value) = 1
Either way you have to be careful of the data before deciding the best approach.
your data
Declare #Table Table ([MyFld] varchar(100))
Insert Into #Table
([MyFld] ) Values
('P_36840')
,('U_216137')
,('C_203134_H')
,('C_203134_W')
use SubString,Left and PatIndex
select
Left(
SubString(
[MyFld],
PatIndex('%[0-9.-]%', [MyFld]),
8000
),
PatIndex(
'%[^0-9.-]%',
SubString(
[MyFld],
PatIndex('%[0-9.-]%', [MyFld]),
8000
) + 'X'
)-1
) as DerivedPrimaryKey
from
#Table

Text between string provided examples below

Hi have a string like these in a column:
Abc_def_ghi_contact.pdf
Asdd_dk_hk_can.pdf
The result which i need are
To extract what ever is there
Before the . And after the last _ in the above
Result for above should be
Cantact
Can
Need this in SSMS code
If you have always have .pdf extension file below query works.
declare #str varchar(100) = 'Abc_def_ghi_contact.pdf'
select
SUBSTRING(
right(#str, charindex('_', reverse(#str) + '_') - 1)
,1
,CASE WHEN CHARINDEX('.',right(#str, charindex('_', reverse(#str) + '_') - 1)) >1
THEN CHARINDEX('.',right(#str, charindex('_', reverse(#str) + '_') - 1))-1
ELSE LEN(right(#str, charindex('_', reverse(#str) + '_') - 1))
END
)

Extract string between after second / and before -

I have a field that holds an account code. I've managed to extract the first 2 parts OK but I'm struggling with the last 2.
The field data is as follows:
812330/50110/0-0
812330/50110/BDG001-0
812330/50110/0-X001
I need to get the string between the second "/" and the "-" and after the "-" .Both fields have variable lengths, so I would be looking to output 0 and 0 on the first record, BDG001 and 0 on the second record and 0 and X001 on the third record.
Any help much appreciated, thanks.
You can use CHARINDEX and LEFT/RIGHT:
CREATE TABLE #tab(col VARCHAR(1000));
INSERT INTO #tab VALUES ('812330/50110/0-0'),('812330/50110/BDG001-0'),
('812330/50110/0-X001');
WITH cte AS
(
SELECT
col,
r = RIGHT(col, CHARINDEX('/', REVERSE(col))-1)
FROM #tab
)
SELECT col,
r,
sub1 = LEFT(r, CHARINDEX('-', r)-1),
sub2 = RIGHT(r, LEN(r) - CHARINDEX('-', r))
FROM cte;
LiveDemo
EDIT:
or even simpler:
SELECT
col
,sub1 = SUBSTRING(col,
LEN(col) - CHARINDEX('/', REVERSE(col)) + 2,
CHARINDEX('/', REVERSE(col)) -CHARINDEX('-', REVERSE(col))-1)
,sub2 = RIGHT(col, CHARINDEX('-', REVERSE(col))-1)
FROM #tab;
LiveDemo2
EDIT 2:
Using PARSENAME SQL SERVER 2012+ (if your data does not contain .):
SELECT
col,
sub1 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 2),
sub2 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 1)
FROM #tab;
LiveDemo3
...Or you can do this, so you only go from left side to right, so you don't need to count from the end in case you have more '/' or '-' signs:
SELECT
SUBSTRING(columnName, CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) + 1,
CHARINDEX('-', columnName) - CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) - 1) AS FirstPart,
SUBSTRING(columnName, CHARINDEX('-' , columnName) + 1, LEN(columnName)) AS LastPart
FROM table_name
One method way is to download a split() function off the web and use it. However, the values end up in separate rows, not separate columns. An alternative is a series of nested subqueries, CTEs, or outer applies:
select t.*, p1.part1, p12.part2, p12.part3
from table t outer apply
(select t.*,
left(t.field, charindex('/', t.field)) as part1,
substring(t.field, charindex('/', t.field) + 1) as rest1
) p1 outer apply
(select left(p1.rest1, charindex('/', p1.rest1) as part2,
substring(p1.rest1, charindex('/', p1.rest1) + 1, len(p1.rest1)) as part3
) p12
where t.field like '%/%/%';
The where clause guarantees that the field value is in the right format. Otherwise, you need to start sprinkling the code with case statements to handle misformated data.

How to get expression from string

Here is the string :'(a+b)+(x/y)*1000'
from that string i want to get '(x/y)' meaning i want the part that contains the division to check later if denominator <> 0 to avoid division by zero.
The string formula can vary but divisions are always between parenthesis.
How can i achieve that in sql ?
Bits that it appears you already have (based on a comment you made)...
Pos of the '/' = CHARINDEX('/', yourString)
Pos of the ')' = CHARINDEX(')', yourString, CHARINDEX('/', yourString) + 1)
The position of the ( is a little different, as you need to search backwards. So you need to reverse the string. And so you also need to change the starting position.
CHARINDEX('(', REVERSE(yourString), LEN(yourString) - CHARINDEX('/', yourString) + 2)
Which give the position from the right hand side. LEN(yourString) - position + 1 give the position from the left hand side.
Add that all together and you get a very long formula...
SUBSTRING(
yourString,
LEN(yourString)
- CHARINDEX('(', REVERSE(yourString), LEN(yourString) - CHARINDEX('/', yourString) + 2)
+ 1,
CHARINDEX(')', yourString, CHARINDEX('/', yourString) + 1)
- LEN(yourString)
+ CHARINDEX('(', REVERSE(yourString), LEN(yourString) - CHARINDEX('/', yourString) + 2)
- 1
)
Remove everything up to the second ( using stuff and get the characters to the next ) using left.
declare #S varchar(20)
set #S = '(1+2)+(3/4)*1000'
select left(S2.S, charindex(')', S2.S)-1)
from (select stuff(#S, 1, charindex('(', #S), '')) as S1(S)
cross apply (select stuff(S1.S, 1, charindex('(', S1.S), '')) as S2(S)

SQL - Selecting portion of a string

If I have a simple table where the data is such that the rows contains strings like:
/abc/123/gyh/tgf/345/6yh/5er
In SQL, how can I select out the data between the 5th and 6th slash? Every row I have is simply data inside front-slashes, and I will only want to select all of the characters between slash 5 and 6.
CLR functions are more efficient in handling strings than T-SQL. Here is some info to get you started on writing a CLR user defined function.
http://msdn.microsoft.com/en-us/library/ms189876.aspx
http://www.mssqltips.com/tip.asp?tip=1344
I think you should create the function that has 3 parameters:
the value you are searching
the delimiter (in your case: /)
The instance you are looking for (in your case: 5)
Then you split on the delimiter (into an array). Then return the 5th item in the array (index 4)
Here is a t-sql solution, but I really believe that a CLR solution would be better.
DECLARE #RRR varchar(500)
SELECT #RRR = '/abc/123/gyh/tgf/345/6yh/5er'
DECLARE
#index INT,
#INSTANCES INT
SELECT
#index = 1,
#INSTANCES = 5
WHILE (#INSTANCES > 1) BEGIN
SELECT #index = CHARINDEX('/', #RRR, #index + 1)
SET #INSTANCES = #INSTANCES - 1
END
SELECT SUBSTRING(#RRR, #index + 1, CHARINDEX('/', #RRR, #index + 1) - #index - 1)
SELECT SUBSTRING(myfield,
/* 5-th slash */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1)
+ 1,
/* 6-th slash */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1) + 1)
-
/* 5-th slash again */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1)
- 1)
FROM myTable
WHERE ...
This will work, but it's far from elegant. If possible, select the complete field and filter out the required value on the client side (using a more powerful programming language than T-SQL). As you can see, T-SQL was not designed to do this kind of stuff.
(Edit: I know the following does not apply to your situation but I'll keep it as a word of advise for others who read this:)
In fact, relational databases are not designed to work with string-separated lists of values at all, so an even better solution would be to split that field into separate fields in your table (or into a subtable, if the number of entries varies).
Maybe... SELECT FROM `table` WHERE `field` LIKE '%/345/%'