T-SQL substring between two slashes to extract data - sql

I am trying extract part of a string in T-SQL for a project I am working on.
Examples:
/Clients/AAA/Something/Something
/Clients/BBBB/Something/Something
I am specifically trying to extract the AAA or the BBB which are not a consistent number of characters.

Try the following using CHARINDEX and SUBSTRING.
drop table #a
create table #a (d varchar(100))
insert into #a (d)
values ('/Clients/AAA/Something/Something/')
,('/Clients/bbbbb/Something/Something/')
select d as [OriginalData]
,charindex('/', d, charindex('/', d, 0)+1) as [SecondSlash]
,charindex('/', d, charindex('/', d, charindex('/', d, 0)+1)+1) as [ThirdSlash]
,SUBSTRING(d -- Value
, charindex('/', d, charindex('/', d, 0)+1)+1 -- Startpoint (SecondSlash) + 1
, charindex('/', d, charindex('/', d, charindex('/', d, 0)+1)+1) - charindex('/', d, charindex('/', d, 0)+1)-1) as [Extract]
-- Endpoint (ThirdSlash - SecondSlash - 1)
from #a
It's a bit messy and will only return the text between the second and third slash, but it should be fairly quick.

I find that apply is convenient for expressing this type of logic:
with t as (
select *
from (values ('/Clients/AAA/Something/Something/'), ('/Clients/bbbbb/Something/Something/')) t(str)
)
select *, left(str2, charindex('/', str2) - 1)
from t cross apply
(values (stuff(str, 1, patindex('%_/%', str) + 1, ''))) v(str2);
Note that this looks for the pattern _/ to find the second slash in the string.

select Data
-- Find second slash
, charindex('/', Data, 2)
-- Find third slash
, charindex('/', Data, charindex('/', Data, 2)+1)
-- Find string between the second and third slash
, substring(data, charindex('/', Data, 2) + 1, charindex('/', Data, charindex('/', Data, 2)+1) - charindex('/', Data, 2) - 1)
from (
select '/Clients/AAA/Something/Something' Data
union all select '/Clients/BBBB/Something/Something'
) x

Related

Substring a URL to the 5th slash

Could you help me sub-string a list of different URLs?
I can only understand how to sub-string a part of it, but could not manage to make it as requested - which is I need the URL to be full to the 5th slash (bold section), but some of the URLs don't have the 5th slash.
Example URLs:
'http://db-hit-internet/bags/personnel/default.axxxx'
'http://db-hit-internet/store/books/preview/default.axxxx'
'http://db-git-internet/friends/default.aspx?lang=LTT'
Expected output:
'http://db-hit-internet/bags/personnel'
'http://db-hit-internet/store/books/preview'
'http://db-git-internet/friends/default.aspx?lang=LTT'
I have query:
SELECT ('CS' + cast([id] as char (4))) AS name, [SysName], [Link], COUNT(*) AS Viewed
FROM main AS A
INNER JOIN
(
SELECT [LogDate], [LogPage] COLLATE Latin1_General_CI_AS PageName
FROM web
UNION
SELECT [LogDate], [LogPage] PageName
FROM web2
)
ON A.Link= PageName
WHERE A.[Link] is not null
GROUP BY A.id, A.[SysName], A.[Link]
And I need the web and web2 union to have the URLs sub-stringed to the 5th slash. The problem is that there should be a CASE statement to check if the 5th slash exists and then according to that SUBSTRING and CHARINDEX should be included somewhere.
I tried:
LEFT([LogPage], CHARINDEX('/', [LogPage], CHARINDEX('/', [LogPage], CHARINDEX('/', [LogPage], CHARINDEX('//', [LogPage])+2)+1)+1))
But it only works with the URLs that have the 5th slash.
You can also use CROSS APPLY.
SELECT URL,SUBSTRING(URL,1,CASE WHEN (TS.LOC!=0 AND FRS.LOC!=0 AND FVS.LOC!=0) THEN FVS.LOC
ELSE LEN(URL) END) SUBURL
FROM TEST
CROSS APPLY (VALUES(CHARINDEX('/',URL))) FS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,FS.LOC+1))) SS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,SS.LOC+1))) TS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,TS.LOC+1))) FRS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,FRS.LOC+1))) FVS(LOC)
Check Demo Here
Note : Please change the CASE conditions accordingly (Assumption: 2 slashes will always be there). This will give sub string till 5th slash.
One option is a JSON-based approach, which transforms the data into a valid JSON and parses this JSON with OPENJSON():
Table:
CREATE TABLE Data (url varchar(100))
INSERT INTO Data (url)
VALUES
('http://db-hit-internet/bags/personnel/default.axxxx'),
('http://db-hit-internet/store/books/preview/default.axxxx'),
('http://db-git-internet/friends/default.aspx?lang=LTT'),
('http://db-git-internet.net')
Statement:
SELECT CONCAT(j.part1, j.part2, j.part3, j.part4, j.part5) AS url
FROM Data d
CROSS APPLY OPENJSON(CONCAT('[["', REPLACE(STRING_ESCAPE(d.url, 'json'), '/', '/","'), '"]]')) WITH (
part1 varchar(100) '$[0]',
part2 varchar(100) '$[1]',
part3 varchar(100) '$[2]',
part4 varchar(100) '$[3]',
part5 varchar(100) '$[4]'
) j
Result:
url url
http://db-hit-internet/bags/personnel/default.axxxx http://db-hit-internet/bags/personnel/
http://db-hit-internet/store/books/preview/default.axxxx http://db-hit-internet/store/books/
http://db-git-internet/friends/default.aspx?lang=LTT http://db-git-internet/friends/default.aspx?lang=LTT
http://db-git-internet.net http://db-git-internet.net
If you want a string-based approach, the following statement is a possible solution:
SELECT d.url, LEFT(d.url, v5.pos)
FROM Data d
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, 1) END) v1 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v1.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v1.pos + 1) END) v2 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v2.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v2.pos + 1) END) v3 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v3.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v3.pos + 1) END) v4 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v4.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v4.pos + 1) END) v5 (pos)

How to sub-string in SQL with 4 same consecutive characters

I want to sub-string 11.1.2.3.4.5 or 10.1.2.4.5 and so on to be split until 4(dot) only like 11.1.2.3 and 10.1.2.3 likewise.
Can someone help to achieve this in SQL?
You could use a recurcive CTE as the following
CREATE TABLE Strings( S VARCHAR(25) );
INSERT Strings VALUES
('1.2.3.4.5.6'),
('11.2.12.5.66'),
('y.888.p.666.2.00');
WITH CTE AS
(
SELECT 1 N, CHARINDEX('.', S) Pos, S
FROM Strings
UNION ALL
SELECT N + 1, CHARINDEX('.', S, Pos + 1), S
FROM CTE
WHERE Pos > 0
)
SELECT S, SUBSTRING(S, 1, Pos - 1) --or use LEFT()
FROM CTE
WHERE N = 4;
Or using a nested CHARINDEX() as
SELECT S, LEFT(S, CI-1)
FROM Strings
CROSS APPLY
(
VALUES
(CHARINDEX('.', S, CHARINDEX('.', S, CHARINDEX('.', S, CHARINDEX('.', S)+1)+1)+1))
) T(CI)
Here is a db<>fiddle

Extract string between after second / and before -

I have a field that holds an account code. I've managed to extract the first 2 parts OK but I'm struggling with the last 2.
The field data is as follows:
812330/50110/0-0
812330/50110/BDG001-0
812330/50110/0-X001
I need to get the string between the second "/" and the "-" and after the "-" .Both fields have variable lengths, so I would be looking to output 0 and 0 on the first record, BDG001 and 0 on the second record and 0 and X001 on the third record.
Any help much appreciated, thanks.
You can use CHARINDEX and LEFT/RIGHT:
CREATE TABLE #tab(col VARCHAR(1000));
INSERT INTO #tab VALUES ('812330/50110/0-0'),('812330/50110/BDG001-0'),
('812330/50110/0-X001');
WITH cte AS
(
SELECT
col,
r = RIGHT(col, CHARINDEX('/', REVERSE(col))-1)
FROM #tab
)
SELECT col,
r,
sub1 = LEFT(r, CHARINDEX('-', r)-1),
sub2 = RIGHT(r, LEN(r) - CHARINDEX('-', r))
FROM cte;
LiveDemo
EDIT:
or even simpler:
SELECT
col
,sub1 = SUBSTRING(col,
LEN(col) - CHARINDEX('/', REVERSE(col)) + 2,
CHARINDEX('/', REVERSE(col)) -CHARINDEX('-', REVERSE(col))-1)
,sub2 = RIGHT(col, CHARINDEX('-', REVERSE(col))-1)
FROM #tab;
LiveDemo2
EDIT 2:
Using PARSENAME SQL SERVER 2012+ (if your data does not contain .):
SELECT
col,
sub1 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 2),
sub2 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 1)
FROM #tab;
LiveDemo3
...Or you can do this, so you only go from left side to right, so you don't need to count from the end in case you have more '/' or '-' signs:
SELECT
SUBSTRING(columnName, CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) + 1,
CHARINDEX('-', columnName) - CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) - 1) AS FirstPart,
SUBSTRING(columnName, CHARINDEX('-' , columnName) + 1, LEN(columnName)) AS LastPart
FROM table_name
One method way is to download a split() function off the web and use it. However, the values end up in separate rows, not separate columns. An alternative is a series of nested subqueries, CTEs, or outer applies:
select t.*, p1.part1, p12.part2, p12.part3
from table t outer apply
(select t.*,
left(t.field, charindex('/', t.field)) as part1,
substring(t.field, charindex('/', t.field) + 1) as rest1
) p1 outer apply
(select left(p1.rest1, charindex('/', p1.rest1) as part2,
substring(p1.rest1, charindex('/', p1.rest1) + 1, len(p1.rest1)) as part3
) p12
where t.field like '%/%/%';
The where clause guarantees that the field value is in the right format. Otherwise, you need to start sprinkling the code with case statements to handle misformated data.

Ternary operator in SQL? "invalid length parameter passed to the LEFT or SUBSTRING function"

Sorry for this misleading subject, i didn't know how to word better.
Because i'm mainly a software-developer, the ternary operator comes to my mind with my following problem.
I need to find the most robust way to link two tables via nullable foreign-key(modModel and tabSparePart). The only similarity between both is the model's name and the sparepart's description(the tabSparePart is an external table from customer that is imported automatically, so it's not my responsibility and i cannot change the data).
Consider the following sparepart-names:
W200I_E/Swap
EXCHANGEUNIT P1i / SILVERBLACK/ CYRILLIC
The modelnames that i want to find are P1i and W200I_E.
So there is only one strong rule that i can ensure in the where-clause:
there must be a separator / and the relevant part is the first one.
Here is the sample data:
Create table #temp(Partname varchar(100))
INSERT INTO #temp
SELECT 'EXCHANGEUNIT P1i / SILVERBLACK/ CYRILLIC' UNION ALL SELECT 'W200I_E/Swap unit/Black'
I would have been finished with following query:
SELECT RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)) AS UNIT
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
... what returns:
EXCHANGEUNIT P1i
W200I_E
But i need P1i. So i need a way to handle also the case that the first part is separated by whitespaces. In that case i need to select the last word, but only if it is separated at all.
I'm getting a "invalid length parameter passed to the LEFT or SUBSTRING function"-error with following query:
SELECT REVERSE( LEFT( REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))
, CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))))-1 ))
AS Unit
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
This would work without the second record that has no whitespace. If i would also ensure that the first part contains a whitespace, i would discard valid records.
To cut a long story short, I need to find a way to combine both ways according to the existence of separators.
PS: This has arisen from: Get the last word of a part of a varchar (LEFT/RIGHT)
If anybody is interested, this is the complete (working) stored-procedure. I'm sure i've never used such a strange JOIN:
CREATE PROC [dbo].[UpdateModelSparePart](#updateCount int output)
with execute as Owner
AS
BEGIN
BEGIN TRANSACTION
UPDATE modModel SET fiSparePart=ModelPart.idSparePart
FROM modModel INNER JOIN
(
SELECT m.idModel
,m.ModelName
,sp.idSparePart
,sp.Price
,Row_Number()Over(Partition By idModel ORDER BY Price DESC)as ModelPrice
FROM modModel AS m INNER JOIN tabSparePart AS sp
ON m.ModelName = CASE
WHEN CHARINDEX(' ', REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)))) > 0 THEN
REVERSE( LEFT( REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)))
,CHARINDEX(' ', REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1))))-1 ))
ELSE
RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1))
END
WHERE (CHARINDEX('/', sp.SparePartDescription) > 0)
GROUP BY idModel,ModelName,idSparePart,Price
)As ModelPart
ON ModelPart.idModel=modModel.idModel
Where ModelPrice=1
SET #updateCount = ##ROWCOUNT;
COMMIT TRANSACTION
END
A more concise version.
SELECT REVERSE(SUBSTRING(Rev, 0, CHARINDEX(' ', Rev))) AS Unit
FROM #temp
CROSS APPLY (
SELECT REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))) + ' '
) T(Rev)
WHERE CHARINDEX('/', Partname) > 0
I was able to solve the problem:
SELECT 'Unit' =
CASE
WHEN CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))) > 0 THEN
REVERSE( LEFT( REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))
,CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))))-1 ))
ELSE
RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))
END
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
Ugly but working fine.

SQL - Selecting portion of a string

If I have a simple table where the data is such that the rows contains strings like:
/abc/123/gyh/tgf/345/6yh/5er
In SQL, how can I select out the data between the 5th and 6th slash? Every row I have is simply data inside front-slashes, and I will only want to select all of the characters between slash 5 and 6.
CLR functions are more efficient in handling strings than T-SQL. Here is some info to get you started on writing a CLR user defined function.
http://msdn.microsoft.com/en-us/library/ms189876.aspx
http://www.mssqltips.com/tip.asp?tip=1344
I think you should create the function that has 3 parameters:
the value you are searching
the delimiter (in your case: /)
The instance you are looking for (in your case: 5)
Then you split on the delimiter (into an array). Then return the 5th item in the array (index 4)
Here is a t-sql solution, but I really believe that a CLR solution would be better.
DECLARE #RRR varchar(500)
SELECT #RRR = '/abc/123/gyh/tgf/345/6yh/5er'
DECLARE
#index INT,
#INSTANCES INT
SELECT
#index = 1,
#INSTANCES = 5
WHILE (#INSTANCES > 1) BEGIN
SELECT #index = CHARINDEX('/', #RRR, #index + 1)
SET #INSTANCES = #INSTANCES - 1
END
SELECT SUBSTRING(#RRR, #index + 1, CHARINDEX('/', #RRR, #index + 1) - #index - 1)
SELECT SUBSTRING(myfield,
/* 5-th slash */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1)
+ 1,
/* 6-th slash */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1) + 1)
-
/* 5-th slash again */
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield,
CHARINDEX('/', myfield) + 1) + 1) + 1) + 1)
- 1)
FROM myTable
WHERE ...
This will work, but it's far from elegant. If possible, select the complete field and filter out the required value on the client side (using a more powerful programming language than T-SQL). As you can see, T-SQL was not designed to do this kind of stuff.
(Edit: I know the following does not apply to your situation but I'll keep it as a word of advise for others who read this:)
In fact, relational databases are not designed to work with string-separated lists of values at all, so an even better solution would be to split that field into separate fields in your table (or into a subtable, if the number of entries varies).
Maybe... SELECT FROM `table` WHERE `field` LIKE '%/345/%'