Substring a URL to the 5th slash - sql

Could you help me sub-string a list of different URLs?
I can only understand how to sub-string a part of it, but could not manage to make it as requested - which is I need the URL to be full to the 5th slash (bold section), but some of the URLs don't have the 5th slash.
Example URLs:
'http://db-hit-internet/bags/personnel/default.axxxx'
'http://db-hit-internet/store/books/preview/default.axxxx'
'http://db-git-internet/friends/default.aspx?lang=LTT'
Expected output:
'http://db-hit-internet/bags/personnel'
'http://db-hit-internet/store/books/preview'
'http://db-git-internet/friends/default.aspx?lang=LTT'
I have query:
SELECT ('CS' + cast([id] as char (4))) AS name, [SysName], [Link], COUNT(*) AS Viewed
FROM main AS A
INNER JOIN
(
SELECT [LogDate], [LogPage] COLLATE Latin1_General_CI_AS PageName
FROM web
UNION
SELECT [LogDate], [LogPage] PageName
FROM web2
)
ON A.Link= PageName
WHERE A.[Link] is not null
GROUP BY A.id, A.[SysName], A.[Link]
And I need the web and web2 union to have the URLs sub-stringed to the 5th slash. The problem is that there should be a CASE statement to check if the 5th slash exists and then according to that SUBSTRING and CHARINDEX should be included somewhere.
I tried:
LEFT([LogPage], CHARINDEX('/', [LogPage], CHARINDEX('/', [LogPage], CHARINDEX('/', [LogPage], CHARINDEX('//', [LogPage])+2)+1)+1))
But it only works with the URLs that have the 5th slash.

You can also use CROSS APPLY.
SELECT URL,SUBSTRING(URL,1,CASE WHEN (TS.LOC!=0 AND FRS.LOC!=0 AND FVS.LOC!=0) THEN FVS.LOC
ELSE LEN(URL) END) SUBURL
FROM TEST
CROSS APPLY (VALUES(CHARINDEX('/',URL))) FS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,FS.LOC+1))) SS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,SS.LOC+1))) TS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,TS.LOC+1))) FRS(LOC)
CROSS APPLY (VALUES(CHARINDEX('/',URL,FRS.LOC+1))) FVS(LOC)
Check Demo Here
Note : Please change the CASE conditions accordingly (Assumption: 2 slashes will always be there). This will give sub string till 5th slash.

One option is a JSON-based approach, which transforms the data into a valid JSON and parses this JSON with OPENJSON():
Table:
CREATE TABLE Data (url varchar(100))
INSERT INTO Data (url)
VALUES
('http://db-hit-internet/bags/personnel/default.axxxx'),
('http://db-hit-internet/store/books/preview/default.axxxx'),
('http://db-git-internet/friends/default.aspx?lang=LTT'),
('http://db-git-internet.net')
Statement:
SELECT CONCAT(j.part1, j.part2, j.part3, j.part4, j.part5) AS url
FROM Data d
CROSS APPLY OPENJSON(CONCAT('[["', REPLACE(STRING_ESCAPE(d.url, 'json'), '/', '/","'), '"]]')) WITH (
part1 varchar(100) '$[0]',
part2 varchar(100) '$[1]',
part3 varchar(100) '$[2]',
part4 varchar(100) '$[3]',
part5 varchar(100) '$[4]'
) j
Result:
url url
http://db-hit-internet/bags/personnel/default.axxxx http://db-hit-internet/bags/personnel/
http://db-hit-internet/store/books/preview/default.axxxx http://db-hit-internet/store/books/
http://db-git-internet/friends/default.aspx?lang=LTT http://db-git-internet/friends/default.aspx?lang=LTT
http://db-git-internet.net http://db-git-internet.net
If you want a string-based approach, the following statement is a possible solution:
SELECT d.url, LEFT(d.url, v5.pos)
FROM Data d
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, 1) END) v1 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v1.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v1.pos + 1) END) v2 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v2.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v2.pos + 1) END) v3 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v3.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v3.pos + 1) END) v4 (pos)
CROSS APPLY (SELECT CASE WHEN CHARINDEX('/', d.url, v4.pos + 1) = 0 THEN LEN(d.url) ELSE CHARINDEX('/', d.url, v4.pos + 1) END) v5 (pos)

Related

Pull the particular string from nvarchar type column in SQL Server

I have a string like &hprop=anprop_p&asofmonth=01/2017&OutputType=PDF&IsGrid=-&ReportCode=AllCol1&Attach=NO&IsRequestQue=true and want to pull the values partitioned by & from string.
As we see above each string is separated with & and both the values have a name i.e. Outputtype= and ReportCode=
In SQL query it should return only values in different columns. AllCol1 Aand PDF
I have tried the below query but it is pulling string ReportCode=AllCol1
declare #Str varchar(500)
select SUBSTRING(SUBSTRING(#Str, CHARINDEX('&ReportCode=', #Str) + 1, LEN(#Str)), 0, CHARINDEX('&', SUBSTRING(#Str, CHARINDEX('&', #Str) +1, LEN(#Str))))
As you are using SQL Server 2016, you can take advantage of STRING_SPLIT() to split your url into the component query parameters, e.g.
SELECT *
FROM STRING_SPLIT(N'&hprop=anprop_p&asofmonth=01/2017&OutputType=PDF&IsGrid=-&ReportCode=AllCol1&Attach=NO&IsRequestQue=true', '&');
Will return:
value
-----------------
hprop=anprop_p
asofmonth=01/2017
OutputType=PDF
IsGrid=-
ReportCode=AllCol1
Attach=NO
IsRequestQue=true
You would then need to split each result on = to separate it into the parameter name and the argument. e.g.
SELECT s.value,
Parameter = CASE WHEN CHARINDEX('=', s.value) = 0 THEN s.value ELSE LEFT(s.value, CHARINDEX('=', s.value) - 1) END,
Value = CASE WHEN CHARINDEX('=', s.value) = 0 THEN NULL ELSE SUBSTRING(s.value, CHARINDEX('=', s.value) + 1, LEN(s.value)) END
FROM STRING_SPLIT(N'&hprop=anprop_p&asofmonth=01/2017&OutputType=PDF&IsGrid=-&ReportCode=AllCol1&Attach=NO&IsRequestQue=true', '&') s;
Returns:
value Parameter Value
-------------------------------------------------
NULL
hprop=anprop_p hprop anprop_p
asofmonth=01/2017 asofmonth 01/2017
OutputType=PDF OutputType PDF
IsGrid=- IsGrid -
ReportCode=AllCol1 ReportCode AllCol1
Attach=NO Attach NO
IsRequestQue=true IsRequestQue true
Finally, you would just need to extract the terms you are actually interested in, and PIVOT them to bring back one row. Bringing it all together, you get:
DECLARE #T TABLE (ID INT IDENTITY, Col NVARCHAR(MAX));
INSERT #T (Col)
VALUES
(N'&hprop=anprop_p&asofmonth=01/2017&OutputType=PDF&IsGrid=-&ReportCode=AllCol1&Attach=NO&IsRequestQue=true'),
(N'&hprop=anprop_p&asofmonth=01/2017&OutputType=XLS&IsGrid=-&ReportCode=AllCol3&Attach=NO&IsRequestQue=false');
SELECT pvt.ID, pvt.OutputType, pvt.ReportCode
FROM ( SELECT T.ID,
t.Col,
Parameter = CASE WHEN CHARINDEX('=', s.value) = 0 THEN s.value ELSE LEFT(s.value, CHARINDEX('=', s.value) - 1) END,
Value = CASE WHEN CHARINDEX('=', s.value) = 0 THEN NULL ELSE SUBSTRING(s.value, CHARINDEX('=', s.value) + 1, LEN(s.value)) END
FROM #T AS t
CROSS APPLY STRING_SPLIT(T.Col, '&') AS s
WHERE s.value <> ''
) AS t
PIVOT (MAX(Value) FOR Parameter IN ([ReportCode], [OutputType])) AS pvt;
Which returns:
ID OutputType ReportCode
----------------------------------
1 PDF AllCol1
2 XLS AllCol3
Example on DB<>Fiddle
Use string_split():
select max(case when s.value like 'Outputtype=%'
then stuff(s.value, 1, 11, '')
end) as Outputtype,
max(case when s.value like 'ReportCode=%'
then stuff(s.value, 1, 11, '')
end) as ReportCode
from string_split(#str, '&') s;
Here is a db<>fiddle.

T-SQL substring between two slashes to extract data

I am trying extract part of a string in T-SQL for a project I am working on.
Examples:
/Clients/AAA/Something/Something
/Clients/BBBB/Something/Something
I am specifically trying to extract the AAA or the BBB which are not a consistent number of characters.
Try the following using CHARINDEX and SUBSTRING.
drop table #a
create table #a (d varchar(100))
insert into #a (d)
values ('/Clients/AAA/Something/Something/')
,('/Clients/bbbbb/Something/Something/')
select d as [OriginalData]
,charindex('/', d, charindex('/', d, 0)+1) as [SecondSlash]
,charindex('/', d, charindex('/', d, charindex('/', d, 0)+1)+1) as [ThirdSlash]
,SUBSTRING(d -- Value
, charindex('/', d, charindex('/', d, 0)+1)+1 -- Startpoint (SecondSlash) + 1
, charindex('/', d, charindex('/', d, charindex('/', d, 0)+1)+1) - charindex('/', d, charindex('/', d, 0)+1)-1) as [Extract]
-- Endpoint (ThirdSlash - SecondSlash - 1)
from #a
It's a bit messy and will only return the text between the second and third slash, but it should be fairly quick.
I find that apply is convenient for expressing this type of logic:
with t as (
select *
from (values ('/Clients/AAA/Something/Something/'), ('/Clients/bbbbb/Something/Something/')) t(str)
)
select *, left(str2, charindex('/', str2) - 1)
from t cross apply
(values (stuff(str, 1, patindex('%_/%', str) + 1, ''))) v(str2);
Note that this looks for the pattern _/ to find the second slash in the string.
select Data
-- Find second slash
, charindex('/', Data, 2)
-- Find third slash
, charindex('/', Data, charindex('/', Data, 2)+1)
-- Find string between the second and third slash
, substring(data, charindex('/', Data, 2) + 1, charindex('/', Data, charindex('/', Data, 2)+1) - charindex('/', Data, 2) - 1)
from (
select '/Clients/AAA/Something/Something' Data
union all select '/Clients/BBBB/Something/Something'
) x

Extract substring from a string in SQL Server

I need to extract a part of substring from string which follows as per below.
YY_12.Yellow
ABC_WSA.Thisone_A
SS_4MON.DHHE_A_A
I need to extract the string as per below
Yellow
Thisone
DHHE
You could use something like this:
declare #tbl table (col nvarchar(100));
insert #tbl values ('YY_12.Yellow'), ('ABC_WSA.Thisone_A'), ('SS_4MON.DHHE_A_A')
select *
, charindex('_', col2, 0)
, left(col2,
case
when charindex('_', col2, 0) - 1 > 0
then charindex('_', col2, 0) - 1
else len(col2)
end) [result]
from (
select col
, substring(col, charindex('.', col, 0) + 1, len(col)) [col2]
from #tbl ) rs
I'm going to leave the full code so as you can hopefully understand what I did.
First identify and remove everything up to the dot "." (in the [col2] column in the nested SELECT)
Then I nest that SELECT so I can apply a new logic much easier on the result column from the first SELECT from which I only keep everything up to the underscore "_"
The final result is stored in the [result] column
Try this:
CREATE TABLE app (info varchar(20))
INSERT INTO app VALUES
('YY_12.Yellow'),
('ABC_WSA.Thisone_A'),
('SS_4MON.DHHE_A_A'),
('testnopoint')
SELECT
CASE
WHEN CHARINDEX('.', info) > 0 THEN
CASE
WHEN CHARINDEX('_', info, CHARINDEX('.', info) + 1) > 0 THEN
SUBSTRING(info, CHARINDEX('.', info) + 1, CHARINDEX('_', info, CHARINDEX('.', info) + 1) - CHARINDEX('.', info) - 1)
ELSE
SUBSTRING(info, CHARINDEX('.', info) + 1, LEN(info))
END
END
FROM app
My query, if . is not present returns NULL, if you want returns all string remove the CASE statement
Go on SqlFiddle
You could also try with parsename() function available from SQL Server 2012
select Name, left(parsename(Name,1),
case when charindex('_', parsename(Name,1)) > 0
then charindex('_', parsename(Name,1))-1
else len(parsename(Name,1))
end) [ExtrectedName] from table
This assumes you have always . in your string to read the name after .
Result :
Name ExtrectedName
YY_12.Yellow Yellow
ABC_WSA.Thisone_A Thisone
SS_4MON.DHHE_A_A DHHE
Try this, used STUFF here
SELECT LEFT(STUFF(col,1,CHARINDEX('.',col),''),
CHARINDEX('_',STUFF(col,1,CHARINDEX('.',col),'')+'_')-1
)
FROM #table
Output:-
Yellow
Thisone
DHHE

Extract string between after second / and before -

I have a field that holds an account code. I've managed to extract the first 2 parts OK but I'm struggling with the last 2.
The field data is as follows:
812330/50110/0-0
812330/50110/BDG001-0
812330/50110/0-X001
I need to get the string between the second "/" and the "-" and after the "-" .Both fields have variable lengths, so I would be looking to output 0 and 0 on the first record, BDG001 and 0 on the second record and 0 and X001 on the third record.
Any help much appreciated, thanks.
You can use CHARINDEX and LEFT/RIGHT:
CREATE TABLE #tab(col VARCHAR(1000));
INSERT INTO #tab VALUES ('812330/50110/0-0'),('812330/50110/BDG001-0'),
('812330/50110/0-X001');
WITH cte AS
(
SELECT
col,
r = RIGHT(col, CHARINDEX('/', REVERSE(col))-1)
FROM #tab
)
SELECT col,
r,
sub1 = LEFT(r, CHARINDEX('-', r)-1),
sub2 = RIGHT(r, LEN(r) - CHARINDEX('-', r))
FROM cte;
LiveDemo
EDIT:
or even simpler:
SELECT
col
,sub1 = SUBSTRING(col,
LEN(col) - CHARINDEX('/', REVERSE(col)) + 2,
CHARINDEX('/', REVERSE(col)) -CHARINDEX('-', REVERSE(col))-1)
,sub2 = RIGHT(col, CHARINDEX('-', REVERSE(col))-1)
FROM #tab;
LiveDemo2
EDIT 2:
Using PARSENAME SQL SERVER 2012+ (if your data does not contain .):
SELECT
col,
sub1 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 2),
sub2 = PARSENAME(REPLACE(REPLACE(col, '/', '.'), '-', '.'), 1)
FROM #tab;
LiveDemo3
...Or you can do this, so you only go from left side to right, so you don't need to count from the end in case you have more '/' or '-' signs:
SELECT
SUBSTRING(columnName, CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) + 1,
CHARINDEX('-', columnName) - CHARINDEX('/' , columnName, CHARINDEX('/' , columnName) + 1) - 1) AS FirstPart,
SUBSTRING(columnName, CHARINDEX('-' , columnName) + 1, LEN(columnName)) AS LastPart
FROM table_name
One method way is to download a split() function off the web and use it. However, the values end up in separate rows, not separate columns. An alternative is a series of nested subqueries, CTEs, or outer applies:
select t.*, p1.part1, p12.part2, p12.part3
from table t outer apply
(select t.*,
left(t.field, charindex('/', t.field)) as part1,
substring(t.field, charindex('/', t.field) + 1) as rest1
) p1 outer apply
(select left(p1.rest1, charindex('/', p1.rest1) as part2,
substring(p1.rest1, charindex('/', p1.rest1) + 1, len(p1.rest1)) as part3
) p12
where t.field like '%/%/%';
The where clause guarantees that the field value is in the right format. Otherwise, you need to start sprinkling the code with case statements to handle misformated data.

Ternary operator in SQL? "invalid length parameter passed to the LEFT or SUBSTRING function"

Sorry for this misleading subject, i didn't know how to word better.
Because i'm mainly a software-developer, the ternary operator comes to my mind with my following problem.
I need to find the most robust way to link two tables via nullable foreign-key(modModel and tabSparePart). The only similarity between both is the model's name and the sparepart's description(the tabSparePart is an external table from customer that is imported automatically, so it's not my responsibility and i cannot change the data).
Consider the following sparepart-names:
W200I_E/Swap
EXCHANGEUNIT P1i / SILVERBLACK/ CYRILLIC
The modelnames that i want to find are P1i and W200I_E.
So there is only one strong rule that i can ensure in the where-clause:
there must be a separator / and the relevant part is the first one.
Here is the sample data:
Create table #temp(Partname varchar(100))
INSERT INTO #temp
SELECT 'EXCHANGEUNIT P1i / SILVERBLACK/ CYRILLIC' UNION ALL SELECT 'W200I_E/Swap unit/Black'
I would have been finished with following query:
SELECT RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)) AS UNIT
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
... what returns:
EXCHANGEUNIT P1i
W200I_E
But i need P1i. So i need a way to handle also the case that the first part is separated by whitespaces. In that case i need to select the last word, but only if it is separated at all.
I'm getting a "invalid length parameter passed to the LEFT or SUBSTRING function"-error with following query:
SELECT REVERSE( LEFT( REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))
, CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))))-1 ))
AS Unit
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
This would work without the second record that has no whitespace. If i would also ensure that the first part contains a whitespace, i would discard valid records.
To cut a long story short, I need to find a way to combine both ways according to the existence of separators.
PS: This has arisen from: Get the last word of a part of a varchar (LEFT/RIGHT)
If anybody is interested, this is the complete (working) stored-procedure. I'm sure i've never used such a strange JOIN:
CREATE PROC [dbo].[UpdateModelSparePart](#updateCount int output)
with execute as Owner
AS
BEGIN
BEGIN TRANSACTION
UPDATE modModel SET fiSparePart=ModelPart.idSparePart
FROM modModel INNER JOIN
(
SELECT m.idModel
,m.ModelName
,sp.idSparePart
,sp.Price
,Row_Number()Over(Partition By idModel ORDER BY Price DESC)as ModelPrice
FROM modModel AS m INNER JOIN tabSparePart AS sp
ON m.ModelName = CASE
WHEN CHARINDEX(' ', REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)))) > 0 THEN
REVERSE( LEFT( REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)))
,CHARINDEX(' ', REVERSE(RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1))))-1 ))
ELSE
RTRIM(LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1))
END
WHERE (CHARINDEX('/', sp.SparePartDescription) > 0)
GROUP BY idModel,ModelName,idSparePart,Price
)As ModelPart
ON ModelPart.idModel=modModel.idModel
Where ModelPrice=1
SET #updateCount = ##ROWCOUNT;
COMMIT TRANSACTION
END
A more concise version.
SELECT REVERSE(SUBSTRING(Rev, 0, CHARINDEX(' ', Rev))) AS Unit
FROM #temp
CROSS APPLY (
SELECT REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))) + ' '
) T(Rev)
WHERE CHARINDEX('/', Partname) > 0
I was able to solve the problem:
SELECT 'Unit' =
CASE
WHEN CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))) > 0 THEN
REVERSE( LEFT( REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1)))
,CHARINDEX(' ', REVERSE(RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))))-1 ))
ELSE
RTRIM(LEFT(Partname, CHARINDEX('/', Partname) - 1))
END
FROM #temp
WHERE CHARINDEX('/', Partname) > 0
Ugly but working fine.