How to obtain certain value from URL? [duplicate] - sql

This question already has answers here:
SQL Server 2008 R2 - How to split my varchar column string and get 3rd index string
(4 answers)
Closed 7 years ago.
Given this URL:
www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html
The value to is between 4th and 5th slash.
How to retrieve that particular value using SQL Server 2008?

There is no function in SQL server to get the nth occurrence of a value, the only function is CHARINDEX, which will retrieve the first instance after the specified starting position. So the only way to utilise this is to cascade each value found, i.e:
Find the position of 1st "/"
Find the position of the next "/" after the first one
Find the position of the next "/" after the second one
So each calculation requires the result of the previous one, which to get the 5th occurrence gets fairly messy, but not impossible if you use CROSS APPLY to reuse your results. Once you have the position of the 4th and 5th occurrence you can use SUBSTRING to extract the text:
SELECT t.url,
Value = SUBSTRING(t.url, p4.Position, p5.Position - p4.Position - 1)
FROM (SELECT url = 'URL:/www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html') AS t
CROSS APPLY (SELECT 1 + CHARINDEX('/', url)) AS p1 (Position)
CROSS APPLY (SELECT 1 + CHARINDEX('/', url, p1.Position)) AS p2 (Position)
CROSS APPLY (SELECT 1 + CHARINDEX('/', url, p2.Position)) AS p3 (Position)
CROSS APPLY (SELECT 1 + CHARINDEX('/', url, p3.Position)) AS p4 (Position)
CROSS APPLY (SELECT 1 + CHARINDEX('/', url, p4.Position)) AS p5 (Position);
ADDENDUM
The other option you have, if you want more flexibility, i.e. get the text between the 50th and 51st occurrence, is to utilise a split function. The most efficient way to split strings is with a CLR function, but the next best T-SQL only method for this purpose is to use a numbers table to split your string, and in the absence of this create your own using stacked CTEs.
I will assume that you don't have a numbers table and use a stacked CTE in the interest of a complete working example.
CREATE FUNCTION dbo.Split (#StringToSplit VARCHAR(1000), #Delimiter CHAR(1))
RETURNS TABLE
AS
RETURN
( WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) AS t (n)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY n1.N) FROM N1 CROSS JOIN N2 AS N2)
SELECT Position = ROW_NUMBER() OVER(ORDER BY n.N),
Value = SUBSTRING(#StringToSplit, n.N, ISNULL(NULLIF(CHARINDEX(#Delimiter, #StringToSplit, n.N + 1), 0), LEN(#StringToSplit)) - n.N)
FROM Numbers AS n
WHERE SUBSTRING(#Delimiter + #StringToSplit, n.N, 1) = #Delimiter
);
Which you can call fairly simply:
DECLARE #Table TABLE (URL VARCHAR(255) NOT NULL);
INSERT #Table VALUES ('URL:/www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html');
SELECT s.*
FROM #Table AS t
CROSS APPLY dbo.Split(t.URL, '/') AS s;
Which gives you:
Position Value
---------------------
1 URL:
2 www.google.com
3 hsisn
4 -#++#
5 valuetoretrive
6 +#(#(
7 .htm
So you can simply select the 5th value from this by adding a where clause.:
DECLARE #Table TABLE (URL VARCHAR(255) NOT NULL);
INSERT #Table
VALUES
('URL:/www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html'),
('URL:/www.google.com/hsisn/-#++#/valuetoretrive2/+#(#(/.html');
SELECT t.URL, s.Value
FROM #Table AS t
CROSS APPLY dbo.Split(t.URL, '/') AS s
WHERE s.Position = 5;

If you don't know before hand the lenght of the url or the value to retrieve or slash positions you can use this snipet
declare #uri varchar(max) = 'URL:/www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html'
,#startAt int = 0
,#slashCount int = 0
while #slashCount < 5
begin
set #startAt = CHARINDEX('/',#uri);
set #slashCount = #slashCount + 1;
if (#slashCount = 5)
set #uri = SUBSTRING(#uri, 0, #startAt)
else
set #uri = SUBSTRING(#uri, #startAt + 1, LEN(#uri))
-- debug info
select #startAt, #slashCount, #uri
end
it ill decompose the string, getting slash positions until it find #4 and #5 slash and get anything between.
OUTPUT
5 1 www.google.com/hsisn/-#++#/valuetoretrive/+#(#(/.html
15 2 hsisn/-#++#/valuetoretrive/+#(#(/.html
6 3 -#++#/valuetoretrive/+#(#(/.html
6 4 valuetoretrive/+#(#(/.html
15 5 valuetoretrive
You also can get it using a cross apply instead of a while loop but this way you code ill not need to get big and messy to get anything after the n > 10, n-th slah.

Related

Scalar function to calculate items count?

I have a nvarchar column which contains values like following:
item1+item2
item1+2item2
4item1+item2+2item3
I want a scalar function to calculate the item count.
As examples above, we notice:
items separated by "+"
The item may be have digit in first. This is the item count.
The required count for above examples should be as following:
item1+item2 2
item1+2item2 3
4item1+item2+2item3 7
Here is an option using try_convert() and string_split()
This one is assuming single digit leads.
Example
Declare #YourTable Table ([SomeCol] varchar(50)) Insert Into #YourTable Values
('item1+item2')
,('item1+2item2')
,('4item1+item2+2item3')
Select *
From #YourTable A
Cross Apply (
Select value = sum(isnull(try_convert(int,left(value,1)),1))
From string_split(SomeCol,'+')
) B
Returns
SomeCol value
item1+item2 2
item1+2item2 3
4item1+item2+2item3 7
You can split the values. Then extract the leading number:
select t.*, v.the_sum
from t cross apply
(select sum(coalesce(nullif(v.num, 0), 1)) as the_sum
from string_split(col, '+') s cross apply
(values (try_convert(int, left(s.value, patindex('%[^0-9]%', ltrim(s.value) + ' ') - 1
)
)
)
) v(num)
) v;
Here is a db<>fiddle.
Note: this assumes that the prefix is never explicitly 0. That can be handled, but adds a bit of complication that doesn't seem necessary.

SQL Server Recursive CTE not returning expected rows

I'm building a Markov chain name generator. I'm trying to replace a while loop with a recursive CTE. Limitations in using top and order by in the recursive part of the CTE have led me down the following path.
The point of all of this is to generate names, based on a model, which is just another word that I've chunked out into three character segments, stored in three columns in the Markov_Model table. The next character in the sequence will be a character from the Markov_Model, such that the 1st and 2nd characters in the model match the penultimate and ultimate character in the word being generated. Rather than generate a probability matrix for the that third character, I'm using a scalar function that finds all the characters that fit the criteria, and gets one of them randomly: order by newid().
The problem is that this formulation of the CTE gets the desired number of rows in the anchor segment, but the union that recursively calls the CTE only unions one row from the anchor. I've attached a sample of the desired output at the bottom.
The query:
;with names as
(
select top 5
cast('+' as nvarchar(50)) as char1,
cast('+' as nvarchar(50)) as char2,
cast(char3 as nvarchar(50)) as char3,
cast('++' + char3 as nvarchar(100)) as name_in_progress,
1 as iteration
from markov_Model
where char1 is null
and char2 is null
order by newid() -- Get some random starting characters
union all
select
n.char2 as char1,
n.char3 as char2,
cast(fnc.addition as nvarchar(50)) as char3,
cast(n.name_in_progress + fnc.addition as nvarchar(100)),
1 + n.iteration
from names n
cross apply (
-- This function takes the preceding two characters,
-- and gets a random character that follows the pattern
select isnull(dbo.[fn_markov_getNext] (n.char2, n.char3), ',') as addition
) fnc
)
select *
from names
option (maxrecursion 3) -- For debug
The trouble is the union only unions one row.
Example output:
char1 char2 char3 name_in_progress iteration
+ + F ++F 1
+ + N ++N 1
+ + K ++K 1
+ + S ++S 1
+ + B ++B 1
+ B a ++Ba 2
B a c ++Bac 3
a c h ++Bach 4
Note I'm using + and , as null replacers/delimeters.
What I want to see is the entirety of the previous recursion, with the addition of the new characters to the name_in_progress; each pass should modify the entirely of the previous pass.
My desired output would be:
Top 10 of the Markov_Model table:
Text of the function that gets the next character from the Markov_Model:
CREATEFUNCTION [dbo].[fn_markov_getNext]
(
#char2 nvarchar(1),
#char3 nvarchar(1)
)
RETURNS nvarchar(1)
AS
BEGIN
DECLARE #newChar nvarchar(1)
set #newChar = (
select top 1
isnull(char3, ',')
from markov_Model mm
where isnull(mm.char1, '+') = isnull(#char2, '+')
and isnull(mm.char2, '+') = isnull(#char3, ',')
order by (select new_id from vw_getGuid) -- A view that calls newid()
)
return #newChar
END

SQL Server: Select rows with multiple occurrences of regex match in a column

I’m fairly used to using MySQL, but not particularly familiar with SQL Server. Tough luck, the database I’m dealing with here is on SQL Server 2014.
I have a table with a column whose values are all integers with leading, separating, and trailing semicolons, like these three fictitious rows:
;905;1493;384;13387;29;933;467;28732;
;905;138;3084;1387;290;9353;4767;2732;
;9085;14493;3864;130387;289;933;4767;28732;
What I am trying to do now is to select all rows where more than one number taken from a list of numbers appears in this column. So for example, given the three rows above, if I have the group 905,467,4767, the statement I’m trying to figure out how to construct should return the first two rows: the first row contains 905 and 467; the second row contains 905 and 4767. The third row contains only 4767, so that row should not be returned.
As far as I can tell, SQL Server does not actually support regex directly (and I don’t even know what managed code is), which doesn’t help. Even with regex, I wouldn’t know where to begin. Oracle seems to have a function that would be very useful, but that’s Oracle.
Most similar questions on here deal with finding multiple instances of the same character (usually singular) and solve the problem by replacing the string to match with nothing and counting the difference in length. I suppose that would technically work here, too, but given a ‘filter’ group of 15 numbers, the SELECT statement would become ridiculously long and convoluted and utterly unreadable. Additionally, I only want to match entire numbers (so if one of the numbers to match is 29, the value 29 would match in the first row, but the value 290 in the second row should not match), which means I’d have to include the semicolons in the REPLACE clause and then discount them when calculating the length. A complete mess.
What I would ideally like to do is something like this:
SELECT * FROM table WHERE REGEXP_COUNT(column, ';(905|467|4767);') > 1
– but that will obviously not work, for all kinds of reasons (the most obvious one being the nonexistence of REGEXP_COUNT outside Oracle).
Is there some sane, manageable way of doing this?
You can do
SELECT *
FROM Mess
CROSS APPLY (SELECT COUNT(*)
FROM (VALUES (905),
(467),
(4767)) V(Num)
WHERE Col LIKE CONCAT('%;', Num, ';%')) ca(count)
WHERE count > 1
SQL Fiddle
Or alternatively
WITH Nums
AS (SELECT Num
FROM (VALUES (905),
(467),
(4767)) V(Num))
SELECT Mess.*
FROM Mess
CROSS APPLY (VALUES(CAST(CONCAT('<x>', REPLACE(Col, ';', '</x><x>'), '</x>') AS XML))) x(x)
CROSS APPLY (SELECT COUNT(*)
FROM (SELECT n.value('.', 'int')
FROM x.x.nodes('/x') n(n)
WHERE n.value('.', 'varchar') <> ''
INTERSECT
SELECT Num
FROM Nums) T(count)
HAVING COUNT(*) > 1) ca2(count)
Could you put your arguments into a table (perhaps using a table-valued function accepting a string (of comma-separated integers) as a parameter) and use something like this?
DECLARE #T table (String varchar(255))
INSERT INTO #T
VALUES
(';905;1493;384;13387;29;933;467;28732;')
, (';905;138;3084;1387;290;9353;4767;2732;')
, (';9085;14493;3864;130387;289;933;4767;28732;')
DECLARE #Arguments table (Arg int)
INSERT INTO #Arguments
VALUES
(905)
, (467)
, (4767)
SELECT String
FROM
#T
CROSS JOIN #Arguments
GROUP BY String
HAVING SUM(CASE WHEN PATINDEX('%;' + CAST(Arg AS varchar) + ';%', String) > 0 THEN 1 ELSE 0 END) > 1
And example of using this with a function to generate the arguments:
CREATE FUNCTION GenerateArguments (#Integers varchar(255))
RETURNS #Arguments table (Arg int)
AS
BEGIN
WITH cte
AS
(
SELECT
PATINDEX('%,%', #Integers) p
, LEFT(#Integers, PATINDEX('%,%', #Integers) - 1) n
UNION ALL
SELECT
CASE WHEN PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) + p = p THEN 0 ELSE PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) + p END
, CASE WHEN PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) = 0 THEN RIGHT(#Integers, PATINDEX('%,%', REVERSE(#Integers)) - 1) ELSE LEFT(SUBSTRING(#Integers, p + 1, LEN(#Integers)), PATINDEX('%,%', SUBSTRING(#Integers, p + 1, LEN(#Integers))) - 1) END
FROM cte
WHERE p <> 0
)
INSERT INTO #Arguments (Arg)
SELECT n
FROM cte
RETURN
END
GO
DECLARE #T table (String varchar(255))
INSERT INTO #T
VALUES
(';905;1493;384;13387;29;933;467;28732;')
, (';905;138;3084;1387;290;9353;4767;2732;')
, (';9085;14493;3864;130387;289;933;4767;28732;')
;
SELECT String
FROM
#T
CROSS JOIN GenerateArguments('905,467,4767')
GROUP BY String
HAVING SUM(CASE WHEN PATINDEX('%;' + CAST(Arg AS varchar) + ';%', String) > 0 THEN 1 ELSE 0 END) > 1
You can achieve this using the like function for the regex and row_number to determine the number of matches.
Here we declare the column values for testing:
DECLARE #tbl TABLE (
string NVARCHAR(MAX)
)
INSERT #tbl VALUES
(';905;1493;384;13387;29;933;467;28732;'),
(';905;138;3084;1387;290;9353;4767;2732;'),
(';9085;14493;3864;130387;289;933;4767;28732;')
Then we pass your search parameters into a table variable to be joined on:
DECLARE #search_tbl TABLE (
search_value INT
)
INSERT #search_tbl VALUES
(905),
(467),
(4767)
Finally we join the table with the column to search for onto the search table. We apply the row_number function to determine the number of times it matches. We select from this subquery where the row_number = 2 meaning that it joined at least twice.
SELECT
string
FROM (
SELECT
tbl.string,
ROW_NUMBER() OVER (PARTITION BY tbl.string ORDER BY tbl.string) AS rn
FROM #tbl tbl
JOIN #search_tbl search_tbl ON
tbl.string LIKE '%;' + CAST(search_tbl.search_value AS NVARCHAR(MAX)) + ';%'
) tbl
WHERE rn = 2
You could build a where clause like this :
WHERE
case when column like '%;905;%' then 1 else 0 end +
case when column like '%;467;%' then 1 else 0 end +
case when column like '%;4767;%' then 1 else 0 end >= 2
The advantage is that you do not need a helper table. I don't know how you build the query, but the following also works, and is useful if the numbers are in a tsql variable.
case when column like ('%;' + #n + ';%') then 1 else 0 end

SQL Server Function OR Procedure to cut records form table by parameters from other table

I have two tables i should make a comparing between those two tables, the first table have one column this column is the full URL and the other table have two columns first column is URLCategory and the other one is the number of how many / i should cut before in the other table column URL
the first table is
URL
http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
http://195.170.180.170/SADAD/PaymentNotificationService.asmx
http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
http://217.146.8.6/din.aspx?s=11575802&client=DynGate&p=10002926
http://195.170.180.170/SADAD/PaymentNotificationService.asmx
http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
http://195.170.180.170/SADAD/PaymentNotificationService.asmx
http://www.google.com/
the Second table which is hould compare with
URL CUT_BEFORE
http://10.6.2.26 3
http://217.146.8.6 1
http://195.170.180.170 2
I should compare between second table with first column to be like that
URL
http://10.6.2.26/ERP/HRServices
http://195.170.180.170/SADAD
http://10.6.2.26/ERP/HRServices
http://10.6.2.26/ERP/HRServices
http://10.6.2.26/ERP/HRServices
http://217.146.8.6
http://195.170.180.170/SADAD
http://10.6.2.26/ERP/HRServices
http://195.170.180.170/SADAD
http://www.google.com/
What's the function script to do something like that in SQLServer
OR can we make it in Stored procedure with while loop because when i tried to execute the last function below i used this query
declare #table table
( main_url NVARCHAR(MAX),URL NVARCHAR(MAX), count int)
insert #TABLE
select
Main_URL,T2.Url,T2.[Count]
from
(select
URL as Main_URL,LEFT(URL1, CHARINDEX('/', URL1) - 1) as URL1
from
(select URL,replace(stuff(URL1, 1,patindex('%://%', URL1 + '0'), ''),'//','') as URL1
from (select URL, convert(nvarchar(max),[Url]) Url1 from [dbo].[InternetUsage_nn] )T1)T)T1
left outer join [dbo].[InternetUsage_URL_List] T2
on T1.URL1=convert(nvarchar(max),T2.URL) where T2.URL is not null
select dbo.FindAbsolutePath('/',Main_url,count) from #Table
waiting for your answers
Thanks
The following does what you requre, with the aid of a Split function:
CREATE FUNCTION dbo.Split(#StringToSplit NVARCHAR(MAX), #Delimiter NCHAR(1))
RETURNS TABLE
AS
RETURN
(
SELECT ID = ROW_NUMBER() OVER(ORDER BY n.Number),
Position = Number,
Value = SUBSTRING(#StringToSplit, Number, CHARINDEX(#Delimiter, #StringToSplit + #Delimiter, Number) - Number)
FROM ( SELECT TOP (LEN(#StringToSplit) + 1) Number = ROW_NUMBER() OVER(ORDER BY a.object_id)
FROM sys.all_objects a
) n
WHERE SUBSTRING(#Delimiter + #StringToSplit + #Delimiter, n.Number, 1) = #Delimiter
);
Once you have this function your code becomes relatively concise:
DECLARE #T TABLE (URL VARCHAR(1000));
DECLARE #T2 TABLE (URL VARCHAR(1000), Cut_Before INT);
-- POPULATE TABLES HERE (NOT INCLUDED TO SAVE SPACE)
WITH CTE AS
( SELECT FullURL = t.URL,
BaseURLLength = LEN(ISNULL(t2.URL, t.URL)),
Remainder = ISNULL(REPLACE(t.URL, t2.URL, ''), ''),
Cut_Before = ISNULL(t2.Cut_Before, 1)
FROM #T AS t
LEFT JOIN #T2 AS t2
ON t.URL LIKE t2.URL + '/%'
)
SELECT t.FullURL,
Cut = SUBSTRING(t.FullURL, 1, BaseURLLength + LEN(s.Value) + s.Position - 1)
FROM CTE t
OUTER APPLY dbo.Split(t.Remainder, '/') AS s
WHERE s.ID = t.Cut_Before;
Example on SQL Fiddle
The premise is, the first part inside the CTE identifies the part URL for each full url by joining using LIKE. Using http://10.6.2.26/ERP/HRServices/WorkflowService.asmx as an example this will show the following:
FullURL: http://10.6.2.26/ERP/HRServices/WorkflowService.asmx
BaseURLLength: 16
Remainder: /ERP/HRServices/WorkflowService.asmx
Cut_Before: 3
Where remainder is what is left of the full url after you have removed the part URL. The split function will then split the remainder into a new row for each of it's component parts:
SELECT *
FROM dbo.Split('/ERP/HRServices/WorkflowService.asmx', '/');
Will return:
ID Position Value
1 1
2 2 ERP
3 6 HRServices
4 17 WorkflowService.asmx
This is then limited to only the row that matches the Cut_Before value. This row can then be used to establish the position to "Cut" the full URL (the starting position + the length of the value at that position).
i modified my code. this code block will resolve your problem.
CREATE Function FindAbsolutePath(
#TargetStr varchar(8000),
#SearchedStr varchar(8000),
#Occurrence int
)
RETURNS varchar(8000)
AS
BEGIN
DECLARE #Result varchar(8000);
if CHARINDEX('http://',#SearchedStr)>0 --fix http://
BEGIN
set #Occurrence=#Occurrence+2;
END
;WITH Occurrences AS (
SELECT
Number,
ROW_NUMBER() OVER(ORDER BY Number) AS Occurrence
FROM master.dbo.spt_values
WHERE
Number BETWEEN 1
AND LEN(#SearchedStr)
AND type='P'
AND SUBSTRING(#SearchedStr,Number,LEN(#TargetStr))=#TargetStr
)
SELECT #Result= SUBSTRING(#SearchedStr,0,Number)
FROM Occurrences
WHERE Occurrence=#Occurrence
return #Result
END
--select dbo.FindAbsolutePath('/','http://10.6.2.26/ERP/HRServices/WorkflowService.asmx',3)

Print bullet before each sentence + new line after each sentence SQL

I have a text like: Sentence one. Sentence two. Sentence three.
I want it to be:
Sentence one.
Sentence two.
Sentence three.
I assume I can replace '.' with '.' + char(10) + char(13), but how can I go about bullets? '•' character works fine if printed manually I just do not know how to bullet every sentence including the first.
-- Initial string
declare #text varchar(100)
set #text = 'Sentence one. Sentence two. Sentence three.'
-- Setting up replacement text - new lines (assuming this works) and bullets ( char(149) )
declare #replacement varchar(100)
set #replacement = '.' + char(10) + char(13) + char(149)
-- Adding a bullet at the beginning and doing the replacement, but this will also add a trailing bullet
declare #processedText varchar(100)
set #processedText = char(149) + ' ' + replace(#text, '.', #replacement)
-- Figure out length of substring to select in the next step
declare #substringLength int
set #substringLength = LEN(#processedText) - CHARINDEX(char(149), REVERSE(#processedText))
-- Removes trailing bullet
select substring(#processedText, 0, #substringLength)
I've tested here - https://data.stackexchange.com/stackoverflow/qt/119364/
I should point out that doing this in T-SQL doesn't seem correct. T-SQL is meant to process data; any presentation-specific work should be done in the code that calls this T-SQL (C# or whatever you're using).
Here's my over-the-top approach but I feel it's a fairly solid approach. It combines classic SQL problem solving techniques of Number tables for string slitting and use of the FOR XML for concatenating the split lines back together. The code is long but the only place you'd need to actually edit is the SOURCE_DATA section.
No knock on #Jeremy Wiggins approach, but I prefer mine as it lends itself well to a set based approach in addition to being fairly efficient code.
-- This code will rip lines apart based on #delimiter
-- and put them back together based on #rebind
DECLARE
#delimiter char(1)
, #rebind varchar(10);
SELECT
#delimiter = '.'
, #rebind = char(10) + char(149) + ' ';
;
-- L0 to L5 simulate a numbers table
-- http://billfellows.blogspot.com/2009/11/fast-number-generator.html
WITH L0 AS
(
SELECT
0 AS C
UNION ALL
SELECT
0
)
, L1 AS
(
SELECT
0 AS c
FROM
L0 AS A
CROSS JOIN L0 AS B
)
, L2 AS
(
SELECT
0 AS c
FROM
L1 AS A
CROSS JOIN L1 AS B
)
, L3 AS
(
SELECT
0 AS c
FROM
L2 AS A
CROSS JOIN L2 AS B
)
, L4 AS
(
SELECT
0 AS c
FROM
L3 AS A
CROSS JOIN L3 AS B
)
, L5 AS
(
SELECT
0 AS c
FROM
L4 AS A
CROSS JOIN L4 AS B
)
, NUMS AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number
FROM
L5
)
, SOURCE_DATA (ID, content) AS
(
-- This query simulates your input data
SELECT 1, 'Sentence one. Sentence two. Sentence three.'
UNION ALL SELECT 7, 'In seed time learn, in harvest teach, in winter enjoy.Drive your cart and your plow over the bones of the dead.The road of excess leads to the palace of wisdom.Prudence is a rich, ugly old maid courted by Incapacity.He who desires but acts not, breeds pestilence.'
)
, MAX_LENGTH AS
(
-- this query is rather important. The current NUMS query generates a
-- very large set of numbers but we only need 1 to maximum lenth of our
-- source data. We can take advantage of a 2008 feature of letting
-- TOP take a dynamic value
SELECT TOP (SELECT MAX(LEN(SD.content)) AS max_length FROM SOURCE_DATA SD)
N.number
FROM
NUMS N
)
, MULTI_LINES AS
(
-- This query will make many lines out a single line based on the supplied delimiter
-- Need to retain the ID (or some unique value from original data to regroup it
-- http://www.sommarskog.se/arrays-in-sql-2005.html#tblnum
SELECT
SD.ID
, LTRIM(substring(SD.content, Number, charindex(#delimiter, SD.content + #delimiter, Number) - Number)) + #delimiter AS lines
FROM
MAX_LENGTH
CROSS APPLY
SOURCE_DATA SD
WHERE
Number <= len(SD.content)
AND substring(#delimiter + SD.content, Number, 1) = #delimiter
)
, RECONSITITUE (content, ID) AS
(
-- use classic concatenation to put it all back together
-- using CR/LF * (space) as delimiter
-- as a correlated sub query and joined back to our original table to preserve IDs
-- https://stackoverflow.com/questions/5196371/sql-query-concatenating-results-into-one-string
SELECT DISTINCT
STUFF
(
(
SELECT #rebind + M.lines
FROM MULTI_LINES M
WHERE M.ID = ML.ID
FOR XML PATH('')
)
, 1
, 1
, '')
, ML.ID
FROM
MULTI_LINES ML
)
SELECT
R.content
, R.ID
FROM
RECONSITITUE R
Results
content ID
----------------------------------------------------------- ---
• In seed time learn, in harvest teach, in winter enjoy.
• Drive your cart and your plow over the bones of the dead.
• The road of excess leads to the palace of wisdom.
• Prudence is a rich, ugly old maid courted by Incapacity.
• He who desires but acts not, breeds pestilence. 7
• Sentence one.
• Sentence two.
• Sentence three. 1
(2 row(s) affected)
References
Number table
Splitting strings via number table
SQL Query - Concatenating Results into One String
select '• On '+ cast(getdate() as varchar)+' I discovered how to do this '
Sample