How to get the nth string in any generic word or sentence with a space delimiter - sql

How do I get the nth word in a sentence or a set of strings with space delimiter?
Sorry for the change in the requirement.Thank you.

By using instr.
select substr(help, 1, instr(help,' ') - 1)
from ( select 'hello my name is...' as help
from dual )
instr(help,' ') returns the positional index of the first occurrence of the second argument in the first, inclusive of the string you're searching for. i.e. the first occurrence of ' ' in the string 'hello my name is...' plus the space.
substr(help, 1, instr(help,' ') - 1) then takes the input string from the first character to the index indicated in instr(.... I then remove one so that the space isn't included..
For the nth occurrence just change this slightly:
instr(help,' ',1,n) is the nth occurrence of ' ' from the first character. You then need to find the positional index of the next index instr(help,' ',1,n + 1), lastly work out the difference between them so you know how far to go in your substr(.... As you're looking for the nth, when n is 1 this breaks down and you have to deal with it, like so:
select substr( help
, decode( n
, 1, 1
, instr(help, ' ', 1, n - 1) + 1
)
, decode( &1
, 1, instr(help, ' ', 1, n ) - 1
, instr(help, ' ', 1, n) - instr(help, ' ', 1, n - 1) - 1
)
)
from ( select 'hello my name is...' as help
from dual )
This will also break down at n. As you can see this is getting ridiculous so you might want to consider using regular expressions
select regexp_substr(help, '[^[:space:]]+', 1, n )
from ( select 'hello my name is...' as help
from dual )

Try this. An example of getting the 4th word:
select names from (
select
regexp_substr('I want my two dollars','[^ ]+', 1, level) as names,
rownum as nth
from dual
connect by regexp_substr('I want my two dollars', '[^ ]+', 1, level) is not null
)
where nth = 4;
The inner query is converting the space-delimited string into a set of rows. The outer query is grabbing the nth item from the set.

Try something like
WITH q AS (SELECT 'ABCD EFGH IJKL' AS A_STRING FROM DUAL)
SELECT SUBSTR(A_STRING, 1, INSTR(A_STRING, ' ')-1)
FROM q
Share and enjoy.
And here's the solution for the revised question:
WITH q AS (SELECT 'ABCD EFGH IJKL' AS A_STRING, 3 AS OCCURRENCE FROM DUAL)
SELECT SUBSTR(A_STRING,
CASE
WHEN OCCURRENCE=1 THEN 1
ELSE INSTR(A_STRING, ' ', 1, OCCURRENCE-1)+1
END,
CASE
WHEN INSTR(A_STRING, ' ', 1, OCCURRENCE) = 0 THEN LENGTH(A_STRING)
ELSE INSTR(A_STRING, ' ', 1, OCCURRENCE) - CASE
WHEN OCCURRENCE=1 THEN 0
ELSE INSTR(A_STRING, ' ', 1, OCCURRENCE-1)
END - 1
END)
FROM q;
Share and enjoy.

CREATE PROC spGetCharactersInAStrings
(
#S VARCHAR(100) = '^1402 WSN NI^AMLAB^tev^e^^rtS htimS 0055518',
#Char VARCHAR(100) = '8'
)
AS
-- exec spGetCharactersInAStrings '^1402 WSN NI^AMLAB^tev^e^^rtS htimS 0055518', '5'
BEGIN
DECLARE #i INT = 1,
#c INT,
#pos INT = 0,
#NewStr VARCHAR(100),
#sql NVARCHAR(100),
#ParmDefinition nvarchar(500) = N'#retvalOUT int OUTPUT'
DECLARE #D TABLE
(
ID INT IDENTITY(1, 1),
String VARCHAR(100),
Position INT
)
SELECT #c = LEN(#S), #NewStr = #S
WHILE #i <= #c
BEGIN
SET #sql = ''
SET #sql = ' SELECT #retvalOUT = CHARINDEX(''' + + #Char + ''',''' + #NewStr + ''')'
EXEC sp_executesql #sql, #ParmDefinition, #retvalOUT=#i OUTPUT;
IF #i > 0
BEGIN
set #pos = #pos + #i
SELECT #NewStr = SUBSTRING(#NewStr, #i + 1, LEN(#S))
--SELECT #NewStr '#NewStr', #Char '#Char', #pos '#pos', #sql '#sql'
--SELECT #NewStr '#NewStr', #pos '#pos'
INSERT INTO #D
SELECT #NewStr, #pos
SET #i = #i + 1
END
ELSE
BREAK
END
SELECT * FROM #D
END

If you're using MySQL and cannot use the instr function that accepts four parameters or regexp_substr, you can do this way:
select substring_index(substring_index(help, ' ', 2), ' ', -1)
from (select 'hello my name is...' as help) h
Result: "my".
Replace "2" in the code above with the number of the word you want.

If you are using SQL Server 2016+ then you can take advantage of the STRING_SPLIT function. It returns rows of string values and if you aim to get nth value, then you can use Row_Number() window function.
Here there is a little trick as you don't want to really order by something so that you have to "cheat" the row_number function and allow its value in the natural order which is the STRING_SPLIT() function will spit out.
Below is a code snippet if you want to find the third word of the string
Declare #_intPart INT = 3; -- change nth work here, start # from 1 not 0
SELECT value FROM(
SELECT value,
ROW_NUMBER()OVER(ORDER BY (SELECT 1)) AS rowno
FROM STRING_SPLIT('hello world this is amazing', ' ')
) AS o1 WHERE o1.rowno = #_intPart;
You can also make a scalar function to retrieve values.

Related

Find the count of words in string

SQL: How to find the count of words in following example?
declare #s varchar(55) = 'How to find the count of words in this string ?'
Subquestions:
How to count spaces?
How to count double/triple/... spaces as one? answer by Gordon Linoff here
How to avoid counting of special characters? Example: 'Please , don't count this comma'
Is it possible without string_split function (because it's available only since SQL SERVER 2016)?
Summary with the best solutions HERE
Thanks to Gordon Linoff's answer here
SELECT len(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','))
OutPut
-------
How,to,find,the,count,of,words,in,this,string?
SELECT replace(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','),',','')
OutPut
------
Howtofindthecountofwordsinthisstring?
Now you can find the difference between the length of both the output and add 1 for the last word like below.
declare #s varchar(55) = 'How to find the count of words in this string?'
SELECT len(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','))
-len(replace(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','),',',''))
+ 1 AS WORD_COUNT
WORD_COUNT
----------
10
http://sqlfiddle.com/#!18/06c1d/5
One method uses a recursive CTE:
declare #s varchar(55) = 'How to find the count of words in this string ?';
with cte as (
select convert(varchar(max), '') as word,
convert(varchar(max), ltrim(#s)) as rest
union all
select left(rest, patindex('%[ ]%', rest + ' ') - 1),
ltrim(stuff(rest, 1, patindex('%[ ]%', rest + ' '), ''))
from cte
where rest <> ''
)
select count(*)
from cte
where word not in ('', '?', ',')
--OPTION (MAXRECURSION 1000); -- use if number of words >99
;
Here is a db<>fiddle.
First thing is you need to remove the double/tripple.. or more count into one.
declare #str varchar(500) = 'dvdv sdd dfxdfd dfd'
select Replace(Replace(Replace( #str,' ',']['), '[]', ''), '][', ' ')
this will remove all the unnecessary space in between the word and you'll get your final word.
After that you may use string_split (for SQL SERVER 2016 and above). To count the number of word in your text from which minus 1 is your total count of spaces.
select count(value) - 1 from string_split( #str, ' ')
Final query looks like
declare #str varchar(500) = 'dvdv sdd dfxdfd dfd'
select count(value) - 1 from string_split( Replace(Replace(Replace( #str,' ',']['), '[]', ''), '][', ' '), ' ')
For only word count and if your MSSQL Version support STRING_SPLIT, you can use this simple script below-
DECLARE #s VARCHAR(55) = 'How to find the count of words in this string ?'
SELECT
COUNT(
IIF(
LTRIM(value)='',
NULL,
1
)
)
FROM STRING_SPLIT(#s, ' ')
WHERE value LIKE '%[0-9,A-z]%'
Using string_split (available only since SQL SERVER 2016):
declare #string varchar(55) = 'How to find the count of words in this string ?';
select count(*) WordCount from string_split(#string,' ') where value like '%[0-9A-Za-z]%'
The same idea is used in following answers:
https://stackoverflow.com/a/57783421/6165594
https://stackoverflow.com/a/57783743/6165594
Without using string_split:
declare #string varchar(55) = 'How to find the count of words in this string ?';
;with space as
( -- returns space positions in a string
select cast( 0 as int) idx union all
select cast(charindex(' ', #string, idx+1) as int) from space
where charindex(' ', #string, idx+1)>0
)
select count(*) WordCount from space
where substring(#string,idx+1,charindex(' ',#string+' ',idx+1)-idx-1) like '%[0-9A-Za-z]%'
OPTION (MAXRECURSION 0);
The same idea is used in following answers:
https://stackoverflow.com/a/57787850/6165594
As Inline Function:
ALTER FUNCTION dbo.WordCount
(
#string NVARCHAR(MAX)
, #WordPattern NVARCHAR(MAX) = '%[0-9A-Za-z]%'
)
/*
Call Example:
1) Word count for single string:
select * from WordCount(N'How to find the count of words in this string ? ', default)
2) Word count for set of strings:
select *
from (
select 'How to find the count of words in this string ? ' as string union all
select 'How many words in 2nd example?'
) x
cross apply WordCount(x.string, default)
Limitations:
If string contains >100 spaces function fails with error:
Msg 530, Level 16, State 1, Line 45
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
NB! OPTION (MAXRECURSION 0); -- don't work within inline function
*/
RETURNS TABLE AS RETURN
(
with space as
( -- returns space positions in a string
select cast( 0 as int) idx union all
select cast(charindex(' ', #string, idx+1) as int) from space
where charindex(' ', #string, idx+1)>0
)
select count(*) WordCount from space
where substring(#string,idx+1,charindex(' ',#string+' ',idx+1)-idx-1) like #WordPattern
-- OPTION (MAXRECURSION 0); -- don't work within inline function
);
go

Split the query string with repeatative special characters using SQL

This is my String
Declare #qstr as varchar(max)='hireteammember.aspx?empemail=kuldeep#asselsolutions.com&empid=376&empname=kuldeep&adminname=TMA1&term=5&teamid=161&contactid=614¥1&WP=100¥5¥Months&Amt=500&DueDay=5&StrDt=12/31/2013&MemCatg=Employees&StrTm=21:05&PlnHrs=5&WrkDays=true¥true¥true¥true¥true¥false¥false'
I want to extract the values of empid,empname,adminname,term,teamid,contactid,WP,Months,Dueday,StrDt,MemCatgmStrTm,PlnHrs,WrkDays and assign them to new variables
I have used
select ( SUBSTRING(#qstr,CHARINDEX('=',#qstr)+1,CHARINDEX('&',#qstr)-CHARINDEX('=',#qstr)-1)))
but only getting the 'empemail' , for the next occurance of special char '&' , not able to get the values of further terms , if i am using '&' in spite of '=' .
Help me to split the whole string
How about using XML to split the values into rows, and then splitting them into columns.
Something like
Declare #qstr as varchar(max)='hireteammember.aspx?empemail=kuldeep#asselsolutions.com&empid=376&empname=kuldeep&adminname=TMA1&term=5&teamid=161&contactid=614¥1&WP=100¥5¥Months&Amt=500&DueDay=5&StrDt=12/31/2013&MemCatg=Employees&StrTm=21:05&PlnHrs=5&WrkDays=true¥true¥true¥true¥true¥false¥false'
DECLARe #str VARCHAR(MAX) = SUBSTRING(#qstr,CHARINDEX('?',#qstr,0) + 1, LEN(#qstr)-CHARINDEX('?',#qstr,0))
DECLARE #xml XML
SELECT #xml = CAST('<d>' + REPLACE(#str, '&', '</d><d>') + '</d>' AS XML)
;WITH Vals AS (
SELECT T.split.value('.', 'nvarchar(max)') AS data
FROM #xml.nodes('/d') T(split)
)
SELECT LEFT(data,CHARINDEX('=',data,0) - 1),
RIGHT(data,LEN(data) - CHARINDEX('=',data,0))
FROM Vals
SQL Fiddle DEMO
CREATE FUNCTION dbo.SplitQueryString (#s varchar(8000))
RETURNS table
AS
RETURN (
WITH splitter_cte AS (
SELECT CHARINDEX('&', #s) as pos, 0 as lastPos
UNION ALL
SELECT CHARINDEX('&', #s, pos + 1), pos
FROM splitter_cte
WHERE pos > 0
),
pair_cte AS (
SELECT chunk,
CHARINDEX('=', chunk) as pos
FROM (
SELECT SUBSTRING(#s, lastPos + 1,
case when pos = 0 then 80000
else pos - lastPos -1 end) as chunk
FROM splitter_cte) as t1
)
SELECT substring(chunk, 0, pos) as keyName,
substring(chunk, pos+1, 8000) as keyValue
FROM pair_cte
)
GO
declare #queryString varchar(2048)
set #queryString = 'foo=bar&temp=baz&key=value';
SELECT *
FROM dbo.SplitQueryString(#queryString)
OPTION(MAXRECURSION 0);
when run produces the following output.
keyName keyValue
------- --------
foo bar
temp baz
key value
(3 row(s) affected)
I believe that this will do exactly what you are asking.
SQL FIDDLE DEMO
If the order of the values in the html string remains same i would suggest using the whole string name like
select ( SUBSTRING(#qstr,CHARINDEX('empemail=',#qstr)+1,CHARINDEX('&empid=',#qstr)-CHARINDEX('empemail=',#qstr)-1)))
If you are still looking for nth occurance then refer to this link
Declare #qstr as varchar(max)='hireteammember.aspx?empemail=kuldeep#asselsolutions.com&empid=376&empname=kuldeep&adminname=TMA1&term=5&teamid=161&contactid=614¥1&WP=100¥5¥Months&Amt=500&DueDay=5&StrDt=12/31/2013&MemCatg=Employees&StrTm=21:05&PlnHrs=5&WrkDays=true¥true¥true¥true¥true¥false¥false'
(select ( SUBSTRING(#qstr,CHARINDEX('&empname=',#qstr)+1,CHARINDEX('&adminname=',#qstr)-CHARINDEX('&empname=',#qstr)-1)))
(select ( SUBSTRING(#qstr,CHARINDEX('?empemail=',#qstr)+1,CHARINDEX('&empid=',#qstr)-CHARINDEX('?empemail=',#qstr)-1)))
like this i have splitted and updated The whole string. Thank you All for your answers, Your answers Helped me to solve this

Using PATINDEX to find varying length patterns in T-SQL

I'm looking to pull floats out of some varchars, using PATINDEX() to spot them. I know in each varchar string, I'm only interested in the first float that exists, but they might have different lengths.
e.g.
'some text 456.09 other text'
'even more text 98273.453 la la la'
I would normally match these with a regex
"[0-9]+[.][0-9]+"
However, I can't find an equivalent for the + operator, which PATINDEX accepts. So they would need to be matched (respectively) with:
'[0-9][0-9][0-9].[0-9][0-9]' and '[0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9]'
Is there any way to match both of these example varchars with one single valid PATINDEX pattern?
I blogged about this a while ago.
Extracting numbers with SQL server
Declare #Temp Table(Data VarChar(100))
Insert Into #Temp Values('some text 456.09 other text')
Insert Into #Temp Values('even more text 98273.453 la la la')
Insert Into #Temp Values('There are no numbers in this one')
Select Left(
SubString(Data, PatIndex('%[0-9.-]%', Data), 8000),
PatIndex('%[^0-9.-]%', SubString(Data, PatIndex('%[0-9.-]%', Data), 8000) + 'X')-1)
From #Temp
Wildcards.
SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','some text 456.09 other text')
SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','even more text 98273.453 la la la')
Yes you need to link to the clr to get regex support. But if PATINDEX does not do what you need then regex was designed exactly for that.
http://msdn.microsoft.com/en-us/magazine/cc163473.aspx
Should be checked for robustness (what if you only have an int, for example), but this is just to put you on a track:
if exists (select routine_name from information_schema.routines where routine_name = 'GetFirstFloat')
drop function GetFirstFloat
go
create function GetFirstFloat (#string varchar(max))
returns float
as
begin
declare #float varchar(max)
declare #pos int
select #pos = patindex('%[0-9]%', #string)
select #float = ''
while isnumeric(substring(#string, #pos, 1)) = 1
begin
select #float = #float + substring(#string, #pos, 1)
select #pos = #pos + 1
end
return cast(#float as float)
end
go
select dbo.GetFirstFloat('this is a string containing pi 3.14159216 and another non float 3 followed by a new fload 5.41 and that''s it')
select dbo.GetFirstFloat('this is a string with no float')
select dbo.GetFirstFloat('this is another string with an int 3')
Given that the pattern is going to be varied in length, you're not going to have a rough time getting this to work with PATINDEX. There is another post that I wrote, which I've modified to accomplish what you're trying to do here. Will this work for you?
CREATE TABLE #nums (n INT)
DECLARE #i INT
SET #i = 1
WHILE #i < 8000
BEGIN
INSERT #nums VALUES(#i)
SET #i = #i + 1
END
CREATE TABLE #tmp (
id INT IDENTITY(1,1) not null,
words VARCHAR(MAX) null
)
INSERT INTO #tmp
VALUES('I''m looking for a number, regardless of length, even 23.258 long'),('Maybe even pi which roughly 3.14159265358,'),('or possibly something else that isn''t a number')
UPDATE #tmp SET words = REPLACE(words, ',',' ')
;WITH CTE AS (SELECT ROW_NUMBER() OVER (ORDER BY ID) AS rownum, ID, NULLIF(SUBSTRING(' ' + words + ' ' , n , CHARINDEX(' ' , ' ' + words + ' ' , n) - n) , '') AS word
FROM #nums, #tmp
WHERE ID <= LEN(' ' + words + ' ') AND SUBSTRING(' ' + words + ' ' , n - 1, 1) = ' '
AND CHARINDEX(' ' , ' ' + words + ' ' , n) - n > 0),
ids AS (SELECT ID, MIN(rownum) AS rownum FROM CTE WHERE ISNUMERIC(word) = 1 GROUP BY id)
SELECT CTE.rownum, cte.id, cte.word
FROM CTE, ids WHERE cte.id = ids.id AND cte.rownum = ids.rownum
The explanation and origin of the code is covered in more detail in the origional post
PATINDEX is not powerful enough to do that. You should use regular expressions.
SQL Server has Regular expression support since SQL Server 2005.

SQL: problem word count with len()

I am trying to count words of text that is written in a column of table. Therefor I am using the following query.
SELECT LEN(ExtractedText) -
LEN(REPLACE(ExtractedText, ' ', '')) + 1 from EDDSDBO.Document where ID='100'.
I receive a wrong result that is much to high.
On the other hand, if I copy the text directly into the statement then it works, i.e.
SELECT LEN('blablabla text') - LEN(REPLACE('blablabla text', ' ', '')) + 1.
Now the datatype is nvarchar(max) since the text is very long. I have already tried to convert the column into text or ntext and to apply datalength() instead of len(). Nevertheless I obtain the same result that it does work as a string but does not work from a table.
You're counting spaces not words. That will typically yield an approximate answer.
e.g.
' this string will give an incorrect result '
Try this approach: http://www.sql-server-helper.com/functions/count-words.aspx
CREATE FUNCTION [dbo].[WordCount] ( #InputString VARCHAR(4000) )
RETURNS INT
AS
BEGIN
DECLARE #Index INT
DECLARE #Char CHAR(1)
DECLARE #PrevChar CHAR(1)
DECLARE #WordCount INT
SET #Index = 1
SET #WordCount = 0
WHILE #Index <= LEN(#InputString)
BEGIN
SET #Char = SUBSTRING(#InputString, #Index, 1)
SET #PrevChar = CASE WHEN #Index = 1 THEN ' '
ELSE SUBSTRING(#InputString, #Index - 1, 1)
END
IF #PrevChar = ' ' AND #Char != ' '
SET #WordCount = #WordCount + 1
SET #Index = #Index + 1
END
RETURN #WordCount
END
GO
usage
DECLARE #String VARCHAR(4000)
SET #String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'
SELECT [dbo].[WordCount] ( #String )
Leading spaces, trailing spaces, two or more spaces between the neighbouring words – these are the likely causes of the wrong results you are getting.
The functions LTRIM() and RTRIM() can help you eliminate the first two issues. As for the third one, you can use REPLACE(ExtractedText, ' ', ' ') to replace double spaces with single ones, but I'm not sure if you do not have triple ones (in which case you'd need to repeat the replacing).
UPDATE
Here's a UDF that uses CTEs and ranking to eliminate extra spaces and then counts the remaining ones to return the quantity as the number of words:
CREATE FUNCTION fnCountWords (#Str varchar(max))
RETURNS int
AS BEGIN
DECLARE #xml xml, #res int;
SET #Str = RTRIM(LTRIM(#Str));
WITH split AS (
SELECT
idx = number,
chr = SUBSTRING(#Str, number, 1)
FROM master..spt_values
WHERE type = 'P'
AND number BETWEEN 1 AND LEN(#Str)
),
ranked AS (
SELECT
idx,
chr,
rnk = idx - ROW_NUMBER() OVER (PARTITION BY chr ORDER BY idx)
FROM split
)
SELECT #res = COUNT(DISTINCT rnk) + 1
FROM ranked
WHERE chr = ' ';
RETURN #res;
END
With this function your query will be simply like this:
SELECT fnCountWords(ExtractedText)
FROM EDDSDBO.Document
WHERE ID='100'
UPDATE 2
The function uses one of the system tables, master..spt_values, as a tally table. The particular subset used contains only values from 0 to 2047. This means the function will not work correctly for inputs longer than 2047 characters (after trimming both leading and trailing spaces), as #t-clausen.dk has correctly noted in his comment. Therefore, a custom tally table should be used if longer input strings are possible.
Replace the spaces with something that never occur in your text like ' $!' or pick another value.
then replace all '$! ' and '$!' with nothing this way you never have more than 1 space after a word. Then use your current script. I have defined a word as a space followed by a non-space.
This is an example
DECLARE #T TABLE(COL1 NVARCHAR(2000), ID INT)
INSERT #T VALUES('A B C D', 100)
SELECT LEN(C) - LEN(REPLACE(C,' ', '')) COUNT FROM (
SELECT REPLACE(REPLACE(REPLACE(' ' + COL1, ' ', ' $!'), '$! ',''), '$!', '') C
FROM #T ) A
Here is a recursive solution
DECLARE #T TABLE(COL1 NVARCHAR(2000), ID INT)
INSERT #T VALUES('A B C D', 100)
INSERT #T VALUES('have a nice day with 7 words', 100)
;WITH CTE AS
(
SELECT 1 words, col1 c, col1 FROM #t WHERE id = 100
UNION ALL
SELECT words +1, right(c, len(c) - patindex('% [^ ]%', c)), col1 FROM cte
WHERE patindex('% [^ ]%', c) > 0
)
SELECT words, col1 FROM cte WHERE patindex('% [^ ]%', c) = 0
You should declare the column using the varchar data type, like:
create table emp(ename varchar(22));
insert into emp values('amit');
select ename,len(ename) from emp;
output : 4

How do I split a delimited string so I can access individual items?

Using SQL Server, how do I split a string so I can access item x?
Take a string "Hello John Smith". How can I split the string by space and access the item at index 1 which should return "John"?
I don't believe SQL Server has a built-in split function, so other than a UDF, the only other answer I know is to hijack the PARSENAME function:
SELECT PARSENAME(REPLACE('Hello John Smith', ' ', '.'), 2)
PARSENAME takes a string and splits it on the period character. It takes a number as its second argument, and that number specifies which segment of the string to return (working from back to front).
SELECT PARSENAME(REPLACE('Hello John Smith', ' ', '.'), 3) --return Hello
Obvious problem is when the string already contains a period. I still think using a UDF is the best way...any other suggestions?
You may find the solution in SQL User Defined Function to Parse a Delimited String helpful (from The Code Project).
You can use this simple logic:
Declare #products varchar(200) = '1|20|3|343|44|6|8765'
Declare #individual varchar(20) = null
WHILE LEN(#products) > 0
BEGIN
IF PATINDEX('%|%', #products) > 0
BEGIN
SET #individual = SUBSTRING(#products,
0,
PATINDEX('%|%', #products))
SELECT #individual
SET #products = SUBSTRING(#products,
LEN(#individual + '|') + 1,
LEN(#products))
END
ELSE
BEGIN
SET #individual = #products
SET #products = NULL
SELECT #individual
END
END
First, create a function (using CTE, common table expression does away with the need for a temp table)
create function dbo.SplitString
(
#str nvarchar(4000),
#separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select
1,
1,
charindex(#separator, #str)
union all
select
p + 1,
b + 1,
charindex(#separator, #str, b + 1)
from tokens
where b > 0
)
select
p-1 zeroBasedOccurance,
substring(
#str,
a,
case when b > 0 then b-a ELSE 4000 end)
AS s
from tokens
)
GO
Then, use it as any table (or modify it to fit within your existing stored proc) like this.
select s
from dbo.SplitString('Hello John Smith', ' ')
where zeroBasedOccurance=1
Update
Previous version would fail for input string longer than 4000 chars. This version takes care of the limitation:
create function dbo.SplitString
(
#str nvarchar(max),
#separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select
cast(1 as bigint),
cast(1 as bigint),
charindex(#separator, #str)
union all
select
p + 1,
b + 1,
charindex(#separator, #str, b + 1)
from tokens
where b > 0
)
select
p-1 ItemIndex,
substring(
#str,
a,
case when b > 0 then b-a ELSE LEN(#str) end)
AS s
from tokens
);
GO
Usage remains the same.
Most of the solutions here use while loops or recursive CTEs. A set-based approach will be superior, I promise, if you can use a delimiter other than a space:
CREATE FUNCTION [dbo].[SplitString]
(
#List NVARCHAR(MAX),
#Delim VARCHAR(255)
)
RETURNS TABLE
AS
RETURN ( SELECT [Value], idx = RANK() OVER (ORDER BY n) FROM
(
SELECT n = Number,
[Value] = LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim, #List + #Delim, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_objects) AS x
WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim + #List, [Number], LEN(#Delim)) = #Delim
) AS y
);
Sample usage:
SELECT Value FROM dbo.SplitString('foo,bar,blat,foo,splunge',',')
WHERE idx = 3;
Results:
----
blat
You could also add the idx you want as an argument to the function, but I'll leave that as an exercise to the reader.
You can't do this with just the native STRING_SPLIT function added in SQL Server 2016, because there is no guarantee that the output will be rendered in the order of the original list. In other words, if you pass in 3,6,1 the result will likely be in that order, but it could be 1,3,6. I have asked for the community's help in improving the built-in function here:
Please help with STRING_SPLIT improvements
With enough qualitative feedback, they may actually consider making some of these enhancements:
STRING_SPLIT is not feature complete
More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if splitting strings coming from the application layer:
Split strings the right way – or the next best way
Splitting Strings : A Follow-Up
Splitting Strings : Now with less T-SQL
Comparing string splitting / concatenation methods
Processing a list of integers : my approach
Splitting a list of integers : another roundup
More on splitting lists : custom delimiters, preventing duplicates, and maintaining order
Removing Duplicates from Strings in SQL Server
On SQL Server 2016 or above, though, you should look at STRING_SPLIT() and STRING_AGG():
Performance Surprises and Assumptions : STRING_SPLIT()
STRING_SPLIT() in SQL Server 2016 : Follow-Up #1
STRING_SPLIT() in SQL Server 2016 : Follow-Up #2
SQL Server v.Next : STRING_AGG() performance
Solve old problems with SQL Server’s new STRING_AGG and STRING_SPLIT functions
You can leverage a Number table to do the string parsing.
Create a physical numbers table:
create table dbo.Numbers (N int primary key);
insert into dbo.Numbers
select top 1000 row_number() over(order by number) from master..spt_values
go
Create test table with 1000000 rows
create table #yak (i int identity(1,1) primary key, array varchar(50))
insert into #yak(array)
select 'a,b,c' from dbo.Numbers n cross join dbo.Numbers nn
go
Create the function
create function [dbo].[ufn_ParseArray]
( #Input nvarchar(4000),
#Delimiter char(1) = ',',
#BaseIdent int
)
returns table as
return
( select row_number() over (order by n asc) + (#BaseIdent - 1) [i],
substring(#Input, n, charindex(#Delimiter, #Input + #Delimiter, n) - n) s
from dbo.Numbers
where n <= convert(int, len(#Input)) and
substring(#Delimiter + #Input, n, 1) = #Delimiter
)
go
Usage (outputs 3mil rows in 40s on my laptop)
select *
from #yak
cross apply dbo.ufn_ParseArray(array, ',', 1)
cleanup
drop table dbo.Numbers;
drop function [dbo].[ufn_ParseArray]
Performance here is not amazing, but calling a function over a million row table is not the best idea. If performing a string split over many rows I would avoid the function.
This question is not about a string split approach, but about how to get the nth element.
All answers here are doing some kind of string splitting using recursion, CTEs, multiple CHARINDEX, REVERSE and PATINDEX, inventing functions, call for CLR methods, number tables, CROSS APPLYs ... Most answers cover many lines of code.
But - if you really want nothing more than an approach to get the nth element - this can be done as real one-liner, no UDF, not even a sub-select... And as an extra benefit: type safe
Get part 2 delimited by a space:
DECLARE #input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(#input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')
Of course you can use variables for delimiter and position (use sql:column to retrieve the position directly from a query's value):
DECLARE #dlmt NVARCHAR(10)=N' ';
DECLARE #pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(#input,#dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("#pos")][1]','nvarchar(max)')
If your string might include forbidden characters (especially one among &><), you still can do it this way. Just use FOR XML PATH on your string first to replace all forbidden characters with the fitting escape sequence implicitly.
It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:
SET #input=N'Some <, > and &;Other äöü#€;One more';
SET #dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(#input,#dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("#pos")][1]','nvarchar(max)');
UPDATE for SQL-Server 2016+
Regretfully the developers forgot to return the part's index with STRING_SPLIT. But, using SQL-Server 2016+, there is JSON_VALUE and OPENJSON.
With JSON_VALUE we can pass in the position as the index' array.
For OPENJSON the documentation states clearly:
When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.
A string like 1,2,3 needs nothing more than brackets: [1,2,3].
A string of words like this is an example needs to be ["this","is","an","example"].
These are very easy string operations. Just try it out:
DECLARE #str VARCHAR(100)='Hello John Smith';
DECLARE #position INT = 2;
--We can build the json-path '$[1]' using CONCAT
SELECT JSON_VALUE('["' + REPLACE(#str,' ','","') + '"]',CONCAT('$[',#position-1,']'));
--See this for a position safe string-splitter (zero-based):
SELECT JsonArray.[key] AS [Position]
,JsonArray.[value] AS [Part]
FROM OPENJSON('["' + REPLACE(#str,' ','","') + '"]') JsonArray
In this post I tested various approaches and found, that OPENJSON is really fast. Even much faster than the famous "delimitedSplit8k()" method...
UPDATE 2 - Get the values type-safe
We can use an array within an array simply by using doubled [[]]. This allows for a typed WITH-clause:
DECLARE #SomeDelimitedString VARCHAR(100)='part1|1|20190920';
DECLARE #JsonArray NVARCHAR(MAX)=CONCAT('[["',REPLACE(#SomeDelimitedString,'|','","'),'"]]');
SELECT #SomeDelimitedString AS TheOriginal
,#JsonArray AS TransformedToJSON
,ValuesFromTheArray.*
FROM OPENJSON(#JsonArray)
WITH(TheFirstFragment VARCHAR(100) '$[0]'
,TheSecondFragment INT '$[1]'
,TheThirdFragment DATE '$[2]') ValuesFromTheArray
Here is a UDF which will do it. It will return a table of the delimited values, haven't tried all scenarios on it but your example works fine.
CREATE FUNCTION SplitString
(
-- Add the parameters for the function here
#myString varchar(500),
#deliminator varchar(10)
)
RETURNS
#ReturnTable TABLE
(
-- Add the column definitions for the TABLE variable here
[id] [int] IDENTITY(1,1) NOT NULL,
[part] [varchar](50) NULL
)
AS
BEGIN
Declare #iSpaces int
Declare #part varchar(50)
--initialize spaces
Select #iSpaces = charindex(#deliminator,#myString,0)
While #iSpaces > 0
Begin
Select #part = substring(#myString,0,charindex(#deliminator,#myString,0))
Insert Into #ReturnTable(part)
Select #part
Select #myString = substring(#mystring,charindex(#deliminator,#myString,0)+ len(#deliminator),len(#myString) - charindex(' ',#myString,0))
Select #iSpaces = charindex(#deliminator,#myString,0)
end
If len(#myString) > 0
Insert Into #ReturnTable
Select #myString
RETURN
END
GO
You would call it like this:
Select * From SplitString('Hello John Smith',' ')
Edit: Updated solution to handle delimters with a len>1 as in :
select * From SplitString('Hello**John**Smith','**')
Here I post a simple way of solution
CREATE FUNCTION [dbo].[split](
#delimited NVARCHAR(MAX),
#delimiter NVARCHAR(100)
) RETURNS #t TABLE (id INT IDENTITY(1,1), val NVARCHAR(MAX))
AS
BEGIN
DECLARE #xml XML
SET #xml = N'<t>' + REPLACE(#delimited,#delimiter,'</t><t>') + '</t>'
INSERT INTO #t(val)
SELECT r.value('.','varchar(MAX)') as item
FROM #xml.nodes('/t') as records(r)
RETURN
END
Execute the function like this
select * from dbo.split('Hello John Smith',' ')
In my opinion you guys are making it way too complicated. Just create a CLR UDF and be done with it.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections.Generic;
public partial class UserDefinedFunctions {
[SqlFunction]
public static SqlString SearchString(string Search) {
List<string> SearchWords = new List<string>();
foreach (string s in Search.Split(new char[] { ' ' })) {
if (!s.ToLower().Equals("or") && !s.ToLower().Equals("and")) {
SearchWords.Add(s);
}
}
return new SqlString(string.Join(" OR ", SearchWords.ToArray()));
}
};
What about using string and values() statement?
DECLARE #str varchar(max)
SET #str = 'Hello John Smith'
DECLARE #separator varchar(max)
SET #separator = ' '
DECLARE #Splited TABLE(id int IDENTITY(1,1), item varchar(max))
SET #str = REPLACE(#str, #separator, '''),(''')
SET #str = 'SELECT * FROM (VALUES(''' + #str + ''')) AS V(A)'
INSERT INTO #Splited
EXEC(#str)
SELECT * FROM #Splited
Result-set achieved.
id item
1 Hello
2 John
3 Smith
I use the answer of frederic but this did not work in SQL Server 2005
I modified it and I'm using select with union all and it works
DECLARE #str varchar(max)
SET #str = 'Hello John Smith how are you'
DECLARE #separator varchar(max)
SET #separator = ' '
DECLARE #Splited table(id int IDENTITY(1,1), item varchar(max))
SET #str = REPLACE(#str, #separator, ''' UNION ALL SELECT ''')
SET #str = ' SELECT ''' + #str + ''' '
INSERT INTO #Splited
EXEC(#str)
SELECT * FROM #Splited
And the result-set is:
id item
1 Hello
2 John
3 Smith
4 how
5 are
6 you
This pattern works fine and you can generalize
Convert(xml,'<n>'+Replace(FIELD,'.','</n><n>')+'</n>').value('(/n[INDEX])','TYPE')
^^^^^ ^^^^^ ^^^^
note FIELD, INDEX and TYPE.
Let some table with identifiers like
sys.message.1234.warning.A45
sys.message.1235.error.O98
....
Then, you can write
SELECT Source = q.value('(/n[1])', 'varchar(10)'),
RecordType = q.value('(/n[2])', 'varchar(20)'),
RecordNumber = q.value('(/n[3])', 'int'),
Status = q.value('(/n[4])', 'varchar(5)')
FROM (
SELECT q = Convert(xml,'<n>'+Replace(fieldName,'.','</n><n>')+'</n>')
FROM some_TABLE
) Q
splitting and casting all parts.
Yet another get n'th part of string by delimeter function:
create function GetStringPartByDelimeter (
#value as nvarchar(max),
#delimeter as nvarchar(max),
#position as int
) returns NVARCHAR(MAX)
AS BEGIN
declare #startPos as int
declare #endPos as int
set #endPos = -1
while (#position > 0 and #endPos != 0) begin
set #startPos = #endPos + 1
set #endPos = charindex(#delimeter, #value, #startPos)
if(#position = 1) begin
if(#endPos = 0)
set #endPos = len(#value) + 1
return substring(#value, #startPos, #endPos - #startPos)
end
set #position = #position - 1
end
return null
end
and the usage:
select dbo.GetStringPartByDelimeter ('a;b;c;d;e', ';', 3)
which returns:
c
If your database has compatibility level of 130 or higher then you can use the STRING_SPLIT function along with OFFSET FETCH clauses to get the specific item by index.
To get the item at index N (zero based), you can use the following code
SELECT value
FROM STRING_SPLIT('Hello John Smith',' ')
ORDER BY (SELECT NULL)
OFFSET N ROWS
FETCH NEXT 1 ROWS ONLY
To check the compatibility level of your database, execute this code:
SELECT compatibility_level
FROM sys.databases WHERE name = 'YourDBName';
Try this:
CREATE function [SplitWordList]
(
#list varchar(8000)
)
returns #t table
(
Word varchar(50) not null,
Position int identity(1,1) not null
)
as begin
declare
#pos int,
#lpos int,
#item varchar(100),
#ignore varchar(100),
#dl int,
#a1 int,
#a2 int,
#z1 int,
#z2 int,
#n1 int,
#n2 int,
#c varchar(1),
#a smallint
select
#a1 = ascii('a'),
#a2 = ascii('A'),
#z1 = ascii('z'),
#z2 = ascii('Z'),
#n1 = ascii('0'),
#n2 = ascii('9')
set #ignore = '''"'
set #pos = 1
set #dl = datalength(#list)
set #lpos = 1
set #item = ''
while (#pos <= #dl) begin
set #c = substring(#list, #pos, 1)
if (#ignore not like '%' + #c + '%') begin
set #a = ascii(#c)
if ((#a >= #a1) and (#a <= #z1))
or ((#a >= #a2) and (#a <= #z2))
or ((#a >= #n1) and (#a <= #n2))
begin
set #item = #item + #c
end else if (#item > '') begin
insert into #t values (#item)
set #item = ''
end
end
set #pos = #pos + 1
end
if (#item > '') begin
insert into #t values (#item)
end
return
end
Test it like this:
select * from SplitWordList('Hello John Smith')
I was looking for the solution on net and the below works for me.
Ref.
And you call the function like this :
SELECT * FROM dbo.split('ram shyam hari gopal',' ')
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[Split](#String VARCHAR(8000), #Delimiter CHAR(1))
RETURNS #temptable TABLE (items VARCHAR(8000))
AS
BEGIN
DECLARE #idx INT
DECLARE #slice VARCHAR(8000)
SELECT #idx = 1
IF len(#String)<1 OR #String IS NULL RETURN
WHILE #idx!= 0
BEGIN
SET #idx = charindex(#Delimiter,#String)
IF #idx!=0
SET #slice = LEFT(#String,#idx - 1)
ELSE
SET #slice = #String
IF(len(#slice)>0)
INSERT INTO #temptable(Items) VALUES(#slice)
SET #String = RIGHT(#String,len(#String) - #idx)
IF len(#String) = 0 break
END
RETURN
END
The following example uses a recursive CTE
Update 18.09.2013
CREATE FUNCTION dbo.SplitStrings_CTE(#List nvarchar(max), #Delimiter nvarchar(1))
RETURNS #returns TABLE (val nvarchar(max), [level] int, PRIMARY KEY CLUSTERED([level]))
AS
BEGIN
;WITH cte AS
(
SELECT SUBSTRING(#List, 0, CHARINDEX(#Delimiter, #List + #Delimiter)) AS val,
CAST(STUFF(#List + #Delimiter, 1, CHARINDEX(#Delimiter, #List + #Delimiter), '') AS nvarchar(max)) AS stval,
1 AS [level]
UNION ALL
SELECT SUBSTRING(stval, 0, CHARINDEX(#Delimiter, stval)),
CAST(STUFF(stval, 1, CHARINDEX(#Delimiter, stval), '') AS nvarchar(max)),
[level] + 1
FROM cte
WHERE stval != ''
)
INSERT #returns
SELECT REPLACE(val, ' ','' ) AS val, [level]
FROM cte
WHERE val > ''
RETURN
END
Demo on SQLFiddle
Alter Function dbo.fn_Split
(
#Expression nvarchar(max),
#Delimiter nvarchar(20) = ',',
#Qualifier char(1) = Null
)
RETURNS #Results TABLE (id int IDENTITY(1,1), value nvarchar(max))
AS
BEGIN
/* USAGE
Select * From dbo.fn_Split('apple pear grape banana orange honeydew cantalope 3 2 1 4', ' ', Null)
Select * From dbo.fn_Split('1,abc,"Doe, John",4', ',', '"')
Select * From dbo.fn_Split('Hello 0,"&""&&&&', ',', '"')
*/
-- Declare Variables
DECLARE
#X xml,
#Temp nvarchar(max),
#Temp2 nvarchar(max),
#Start int,
#End int
-- HTML Encode #Expression
Select #Expression = (Select #Expression For XML Path(''))
-- Find all occurences of #Delimiter within #Qualifier and replace with |||***|||
While PATINDEX('%' + #Qualifier + '%', #Expression) > 0 AND Len(IsNull(#Qualifier, '')) > 0
BEGIN
Select
-- Starting character position of #Qualifier
#Start = PATINDEX('%' + #Qualifier + '%', #Expression),
-- #Expression starting at the #Start position
#Temp = SubString(#Expression, #Start + 1, LEN(#Expression)-#Start+1),
-- Next position of #Qualifier within #Expression
#End = PATINDEX('%' + #Qualifier + '%', #Temp) - 1,
-- The part of Expression found between the #Qualifiers
#Temp2 = Case When #End &LT 0 Then #Temp Else Left(#Temp, #End) End,
-- New #Expression
#Expression = REPLACE(#Expression,
#Qualifier + #Temp2 + Case When #End &LT 0 Then '' Else #Qualifier End,
Replace(#Temp2, #Delimiter, '|||***|||')
)
END
-- Replace all occurences of #Delimiter within #Expression with '&lt/fn_Split&gt&ltfn_Split&gt'
-- And convert it to XML so we can select from it
SET
#X = Cast('&ltfn_Split&gt' +
Replace(#Expression, #Delimiter, '&lt/fn_Split&gt&ltfn_Split&gt') +
'&lt/fn_Split&gt' as xml)
-- Insert into our returnable table replacing '|||***|||' back to #Delimiter
INSERT #Results
SELECT
"Value" = LTRIM(RTrim(Replace(C.value('.', 'nvarchar(max)'), '|||***|||', #Delimiter)))
FROM
#X.nodes('fn_Split') as X(C)
-- Return our temp table
RETURN
END
You can split a string in SQL without needing a function:
DECLARE #bla varchar(MAX)
SET #bla = 'BED40DFC-F468-46DD-8017-00EF2FA3E4A4,64B59FC5-3F4D-4B0E-9A48-01F3D4F220B0,A611A108-97CA-42F3-A2E1-057165339719,E72D95EA-578F-45FC-88E5-075F66FD726C'
-- http://stackoverflow.com/questions/14712864/how-to-query-values-from-xml-nodes
SELECT
x.XmlCol.value('.', 'varchar(36)') AS val
FROM
(
SELECT
CAST('<e>' + REPLACE(#bla, ',', '</e><e>') + '</e>' AS xml) AS RawXml
) AS b
CROSS APPLY b.RawXml.nodes('e') x(XmlCol);
If you need to support arbitrary strings (with xml special characters)
DECLARE #bla NVARCHAR(MAX)
SET #bla = '<html>unsafe & safe Utf8CharsDon''tGetEncoded ÄöÜ - "Conex"<html>,Barnes & Noble,abc,def,ghi'
-- http://stackoverflow.com/questions/14712864/how-to-query-values-from-xml-nodes
SELECT
x.XmlCol.value('.', 'nvarchar(MAX)') AS val
FROM
(
SELECT
CAST('<e>' + REPLACE((SELECT #bla FOR XML PATH('')), ',', '</e><e>') + '</e>' AS xml) AS RawXml
) AS b
CROSS APPLY b.RawXml.nodes('e') x(XmlCol);
In Azure SQL Database (based on Microsoft SQL Server but not exactly the same thing) the signature of STRING_SPLIT function looks like:
STRING_SPLIT ( string , separator [ , enable_ordinal ] )
When enable_ordinal flag is set to 1 the result will include a column named ordinal that consists of the 1‑based position of the substring within the input string:
SELECT *
FROM STRING_SPLIT('hello john smith', ' ', 1)
| value | ordinal |
|-------|---------|
| hello | 1 |
| john | 2 |
| smith | 3 |
This allows us to do this:
SELECT value
FROM STRING_SPLIT('hello john smith', ' ', 1)
WHERE ordinal = 2
| value |
|-------|
| john |
If enable_ordinal is not available then there is a trick which assumes that the substrings within the input string are unique. In this scenario, CHAR_INDEX could be used to find the position of the substring within the input string:
SELECT value, ROW_NUMBER() OVER (ORDER BY CHARINDEX(value, input_str)) AS ord_pos
FROM (VALUES
('hello john smith')
) AS x(input_str)
CROSS APPLY STRING_SPLIT(input_str, ' ')
| value | ord_pos |
|-------+---------|
| hello | 1 |
| john | 2 |
| smith | 3 |
I know it's an old Question, but i think some one can benefit from my solution.
select
SUBSTRING(column_name,1,CHARINDEX(' ',column_name,1)-1)
,SUBSTRING(SUBSTRING(column_name,CHARINDEX(' ',column_name,1)+1,LEN(column_name))
,1
,CHARINDEX(' ',SUBSTRING(column_name,CHARINDEX(' ',column_name,1)+1,LEN(column_name)),1)-1)
,SUBSTRING(SUBSTRING(column_name,CHARINDEX(' ',column_name,1)+1,LEN(column_name))
,CHARINDEX(' ',SUBSTRING(column_name,CHARINDEX(' ',column_name,1)+1,LEN(column_name)),1)+1
,LEN(column_name))
from table_name
SQL FIDDLE
Advantages:
It separates all the 3 sub-strings deliminator by ' '.
One must not use while loop, as it decreases the performance.
No need to Pivot as all the resultant sub-string will be displayed in
one Row
Limitations:
One must know the total no. of spaces (sub-string).
Note: the solution can give sub-string up to to N.
To overcame the limitation we can use the following ref.
But again the above solution can't be use in a table (Actaully i wasn't able to use it).
Again i hope this solution can help some-one.
Update: In case of Records > 50000 it is not advisable to use LOOPS as it will degrade the Performance
Pure set-based solution using TVF with recursive CTE. You can JOIN and APPLY this function to any dataset.
create function [dbo].[SplitStringToResultSet] (#value varchar(max), #separator char(1))
returns table
as return
with r as (
select value, cast(null as varchar(max)) [x], -1 [no] from (select rtrim(cast(#value as varchar(max))) [value]) as j
union all
select right(value, len(value)-case charindex(#separator, value) when 0 then len(value) else charindex(#separator, value) end) [value]
, left(r.[value], case charindex(#separator, r.value) when 0 then len(r.value) else abs(charindex(#separator, r.[value])-1) end ) [x]
, [no] + 1 [no]
from r where value > '')
select ltrim(x) [value], [no] [index] from r where x is not null;
go
Usage:
select *
from [dbo].[SplitStringToResultSet]('Hello John Smith', ' ')
where [index] = 1;
Result:
value index
-------------
John 1
Almost all the other answers are replacing the string being split which wastes CPU cycles and performs unnecessary memory allocations.
I cover a much better way to do a string split here: http://www.digitalruby.com/split-string-sql-server/
Here is the code:
SET NOCOUNT ON
-- You will want to change nvarchar(MAX) to nvarchar(50), varchar(50) or whatever matches exactly with the string column you will be searching against
DECLARE #SplitStringTable TABLE (Value nvarchar(MAX) NOT NULL)
DECLARE #StringToSplit nvarchar(MAX) = 'your|string|to|split|here'
DECLARE #SplitEndPos int
DECLARE #SplitValue nvarchar(MAX)
DECLARE #SplitDelim nvarchar(1) = '|'
DECLARE #SplitStartPos int = 1
SET #SplitEndPos = CHARINDEX(#SplitDelim, #StringToSplit, #SplitStartPos)
WHILE #SplitEndPos > 0
BEGIN
SET #SplitValue = SUBSTRING(#StringToSplit, #SplitStartPos, (#SplitEndPos - #SplitStartPos))
INSERT #SplitStringTable (Value) VALUES (#SplitValue)
SET #SplitStartPos = #SplitEndPos + 1
SET #SplitEndPos = CHARINDEX(#SplitDelim, #StringToSplit, #SplitStartPos)
END
SET #SplitValue = SUBSTRING(#StringToSplit, #SplitStartPos, 2147483647)
INSERT #SplitStringTable (Value) VALUES(#SplitValue)
SET NOCOUNT OFF
-- You can select or join with the values in #SplitStringTable at this point.
Recursive CTE solution with server pain, test it
MS SQL Server 2008 Schema Setup:
create table Course( Courses varchar(100) );
insert into Course values ('Hello John Smith');
Query 1:
with cte as
( select
left( Courses, charindex( ' ' , Courses) ) as a_l,
cast( substring( Courses,
charindex( ' ' , Courses) + 1 ,
len(Courses ) ) + ' '
as varchar(100) ) as a_r,
Courses as a,
0 as n
from Course t
union all
select
left(a_r, charindex( ' ' , a_r) ) as a_l,
substring( a_r, charindex( ' ' , a_r) + 1 , len(a_R ) ) as a_r,
cte.a,
cte.n + 1 as n
from Course t inner join cte
on t.Courses = cte.a and len( a_r ) > 0
)
select a_l, n from cte
--where N = 1
Results:
| A_L | N |
|--------|---|
| Hello | 0 |
| John | 1 |
| Smith | 2 |
while similar to the xml based answer by josejuan, i found that processing the xml path only once, then pivoting was moderately more efficient:
select ID,
[3] as PathProvidingID,
[4] as PathProvider,
[5] as ComponentProvidingID,
[6] as ComponentProviding,
[7] as InputRecievingID,
[8] as InputRecieving,
[9] as RowsPassed,
[10] as InputRecieving2
from
(
select id,message,d.* from sysssislog cross apply (
SELECT Item = y.i.value('(./text())[1]', 'varchar(200)'),
row_number() over(order by y.i) as rn
FROM
(
SELECT x = CONVERT(XML, '<i>' + REPLACE(Message, ':', '</i><i>') + '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
) d
WHERE event
=
'OnPipelineRowsSent'
) as tokens
pivot
( max(item) for [rn] in ([3],[4],[5],[6],[7],[8],[9],[10])
) as data
ran in 8:30
select id,
tokens.value('(/n[3])', 'varchar(100)')as PathProvidingID,
tokens.value('(/n[4])', 'varchar(100)') as PathProvider,
tokens.value('(/n[5])', 'varchar(100)') as ComponentProvidingID,
tokens.value('(/n[6])', 'varchar(100)') as ComponentProviding,
tokens.value('(/n[7])', 'varchar(100)') as InputRecievingID,
tokens.value('(/n[8])', 'varchar(100)') as InputRecieving,
tokens.value('(/n[9])', 'varchar(100)') as RowsPassed
from
(
select id, Convert(xml,'<n>'+Replace(message,'.','</n><n>')+'</n>') tokens
from sysssislog
WHERE event
=
'OnPipelineRowsSent'
) as data
ran in 9:20
CREATE FUNCTION [dbo].[fnSplitString]
(
#string NVARCHAR(MAX),
#delimiter CHAR(1)
)
RETURNS #output TABLE(splitdata NVARCHAR(MAX)
)
BEGIN
DECLARE #start INT, #end INT
SELECT #start = 1, #end = CHARINDEX(#delimiter, #string)
WHILE #start < LEN(#string) + 1 BEGIN
IF #end = 0
SET #end = LEN(#string) + 1
INSERT INTO #output (splitdata)
VALUES(SUBSTRING(#string, #start, #end - #start))
SET #start = #end + 1
SET #end = CHARINDEX(#delimiter, #string, #start)
END
RETURN
END
AND USE IT
select *from dbo.fnSplitString('Querying SQL Server','')
if anyone wants to get only one part of the seperatured text can use this
select * from fromSplitStringSep('Word1 wordr2 word3',' ')
CREATE function [dbo].[SplitStringSep]
(
#str nvarchar(4000),
#separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select
1,
1,
charindex(#separator, #str)
union all
select
p + 1,
b + 1,
charindex(#separator, #str, b + 1)
from tokens
where b > 0
)
select
p-1 zeroBasedOccurance,
substring(
#str,
a,
case when b > 0 then b-a ELSE 4000 end)
AS s
from tokens
)
I devoloped this,
declare #x nvarchar(Max) = 'ali.veli.deli.';
declare #item nvarchar(Max);
declare #splitter char='.';
while CHARINDEX(#splitter,#x) != 0
begin
set #item = LEFT(#x,CHARINDEX(#splitter,#x))
set #x = RIGHT(#x,len(#x)-len(#item) )
select #item as item, #x as x;
end
the only attention you should is dot '.' that end of the #x is always should be there.
building on #NothingsImpossible solution, or, rather, comment on the most voted answer (just below the accepted one), i found the following quick-and-dirty solution fulfill my own needs - it has a benefit of being solely within SQL domain.
given a string "first;second;third;fourth;fifth", say, I want to get the third token. this works only if we know how many tokens the string is going to have - in this case it's 5. so my way of action is to chop the last two tokens away (inner query), and then to chop the first two tokens away (outer query)
i know that this is ugly and covers the specific conditions i was in, but am posting it just in case somebody finds it useful. cheers
select
REVERSE(
SUBSTRING(
reverse_substring,
0,
CHARINDEX(';', reverse_substring)
)
)
from
(
select
msg,
SUBSTRING(
REVERSE(msg),
CHARINDEX(
';',
REVERSE(msg),
CHARINDEX(
';',
REVERSE(msg)
)+1
)+1,
1000
) reverse_substring
from
(
select 'first;second;third;fourth;fifth' msg
) a
) b
declare #strng varchar(max)='hello john smith'
select (
substring(
#strng,
charindex(' ', #strng) + 1,
(
(charindex(' ', #strng, charindex(' ', #strng) + 1))
- charindex(' ',#strng)
)
))