Trying to write a query for printing each character of a string separately. I have tried the following
select substring('sas',1,1)
union all
select substring('sas',2,1)
union all
select substring('sas',3,1)
But I would have to run union all for each character. Any better approach to this?
Sandbox: http://sqlfiddle.com/#!9/9eecb/123683
DECLARE #data VARCHAR(100) = 'October 11, 2017'
;WITH CTE AS
(
SELECT STUFF(#data,1,1,'') TXT, LEFT(#data,1) Col1
UNION ALL
SELECT STUFF(TXT,1,1,'') TXT, LEFT(TXT,1) Col1 FROM CTE
WHERE LEN(TXT) > 0
)
select Col1, ISNUMERIC(Col1) from CTE
You can try this as well
DECLARE #data VARCHAR(100) = 'TEST'
Declare #cnt int = len(#data)
Declare #i int =1
While (#i <= #cnt)
BEGIN
PRint SUBSTRING(#data,#i,1)
set #i=#i+1
END
I really don't like the use of an rCTE for tasks like this, that are iterative and slow (far slower than a Tally, especially when more than a few rows). You could use a Tally and do this far faster. As a TVF, this would like like this:
CREATE FUNCTION dbo.GetChars (#String varchar(8000))
RETURNS table
AS RETURN
WITH N AS(
SELECT N
FROM(VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (LEN(#String)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4)
SELECT SUBSTRING(#String, T.I, 1) AS C, T.I
FROM Tally T;
GO
db<>fiddle
Note, this will not work on SQL Server 2005.
Related
This question already has answers here:
T-SQL Split Word into characters
(7 answers)
Closed 4 years ago.
I have String like as follows
Declare #string ='welcome'
and i want output like this
w
e
l
c
o
m
e
You can use a tally table for this, usually faster than looping, but you would have to test for yourself:
declare #s varchar(200)
set #s = 'Welcome'
;with t as (
select v
from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(v)
),
n as (
select row_number() over(order by (select null)) rn
from t t1 cross join t t2
)
select substring(#s, rn, 1)
from n
where rn <= len(#s)
DECLARE #string VARCHAR(256) = 'welcome'
DECLARE #cnt INT = 0;
WHILE #cnt < len(#string)
BEGIN
SET #cnt = #cnt + 1;
PRINT SUBSTRING ( #string ,#cnt , 1 )
END;
In essence you loop through the length of the string.
You print the character at the location of the index of the loop and print that.
You ca use recursive CTE.
Declare #string varchar(10) ='welcome'
;with cte as
(
select 1 as i,substring(#string,1,1) as single_char
union all
select i+1 as i,convert(varchar(1),substring(#string,i+1,i+1)) as single_char from cte where len(convert(varchar(1),substring(#string,i+1,i+1)))=1
)
select single_char From cte
I am looking to take a string of directory path and parse information out of it into existing columns on another table. This is for the purpose of creating a staging table for reporting. It will be parsing many directory paths if the ProjectName is applicable to the change in structure.
Data Example:
Table1_Column1
ProjectName\123456_ProjectShortName\Release_1\Iteration\etc
Expected Output:
Table2_Column1, Table2_Column2
123456 ProjectShortName
I've figured out how to parse some strings by character but it seems a bit clunky and inefficient. Is there a better structure to go about this? To add some more to it, this is just one column I need to manipulate before shifting it over there are three other columns that are being directly shifted to the staging table based on the ProjectName.
Is it better to just create a UDF to split then call it within the job that will move the data or is there another way?
Here's a method without a UDF.
It uses charindex and substring to get the parts from that path string.
An example using a table variable:
declare #T table (Table1_Column1 varchar(100));
insert into #T values
('ProjectName\123456_ProjectShortName\Release_1\Iteration\etc'),
('OtherProjectName\789012_OtherProjectShortName\Release_2\Iteration\xxx');
select
case
when FirstBackslashPos > 0 and FirstUnderscorePos > 0
then substring(Col1,FirstBackslashPos+1,FirstUnderscorePos-FirstBackslashPos-1)
end as Table1_Column1,
case
when FirstUnderscorePos > 0 and SecondBackslashPos > 0
then substring(Col1,FirstUnderscorePos+1,SecondBackslashPos-FirstUnderscorePos-1)
end as Table1_Column2
from (
select
Table1_Column1 as Col1,
charindex('\',Table1_Column1) as FirstBackslashPos,
charindex('_',Table1_Column1) as FirstUnderscorePos,
charindex('\',Table1_Column1,charindex('\',Table1_Column1)+1) as SecondBackslashPos
from #T
) q;
If you want to calculate only one into a variable
declare #ProjectPath varchar(100);
set #ProjectPath = 'ProjectName\123456_ProjectShortName\Release_1\Iteration\etc';
declare #FirstBackslashPos int = charindex('\',#ProjectPath);
declare #FirstUnderscorePos int = charindex('_',#ProjectPath,#FirstBackslashPos);
declare #SecondBackslashPos int = charindex('\',#ProjectPath,#FirstBackslashPos+1);
declare #ProjectNumber varchar(30) = case when #FirstBackslashPos > 0 and #FirstUnderscorePos > 0 then substring(#ProjectPath,#FirstBackslashPos+1,#FirstUnderscorePos-#FirstBackslashPos-1)end;
declare #ProjectShortName varchar(30) = case when #FirstUnderscorePos > 0 and #SecondBackslashPos > 0 then substring(#ProjectPath,#FirstUnderscorePos+1,#SecondBackslashPos-#FirstUnderscorePos-1) end;
select #ProjectNumber as ProjectNumber, #ProjectShortName as ProjectShortName;
But i.m.h.o. it might be worth the effort to add some CLR that brings true regex matching to the SQL server. Since CHARINDEX and PATINDEX are not as flexible as regex.
The following is a SUPER fast Parser but it is limited to 8K bytes. Notice the Returned Sequence Number... Perhaps you can key off of that because I am still not clear on the logic for why column1 is 123456 and not ProjectName
Declare #String varchar(max) = 'ProjectName\123456_ProjectShortName\Release_1\Iteration\etc'
Select * from [dbo].[udf-Str-Parse-8K](#String,'\')
Returns
RetSeq RetVal
1 ProjectName
2 123456_ProjectShortName
3 Release_1
4 Iteration
5 etc
The UDF if needed
CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(#String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(#Delimiter) From cte2 t Where Substring(#String,t.N,DataLength(#Delimiter)) = #Delimiter),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(#Delimiter,#String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By A.N)
,RetVal = Substring(#String, A.N, A.L)
From cte4 A
);
--Much faster than str-Parse, but limited to 8K
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')
I need some on help a SQL Query. I have a column with values stored as comma separated values.
I need to write a query which finds the 3rd delimited item within each value in the column.
Is this possible to do this in a Select statement?
ex: ColumnValue: josh,Reg01,False,a0-t0,22/09/2010
So I will need to get the 3rd value (i.e.) False from the above string.
Yes.
Where #s is your string...
select
SUBSTRING (#s,
CHARINDEX(',',#s,CHARINDEX(',',#s)+1)+1,
CHARINDEX(',',#s,CHARINDEX(',',#s,CHARINDEX(',',#s)+1)+1)
-CHARINDEX(',',#s,CHARINDEX(',',#s)+1)-1)
Or more generically...
;with cte as
(
select 1 as Item, 1 as Start, CHARINDEX(',',#s, 1) as Split
union all
select cte.Item+1, cte.Split+1, nullif(CHARINDEX(',',#s, cte.Split+1),0) as Split
from cte
where cte.Split<>0
)
select SUBSTRING(#s, start,isnull(split,len(#s)+1)-start)
from cte
where Item = 3
Now store your data properly :)
Try this (assuming SQL Server 2005+)
DECLARE #t TABLE(ColumnValue VARCHAR(50))
INSERT INTO #t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO #t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO #t(ColumnValue) SELECT 'stackoverflow'
SELECT ThirdValue = splitdata
FROM(
SELECT
Rn = ROW_NUMBER() OVER(PARTITION BY ColumnValue ORDER BY (SELECT 1))
,X.ColumnValue
,Y.splitdata
FROM
(
SELECT *,
CAST('<X>'+REPLACE(F.ColumnValue,',','</X><X>')+'</X>' AS XML) AS xmlfilter FROM #t F
)X
CROSS APPLY
(
SELECT fdata.D.value('.','varchar(50)') AS splitdata
FROM X.xmlfilter.nodes('X') as fdata(D)
) Y
)X WHERE X.Rn = 3
//Result
ThirdValue
False
bannana
Also it is not very clear from your question as what version of SQL Server you are using. In case you are using SQL SERVER 2000, you can go ahead with the below approach.
Step 1: Create a number table
CREATE TABLE dbo.Numbers
(
N INT NOT NULL PRIMARY KEY
);
GO
DECLARE #rows AS INT;
SET #rows = 1;
INSERT INTO dbo.Numbers VALUES(1);
WHILE(#rows <= 10000)
BEGIN
INSERT INTO dbo.Numbers SELECT N + #rows FROM dbo.Numbers;
SET #rows = #rows * 2;
END
Step 2: Apply the query below
DECLARE #t TABLE(ColumnValue VARCHAR(50))
INSERT INTO #t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO #t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO #t(ColumnValue) SELECT 'stackoverflow'
--Declare a table variable to put the identity column and store the indermediate results
DECLARE #tempT TABLE(Id INT IDENTITY,ColumnValue VARCHAR(50),SplitData VARCHAR(50))
-- Insert the records into the table variable
INSERT INTO #tempT
SELECT
ColumnValue
,SUBSTRING(ColumnValue, Numbers.N,CHARINDEX(',', ColumnValue + ',', Numbers.N) - Numbers.N) AS splitdata
FROM #t
JOIN Numbers ON Numbers.N <= DATALENGTH(ColumnValue) + 1
AND SUBSTRING(',' + ColumnValue, Numbers.N, 1) = ','
--Project the filtered records
SELECT ThirdValue = X.splitdata
FROM
--The co-related subquery does the ROW_NUMBER() OVER(PARTITION BY ColumnValue)
(SELECT
Rn = (SELECT COUNT(*)
FROM #tempT t2
WHERE t2.ColumnValue=t1.ColumnValue
AND t2.Id<=t1.Id)
,t1.ColumnValue
,t1.splitdata
FROM #tempT t1)X
WHERE X.Rn =3
-- Result
ThirdValue
False
bannana
Also you can use Master..spt_Values for your number table
DECLARE #t TABLE(ColumnValue VARCHAR(50))
INSERT INTO #t(ColumnValue) SELECT 'josh,Reg01,False,a0-t0,22/09/2010'
INSERT INTO #t(ColumnValue) SELECT 'mango,apple,bannana,grapes'
INSERT INTO #t(ColumnValue) SELECT 'stackoverflow'
--Declare a table variable to put the identity column and store the indermediate results
DECLARE #tempT TABLE(Id INT IDENTITY,ColumnValue VARCHAR(50),SplitData VARCHAR(50))
-- Insert the records into the table variable
INSERT INTO #tempT
SELECT
ColumnValue
,SUBSTRING(ColumnValue, Number ,CHARINDEX(',', ColumnValue + ',', Number ) - Number) AS splitdata
FROM #t
JOIN master..spt_values ON Number <= DATALENGTH(ColumnValue) + 1 AND type='P'
AND SUBSTRING(',' + ColumnValue, Number , 1) = ','
--Project the filtered records
SELECT ThirdValue = X.splitdata
FROM
--The co-related subquery does the ROW_NUMBER() OVER(PARTITION BY ColumnValue)
(SELECT
Rn = (SELECT COUNT(*)
FROM #tempT t2
WHERE t2.ColumnValue=t1.ColumnValue
AND t2.Id<=t1.Id)
,t1.ColumnValue
,t1.splitdata
FROM #tempT t1)X
WHERE X.Rn =3
You can read about this from
1) What is the purpose of system table table master..spt_values and what are the meanings of its values?
2) Why (and how) to split column using master..spt_values?
You really need something like String.Split(',')(2) which unfortunately dos not exist in SQL but this may be helpful to you
You can make some test with this solution and the other ones but, I believe that using XML in such situations almost always gives to you best performance and insure less coding:
DECLARE #InPutCSV NVARCHAR(2000)= 'josh,Reg01,False,a0-t0,22/09/2010'
DECLARE #ValueIndexToGet INT=3
DECLARE #XML XML = CAST ('<d>' + REPLACE(#InPutCSV, ',', '</d><d>') + '</d>' AS XML);
WITH CTE(RecordNumber,Value) AS
(
SELECT ROW_NUMBER() OVER(ORDER BY T.v.value('.', 'NVARCHAR(100)') DESC) AS RecordNumber
,T.v.value('.', 'NVARCHAR(100)') AS Value
FROM #XML.nodes('/d') AS T(v)
)
SELECT Value
FROM CTE WHERE RecordNumber=#ValueIndexToGet
I can confirm that it takes 1 seconds to get value from CSV string with 100 000 values.
I need to represent the following records
DATA
000200AA
00000200AA
000020BCD
00000020BCD
000020ABC
AS
DATA CNT
200AA 1
20BCD 2
20ABC 2
ANY IDEAS?
USE patindex
select count(test) as cnt,
substring(test, patindex('%[^0]%',test),len(test)) from (
select ('000200AA') as test
union
select '00000200AA' as test
union
select ('000020BCD') as test
union
select ('00000020BCD') as test
union
select ('000020ABC') as test
)ty
group by substring(test, patindex('%[^0]%',test),len(test))
How about a nice recursive user-defined function?
CREATE FUNCTION dbo.StripLeadingZeros (
#input varchar(MAX)
) RETURNS varchar(MAX)
BEGIN
IF LEN(#input) = 0
RETURN #input
IF SUBSTRING(#input, 1, 1) = '0'
RETURN dbo.StripLeadingZeros(SUBSTRING(#input, 2, LEN(#input) - 1))
RETURN #input
END
GO
Then:
SELECT dbo.StripLeadingZeros(DATA) DATA, COUNT(DATA) CNT
FROM YourTable GROUP BY dbo.StripLeadingZeros(DATA)
DECLARE #String VARCHAR(32) = N'000200AA'
SELECT SUBSTRING ( #String ,CHARINDEX(N'2', #String),LEN(#String))
Depending on the what you need to get the values this code may differ:
Assuming a simple right 5 chars as Barry suggested, you can use RIGHT(data, 5) and GROUP BY and COUNT to get your results
http://sqlfiddle.com/#!3/19ecd/2
take a look at the STUFF function
It inserts data into a string on a range
You can do this query:
SELECT RIGHT([DATA],LEN[DATA])-PATINDEX('%[1-9]%',[DATA])+1) [DATA], COUNT(*) CNT
FROM YourTable
GROUP BY RIGHT([DATA],LEN[DATA])-PATINDEX('%[1-9]%',[DATA])+1)
I have 2 string in input for example '1,5,6' and '2,89,9' with same number of element (3 or plus).
Those 2 string i want made a "ordinate join" as
1 2
5 89
6 9
i have think to assign a rownumber and made a join between 2 result set as
SELECT a.item, b.item FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS rownumber,
* FROM dbo.Split('1,5,6',',')
) AS a
INNER JOIN
(
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS rownumber,
* FROM dbo.Split('2,89,9',',')
) AS b ON a.rownumber = b.rownumber
is that a best practice ever?
When dbo.Split() returns the data-set, nothing you do can assign the row_number you want (based on their order in the string) with absolute certainty. SQL never guarantees an ordering without an ORDER BY that actually relates to the data.
With you trick of using (SELECT 0) to order by you may often get the right values. Probably very often. But this is never guaranteed. Once in a while you will get the wrong order.
Your best option is to recode dbo.Split() to assign a row_number as the string is parsed. Only then can you know with 100% certainty that the row_number really does correspond to the item's position in the list.
Then you join them as you suggest, and get the results you want.
Other than that, the idea does seem fine to me. Though you may wish to consider a FULL OUTER JOIN if one list can be longer than the other.
You can do it like this as well
Consider your split function like this:
CREATE FUNCTION Split
(
#delimited nvarchar(max),
#delimiter nvarchar(100)
) RETURNS #t TABLE
(
id int identity(1,1),
val nvarchar(max)
)
AS
BEGIN
declare #xml xml
set #xml = N'<root><r>' + replace(#delimited,#delimiter,'</r><r>') + '</r></root>'
insert into #t(val)
select
r.value('.','varchar(5)') as item
from #xml.nodes('//root/r') as records(r)
RETURN
END
GO
The it will be a simple task to JOIN them together. Like this:
SELECT
*
FROM
dbo.Split('1,5,6',',') AS a
JOIN dbo.Split('2,89,9',',') AS b
ON a.id=b.id
The upside of this is that you do not need any ROW_NUMBER() OVER(ORDER BY SELECT 0)
Edit
As in the comment the performance is better with a recursive split function. So maybe something like this:
CREATE FUNCTION dbo.Split (#s varchar(512),#sep char(1))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
GO
And then the select is like this:
SELECT
*
FROM
dbo.Split('1,5,6',',') AS a
JOIN dbo.Split('2,89,9',',') AS b
ON a.pn=b.pn
Thanks to Arion's suggestion. It's very useful for me. I modified the function a little bit to support varchar(max) type of input string, and max length of 1000 for the delimiter string. Also, added a parameter to indicate if you need the empty string in the final return.
For MatBailie's question, because this is an inline function, you can include the pn column in you outer query which is calling this function.
CREATE FUNCTION dbo.Split (#s nvarchar(max),#sep nvarchar(1000), #IncludeEmpty bit)
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT convert(bigint, 1) , convert(bigint, 1), convert(bigint,CHARINDEX(#sep, #s))
UNION ALL
SELECT pn + 1, stop + LEN(#sep), CHARINDEX(#sep, #s, stop + LEN(#sep))
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE LEN(#s) END) AS s
FROM Pieces
where start< CASE WHEN stop > 0 THEN stop ELSE LEN(#s) END + #IncludeEmpty
)
But I ran into a bit issue with this function when the list intended to return had more than 100 records. So, I created another function purely using string parsing functions:
Create function [dbo].[udf_split] (
#ListString nvarchar(max),
#Delimiter nvarchar(1000),
#IncludeEmpty bit)
Returns #ListTable TABLE (ID int, ListValue varchar(max))
AS
BEGIN
Declare #CurrentPosition int, #NextPosition int, #Item nvarchar(max), #ID int
Select #ID = 1,
#ListString = #Delimiter+ #ListString + #Delimiter,
#CurrentPosition = 1+LEN(#Delimiter)
Select #NextPosition = Charindex(#Delimiter, #ListString, #CurrentPosition)
While #NextPosition > 0 Begin
Select #Item = Substring(#ListString, #CurrentPosition, #NextPosition-#CurrentPosition)
If #IncludeEmpty=1 or Len(LTrim(RTrim(#Item)))>0 Begin
Insert Into #ListTable (ID, ListValue) Values (#ID, LTrim(RTrim(#Item)))
Set #ID = #ID+1
End
Select #CurrentPosition = #NextPosition+LEN(#Delimiter),
#NextPosition = Charindex(#Delimiter, #ListString, #CurrentPosition)
End
RETURN
END
Hope this could help.