SQL splitting a column into 3 different columns with multiple seperators - sql

I have the column below which contains data as shown
|DeliveryComment |
|-------------------------|
|[1 * B018] |
|GARAGE |
|BACK GARDEN. [124 * B002]|
|[1 * B018] |
| |
|[124 * B002] |
|[1 * B018] |
| |
|[124 * B002] |
I'd like to split this data into three columns displayed as below.
|ColA |ColB|ColC|
|-----------|----|----|
| |1 |B018|
|GARAGE | | |
|BACK GARDEN|124 |B002|
| |1 |B018|
| | | |
| |124 |B002|
| |1 |B018|
| | | |
| |124 |B002|
The data that should end up in column A can be variable up to 11 characters.
The data that should end up in column B can be a variable numeric value up to 3 characters.
The data that should end up in column C can be variable up to 4 characters.
There will always be [] around the numbers and there will always be a * in between them.

Create and populate sample table (Please save us this step in your future questions)
DECLARE #t AS TABLE
(
col varchar(50)
)
INSERT INTO #T VALUES
('[1 * B018]'),
('GARAGE'),
('BACK GARDEN. [124 * B002]'),
('[1 * B018]'),
(''),
('[124 * B002]'),
('[1 * B018]'),
(''),
('[124 * B002]')
The query:
SELECT CASE WHEN charindex('[', col) > 0 THEN
LEFT(col, charindex('[', col)-1)
ELSE
col
END AS ColA,
CASE WHEN charindex('[', col) = 0 THEN
''
ELSE
SUBSTRING(col, charindex('[', col) +1 ,charindex('*', col) - charindex('[', col) - 1)
END AS ColB,
CASE WHEN charindex('[', col) = 0 THEN
''
ELSE
SUBSTRING(col, charindex('*', col) +1 ,charindex(']', col) - charindex('*', col) - 1)
END AS ColC
FROM #T
Results:
ColA ColB ColC
1 B018
GARAGE
BACK GARDEN. 124 B002
1 B018
124 B002
1 B018
124 B002

This solution uses CROSS APPLY to make the CHARINDEX easier to manage.
SELECT
LEFT(SUBSTRING(col,1,CASE WHEN a= 0 THEN LEN(col) ELSE a-1 END),11) AS [ColA]
,REPLACE(SUBSTRING(col,a+1,b-a),'*','') AS [ColB]
,REPLACE(SUBSTRING(col,b+1,c-b),']','') AS [ColC]
FROM
#t
CROSS APPLY( SELECT CHARINDEX('[',Col,0)A
,CHARINDEX('*',Col,0)B
,CHARINDEX(']',Col,0)C
) Z
Note: the examples you have so far -including this one - currently have leading and trailing spaces that will need trimming out with RTRIM & LTRIM. But right now they would cloud the code.

Related

need to split the one string value into multiple columns based on different special characters or witout characters

I have one string column. I need to split into multiple columns.
The data is like below
tags
status
test
Open
_test
Open
test_
Open
1200>test>
IP
1200>test>234598
completed
I need to split the above into multiple columns as like below
Deptno,empname,Empno
My desired output like below
---------------- ------------ ------- -------- ------
tags | status | EmpNO | EmpName |DepNo|
--------------- |-----------|------ |---------- ------
|test | Open | NULL |test | NULL |
|_test | Open | NULL |test | NULL |
|test_ | Open | NULL |test | NULL |
|1200>test> | IP | NULL |test | 1200 |
|1200>test>234598 | completed| 234598|test | 1200 |
I have written query and able to split the data into multiple columns. But it is not handling all scenario's.
Query:
SELECT [tags],substring([tags], 1 , CHARINDEX('>', [tags])-1) as 'Division',
SUBSTRING([tags],CHARINDEX('>', [tags]) + 1,LEN([tags]) - CHARINDEX('>', [tags]) - CHARINDEX('-', REVERSE([tags])) ),
REVERSE(SUBSTRING(REVERSE([tags]), 1 , CHARINDEX('>', REVERSE([tags]))-1)) AS 'EmpNo'
FROM ods.[testtable]
Could someone help me the query.
According to the desired result a query would be like:
SELECT RIGHT(tags, CASE
WHEN p2.pos = 0
THEN 0
ELSE LEN(tags) - p2.pos
END) AS EmpNO
,SUBSTRING(tags, CASE
WHEN p1.pos = 0
THEN p1.pos
ELSE p1.pos + 1
END, CASE
WHEN p2.pos = 0
THEN LEN(tags) + 1
ELSE P2.Pos - P1.Pos - 1
END) AS EmpName
,LEFT(tags, CASE
WHEN p1.pos = 0
THEN 0
ELSE p1.pos - 1
END) AS DepNo
,*
FROM ods.[testtable]
CROSS APPLY (
SELECT (charindex('>', tags))
) AS P1(Pos)
CROSS APPLY (
SELECT (charindex('>', tags, P1.Pos + 1))
) AS P2(Pos)
I used this post to create the query Post
Using your sample data the follow would appear to give your desired results:
select tags, status,
Empno.v EmpNo,
Replace(Replace(Replace(Replace(tags,'_',''),IsNull(Depno.v,''),''),isnull(Empno.v ,''),''),'>','') EmpName,
Depno.v Depno
from t
cross apply (values(CharIndex('>', tags)-1))v1(t1)
cross apply (values(CharIndex('>', Reverse(tags))-1))v2(t2)
cross apply (values(Left(tags,NullIf(t1,-1))))Depno(v)
cross apply (values(right(tags,Iif(t2>0,t2,null))))Empno(v)

How to convert result of query into count aggregate of each row

Here is the result of the query:
LNAME | LISTAGG
--------------+---------------
| ALEX
BAIRSTOW |
BROAD | STUART
BUTLER |
COOK | ALAISTER,ALEX
HALES | ALEX
JENNINGS |
--------------+---------------
(7 rows)
I would like to take result in 0, 1 or count of entries in the row like (ALAISTER,ALEX) are 2 and empty format.
So the output should be like:
LNAME | LFNAME | LNAME_COUNT | LFNAME_COUNT
----------+---------------+--------------+-------------
BROAD | STUART | 1 | 1
BAIRSTOW | | 1 | 0
COOK | ALAISTER,ALEX | 1 | 2
| ALEX | empty | 1
JENNINGS | | 1 | empty
HALES | ALEX | 1 | 1
BUTLER | | 1 | 0
----------+---------------+--------------+-------------
(7 rows)
I've used a case expression, but couldn't break out the (ALAISTER,ALEX) part.
In SQL Server, you can do this using LEN().
SELECT LNAME, LISTAGG AS LFNAME,
CASE WHEN LNAME IS NULL THEN CAST(0 AS VARCHAR (5))
WHEN LEN(LNAME) > 1 THEN CAST((LEN(LNAME) - LEN(REPLACE(LNAME, ',', ''))) + 1 AS VARCHAR (5))
ELSE 'empty' END AS LNAME_COUNT,
CASE WHEN LISTAGG IS NULL THEN CAST(0 AS VARCHAR (5))
WHEN LEN(LISTAGG) > 1 THEN CAST((LEN(LISTAGG) - LEN(REPLACE(LISTAGG, ',', ''))) + 1 AS VARCHAR (5))
ELSE 'empty' END AS LFNAME_COUNT
FROM TableName
Demo on db<>fiddle for SQL Server
If your DBMS is MySQL, simply change the LEN() to LENGTH()
The following query will do what you are looking for
Declare #table table ( lname varchar (20), listagg varchar (20));
insert into #table ( lname , listagg ) values
('' , 'alex'),
('bairstow', null),
('broad' , 'stuart'),
('butler' , ''),
('cook' , 'alaister,alex'),
('hales' , 'alex'),
('jennings', null);
select
lname,
listagg as LFName,
case when lname <>'' then len(lname)-len(replace(lname,',',''))+1 else 0 end as LName_count,
case when listagg <>'' then len(listagg)-len(replace(listagg,',',''))+1 else 0 end as LFName_count
from #table

Using string_split to create rows from multiple columns

I have data that looks something like this example (on an unfortunately much larger scale):
+----+-------+--------------------+-----------------------------------------------+
| ID | Data | Cost | Comments |
+----+-------+--------------------+-----------------------------------------------+
| 1 | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-------+--------------------+-----------------------------------------------+
| 3 | 1|2 | $3.50|$4.20 | |test |
+----+-------+--------------------+-----------------------------------------------+
Some of the columns in the table I have are pipeline delimited, but they are consistent by each row. So each delimited value corresponds to the same index in the other columns of the same row.
So I can do something like this which is what I want for a single column:
SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s
and that would give me this:
+----+-----------+
| ID | datavalue |
+----+-----------+
| 1 | 1 |
+----+-----------+
| 1 | 2 |
+----+-----------+
| 1 | 3 |
+----+-----------+
| 2 | 1 |
+----+-----------+
| 3 | 1 |
+----+-----------+
| 3 | 2 |
+----+-----------+
but I also want to get the other columns as well (cost and comments in this example) so that the corresponding items are all in the same row like this:
+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue |
+----+-----------+-----------+------------------------------------+
| 1 | 1 | $0.00 | test test |
+----+-----------+-----------+------------------------------------+
| 1 | 2 | $3.17 | |
+----+-----------+-----------+------------------------------------+
| 1 | 3 | $42.42 | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-----------+-----------+------------------------------------+
| 3 | 1 | $3.50 | |
+----+-----------+-----------+------------------------------------+
| 3 | 2 | $4.20 | test |
+----+-----------+-----------+------------------------------------+
I'm not sure what the best or most simple way to achieve this would be
This isn't going to be achievable with STRING_SPLIT as Microsoft refuse to supply the ordinal position as part of the result set. As a result, you'll need to use a different function which does. Personally, I recommend Jeff Moden's DelimitedSplit8k.
Then, you can do this:
CREATE TABLE #Sample (ID int,
[Data] varchar(200),
Cost varchar(200),
Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
(2,'1','$420.69','test'),
(3,'1|2','$3.50|$4.20','|test');
GO
SELECT S.ID,
DSd.Item AS DataValue,
DSc.Item AS CostValue,
DSct.Item AS CommentValue
FROM #Sample S
CROSS APPLY dbo.DelimitedSplit8K(S.[Data],'|') DSd
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Cost,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSc
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Comments,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSct;
GO
DROP TABLE #Sample;
GO
There is, however, only one true answer to this question: Don't store delimited values in SQL Server. Store them in a normalised manner, and you won't have this problem.
Here is a solution approach using a recursive CTE instead of a User Defined Funtion (UDF) which is useful for those without permission to create functions.
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,Data VARCHAR(7) NOT NULL
,Cost VARCHAR(20) NOT NULL
,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');
This query allows choice of delimiter by using a variable, then using a common table expression it parses each delimited string to produce a rows for each portion of those strings, and retains the ordinal position of each.
declare #delimiter as varchar(1)
set #delimiter = '|'
;with cte as (
select id
, convert(varchar(max), null) as datavalue
, convert(varchar(max), null) as costvalue
, convert(varchar(max), null) as commentvalue
, convert(varchar(max), data + #delimiter) as data
, convert(varchar(max), cost + #delimiter) as cost
, convert(varchar(max), comments + #delimiter) as comments
from mytable as t
union all
select id
, convert(varchar(max), left(data, charindex(#delimiter, data) - 1))
, convert(varchar(max), left(cost, charindex(#delimiter, cost) - 1))
, convert(varchar(max), left(comments, charindex(#delimiter, comments) - 1))
, convert(varchar(max), stuff(data, 1, charindex(#delimiter, data), ''))
, convert(varchar(max), stuff(cost, 1, charindex(#delimiter, cost), ''))
, convert(varchar(max), stuff(comments, 1, charindex(#delimiter, comments), ''))
from cte
where (data like ('%' + #delimiter + '%') and cost like ('%' + #delimiter + '%')) or comments like ('%' + #delimiter + '%')
)
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue
As the recursion adds new rows, it places the first portion of the delimited strings into the wanted output columns using left(), then also, using stuff(), removes the last used delimiter from the source strings so that the next row will start at the next delimiter. Note that to initiate the extractions, the delimiter is added to the end of the source delimited strings which is to ensure the where clause does not exclude any of the wanted strings.
the result:
id datavalue costvalue commentvalue
---- ----------- ----------- ------------------------------------
1 1 $0.00 test test
1 2 $3.17
1 3 $42.42 previous thing has a blank comment
2 1 $420.69 test
3 1 $3.50
3 2 $4.20 test
demonstrated here at dbfiddle.uk

How to get certain word from a column value

Column A
/Site/Test1/mysite/Do?id=90
/Site/Test2/mysite/Done?id=10
/NewSite/Site/Test3/mysite/Do?id=90
/Site/Test3/mysite/Done?id=1901
What I am trying to do is get the Test# from each row as well as the # after the =.
I tried the following:
Select
SUBSTRING(Column A, CHARINDEX('/', Column A, 1) + 7, LEN(Column A)),
SUBSTRING(Column A, CHARINDEX('=', Column A, 1) + 1, LEN(Column A)),
Column A
from
Table1
I am able to get the # after the = but how can I get the Test# from each row.
UPDATE: Test# is an example, it can be anything in there. What is for certain is Site and NewSite.
UPDATE #2:
Updated Table:
Column A
/Site/My%20Web%20Site/mysite/Do?id=90
/Site/Test%20It%20Out/mysite/Do?id=101
/Site/Test1/dummy/Done?id=1000
/NewSite/Site/No%20Way/thesite/Do?id=909
Result:
Col1 Col2
My%20Web%20Site 90
Test%20It%20Out 101
Test1 1000
No%20Way 909
select
Col1 = substring(a
, charindex('/Site/', a)+6
, charindex('/', a,(charindex('/Site/', a)+6))-(charindex('/Site/', a)+6)
)
, Col2 = substring(a
, charindex('=', a, 1) + 1
, len(a))
from t
rextester demo: http://rextester.com/DEBB37305
returns:
+-----------------+------+
| Col1 | Col2 |
+-----------------+------+
| My%20Web%20Site | 90 |
| Test%20It%20Out | 101 |
| Test1 | 1000 |
| No%20Way | 909 |
+-----------------+------+
This should work:
select SUBSTRING(col,CHARINDEX('Test',col),5)
To test it with one example:
select SUBSTRING('/Site/Test1/mysite/Do?id=90',CHARINDEX('Test','/Site/Test1/mysite/Do?id=90'),5)

SQL SELECT: concatenated column with line breaks and heading per group

I have the following SQL result from a SELECT query:
ID | category| value | desc
1 | A | 10 | text1
2 | A | 11 | text11
3 | B | 20 | text20
4 | B | 21 | text21
5 | C | 30 | text30
This result is stored in a temporary table named #temptab. This temporary table is then used in another SELECT to build up a new colum via string concatenation (don't ask me about the detailed rationale behind this. This is code I took from a colleague). Via FOR XML PATH() the output of this column is a list of the results and is then used to send mails to customers.
The second SELECT looks as follows:
SELECT t1.column,
t2.column,
(SELECT t.category + ' | ' + t.value + ' | ' + t.desc + CHAR(9) + CHAR(13) + CHAR(10)
FROM #temptab t
WHERE t.ID = ttab.ID
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)') AS colname
FROM table1 t1
...
INNER JOIN #temptab ttab on ttab.ID = someOtherTable.ID
...
Without wanting to go into too much detail, the column colname becomes populated with several entries (due to multiple matches) and hence, a longer string is stored in this column (CHAR(9) + CHAR(13) + CHAR(10) is essentially a line break). The result/content of colname looks like this (it is used to send mails to customers):
A | 10 | text1
A | 11 | text11
B | 20 | text20
B | 21 | text21
C | 30 | text30
Now I would like to know, if there is a way to more nicely format this output string. The best case would be to group the same categories together and add a heading and empty line between different categories:
*A*
A | 10 | text1
A | 11 | text11
*B*
B | 20 | text20
B | 21 | text21
*C*
C | 30 | text30
My question is: How do I have to modify the above query (especially the string-concatenation-part) to achieve above formatting? I was thinking about using a GROUP BY statement, but this obviously does not yield the desired result.
Edit: I use Microsoft SQL Server 2008 R2 (SP2) - 10.50.4270.0 (X64)
Declare #YourTable table (ID int,category varchar(50),value int, [desc] varchar(50))
Insert Into #YourTable values
(1,'A',10,'text1'),
(2,'A',11,'text11'),
(3,'B',20,'text20'),
(4,'B',21,'text21'),
(5,'C',30,'text30')
Declare #String varchar(max) = ''
Select #String = #String + Case when RowNr=1 Then Replicate(char(13)+char(10),2) +'*'+Category+'*' Else '' end
+ char(13)+char(10) + category + ' | ' + cast(value as varchar(25)) + ' | ' + [desc]
From (
Select *
,RowNr=Row_Number() over (Partition By Category Order By Value)
From #YourTable
) A Order By Category, Value
Select Substring(#String,5,Len(#String))
Returns
*A*
A | 10 | text1
A | 11 | text11
*B*
B | 20 | text20
B | 21 | text21
*C*
C | 30 | text30
This should return what you want
Declare #YourTable table (ID int,category varchar(50),value int, [desc] varchar(50))
Insert Into #YourTable values
(1,'A',10,'text1'),
(2,'A',11,'text11'),
(3,'B',20,'text20'),
(4,'B',21,'text21'),
(5,'C',30,'text30');
WITH Categories AS
(
SELECT category
,'**' + category + '**' AS CatCaption
,ROW_NUMBER() OVER(ORDER BY category) AS CatRank
FROM #YourTable
GROUP BY category
)
,Grouped AS
(
SELECT c.CatRank
,0 AS ValRank
,c.CatCaption AS category
,-1 AS ID
,'' AS Value
,'' AS [desc]
FROM Categories AS c
UNION ALL
SELECT c.CatRank
,ROW_NUMBER() OVER(PARTITION BY t.category ORDER BY t.Value)
,t.category
,t.ID
,CAST(t.value AS VARCHAR(100))
,t.[desc]
FROM #YourTable AS t
INNER JOIN Categories AS c ON t.category=c.category
)
SELECT category,Value,[desc]
FROM Grouped
ORDER BY CatRank,ValRank
The result
category Value desc
**A**
A 10 text1
A 11 text11
**B**
B 20 text20
B 21 text21
**C**
C 30 text30