Replacing multiple commas with single comma in MS SQL - sql

How do I replace consecutive commas in a column with single comma in MS SQL?
For example, I have data like
a,,,,b,,,c,,,,,,
d,e,,,f,,,,,,g,,
I want this to be processed to following format:
a,b,c,
d,e,f,g,
The suggested duplicate, Use SQL to Replace Multiple Commas in a String with a Single Comma, is for Oracle. This is a question about SQL Server.

You could use simple REPLACE:
SELECT c, REPLACE(REPLACE(REPLACE(c, ',', '~,'), ',~', ''), '~,', ',')
FROM tab;
DBFiddle Demo
Output:
┌──────────────────┬──────────┐
│ c │ result │
├──────────────────┼──────────┤
│ a,,,,b,,,c,,,,,, │ a,b,c, │
│ d,e,,,f,,,,,,g,, │ d,e,f,g, │
└──────────────────┴──────────┘
Please note that this approach does not depend on SQL dialect and should work with MySQL/Oracle/PostgreSQL/...

It's quite easy to do it with CTE:
declare #s varchar(20) = 'a,,,,b,,,c,,,,,, d,e,,,f,,,,,,g,,'
;with cte as (
select replace(#s, ',,', ',') [s], 1 [rn]
union all
select replace(s, ',,', ',') [s], [rn] + 1
from cte
where LEN(s) - LEN(replace(s, ',,', '')) > 0
)
select top 1 #s = s from cte
order by rn desc
select #s

Although there are very good answers (personally I'd tend to lad2025) I'd like to post another approach (especially as extension to the answer by DruvJoshi)
DECLARE #tbl TABLE(s VARCHAR(100))
INSERT INTO #tbl VALUES('d,e,,,f,,,,,,g,,')
,('a,,,,b,,,c,,,,,,');
SELECT CAST('<x>'+REPLACE(s,',','</x><x>')+'</x>' AS XML)
.query('for $x in /x[text()]
return
<x>
{
concat($x, ",")
}
</x>
')
.value('.','nvarchar(max)') AS result
FROM #tbl;
Short explanation:
The solution uses the well-known XML trick to split a string. The rest is XQuery. The predicate /x[text()] will reduce the nodes to the ones with content. They will be re-created with a comma appended. The .value() with an XPath of . will return one single string of all content within the XML.

please use
SELECT
REGEXP_REPLACE(
REGEXP_REPLACE('Hi,,How are you? Fine, thanks ,, , ,,,, , James,Arden.', ', | ,', ','), --Replace ', ' and ' ,' with ','
',{1,}', ', ') single_comma_text --Replace one or more comma with comma followed by space
FROM DUAL;

Try this Script which eliminates space,multiple commas and give single comma separated as result
DECLARE #tbl AS TABLE (data nvarchar(max))
INSERT INTO #tbl
SELECT 'a,,,,b,,,c,,,,,, ,,,, ,,, ,, ,,,,,d,,,,,,,, ,, d,e,,,f,,,,,,g,,'
;WITH CTE
AS
(
SELECT data
,CAST(LEFT(data,1) AS VARCHAR(10)) AS Letter
,RIGHT(Data,LEN(Data)-1) AS Remainder
FROM #tbl
WHERE LEN(data)>1
UNION ALL
SELECT data
,CAST(LEFT(Remainder,1) AS VARCHAR(10)) AS Letter
,RIGHT(Remainder,LEN(Remainder)-1) AS Remainder
FROM CTE
WHERE LEN(Remainder)>0
)
SELECT STUFF((SELECT ', '+ Letter
FROM
(
SELECT Letter
FROM CTE
WHERE Letter <>',' AND Letter <>''
)dt FOR XML PATH ('')),1,1,'') AS RequiredOutPut
Result
RequiredOutPut
------------------
a, b, c, d, d, e, f, g
Demo : http://rextester.com/VEZS31565

If you can have unknown number of commas I'd recommend splitter using XML path like below.
Assuming you have a table T with column c
also see working demo
Explanation is inline as comments
/* first we split out each letter from each column using XML Path after replacing commas with empty nodes */
; with cte as (
select
id,s
from
(
select
id,
xmldata=cast('<x>'+replace(c,',','</x><x>')+'</x>' as xml) -- conversion to XML from varchar
from t
)A
cross apply
(
select
s = data.D.value('.','varchar(100)')
FROM
A.xmldata.nodes('x') AS data(D)
)c
where s <>''-- filter out empty nodes i.e. commas
)
/* Now we join back results from CTE by adding single comma between letters*/
select distinct id, stuff
((select ','+ s
from cte c1
where c1.id =c2.id
for xml path ('')),1,1,'')
from cte c2

This is what i did.
select replace(replace(replace('a,,,b,,,c,d,e,,,,f',',','<>'),'><',''),'<>',',')

Here is the simple one to suffice all the below cases:
Removing multiple commas in starting of the string
Removing multiple commas at the end of the string
Removing multiple commas to single comma in the middle of the string
select REGEXP_REPLACE(REGEXP_REPLACE(',,LE,,EN,,A,,,','^,*|,*$',''),',{1,}', ', ');
Output :
LE, EN, A

Related

SQL: select the last values before a space in a string

I have a set of strings like this:
CAP BCP0018 36
MFP ACZZ1BD 265
LZP FEI-12 3
I need to extract only the last values from the right and before the space, like:
36
265
3
how will the select statement look like? I tried using the below statement, but it did not work.
select CHARINDEX(myField, ' ', -1)
FROM myTable;
Perhaps the simplest method in SQL Server is:
select t.*, v.value
from t cross apply
(select top (1) value
from string_split(t.col, ' ')
where t.col like concat('% ', val)
) v;
This is perhaps not the most performant method. You probably would use:
select right(t.col, charindex(' ', reverse(t.col)) - 1)
Note: If there are no spaces, then to prevent an error:
select right(t.col, charindex(' ', reverse(t.col) + ' ') - 1)
Since you have mentioned CHARINDEX() in question, I am assuming you are using SQL Server.
Try below
declare #table table(col varchar(100))
insert into #table values('CAP BCP0018 36')
insert into #table values('MFP ACZZ1BD 265')
insert into #table values('LZP FE-12 3')
SELECT REVERSE(LEFT(REVERSE(col),CHARINDEX(' ',REVERSE(col)) - 1)) FROM #table
Functions used
CHARINDEX ( expressionToFind , expressionToSearch ) : returns position of FIRST occurence of an expression inside another expression.
LEFT ( character_expression , integer_expression ) : Returns the left part of a character string with the specified number of characters.
REVERSE ( string_expression ) : Returns the reverse order of a string value

Order Concatenated field

I have a field which is a concatenation of single letters. I am trying to order these strings within a view. These values can't be hard coded as there are too many. Is someone able to provide some guidance on the function to use to achieve the desired output below? I am using MSSQL.
Current output
CustID | Code
123 | BCA
Desired output
CustID | Code
123 | ABC
I have tried using a UDF
CREATE FUNCTION [dbo].[Alphaorder] (#str VARCHAR(50))
returns VARCHAR(50)
BEGIN
DECLARE #len INT,
#cnt INT =1,
#str1 VARCHAR(50)='',
#output VARCHAR(50)=''
SELECT #len = Len(#str)
WHILE #cnt <= #len
BEGIN
SELECT #str1 += Substring(#str, #cnt, 1) + ','
SET #cnt+=1
END
SELECT #str1 = LEFT(#str1, Len(#str1) - 1)
SELECT #output += Sp_data
FROM (SELECT Split.a.value('.', 'VARCHAR(100)') Sp_data
FROM (SELECT Cast ('<M>' + Replace(#str1, ',', '</M><M>') + '</M>' AS XML) AS Data) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)) A
ORDER BY Sp_data
RETURN #output
END
This works when calling one field
ie.
Select CustID, dbo.alphaorder(Code)
from dbo.source
where custid = 123
however when i try to apply this to top(10) i receive the error
"Invalid length parameter passed to the LEFT or SUBSTRING function."
Keeping in mind my source has ~4million records, is this still the best solution?
Unfortunately i am not able to normalize the data into a separate table with records for each Code.
This doesn't rely on a id column to join with itself, performance is almost as fast
as the answer by #Shnugo:
SELECT
CustID,
(
SELECT
chr
FROM
(SELECT TOP(LEN(Code))
SUBSTRING(Code,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),1)
FROM sys.messages) A(Chr)
ORDER by chr
FOR XML PATH(''), type).value('.', 'varchar(max)'
) As CODE
FROM
source t
First of all: Avoid loops...
You can try this:
DECLARE #tbl TABLE(ID INT IDENTITY, YourString VARCHAR(100));
INSERT INTO #tbl VALUES ('ABC')
,('JSKEzXO')
,('QKEvYUJMKRC');
--the cte will create a list of all your strings separated in single characters.
--You can check the output with a simple SELECT * FROM SeparatedCharacters instead of the actual SELECT
WITH SeparatedCharacters AS
(
SELECT *
FROM #tbl
CROSS APPLY
(SELECT TOP(LEN(YourString)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY
(SELECT SUBSTRING(YourString,Nmbr,1))B(Chr)
)
SELECT ID,YourString
,(
SELECT Chr As [*]
FROM SeparatedCharacters sc1
WHERE sc1.ID=t.ID
ORDER BY sc1.Chr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') AS Sorted
FROM #tbl t;
The result
ID YourString Sorted
1 ABC ABC
2 JSKEzXO EJKOSXz
3 QKEvYUJMKRC CEJKKMQRUvY
The idea in short
The trick is the first CROSS APPLY. This will create a tally on-the-fly. You will get a resultset with numbers from 1 to n where n is the length of the current string.
The second apply uses this number to get each character one-by-one using SUBSTRING().
The outer SELECT calls from the orginal table, which means one-row-per-ID and use a correalted sub-query to fetch all related characters. They will be sorted and re-concatenated using FOR XML. You might add DISTINCT in order to avoid repeating characters.
That's it :-)
Hint: SQL-Server 2017+
With version v2017 there's the new function STRING_AGG(). This would make the re-concatenation very easy:
WITH SeparatedCharacters AS
(
SELECT *
FROM #tbl
CROSS APPLY
(SELECT TOP(LEN(YourString)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY
(SELECT SUBSTRING(YourString,Nmbr,1))B(Chr)
)
SELECT ID,YourString
,STRING_AGG(sc.Chr,'') WITHIN GROUP(ORDER BY sc.Chr) AS Sorted
FROM SeparatedCharacters sc
GROUP BY ID,YourString;
Considering your table having good amount of rows (~4 Million), I would suggest you to create a persisted calculated field in the table, to store these values. As calculating these values at run time in a view, will lead to performance problems.
If you are not able to normalize, add this as a denormalized column to the existing table.
I think the error you are getting could be due to empty codes.
If LEN(#str) = 0
BEGIN
SET #output = ''
END
ELSE
BEGIN
... EXISTING CODE BLOCK ...
END
I can suggest to split string into its characters using referred SQL function.
Then you can concatenate string back, this time ordered alphabetically.
Are you using SQL Server 2017? Because with SQL Server 2017, you can use SQL String_Agg string aggregation function to concatenate characters splitted in an ordered way as follows
select
t.CustId, string_agg(strval, '') within GROUP (order by strval)
from CharacterTable t
cross apply dbo.SPLIT(t.code) s
where strval is not null
group by CustId
order by CustId
If you are not working on SQL2017, then you can follow below structure using SQL XML PATH for concatenation in SQL
select
CustId,
STUFF(
(
SELECT
'' + strval
from CharacterTable ct
cross apply dbo.SPLIT(t.code) s
where strval is not null
and t.CustId = ct.CustId
order by strval
FOR XML PATH('')
), 1, 0, ''
) As concatenated_string
from CharacterTable t
order by CustId

Remove Characters in a String in SQL

I have a column u_manualdoc which contains the values are like this CGY DR# 7405. I want to remove the CGY DR#.
Here's the code:
select u_manualdoc, cardcode, cardname from ODLN
I want only the 7405 number. Thanks!
Try this:
--sample data you provided in comments
declare #tbl table(codes varchar(20))
insert into #tbl values
('CGY PST - 58277') , ('CGY RMC PST # 58083'), ('CGY DR # 7443'), ('CSI # 1304'), ('PO# 0568 , 0570'), ('CGY DR# 7446')
--actual query that you can apply to your table
select SUBSTRING(codes, PATINDEX('%[0-9]%', codes), len(codes)) from #tbl
The key point here is to use patindex, which searches for a pattern and returns index where such pattern occur. I specified %[0-9]% which means that we search for any digit - it will return first occurrence of a digit. Now- since this would be our starting point to substring, we pass it to such function. Third parameter of substring is length. Since we want the rest of a string, len function makes sure that we get that :)
Applying to your naming:
select SUBSTRING(u_manualdoc, PATINDEX('%[0-9]%', u_manualdoc), len(u_manualdoc)),
cardcode,
cardname
from ODLN
You should use string functions charindex,len and substring to get it.
See the code below.
select SUBSTRING(u_manualdoc,CHARINDEX('#',u_manualdoc)+1,LEN(u_manualdoc)- CHARINDEX('#',u_manualdoc))
EDIT
In addition to the other answers, you can use this simple method:
select
substring(
u_manualdoc,
len(u_manualdoc) - patindex('%[^0-9]%', reverse(u_manualdoc)) + 2,
len(u_manualdoc)
),
cardcode, cardname
from ODLN
In this example, patindex finds the first non-digit (as specified by ^[0-9]) from the right side of the string, and then uses that as the starting point of the substring.
This will work on all of your sample strings (including 'PO# 0568 , 0570 CGY DR# 7446').
Or use SQL Server Regex, which lets you use more powerful regular expressions within your queries.
TRY THIS
DECLARE #table TABLE(DirtyCol VARCHAR(100));
INSERT INTO #table
VALUES('AB ABCDE # 123'), ('ABCDE# 123'), ('AB: ABC# 123 AB: ABC# 123'), ('AB#'), ('AB # 1 000 000'), ('AB # 1`234`567'), ('AB # (9)(876)(543)');
WITH tally
AS (
SELECT TOP (100) N = ROW_NUMBER() OVER(ORDER BY ##spid)
FROM sys.all_columns),
data
AS (
SELECT DirtyCol,
Col
FROM #table
CROSS APPLY
(
SELECT
(
SELECT C+''
FROM
(
SELECT N,
SUBSTRING(DirtyCol, N, 1) C
FROM tally
WHERE N <= DATALENGTH(DirtyCol)
) [1]
WHERE C BETWEEN '0' AND '9'
ORDER BY N FOR XML PATH('')
)
) p(Col)
WHERE p.Col IS NOT NULL)
SELECT DirtyCol,
CAST(Col AS INT) IntCol
FROM data;

how to replace a comma with single quote comma in SQLIN clause

Hi all I have a an where clause like below
select * from table1
where colum1 IN (?parameter)
when I pass the values to the parameter they show up like below
('1,2,3') but to execute the query I need to change the values as ('1','2','3')
is there a way to replace the commas with single quotes comma in IN clause directly?
There is one hack to do what you want, using like:
select *
from table1
where ',' || column1 || ',' like '%,' || (?parameter) || ',%';
This functions, but it will not make use of an index on column1. You should think about other solutions, such as:
Parsing the string into a table variable.
Using in with a fixed number of parameters.
Storing the values in a table.
There may be other Oracle-specific solutions as well.
Use MS SQL, you can convert it into table value and using join for your condition, I am not familiar oracle but you can find same way to do it.
DECLARE #IDs varchar(max) ='1,2,3';
;WITH Cte AS
(
SELECT
CAST('<ID>' + REPLACE( #IDs, ',' , '</ID><ID>') + '</ID>' AS XML) AS IDs
)
SELECT '''' + Split.a.value('.', 'VARCHAR(100)') +'''' AS ID FROM Cte
CROSS APPLY Cte.IDs.nodes('/ID') Split(a)
You can achieve it using with clause. Logic here is to convert each comma separated value into different row.
with temp_tab as (
select replace(regexp_substr(parameter, '[^,]+',1, level),'''','') as str
from dual
connect by level<= length(regexp_replace(parameter, '[^,]+'))+1 )
select * from table1 where column1 in (select str from temp_tab);

case statement to delete extra spaces only when there is some

I replace all blanks with # using this
SELECT *, REPLACE(NAME,' ','#') AS NAME2
which results miss#test#blogs############## (different number of #s dependent on length of name!
I then delete all # signs after the name using this
select *, substring(Name2,0,charindex('##',Name2)) as name3
which then gives my desired results of, for example MISS#test#blogs
However some wheren't giving this result, they are null. This is because annoyingly some rows in the sheet I have read in dont have the spaces after the name.
is there a case statement i can use so it only deletes # signs after the name if they are there in the first place?
Thanks
The function rtrim can be used to remove trailing spaces. For example:
select replace(rtrim('miss test blogs '),' ','#')
-->
'miss#test#blogs'
Example at SQL Fiddle.
try this:
Declare #t table (name varchar(100),title varchar(100),forename varchar(100))
insert into #t
values('a b c','dasdh dsalkdk asdhl','asd dfg sd')
SELECT REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(name)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS Name,
REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(title)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS title,
REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(forename)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS forename
FROM #t WHERE
(CHARINDEX(' ',NAME) > 0 or CHARINDEX(' ',title) > 0 or CHARINDEX(' ',forename) > 0)
SQL Fiddle Demo
select name2, left(name2,len(name2)+1-patindex('%[^#]%',reverse(name2)+'.'))
from (
SELECT *, REPLACE(NAME,' ','#') AS NAME2
from t
) x;
Check this SQL Fiddle
For posterity, sample table:
create table t (name varchar(100));
insert t select 'name#name#ne###'
union all select '#name#name'
union all select 'name name hi '
union all select 'joe public'
union all select ''
union all select 'joe'
union all select 'joe '
union all select null
union all select ' leading spaces'
union all select ' leading trailing ';
Don't quite understand the question, but if the problem is there is not spaces after some names, can't you do this first:
SELECT *, REPLACE(NAME+' ',' ','#') AS NAME2
i.e., add a space to all names right off the bat?
I had this same problem some days ago.
Well actually, there's a quickly way to subtract the spaces from both the begin and end inside strings. In SQL Server, you can use the RTRIM and LTRIM for this. The first one supresses spaces from right side and the second supresses from left. But, if in your scenario also may exists more than one space in the middle of the string I sugest you take a look on this post on SQL Server Central: http://www.sqlservercentral.com/articles/T-SQL/68378/
There the script's author explain, in details, a good solution for this situation.