Function to replace all non alpha-numeric and multiple whitespace characters with a single space

Function to replace all non alpha-numeric and multiple whitespace characters with a single space - sql

I am trying to write an efficient function to use in a calculated field which has the following characteristics
Replace all non alpha numeric characters with space
Replace multiple white spaces with a space
Trim and lower the results
Example input
A B##%$$C &^%D
Example output
a b c d
A normal regex pattern would match like so
[\W_]+
The following works, however I am not sure if there is a more efficient approach than using 2 loops ( O(n2) complexity at least) with PatIndex and Stuff, charindex and replace
Create Function [dbo].[Clean](#Temp nvarchar(1000))
Returns nvarchar(1000)
AS
Begin
Declare #Pattern as varchar(50) = '%[^a-z0-9 ]%'
While PatIndex(#Pattern, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#Pattern, #Temp), 1, ' ')
while charindex(' ',#Temp ) > 0
set #Temp = replace(#Temp, ' ', ' ')
Return LOWER(TRIM(#Temp))
End
Usage
Select dbo.Clean(' A B##%$$C &^%D ')
Result
a b c d
Is there potentially a single pass approach I can use, or a sneaky method I am not aware of?

I'm not able to test the performance, but the following approach (without loops and based on some string manipulations) is an additional option.
Note, that you'll need at least SQL Server 2017 (for the TRANSLATE() call).
-- Input text and patterns
DECLARE #text varchar(1000) = ' A B##%$$C &^%D'
DECLARE #alphanumericpattern varchar(36) = 'abcdefghijklmnopqrstuvwxyz0123456789'
DECLARE #notalphanumericpattern varchar(1000)
-- Trim and lower the input text
SELECT #text = RTRIM(LTRIM(LOWER(#text)))
-- Get not alpha-numeric characters
SELECT #notalphanumericpattern =
REPLACE(
TRANSLATE(#text, #alphanumericpattern, REPLICATE('a', LEN(#alphanumericpattern))),
'a',
''
)
-- Replace all not alpha-numeric characters with a space
SELECT #text =
REPLACE(
TRANSLATE(#text, #notalphanumericpattern, REPLICATE('$', LEN(#notalphanumericpattern))),
'$',
' '
)
-- Replace multiple spaces with a single space
SELECT #text =
REPLACE(
REPLACE(
REPLACE(
#text,
' ',
'<>'
),
'><',
''
),
'<>',
' '
)
Result:
a b c d

Related

Replace function in SQL Server

I have a string of data
'["Dog",,,1,"Person","2020-03-17",,4,"Todd]'
I am trying to use the replace function to replace double commas with NULL values
Solution
'["Dog",NULL,NULL,1,"Person","2020-03-17",NULL,4,"Todd]'
But I keep ending up with
'"Dog",NULL,,1,"Person","2020-03-17",NULL,4,"Todd'
(The ,,, needs to become ,NULL,NULL, but only becomes ,NULL,,)
Here is my sample code I'm using
REPLACE(FileData, ',,' , ',NULL,')
WHERE FileData LIKE '%,,%'

If you do the same replacement twice, any number of sequential commas will get handled.
REPLACE(REPLACE(FileData, ',,' , ',NULL,'), ',,' , ',NULL,')
The first REPLACE deals with all the odd positions...
',,,,,,,,'` => ',NULL,,NULL,,NULL,,NULL,'
Doing it again will deal with all of the remaining positions.
=> ',NULL,NULL,NULL,NULL,NULL,NULL,NULL,'
Note, by specifically handling a special case of three consecutive commas (as in an other answer here) you won't handle four or five or six, etc. The above solution generalises to Any length of consecutive commas.
To be fully robust, you may also need to consider when there is a missing NULL at the first or last place in the string.
[,ThatOneToMyLeft,and,ThatOneToMyRight,]
A laborious but robust approach could be to replace [, and ,] with [,, and ,,] respectively, then do the double-replacement, then undo the first steps...
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
FileData,
'[,',
'[,,'
),
',]',
',,]'
),
',,',
',NULL,'
),
',,',
',NULL,'
),
',]',
']',
),
'[,',
'['
)
There are ways to make even that less verbose, but I have to run right now :)

You can try the following:
REPLACE(REPLACE(FileData, ',,,' , ',NULL,,'), ',,' , ',NULL,')
Where FileData LIKE '%,,%'

You can create a function for your problem solving that associates to string replacement function.
Check this:
update table1
set column1 = dbo.ReplaceEx(column1, ',', 'NULL')
where column1 like '%,,%'
create function dbo.ReplaceEx(#string varchar(2000), #separator varchar(4), #nullValue varchar(10))
returns varchar(4000)
with execute as caller
as
begin
declare #result varchar(4000);
set #result = '';
select #result = concat_ws(#sep, #result,
case when rtrim(value) = '' then #nullValue
else case when ltrim(rtrim(value)) = '[' then '[' + #nullValue
else case when ltrim(rtrim(value)) = ']' then #nullValue + ']'
else value end end end
)
from string_split(#string, #separator);
return (#result);
end;

How to identify and redact all instances of a matching pattern in T-SQL

I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!

You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE #String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE #StartPos INT = 1, #EndPos INT = 1;
DECLARE #Input VARCHAR(500) = ISNULL(#String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE #OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (#StartPOS <> 0)
BEGIN
SET #StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #Input);
IF #StartPOS <> 0
BEGIN
SET #OutputString += SUBSTRING(#Input, 1, #StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET #Input = SUBSTRING(#Input, #StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET #EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', #Input)); --First occurance of 4 numbers with a not number behind it.
SET #Input = STUFF(#Input, 1, (#EndPos - 1), REPLICATE('*', (#EndPos - 1))); --#EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET #OutputString += #Input; --Append the last element
SET #OutputString = LEFT(#OutputString, LEN(#OutputString))
SELECT #OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.

A dirty solution with recursive CTE
DECLARE
#tags nvarchar(max) = N'Some text with 12345 and 1234 and 12345678',
#c nchar(1) = N' ';
;
WITH Process (s, i)
as
(
SELECT #tags, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #tags)
UNION ALL
SELECT value, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', value)
FROM
(SELECT SUBSTRING(s,0,i)+'*'+SUBSTRING(s,i+4,len(s)) value
FROM Process
WHERE i >0) calc
-- we surround the value and the string with leading/trailing ,
-- so that cloth isn't a false positive for clothing
)
SELECT * FROM Process
WHERE i=0
I think a better solution it's to add clr function in Ms SQL Server to manage regexp.
sql-clr/RegEx

Here is an option using the DelimitedSplit8K_LEAD which can be found here. https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2 This is an extension of Jeff Moden's splitter that is even a little bit faster than the original. The big advantage this splitter has over most of the others is that it returns the ordinal position of each element. One caveat to this is that I am using a space to split on based on your sample data. If you had numbers crammed in the middle of other characters this will ignore them. That may be good or bad depending on you specific requirements.
declare #Something varchar(100) = 'Some text with 12345 and 1234 and 12345678';
with MyCTE as
(
select x.ItemNumber
, Result = isnull(case when TRY_CONVERT(bigint, x.Item) is not null then isnull(replicate('*', len(convert(varchar(20), TRY_CONVERT(bigint, x.Item))) - 4), '') + right(convert(varchar(20), TRY_CONVERT(bigint, x.Item)), 4) end, x.Item)
from dbo.DelimitedSplit8K_LEAD(#Something, ' ') x
)
select Output = stuff((select ' ' + Result
from MyCTE
order by ItemNumber
FOR XML PATH('')), 1, 1, '')
This produces: Some text with *2345 and 1234 and ****5678

SSMS replace all commas outside of quotation marks in string

I've written the following function in SSMS to replace any commas that are outside of quotation marks with ||||:
CREATE FUNCTION dbo.fixqualifier (#string nvarchar(max))
returns nvarchar(max)
as begin
DECLARE #STRINGTOPAD NVARCHAR(MAX)
DECLARE #position int = 1,#newstring nvarchar(max) ='',#QUOTATIONMODE INT = 0
WHILE(LEN(#string)>0)
BEGIN
SET #STRINGTOPAD = SUBSTRING(#string,0,IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)))
SET #newstring = #newstring + IIF(#QUOTATIONMODE = 1, REPLACE(#STRINGTOPAD,',','||||'),#STRINGTOPAD)
SET #QUOTATIONMODE = IIF(#QUOTATIONMODE = 1,0,1)
set #string = SUBSTRING(#string,1+IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)),LEN(#string))
END
return #newstring
end
The idea is for the function to find the first ", replace all ',' before that then switch to quotation mode 1 so it knows to not replace the , until it changes back to quotation mode 0 when it hits the 2nd " and so on.
so for example the string:
qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl
would become:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
It works as expected but it's really inefficient when it comes to doing this for several thousand rows.
Is there a better way or doing this or at least speeding the function up.

Do a simple trick by Modulus
DECLARE #VAR VARCHAR(100) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl'
,#OUTPUT VARCHAR(100) = '';
SELECT #OUTPUT = #OUTPUT + CASE WHEN (LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', ''))) % 2 = 0
THEN REPLACE(VAL, ',', '||||') ELSE VAL END
FROM (
SELECT SUBSTRING(#VAR, NUMBER, 1) VAL
FROM master.dbo.spt_values
WHERE type = 'P'
AND NUMBER BETWEEN 1 AND LEN(#VAR)
) A
PRINT #OUTPUT
Result:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
By this LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', '')) expression, you will get count of ". By taking Modulus of the count %2, if it is zero its even then you can replace commas, otherwise you will keep them.

This uses DelimitedSplit8k and completely avoids any RBAR methods (such as a WHILE or #Variable = #Variable +... (which is a hidden form of RBAR)).
It firstly splits on the quotation, and then on the commas, where the string isn't quoted. Finally it then puts the strings back together again, using the "old" STUFF and FOR XML PATH method:
USE Sandbox;
DECLARE #String varchar(8000) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl';
WITH Splits AS(
SELECT QS.ItemNumber AS QuoteNumber, CS.ItemNumber AS CommaNumber, ISNULL(CS.Item, '"' + QS.Item + '"') AS DelimitedItem
FROM dbo.DelimitedSplit8K(#string,'"') QS
OUTER APPLY (SELECT *
FROM dbo.DelimitedSplit8K(QS.Item,',')
WHERE QS.ItemNumber % 2 = 1) CS
WHERE QS.Item <> ',')
SELECT STUFF((SELECT '||||' + S.DelimitedItem
FROM Splits S
ORDER BY S.QuoteNumber, S.CommaNumber
FOR XML PATH('')),1,1,'') AS DelimitedList;
(Note, DelimitedSplit8K does not accept more than 8,000 characters. If you have more than that, SQL Server is really not the right tool. STRING_SPLIT does not provide the ordinal position, so you would be unable to guarantee the rebuild order with it.)

Remove only leading or trailing carriage returns

I'm dumbfounded that this question has not been asked meaningfully already. How does one go about creating an equivalent function in SQL like LTRIM or RTRIM for carriage returns and line feeds ONLY at the start or end of a string.
Obviously REPLACE(REPLACE(#MyString,char(10),''),char(13),'') removes ALL carriage returns and new line feeds. Which is NOT what I'm looking for. I just want to remove leading or trailing ones.

Find the first character that is not CHAR(13) or CHAR(10) and subtract its position from the string's length.
LTRIM()
SELECT RIGHT(#MyString,LEN(#MyString)-PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#MyString)+1)
RTRIM()
SELECT LEFT(#MyString,LEN(#MyString)-PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',REVERSE(#MyString))+1)

Following functions are enhanced types of trim functions you can use. Copied from sqlauthority.com
These functions remove trailing spaces, leading spaces, white space, tabs, carriage returns, line feeds etc.
Trim Left
CREATE FUNCTION dbo.LTrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #trimchars VARCHAR(10)
SET #trimchars = CHAR(9)+CHAR(10)+CHAR(13)+CHAR(32)
IF #str LIKE '[' + #trimchars + ']%' SET #str = SUBSTRING(#str, PATINDEX('%[^' + #trimchars + ']%', #str), LEN(#str))
RETURN #str
END
Trim Right
CREATE FUNCTION dbo.RTrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #trimchars VARCHAR(10)
SET #trimchars = CHAR(9)+CHAR(10)+CHAR(13)+CHAR(32)
IF #str LIKE '%[' + #trimchars + ']'
SET #str = REVERSE(dbo.LTrimX(REVERSE(#str)))
RETURN #str
END
Trim both Left and Right
CREATE FUNCTION dbo.TrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN dbo.LTrimX(dbo.RTrimX(#str))
END
Using function
SELECT dbo.TRIMX(#MyString)
If you do use these functions you might also consider changing from varchar to nvarchar to support more encodings.

In SQL Server 2017 you can use the TRIM function to remove specific characters from beginning and end, in one go:
WITH testdata(str) AS (
SELECT CHAR(13) + CHAR(10) + ' test ' + CHAR(13) + CHAR(10)
)
SELECT
str,
TRIM(CHAR(13) + CHAR(10) + CHAR(9) + ' ' FROM str) AS [trim cr/lf/tab/space],
TRIM(CHAR(13) + CHAR(10) FROM str) AS [trim cr/lf],
TRIM(' ' FROM str) AS [trim space]
FROM testdata
Result:
Note that the last example (trim space) does nothing as expected since the spaces are in the middle.

Here's an example you may run:
I decided to cast the results as an Xml value, so when you click on it, you will be able to view the Carriage Returns.
DECLARE #CRLF Char(2) = (CHAR(0x0D) + CHAR(0x0A))
DECLARE #String VarChar(MAX) = #CRLF + #CRLF + ' Hello' + #CRLF + 'World ' + #CRLF + #CRLF
--Unmodified String:
SELECT CAST(#String as Xml)[Unmodified]
--Remove Trailing Whitespace (including Spaces).
SELECT CAST(LEFT(#String, LEN(REPLACE(#String, #CRLF, ' '))) as Xml)[RemoveTrailingWhitespace]
--Remove Leading Whitespace (including Spaces).
SELECT CAST(RIGHT(#String, LEN(REVERSE(REPLACE(#String, #CRLF, ' ')))) as Xml)[RemoveLeadingWhitespace]
--Remove Leading & Trailing Whitespace (including Spaces).
SELECT CAST(SUBSTRING(#String, LEN(REPLACE(#String, ' ', '_')) - LEN(REVERSE(REPLACE(#String, #CRLF, ' '))) + 1, LEN(LTRIM(RTRIM(REPLACE(#String, #CRLF, ' '))))) as Xml)[RemoveAllWhitespace]
--Remove Only Leading and Trailing CR/LF's (while still preserving all other Whitespace - including Spaces). - 04/06/2016 - MCR.
SELECT CAST(SUBSTRING(#String, PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#String), LEN(REPLACE(#String, ' ', '_')) - PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#String) + 1 - PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%', REVERSE(#String)) + 1) as Xml)[RemoveLeadingAndTrailingCRLFsOnly]
Remember to remove the Cast-to-Xml, as this was done just as a Proof-of-Concept to show it works.
How is this better than the currently Accepted Answer?
At first glance this may appear to use more Functions than the Accepted Answer.
However, this is not the case.
If you combine both approaches listed in the Accepted Answer (to remove both Trailing and Leading whitespace), you will either have to make two passes updating the Record, or copy all of one Logic into the other (everywhere #String is listed), which would cause way more function calls and become even more difficult to read.

I was stuck using Microsoft SQL Server 2008 R2 and so basing my functions on #sqluser's answer I came up with the below. This will return an empty string if the string only contains the characters to be trimmed.
The bit that threw me was the pattern for PATINDEX must be included between % characters, which for a while I was thinking of as the same wildcard in a LIKE statement but which I now believe is just the syntax to denote a pattern, though I may be wrong!
CREATE FUNCTION [dbo].[ExtendedLTRIM](#string_to_trim VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #tab CHAR(1) = CHAR(9);
DECLARE #line_feed CHAR(1) = CHAR(10);
DECLARE #carriage_return CHAR(1) = CHAR(13);
DECLARE #space CHAR(1) = CHAR(32);
DECLARE #characters_to_trim VARCHAR(10)
SET #characters_to_trim = #tab + #line_feed + #carriage_return + #space
IF #string_to_trim LIKE '[' + #characters_to_trim + ']%'
BEGIN
DECLARE #first_non_trim_character INT = PATINDEX('%[^' + #characters_to_trim + ']%', #string_to_trim);
IF #first_non_trim_character = 0 RETURN '';
RETURN SUBSTRING(#string_to_trim, #first_non_trim_character, 8000)
END
RETURN #string_to_trim
END
GO

To trim characters from a pre-defined list you'll want to create the following UDF (should work in 2008R2 and above).
Handles both sides in a single pass and doesn't care if it's a CRLF, LFCR (yep, seen that abomination more than once), bare LF or a bunch of spaces.
is easy to extend to e.g. add additional parameters to do LTRIM/RTRIM only, or a full purge (that last bit is simpler to do in 2017 by incorporating STRING_AGG, but perfectly doable in 2008R2); as a matter of fact this is a simplified version of something I use to do all those things. If anybody is interested then let me know and I'll update:
CREATE FUNCTION fnTrimHarder
(
#String VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE
#Start INT,
#Len INT,
#Chars CHAR(5) = CONCAT(
CHAR(9), -- TAB
CHAR(10), -- LF
CHAR(13), -- CR
' '
), -- List of invalid characters
#Return VARCHAR(MAX) = '';
IF #String NOT LIKE '%[^' + #Chars + ']%' -- If string contains only invalid characters
OR COALESCE(#String, '') = '' -- Optional addition for NULL handling
RETURN #Return
ELSE
BEGIN -- Create a "table" of characters with ordinals, calculate the start of string and its length, then return the substring
WITH CTE AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1
FROM CTE
WHERE n < LEN(#String)
)
SELECT
#Start = MIN(n),
#Len = 1 + MAX(n) - MIN(n)
FROM CTE
WHERE SUBSTRING(#String, n, 1) NOT LIKE '[' + #Chars + ']';
SET #Return = SUBSTRING(#String, #Start, #Len)
END
RETURN #Return
END
GO

SQL Server TRIM character

I have the following string: 'BOB*', how do I trim the * so it shows up as 'BOB'
I tried the RTRIM('BOB*','*') but does not work as says needs only 1 parameter.

Another pretty good way to implement Oracle's TRIM char FROM string in MS SQL Server is the following:
First, you need to identify a char that will never be used in your string, for example ~
You replace all spaces with that character
You replace the character * you want to trim with a space
You LTrim + RTrim the obtained string
You replace back all spaces with the trimmed character *
You replace back all never-used characters with a space
For example:
REPLACE(REPLACE(LTrim(RTrim(REPLACE(REPLACE(string,' ','~'),'*',' '))),' ','*'),'~',' ')

CREATE FUNCTION dbo.TrimCharacter
(
#Value NVARCHAR(4000),
#CharacterToTrim NVARCHAR(1)
)
RETURNS NVARCHAR(4000)
AS
BEGIN
SET #Value = LTRIM(RTRIM(#Value))
SET #Value = REVERSE(SUBSTRING(#Value, PATINDEX('%[^'+#CharacterToTrim+']%', #Value), LEN(#Value)))
SET #Value = REVERSE(SUBSTRING(#Value, PATINDEX('%[^'+#CharacterToTrim+']%', #Value), LEN(#Value)))
RETURN #Value
END
GO
--- Example
----- SELECT dbo.TrimCharacter('***BOB*********', '*')
----- returns 'BOB'

If you want to remove all asterisks then it's obvious:
SELECT REPLACE('Hello*', '*', '')
However, If you have more than one asterisk at the end and multiple throughout, but are only interested in trimming the trailing ones, then I'd use this:
DECLARE #String VarChar(50) = '**H*i****'
SELECT LEFT(#String, LEN(REPLACE(#String, '*', ' '))) --Returns: **H*i
I updated this answer to include show how to remove leading characters:
SELECT RIGHT(#String, LEN(REPLACE(REVERSE(#String), '*', ' '))) --Returns: H*i****
LEN() has a "feature" (that looks a lot like a bug) where it does not count trailing spaces.

LEFT('BOB*', LEN('BOB*')-1)
should do it.

If you wanted behavior similar to how RTRIM handles spaces i.e. that "B*O*B**" would turn into "B*O*B" without losing the embedded ones then something like -
REVERSE(SUBSTRING(REVERSE('B*O*B**'), PATINDEX('%[^*]%',REVERSE('B*O*B**')), LEN('B*O*B**') - PATINDEX('%[^*]%', REVERSE('B*O*B**')) + 1))
Should do it.

If you only want to remove a single '*' character from the value when the value ends with a '*', a simple CASE expression will do that for you:
SELECT CASE WHEN RIGHT(foo,1) = '*' THEN LEFT(foo,LEN(foo)-1) ELSE foo END AS foo
FROM (SELECT 'BOB*' AS foo)
To remove all trailing '*' characters, then you'd need a more complex expression, making use of the REVERSE, PATINDEX, LEN and LEFT functions.
NOTE: Be careful with the REPLACE function, as that will replace all occurrences of the specified character within the string, not just the trailing ones.

How about.. (in this case to trim off trailing comma or period)
For a variable:
-- Trim commas and full stops from end of City
WHILE RIGHT(#CITY, 1) IN (',', '.'))
SET #CITY = LEFT(#CITY, LEN(#CITY)-1)
For table values:
-- Trim commas and full stops from end of City
WHILE EXISTS (SELECT 1 FROM [sap_out_address] WHERE RIGHT([CITY], 1) IN (',', '.'))
UPDATE [sap_out_address]
SET [CITY] = LEFT([CITY], LEN([CITY])-1)
WHERE RIGHT([CITY], 1) IN (',', '.')

An other approach ONLY if you want to remove leading and trailing characters is the use of TRIM function.
By default removes white spaces but have te avility of remove other characters if you specify its.
SELECT TRIM('=' FROM '=SPECIALS=') AS Result;
Result
--------
SPECIALS
Unfortunately LTRIM and RTRIM does not work in the same way and only removes white spaces instead of specified characters like TRIM does if you specify its.
Reference and more examples:
https://database.guide/how-to-remove-leading-and-trailing-characters-in-sql-server/

RRIM() LTRIM() only remove spaces try http://msdn.microsoft.com/en-us/library/ms186862.aspx
Basically just replace the * with empty space
REPLACE('TextWithCharacterToReplace','CharacterToReplace','CharacterToReplaceWith')
So you want
REPLACE ('BOB*','*','')

I really like Teejay's answer, and almost stopped there. It's clever, but I got the "almost too clever" feeling, as, somehow, your string at some point will actually have a ~ (or whatever) in it on purpose. So that's not defensive enough for me to put into production.
I like Chris' too, but the PATINDEX call seems like overkill.
Though it's probably a micro-optimization, here's one without PATINDEX:
CREATE FUNCTION dbo.TRIMMIT(#stringToTrim NVARCHAR(MAX), #charToTrim NCHAR(1))
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #retVal NVARCHAR(MAX)
SET #retVal = #stringToTrim
WHILE 1 = charindex(#charToTrim, reverse(#retVal))
SET #retVal = SUBSTRING(#retVal,0,LEN(#retVal))
WHILE 1 = charindex(#charToTrim, #retVal)
SET #retVal = SUBSTRING(#retVal,2,LEN(#retVal))
RETURN #retVal
END
--select dbo.TRIMMIT('\\trim\asdfds\\\', '\')
--trim\asdfds
Returning a MAX nvarchar bugs me a little, but that's the most flexible way to do this..

I've used a similar approach to some of the above answers of using pattern matching and reversing the string to find the first non-trimmable character, then cutting that off. The difference is this version does less work than those above, so should be a little more efficient.
This creates RTRIM functionality for any specified character.
It includes an additional step set #charToFind = case... to escape the chosen character.
There is currently an issue if #charToReplace is a right crotchet (]) as there appears to be no way to escape this.
.
declare #stringToSearch nvarchar(max) = '****this is****a ** demo*****'
, #charToFind nvarchar(5) = '*'
--escape #charToFind so it doesn't break our pattern matching
set #charToFind = case #charToFind
when ']' then '[]]' --*this does not work / can't find any info on escaping right crotchet*
when '^' then '\^'
--when '%' then '%' --doesn't require escaping in this context
--when '[' then '[' --doesn't require escaping in this context
--when '_' then '_' --doesn't require escaping in this context
else #charToFind
end
select #stringToSearch
, left
(
#stringToSearch
,1
+ len(#stringToSearch)
- patindex('%[^' + #charToFind + ']%',reverse(#stringToSearch))
)

SqlServer2017 has a new way to do it: https://learn.microsoft.com/en-us/sql/t-sql/functions/trim-transact-sql?view=sql-server-2017
SELECT TRIM('0' FROM '00001900'); -> 19
SELECT TRIM( '.,! ' FROM '# test .'); -> # test
SELECT TRIM('*' FROM 'BOB*'); --> BOB
Unfortunately, RTRIM does not support trimming a specific character.

SELECT REPLACE('BOB*', '*', '')
SELECT REPLACE('B*OB*', '*', '')
-------------------------------------
Result : BOB
-------------------------------------
this will replace all asterisk* from the text

Trim with many cases
--id = 100 101 102 103 104 105 106 107 108 109 110 111
select right(id,2)+1 from ordertbl -- 1 2 3 4 5 6 7 8 9 10 11 -- last two positions are taken
select LEFT('BOB', LEN('BOB')-1) -- BO
select LEFT('BOB*',1) --B
select LEFT('BOB*',2) --BO

Try this:
Original
select replace('BOB*','*','')
Fixed to be an exact replacement
select replace('BOB*','BOB*','BOB')

Solution for one char parameter:
rtrim('0000100','0') ->
select left('0000100',len(rtrim(replace('0000100','0',' '))))
ltrim('0000100','0') ->
select right('0000100',len(replace(ltrim(replace('0000100','0',' ')),' ','.')))

#teejay solution is great. But the code below can be more understandable:
declare #X nvarchar(max)='BOB *'
set #X=replace(#X,' ','^')
set #X=replace(#X,'*',' ')
set #X= ltrim(rtrim(#X))
set #X=replace(#X,'^',' ')

Here's a function I used in the past. Note that while you can make it more general purpose by having extra parameters like the character(s) you wish to remove and what you will be replacing the space character(s) with, this greatly increases execution time. Here, I used a pipe to replace spaces AFTER pre-trimming the input. Change varchar to nvarchar if required.
CREATE FUNCTION [dbo].[TrimColons]
(
#strToTrim varchar(500)
)
RETURNS varchar(500)
AS
BEGIN
RETURN REPLACE(REPLACE(LTRIM(RTRIM(REPLACE(REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|'),':',' '))),' ',':'),'|',' ')
/*
Here's a breakdown of this fancy, schmancy, trimmer
LTRIM(RTRIM(#strToTrim)) trims leading & trailing spaces first
REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|') replaces inside spaces with pipe char
REPLACE(REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|'),':',' ') replaces demarc character, the colon, with spaces
LTRIM(RTRIM(REPLACE(REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|'),':',' '))) trims the leading & trailing converted-to-space demarc char (colon)
REPLACE(LTRIM(RTRIM(REPLACE(REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|'),':',' '))),' ',':') replaces the inner space characters back to demar char (colon)
REPLACE(REPLACE(LTRIM(RTRIM(REPLACE(REPLACE(LTRIM(RTRIM(#strToTrim)),' ','|'),':',' '))),' ',':'),'|',' ') replaces the pipe characters back to original space characters
*/
END

DECLARE #String VarChar(50) = '**H*i****', #String2 VarChar(50)
--Assign to new variable #String2
;WITH X AS (
SELECT LEFT(#String, LEN(REPLACE(#String, '*', ' '))) [V1]
)
SELECT TOP 1 #String2 = RIGHT(V1, LEN(REPLACE(REVERSE(V1), '*', ' '))) FROM X
SELECT #String [#String], #String2 [#String2]
--See the intermediate values, v0 original, v1 triming end, and v2 trim the v1 leading
;WITH X AS (
SELECT #String V0, LEFT(#String, LEN(REPLACE(#String, '*', ' '))) [V1]
)
SELECT [V0], [V1], RIGHT([V1], LEN(REPLACE(REVERSE([V1]), '*', ' '))) [v2] FROM X

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Function to replace all non alpha-numeric and multiple whitespace characters with a single space - sql

Related

Replace function in SQL Server

How to identify and redact all instances of a matching pattern in T-SQL

SSMS replace all commas outside of quotation marks in string

Remove only leading or trailing carriage returns

SQL Server TRIM character

Categories

Resources