SQL Replace multiple different characters in string - sql

I need to replace multiple characters in a string. The result can't contain any '&' or any commas.
I currently have:
REPLACE(T2.[ShipToCode],'&','and')
Which converts & to and, but how do you put multiple values in one line?

You just need to daisy-chain them:
REPLACE(REPLACE(T2.[ShipToCode], '&', 'and'), ',', '')

One comment mentions "dozens of replace calls"... if removing dozens of single characters, you could also use Translate and a single Replace.
REPLACE(TRANSLATE(T2.[ShipToCode], '[];'',$#', '#######'), '#', '')

We used a function to do something similar that looped through the string, though this was mostly to remove characters that were not in the "#ValidCharacters" string. That was useful for removing anything that we didn't want - usually non-alphanumeric characters, though I think we also had space, quote, single quote and a handful of others in that string. It was really used to remove the non-printing characters that tended to sneak in at times so may not be perfect for your case, but may give you some ideas.
CREATE FUNCTION [dbo].[ufn_RemoveInvalidCharacters]
(#str VARCHAR(8000), #ValidCharacters VARCHAR(8000))
RETURNS VARCHAR(8000)
BEGIN
WHILE PATINDEX('%[^' + #ValidCharacters + ']%',#str) > 0
SET #str=REPLACE(#str, SUBSTRING(#str ,PATINDEX('%[^' + #ValidCharacters +
']%',#str), 1) ,'')
RETURN #str
END

If you need fine control, it helps to indent-format the REPLACE() nesting for readability.
SELECT Title,
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(RTRIM(Title),
' & ',''),
'++', ''),
'/', '-'),
'(',''),
')',''),
'.',''),
',',''),
' ', '-')
AS Title_SEO
FROM TitleTable

If you use SQL Server 2017 or 2019 you can use the TRANSLATE function.
TRANSLATE(ShipToCode, '|+,-', '____')
In this example de pipe, plus, comma en minus are all replaced by an underscore.
You can change every character with its own one.
So in the next example the plus and minus are replaced by a hash.
TRANSLATE(ShipToCode, '|+,-', '_#_#')
Just make sure the number of characters is the same in both groups.

Hope this might helps to anyone
If you want to replace multiple words or characters from a string with a blank string (i.e. wanted to remove characters), use regexp_replace() instead of multiple replace() clauses.
SELECT REGEXP_REPLACE("Hello world!123SQL$##$", "[^\w+ ]", "")
The above query will return Hello world123SQL
The same process will be applied if you want to remove multiple words from the string.
If you want to remove Hello and World from the string Hello World SQL, then you can use this query.
SELECT REGEXP_REPLACE("Hello World SQL", "(Hello|World)", "")
This will return SQL
With this process, the query will not look redundant and you didn't have to take care of multiple replace() clauses.
Conclusion
If you wanted to replace the words with blank string, go with REGEXP_REPLACE().
If you want to replace the words with other words, for example replacing & with and then use replace(). If there are multiple words to be replaced, use multiple nested replace().

Related

split text if contains certain string postgres

exit_reason
sr_inefficient_management
tech_too_complex
company_member_resignation
sr_product_engagement
sr_contractual_reasons
sr_contractual_reasons-expectation_issues
sr_churn-takeover_business
I would like to split the column if the value contains the string "sr_" and keep the rest as it is. If the column contains "-" such as "sr_contractual_reasons-expectation_issues", I only want to keep it as "contractual reasons".
So far, my idea is to use
case when exit_reason like '%inefficient_management%' then 'inefficient management'
but if there are many different values, I am in trouble.
Expected output
exit_reason column
tech too complex
company member resignation
product engagement
contractual reasons
contractual reasons
churn
You can just replace 'sr_'
replace(exit_reason, 'sr_', '')
It is unlikely that 'sr_' would appear in any of the reasons. But you can use regexp_replace() to be sure:
regexp_replace(exit_reason, '^sr_', '')
You can try something like it:
REPLACE(
CASE
WHEN exit_reason LIKE '%-%'
THEN split_part(exit_reason,'-',2)
WHEN exit_reason LIKE 'sr_%'
THEN split_part(exit_reason,'sr_',2)
ELSE exit_reason
END
, '_', ' '
)
This code first checks if 'exist_reason' has a hyphen, then if it has 'sr_' and replaces all underscores with blanks.
To also remove the suffix, you could use:
SELECT replace(
regexp_replace(
'sr_contractual_reasons-expectation_issues',
'^(sr_)?([^-]*).*$',
'\2'
),
'_',
' '
);
replace
═════════════════════
contractual reasons
(1 row)
The regular expression matches an optional leading sr_, then all characters until the first -, then anything that follows that, and keeps only the middle part. replace then replaces underscores with spaces.

TRIM doesn't remove inner whitespaces

I try to remove all whitespaces inside a string. For this, I use TRIM() function. Unfortunately it doesn't work as expected, inner whitespaces (between 35 and 'A') remain untouched:
select TRIM('Hopkins 35 A Street') as Street
Column type is nvarchar. The funny thing is that this function works fine (using example from above) when executed on W3Schools (TRIM function example): https://www.w3schools.com/sql/func_sqlserver_trim.asp.
I can use replace on this string and replace ' ' into '' without a problem. I work on SQL Server 18.7.1 (2020)
if you use TRIM like this you are only removing leading and trailing spaces from a string. To remove also spaces in between you should change to:
select TRIM(' ' FROM 'Hopkins 35 A Street') as Street
UPDATE: if you meant to remove all spaces you should use
SELECT REPLACE('Hopkins 35 A Street', ' ', '')
TRIM is only intended to make a double space become a single one
"Trimming" means the removal of whitespace from the start and/or the end of a string value. It never means (and has never meant to mean) the removal of whitespace within a string value (enclosed by non-whitespace characters).
You can indeed use the TRIM function with a FROM in its argument to specify other characters than whitespace to trim. In that case, the TRIM function will remove the specified characters from the start and the end of the string, but not within the string (enclosed by other characters).
In other words: the specified characters will be treated as if they were whitespace as well, but specifying them so will not affect the trimming behavior/algorithm itself.
Check out the sample on Microsoft Docs:
SELECT TRIM( '.,! ' FROM ' # test .') AS Result;
produces this result: # test
TRIM function will only remove only the leading and tailing spaces in the data. It cannot remove all the spaces in the data. I mean it cannot remove all the spaces if there are any spaces in the data like 'Hello World'. TRIM cannot remove the space between the word Hello and World and make it look like 'HelloWorld'. If you want to remove all the spaces, you can use the REPLACE function. In the REPLACE function you can replace the space with any character/number/symbol. If you don't need any you can simply remove the space with ''. like
SELECT REPLACE('Hopkins 35 A Street', ' ', '')

Postgresql regex_replace comma, single and double quotes in a single

I have a string which consists of double quotes, single quotes and commas. I would like to replace all the occurrences of them using regex_replace.
Tried
REGEXP_REPLACE(translate (links, '"',''), '['''''',]' , '')
It replaces the first occurrence of comma not the second one.
'https://google.com/khjdbgksdngksd#/","https://google.com/khjdbgksdngksd#/","'
Why are you mixing TRANSLATE and REGEXP_REPLACE? Just pick one and use it, as either one can do all that you want.
If you want REGEXP_REPLACE to replace all instances, you have to give it a fourth argument (the flag argument) of 'g' for 'global', otherwise it stops after the first match and substitution.
Also, to preserve sanity I would use dollar-quoting when the thing being quoted has single quote marks (which yours has in considerable excess).
Using TRANSLATE is probably a better tool for the job, but your title was specifically about REGEXP_REPLACE, so:
REGEXP_REPLACE(links, $$[',"]$$, '', 'g');
Why not just use replace()?
select replace(replace(replace(links, '"', ''), '''', ''), ',', '')
Or more simply, use translate():
select translate(links, '"'',', '')

Using RTRIM or REGEXP_REPLACE to replace a comma with a comma space and single quote

I'm attempting to learn Oracle regexp_replace well enough to take a value stored in a table as a comma-separated string and change the comma character with a single quote followed by a comma followed by a space, followed by a single quote.
For instance, the field (CourseListT) contains course codes that look like this:
PEOE100,H003,H102,L001,L100,L110,M005,M020,M130
I want it to look like this:
'PEOE100', 'H003', 'H102', 'L001', 'L100', 'L110', 'M005', 'M020', 'M130'
I started with baby steps and found article #25997057 here that showed me how to insert spaces. So I have this working:
SELECT
regexp_replace(gr.CourseListT,'([a-zA-Z0-9_]+)(,?)',' \1\2')
FROM gradreq gr
WHERE gr.gradreqsetid = 326
AND gr.SubjectArea = 'Electives'
But nothing I do will allow me to insert those silly single quote marks.
Would it be better to learn RTRIM replace? Could somebody please help me learn how to accomplish this?
Thank you
Schelly
You can simply do it with replace. Use double single-quotes to escape a single-quote.
select '''' || replace(CourseListT, ',', ''', ''') || ''''
from gradreq

Regular expressions in SQL

Im curious if and how you can use regular expressions to find white space in SQL statments.
I have a string that can have an unlimited amount of white space after the actual string.
For example:
"STRING "
"STRING "
would match, but
"STRING A"
"STRINGB"
would not.
Right now I have:
like 'STRING%'
which doesnt quite return the results I would like.
I am using Sql Server 2008.
A simple like can find any string with spaces at the end:
where col1 like '% '
To also allow tabs, carriage returns or line feeds:
where col1 like '%[ ' + char(9) + char(10) + char(13) + ']'
Per your comment, to find "string" followed by any number of whitespace:
where rtrim(col1) = 'string'
You could try
where len(col1) <> len(rtrim(col1))
Andomar's answer will find the strings for you, but my spidey sense tells me maybe the scope of the problem is bigger than simply finding the whitespace.
If, as I suspect, you are finding the whitespace so that you can then clean it up, a simple
UPDATE Table1
SET col1 = RTRIM(col1)
will remove any trailing whitespace from the column.
Or RTRIM(LTRIM(col1)) to remove both leading and trailing whitespace.
Or REPLACE(col1,' '.'') to remove all whitespace including spaces within the string.
Note that RTRIM and LTRIM only work on spaces, so to remove tabs/CRs/LFs you would have to use REPLACE. To remove those only from the leading/trailing portion of the string is feasible but not entirely simple. Bug your database vendor to implement the ANSI SQL 99 standard TRIM function that would make this much easier.
where len(col1 + 'x') <> len(rtrim(col1)) + 1
BOL provides workarounds for LEN() with trailing spaces : http://msdn.microsoft.com/en-us/library/ms190329.aspx
LEN(Column + '_') - 1
or using DATALENGTH