Oracle regexp_replace: Don't replace the first character, replace others - sql

I am writing a sql, and I am using regexp_replace function for this.
My aim is to replace the characters like '|', '\' etc with '-'.
The problem which I am facing is that it replaces the '+', which is at the beginning.
For eg:
The phone number is: +49 |0| 941 78878544
I have to repalce the '|' with '-'.
My code is this:SELECT regexp_replace(phone,'\D','-') FROM PHONE_TBL WHERE EMPLID = employee;
I get the output as: -49--0--941-78878544
This code replaces the space , along with the '+' in the beginning.
I want the '+' to remain, if it is there in the beginning, and if phone numbers have spaces among them, that also should remain.
For '+', I have figured out i should match the beginning of the string, then,must check for non numeric digit, and then escape, but not able to code.
And for space in between, similar approach.
Any help in this, thanks.

If you want to replace all instances of '\' and '|' with '-', use the following:
SELECT REGEXP_REPLACE(phone,'[\|\]','-')
FROM phone_tbl
WHERE emplid = employee;
The square brackets define a set of characters to match. The '\|' means match a pipe character; the '\' means match a backslash character. If you want to replace more characters other than backslash and pipe then you add them between the square brackets.

Related

Semicolon in LIKE is not working for SQL server 2017

I have a query like this which is not retrieving the values from DB table even if the required value exist there.
Here's the query, which return zero rows:
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\;'',./%'
Following is the value in the table:
'!##$%&*()-_=+{}|:"<>?[]\;'',./'
When I run the query without ";" it is returning the value.
Can any one help me in figuring this out?
Thanks
Ritu
You are using multiple characters which are reserved when using LIKE statement.
i.e. %, _, []
Use the escape character clause (where I have used backtick to treat special characters as regular) such as
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' escape '`'
The value in your table is:
!##$%&*()-_=+{};; :"<>?[]\;'',./
And the one in the like is:
(!##$%&*()-_=+{};;
Starting with ( it will never match, also you should scape the percent (%) in the middle of the string like this:
Select *
FROM SitePanel_FieldValue
WHERE SiteFieldIdfk =111
AND SiteFieldvalue like '%!##$\%&*()-_=+{};;%' ESCAPE '\'
The problem is your brackets ([]), it has nothing to do with semicolons. If we remove the brackets, the above works:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?\;'',./%' THEN 1 END AS WithoutBrackets,
CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?[]\;'',./%' THEN 1 END AS WithBrackets
Notice that WithoutBrackets returns 1, where as WithBrackets returns NULL.
Brackets in a LIKE are to denote a pattern. For example SomeExpress LIKE '[ABC]' would match the characters, A, B, and C. If you are going to include special characters, you need to ESCAPE them. You have both brackets, a percent sign (%) and an underscore (_) you need to escape. You don't need to escape the hyphen (-), as it doesn't appear in a pattern (for example [A-Z]). I choose to use a backtick as the ESCAPE character, as it doesn't appear in your string, and demonstrate with a CASE expression again:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' ESCAPE '`' THEN 1 END;
If you wanted to use a backslash (\ ), which many do, you would need to also escape the backslash in your string:
SELECT CASE WHEN '!##$\%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-\_=+{}|:"<>?\[\]\\;'',./%' ESCAPE '\' THEN 1 END;
db<>fiddle
I think the issue is actually with the backslash. This is an escape character and so if you want it to be included, you have to put it in twice.
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\\;'',./%'

How to replace characters which are not alphanumeric in pyspark sql?

this is my code.
%spark.pyspark
jdbc_write(spark, spark.sql("""
SELECT
Global_Order_Number__c
, Infozeile__c
FROM STAG.SF_CASE_TRANS
"""), JDBC_URLS['xyz_tera_utf8'], "DEV_STAG.SF_CASE", "abc", "1234")
I want to exclude every character in the Infozeile__c field which are not a-z, A-Z, 0-9.
Is their any function which is able to do this?
Apply regexp_replace() to the column in your query:
regexp_replace(Infozeile__c, '[^a-zA-Z0-9]', '') as Infozeile__c
The regex [^a-zA-Z0-9] is a negated character class, meaning any character not in the ranges given. The replacement is a blank, effectively deleting the matched character.
If you're expecting lots of characters to be replaced like this, it would be a bit more efficient to add a +, which means "one or more", so whole blocks of undesirable characters are removed at a time.
regexp_replace(Infozeile__c, '[^a-zA-Z0-9]+', '')

TRIM doesn't remove inner whitespaces

I try to remove all whitespaces inside a string. For this, I use TRIM() function. Unfortunately it doesn't work as expected, inner whitespaces (between 35 and 'A') remain untouched:
select TRIM('Hopkins 35 A Street') as Street
Column type is nvarchar. The funny thing is that this function works fine (using example from above) when executed on W3Schools (TRIM function example): https://www.w3schools.com/sql/func_sqlserver_trim.asp.
I can use replace on this string and replace ' ' into '' without a problem. I work on SQL Server 18.7.1 (2020)
if you use TRIM like this you are only removing leading and trailing spaces from a string. To remove also spaces in between you should change to:
select TRIM(' ' FROM 'Hopkins 35 A Street') as Street
UPDATE: if you meant to remove all spaces you should use
SELECT REPLACE('Hopkins 35 A Street', ' ', '')
TRIM is only intended to make a double space become a single one
"Trimming" means the removal of whitespace from the start and/or the end of a string value. It never means (and has never meant to mean) the removal of whitespace within a string value (enclosed by non-whitespace characters).
You can indeed use the TRIM function with a FROM in its argument to specify other characters than whitespace to trim. In that case, the TRIM function will remove the specified characters from the start and the end of the string, but not within the string (enclosed by other characters).
In other words: the specified characters will be treated as if they were whitespace as well, but specifying them so will not affect the trimming behavior/algorithm itself.
Check out the sample on Microsoft Docs:
SELECT TRIM( '.,! ' FROM ' # test .') AS Result;
produces this result: # test
TRIM function will only remove only the leading and tailing spaces in the data. It cannot remove all the spaces in the data. I mean it cannot remove all the spaces if there are any spaces in the data like 'Hello World'. TRIM cannot remove the space between the word Hello and World and make it look like 'HelloWorld'. If you want to remove all the spaces, you can use the REPLACE function. In the REPLACE function you can replace the space with any character/number/symbol. If you don't need any you can simply remove the space with ''. like
SELECT REPLACE('Hopkins 35 A Street', ' ', '')

how not to replace "]" when using regex_replace for removing special characters

I'm trying to remove few special characters from a comment column in my table. I used the below statement but it seems to remove the ']' even though it is in the ^[not] list.
UPDATE TEST
set comments=REGEXP_REPLACE(
comments,
'[^[a-z,A-Z,0-9,[:space:],''&'','':'',''/'',''.'',''?'',''!'','']'']]*',
' '
);
The table data contains the following:
[SYSTEM]:Do you have it in stock? 😊
My requirement is to have:
[SYSTEM]:Do you have it in stock?
You have two mistakes in you regex:
Do not put characters in quotes and don't split them with comma.
Remove inner square brackets.
And place closing square brackets first in the list, just after initial circumflex. Fixed regex:
UPDATE TEST set comments=REGEXP_REPLACE(comments,'[^]a-zA-Z0-9[:space:]&:/.?!]*',' ');
My try, I just removed the commas, put the "accepted" characters after the initial "not"(no brackets).
A special case are the brackets: https://dba.stackexchange.com/a/109294/6228
select REGEXP_REPLACE(
'[ION] are varză murată.',
'[^][a-zA-Z0-9[:space:]&:/,.?!]+',
' ')
from dual;
Result:
[ION] are varz murat .

Oracle query to identify columns having special characters

I'm trying to write a SQL query to return rows which has anything other than
alphabets, numbers, spaces and following chars '.', '{','[','}',']'
Column has alphabets like Ÿ, ¿
eg:- There's a table TEST with 2 columns - EmpNo and SampleText
EmpNo is simple sequence and SampleText has values like
('12345abcde','abcdefghij','1234567890','ab c d 1 3','abcd$%1234','%^*&^%$#$%','% % $ # %','abcd 12}34{','MINNEAŸPOLIS','THAN ¿VV ¿A')
I want to write a query which should eliminate all rows which have even a single special character except .{[}]. In above example, it should return EmpNo - 1,2,3,4 and 8
I tried REGEXP_LIKE but I'm not getting exactly what I need.
Query I used:
SELECT * FROM test
WHERE REGEXP_LIKE(sampleText, '[^A-Z^a-z^0-9^[^.^{^}]' ,'x');
This is not ignoring blanks and I also need to ignore closing bracket ']'
You can use regular expressions for this, so I think this is what you want:
select t.*
from test t
where not regexp_like(sampletext, '.*[^a-zA-Z0-9 .{}\[\]].*')
I figured out the answer to above problem.
Below query will return rows which have even a signle occurrence of characters besides
alphabets, numbers, square brackets, curly brackets,s pace and dot.
Please note that position of closing bracket ']' in matching pattern is important.
Right ']' has the special meaning of ending a character set definition. It wouldn't make any sense to end the set before you specified any members, so the way to indicate a literal right ']' inside square brackets is to put it immediately after the left '[' that starts the set definition
SELECT * FROM test WHERE REGEXP_LIKE(sampletext, '[^]^A-Z^a-z^0-9^[^.^{^}^ ]' );
They key is the backslash escape character will not work with the right square bracket inside of the character class square brackets (it is interpreted as a literal backslash inside the character class square brackets). Add the right square bracket with an OR at the end like this:
select EmpNo, SampleText
from test
where NOT regexp_like(SampleText, '[ A-Za-z0-9.{}[]|]');
Compare the length using lengthB and length function in oracle.
SELECT * FROM test WHERE length(sampletext) <> lengthb(sampletext)