Replacing multiple special characters using Oracle SQL functions - sql

#All - Thanks for your help
ID Email
1 karthik.sanu#gmail.com. , example#gmail.com#
2 karthik?sanu#gmail.com
In the above example, if you see the 1st row, the email address is invalid because of dot at the end of 1st email address and # at the end of 2nd email address.
In 2nd row, there is a ? in between email address.
Just wanted to know is there any way to handle there characters and remove those from email address using SQL function and update the same in table.
Thanks in advance.

you can also check a translate function
translate('my ,string#with .special chars','#,?. ', ' ')

You could nest multiple invokations of replace(), but this quickly becomes convoluted.
On the other hand, regexp_replace() comes handy for this:
regexp_replace(column_name, '#|,|\?|\.', ' ')
The pipe character (|) stands for or. The dot (.) and the question mark (?) are meaningful characters in regexes so they need to escaped with a backslash (\).

Something like this will "remove" everything but digits, letters and spaces (if that's what you wanted).
SQL> with test (col) as
2 (select 'This) is a se#nten$ce with. everything "but/ only 123 numbers, and ABC lett%ers' from dual)
3 select regexp_replace(col, '[^a-zA-Z0-9 ]') result
4 from test;
RESULT
-----------------------------------------------------------------------
This is a sentence with everything but only 123 numbers and ABC letters
SQL>

Related

How to remove any special characters from a string even with dot and comma and spaces

INPUT STRING:'HI every one. I want to (2-21-2022) remove the comma-dot and other any special character from string(123)'.
OUTPUT STRING:'HI every one I want to 2-21-2022 remove the #comma dot and other any special #character from string 123'
Thanks IN Advance.
If what you said in title:
remove any special characters from a string even with dot and comma and spaces
means that you'd want to keep only digits and letters, then such a regular expression might do:
SQL> with test (col) as
2 (select 'HI every one. I want to (2-21-2022) remove the comma-dot and other any special character from string(123)' from dual)
3 select regexp_replace(col, '[^[:alnum:]]') result
4 from test;
RESULT
---------------------------------------------------------------------------------
HIeveryoneIwantto2212022removethecommadotandotheranyspecialcharacterfromstring123
SQL>
On the other hand, that's not what example you posted represents (as already commented).

Get the FIrst character from the Firstname if more than one firstname is present in oracle

I have a requirement where the persons can have more than one first name and i need to convert the name into first characters from the 2 or more names with capital letters.
Examples:
Srinivas Kalyan ,Sai Kishore
if the above is one having 4 first names then i need to display as below
S K S K
The comma is also be replaced and take all the first characters
2. Srinivas-Kalyan Sai Kishore
For the above name the value should be as
S S K
since Srinivas-Kalyan is considered as one name
Also the name can be in small letters
srinivas kalyan sai kishore
For this
S K S K
This has to be in oracle
Tried the below regex_replace which is working fine in sql developer but in the application it is changing into space
replace(trim(regexp_replace(to_char(regexp_replace(initcap(regexp_replace(regexp_replace
(FIRST_NAME, '[0-9]', ''), '( *[[:punct:]])', '')), '([[:lower:]]| )')), '(.)', '\1 ')),',',null)
The missing letter in your output is caused by this regex for character removal:
( *[[:punct:]])
This will turn Kalyan ,Sai into KalyanSai, which will be treated as one word by the rest of the process, and so you will not have the S of Sai in your output.
I would suggest this shorter expression:
trim(upper(regexp_replace(
regexp_replace(first_name, '([[:alpha:]])(-|[[:alpha:]])+', '\1'),
'.*?([[:alpha:]]|$)', ' \1'
)))
Explanation
([[:alpha:]])(-|[[:alpha:]])+ -> \1
This replaces any sequence of letters (or hyphen) with the first of those. So it reduces words to their initial letter.
.*?([[:alpha:]]|$) -> (space)\1
This replaces anything that precedes the next letter, with a space. As no letter will be skipped (because .*? is non-greedy) this effectively replaces all non-letter sequences with a space. To also replace the non-letters at the very end (which do not precede a letter), the special case $ (end-of-string) is added as alternative.
After these two steps there just remains to:
Upper case everything
Remove blanks at the start and end of the result with trim
I find the advantage of this method is that it does not use any other class than [[:alpha]]. Digits, punctuation, lowercase -- or whatever else -- does not need to be identified explicitly, as it is just the negation of [[:alpha]]. The only exception that has to be made, is for the hyphen.
As some names might include some other non-letters, like a quote, you might want to add such characters as well in the first regular expression.

a sql code or function to remove all the special characters from a particular column in a table

a sql code or function to remove all the special characters from a particular column in a table.
:a oracle code to remove all the special character from a column .for example ABC D.E.F so it should be ABC DEF,space should be maintained between 2 words.
If you want to remove the hyphens and the dots, you can go with translate like so:
select
translate(column_name, 'Q._"?!##$%^&*æ', 'Q')
from
your_table;
The easiest way should be a Regular Expression, this removes anything but spaces and letters a to z:
REGEXP_REPLACE(col, '[^a-z ]','' , 1, 0, 'i')

What is this Oracle regexp matching in this production code?

Here's the code that is in production:
dynamic_sql := q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]';
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two characters that are not Y6
What's the ']' after the 6?
Then what?
I think that StackOverflow's formatting is causing some of the confusion in the answers. Oracle has a syntax for a string literal, q'[...]', which means that the ... portion is to be interpreted exactly as-is; so for instance it can include single quotes without having to escape each one individually.
But the code formatting here doesn't understand that syntax, so it is treating each single-quote as a string delimiter, which makes the result look different that how Oracle really sees it.
The expression is concatenating two such string literals together. (I'm not sure why - it looks like it would be possible to write this as a single string literal with no issues.) As pointed out in another answer/comment, the resulting SQL string is actually:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
And also as pointed out in another answer, the [^Y6] portion of the regex matches a single character, not two. So this expression should simply match any string whose first character is not 'Z' and whose second character is neither 'Y' nor '6'.
When not in couples ] means... Well... Itself:
^[^Z][^Y6]]/
^ assert position at start of the string
[^Z] match a single character not present in the list below
Z the literal character Z (case sensitive)
[^Y6] match a single character not present in the list below
Y6 a single character in the list Y6 literally (case sensitive)
] matches the character ] literally
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two one characters that is not Y or 6
What's the ']' after the 6? it's a ]
I'm afraid I have to post this here as the comment section is inappropriate for the formatting required. After your edit above that shows the entire statement, I ran this to see what the string ends up being:
select q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]' txt
from dual;
It ended up yielding this:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
It is apparent now that the closing bracket and quote at the end of the regex belong to the first alternate quote string and not to the regex. This is concatenating 2 alternate quoted strings which is a tad confusing as it sure looked like part of the regex. If anything you are learning the importance of comments for the poor person behind you! Please comment this accordingly when you are done figuring this out. Even include a link to this post.

Oracle RegExp For Counting Words Occuring After a Character

I want to identify the number of words occurring after a comma, in a full name field in Oracle database table.
The name field contains format of "LAST, FIRST MIDDLE"
Some names may have up to 4 to 5 names, such as "DOE, JOHN A B"
For example, if the Name field = 'WILLIAMS JR, HANK' it would output 1 (for 1 word occurring after the comma.
If the Name field = 'DOE, JOHN A B' i want it to output 3.
I would like to use a regexp_count function to determine this count.
I am using the following code to identify how many words exist in the field and would like to modify it to include this functionality:
REGEXP_COUNT(REPLACE(fieldname, ',',', '), '[^]+')
It would likely have to remove the replace function in order to find the comma, but this was the best I could do so far.
Help is much appreciated!
How about the following:
REGEXP_COUNT( fieldname, "\\w", INSTR(fieldname, ",")+1)
I have updated the code as follows, which appears to be working as desired:
REGEXP_COUNT(fieldname, '[^ ]+', (INSTR(fieldname, ',')+1))