Oracle Identifying multiple spaces using REGEXP_LIKE - sql

I'm trying to identify fields that have more than one space in a comment, e.g. 'this lhas three spaces'
Using this I can get anything with two spaces, but would like to be able to get 2 or more:
select * from labtec.spaces
where REGEXP_LIKE(SPACES, '[[:space:]]{2}');
Any suggestions?

I believe that you can:
select * from labtec.spaces
where REGEXP_LIKE(SPACES, '[[:space:]]{2,}');
Note the comma.
For "Between three and five" you would use {3,5}, for "two or more" {2,}, for "eight or less" {,8}

where REGEXP_LIKE(SPACES, '[[:space:]][[:space:]]+');

You do not need to check for two-or-more characters - checking for two is sufficient to filter the rows since if there are three characters then matching only two of them will work just as well as matching two-or-more.
This will find strings which have two or more (consecutive or non-consecutive) space CHR(32) characters (without using regular expressions):
SELECT *
FROM labtec.spaces
WHERE INSTR( spaces, ' ', 1, 2 ) > 0
This will find where there are two or more consecutive space CHR(32) characters:
SELECT *
FROM labtec.spaces
WHERE INSTR( spaces, ' ' ) > 0
If you want any two (or more) consecutive white-space characters then you only need to check for two matching characters:
SELECT *
FROM labtec.spaces
WHERE REGEXP_LIKE( spaces, '\s\s' ) -- Or, using POSIX syntax '[[:space:]]{2}'
Update - Leading and trailing spaces
SELECT *
FROM labtec.spaces
WHERE SUBSTR( spaces, 1, 2 ) = ' ' -- at least two leading spaces
OR SUBSTR( spaces, -2 ) = ' ' -- at least two trailing spaces
or, using (perl-like) regular expressions:
SELECT *
FROM labtec.spaces
WHERE REGEXP_LIKE( spaces, '^\s\s|\s\s$' )

Related

Oracle remove special characters

I have a column in a table ident_nums that contains different types of ids. I need to remove special characters(e.g. [.,/#&$-]) from that column and replace them with space; however, if the special characters are found at the beginning of the string, I need to remove it without placing a space. I tried to do it in steps; first, I removed the special characters and replaced them with space (I used
REGEXP_REPLACE) then found the records that contain spaces at the beginning of the string and tried to use the TRIM function to remove the white space, but for some reason is not working that.
Here is what I have done
Select regexp_replace(id_num, '[:(),./#*&-]', ' ') from ident_nums
This part works for me, I remove all the unwanted characters from the column, however, if the string in the column starts with a character I don't want to have space in there, I would like to remove just the character, so I tried to use the built-in function TRIM.
update ident_nums
set id_num = TRIM(id_num)
I'm getting an error ORA-01407: can't update ident_nums.id_num to NULL
Any ideas what I am doing wrong here?
It does work if I add a where clause,
update ident_nums
set id_num = TRIM(id_num) where id = 123;
but I need to update all the rows with the white space at the beginning of the string.
Any suggestions are welcome.
Or if it can be done better.
The table has millions of records.
Thank you
Regexp can be slow sometimes so if you can do it by using built-in functions - consider it.
As #Abra suggested TRIM and TRANSLATE is a good choice, but maybe you would prefer LTRIM - removes only leading spaces from string (TRIM removes both - leading and trailing character ). If you want to remove "space" you can ommit defining the trim character parameter, space is default.
select
ltrim(translate('#kdjdj:', '[:(),./#*&-]', ' '))
from dual;
select
ltrim(translate(orginal_string, 'special_characters_to_remove', ' '))
from dual;
Combination of Oracle built-in functions TRANSLATE and TRIM worked for me.
select trim(' ' from translate('#$one,$2-zero...', '#$,-.',' ')) as RESULT
from DUAL
Refer to this dbfiddle
I think trim() is the key, but if you want to keep only alpha numerics, digits, and spaces, then:
select trim(' ' from regexp_replace(col, '[^a-zA-Z0-9 ]', ' ', 1, 0))
regexp_replace() makes it possible to specify only the characters you want to keep, which could be convenient.
Thanks, everyone, It this query worked for me
update update ident_nums
set id_num = LTRIM(REGEXP_REPLACE(id_num, '[:space:]+', ' ')
where REGEXP_LIKE(id_num, '^[ ?]')
this should work for you.
SELECT id_num, length(id_num) length_old, NEW_ID_NUM, length(NEW_ID_NUM) len_NEW_ID_NUM, ltrim(NEW_ID_NUM), length(ltrim(NEW_ID_NUM)) length_after_ltrim
FROM (
SELECT id_num, regexp_replace(id_num, '[:(),./#*&-#]', ' ') NEW_ID_NUM FROM
(
SELECT '1234$%45' as id_num from dual UNION
SELECT '#SHARMA' as id_num from dual UNION
SELECT 'JACK TEST' as id_num from dual UNION
SELECT 'XYZ#$' as id_num from dual UNION
SELECT '#ABCDE()' as id_num from dual -- THe 1st character is space
)
)

Replace function combined with ltrim and rtrim

Can someone please help me to understand the following code:
REPLACE(LTRIM(RTRIM(dbo.UFN_SEPARATES_COLUMNS(CompletionDetails, 3, ','))), '.', '') AS BuildRequestID,
Does it say remove all trailing and leading spaces, then replace 3 with comma. Next, if there is ., replace it with ' '?
It does not at any point replace 3 with ,.
We can make all this easier to follow by formatting the full expression to cover multiple lines:
REPLACE(
LTRIM(RTRIM(
dbo.UFN_SEPARATES_COLUMNS(CompletionDetails, 3, ',')
))
,'.', ''
) AS BuildRequestID,
Expressions like this have to read from the inside out. So we start with this inner-most part:
dbo.UFN_SEPARATES_COLUMNS(CompletionDetails, 3, ',')
This UFN_SEPARATES_COLUMNS() function is not part of Sql Server, but was added by someone at your organization or as part of the vendor software package for the database you're looking at. But I'm confident based on inferences and the link (found via Google) it will treat CompletionDetails as delimited text, where the delimiter is a comma (based on the 3rd ',' argument) and returns the 3rd field (based on the 2nd 3 argument, where counting starts at 1 rather than 0). As CSV parsers go, this one is particularly naive, so be very careful what you expect from it.
Then we use LTRIM() and RTRIM() to remove both leading and trailing blanks from the field. Not all whitepsace is removed; only space characters. Tabs, line feeds, etc are not trimmed. Sql Server 2017 has a new TRIM() function that can handle wider character sets and do both sides of the string with one call.
The code then uses the REPLACE() function to remove all . characters from the result (replaces them with an empty string).
The code is trimming the leading and trailing spaces via the LTRIM() and RTRIM() functions of whatever is returned from the function dbo.x_COLUMNS... i.e. dbo.x_COLUMNS(CompletionDetails, 3, ','). LTRIM is left, RTRIM is right.
It then is replacing all periods (.) with nothing via the REPLACE() function.
So in summary, it's removing all periods from the string and the leading and trailing spaces.
The LTRIM removes leading spaces. RTRIM removes trailing spaces. REPLACE removes the period.
Declare #Val Char(20) = ' Frid.ay '
Select REPLACE(
LTRIM(
RTRIM(
#Val --dbo.x_COLUMNS(CompletionDetails, 3, ',')
)
), '.', ''
)
Result
BuildRequestID
--------------
Friday
remove all trailing and leading spaces, then replace 3 with comma.
Next, if there is ., replace it with ' '
No it does not say that.
But this does:
REPLACE(REPLACE(LTRIM(RTRIM(CompletionDetails)), '3', ','), '.', ' ')
it's not clear if you want . replaced by ' ' or ''.
I used the 1st case, you can change it as you like.
It's easier to understand like this:
remove all trailing and leading spaces: LTRIM(RTRIM(CompletionDetails))
replace 3 with comma: REPLACE( ?, '3', ',')
replace it with ' ': REPLACE(? , '.', ' ') or REPLACE(? , '.', '')

ORACLE - Find rows which contain words with X number of characters

How to search for rows in ORACLE, which contain words with certain amount of characters, let's say 15.
So if the row is "This word has 15 characters - Fifteencharache", that should be brought as a result.
Thanks.
You can use regular expressions to search for (at least) 15 consecutive word characters:
SELECT *
FROM your_table
WHERE REGEXP_LIKE( your_column, '\w{15}' )
If you want exactly 15 characters then:
SELECT *
FROM your_table
WHERE REGEXP_LIKE( your_column, '(^|\W)\w{15}(\W|$)' )
You need to use regexp_like(). For 15 or more characters:
where regexp_like(col, '[a-zA-z]{15}')
Here is a simple way for space-delimited words:
where regexp_like(' ' || col || ' ', ' [a-zA-z]{15} ')
You can expand this to more delimiters if you need.

Oracle - Can't delete/replace whitespaces

Given the following sample statement, I simply can't get rid of the leading and trailing white spaces no matter if using a combination of RTRIM() and LTRIM() or REGEXP_REPLACE():
select
test_column
,length(test_column) len
,regexp_replace(test_column, '(^[[:blank:]]+)|([[:blank:]]+$)','') rxp
,length(regexp_replace(test_column, '(^[[:blank:]]+)|([[:blank:]]+$)','')) len_rxp --22 characters expected, but is 26
,rtrim(ltrim(test_column)) rltrim
,length(rtrim(ltrim(test_column))) len_rltrim --22 characters expected, but is 26
from(
select ' ABCDEF Hijklmnopqr S32 ' test_column --22 characters without and 29 including whitespaces
from dual);
What's the matter?
You can use the following:
select regexp_replace(test_column, '^(\t|\s)*(.*)(\t|\s)*$', '\2')
from (
select ' ABCDEF Hijklmnopqr S32 ' test_column
from dual
);
This should divide your string in 3 parts (leading, meaningful text, ending) and return only the second one, thus cutting away the trailing and ending sequences of spaces and tabs

Delete certain character based on the preceding or succeeding character - ORACLE

I have used REPLACE function in order to delete email addresses from hundreds of records. However, as it is known, the semicolon is the separator, usually between each email address and anther. The problem is, there are a lot of semicolons left randomly.
For example: the field:
123#hotmail.com;456#yahoo.com;789#gmail.com;xyz#msn.com
Let's say that after I deleted two email addresses, the field content became like:
;456#yahoo.com;789#gmail.com;
I need to clean these fields from these extra undesired semicolons to be like
456#yahoo.com;789#gmail.com
For double semicolons I have used REPLACE as well by replacing each ;; with ;
Is there anyway to delete any semicolon that is not preceded or following by any character?
If you only need to replace semicolons at the start or end of the string, using a regular expression with the anchor '^' (beginning of string) / '$' (end of string) should achieve what you want:
with v_data as (
select '123#hotmail.com;456#yahoo.com;789#gmail.com;xyz#msn.com' value
from dual union all
select ';456#yahoo.com;789#gmail.com;' value from dual
)
select
value,
regexp_replace(regexp_replace(value, '^;', ''), ';$', '') as normalized_value
from v_data
If you also need to replace stray semicolons from the middle of the string, you'll probably need regexes with lookahead/lookbehind.
You remove leading and trailing characters with TRIM:
select trim(both ';' from ';456#yahoo.com;;;789#gmail.com;') from dual;
To replace multiple characters with only one occurrence use REGEXP_REPLACE:
select regexp_replace(';456#yahoo.com;;;789#gmail.com;', ';+', ';') from dual;
Both methods combined:
select regexp_replace( trim(both ';' from ';456#yahoo.com;;;789#gmail.com;'), ';+', ';' ) from dual;
regular expression replace can help
select regexp_replace('123#hotmail.com;456#yahoo.com;;456#yahoo.com;;789#gmail.com',
'456#yahoo.com(;)+') as result from dual;
Output:
| RESULT |
|-------------------------------|
| 123#hotmail.com;789#gmail.com |