ORACLE - Find rows which contain words with X number of characters - sql

How to search for rows in ORACLE, which contain words with certain amount of characters, let's say 15.
So if the row is "This word has 15 characters - Fifteencharache", that should be brought as a result.
Thanks.

You can use regular expressions to search for (at least) 15 consecutive word characters:
SELECT *
FROM your_table
WHERE REGEXP_LIKE( your_column, '\w{15}' )
If you want exactly 15 characters then:
SELECT *
FROM your_table
WHERE REGEXP_LIKE( your_column, '(^|\W)\w{15}(\W|$)' )

You need to use regexp_like(). For 15 or more characters:
where regexp_like(col, '[a-zA-z]{15}')
Here is a simple way for space-delimited words:
where regexp_like(' ' || col || ' ', ' [a-zA-z]{15} ')
You can expand this to more delimiters if you need.

Related

Oracle remove special characters

I have a column in a table ident_nums that contains different types of ids. I need to remove special characters(e.g. [.,/#&$-]) from that column and replace them with space; however, if the special characters are found at the beginning of the string, I need to remove it without placing a space. I tried to do it in steps; first, I removed the special characters and replaced them with space (I used
REGEXP_REPLACE) then found the records that contain spaces at the beginning of the string and tried to use the TRIM function to remove the white space, but for some reason is not working that.
Here is what I have done
Select regexp_replace(id_num, '[:(),./#*&-]', ' ') from ident_nums
This part works for me, I remove all the unwanted characters from the column, however, if the string in the column starts with a character I don't want to have space in there, I would like to remove just the character, so I tried to use the built-in function TRIM.
update ident_nums
set id_num = TRIM(id_num)
I'm getting an error ORA-01407: can't update ident_nums.id_num to NULL
Any ideas what I am doing wrong here?
It does work if I add a where clause,
update ident_nums
set id_num = TRIM(id_num) where id = 123;
but I need to update all the rows with the white space at the beginning of the string.
Any suggestions are welcome.
Or if it can be done better.
The table has millions of records.
Thank you
Regexp can be slow sometimes so if you can do it by using built-in functions - consider it.
As #Abra suggested TRIM and TRANSLATE is a good choice, but maybe you would prefer LTRIM - removes only leading spaces from string (TRIM removes both - leading and trailing character ). If you want to remove "space" you can ommit defining the trim character parameter, space is default.
select
ltrim(translate('#kdjdj:', '[:(),./#*&-]', ' '))
from dual;
select
ltrim(translate(orginal_string, 'special_characters_to_remove', ' '))
from dual;
Combination of Oracle built-in functions TRANSLATE and TRIM worked for me.
select trim(' ' from translate('#$one,$2-zero...', '#$,-.',' ')) as RESULT
from DUAL
Refer to this dbfiddle
I think trim() is the key, but if you want to keep only alpha numerics, digits, and spaces, then:
select trim(' ' from regexp_replace(col, '[^a-zA-Z0-9 ]', ' ', 1, 0))
regexp_replace() makes it possible to specify only the characters you want to keep, which could be convenient.
Thanks, everyone, It this query worked for me
update update ident_nums
set id_num = LTRIM(REGEXP_REPLACE(id_num, '[:space:]+', ' ')
where REGEXP_LIKE(id_num, '^[ ?]')
this should work for you.
SELECT id_num, length(id_num) length_old, NEW_ID_NUM, length(NEW_ID_NUM) len_NEW_ID_NUM, ltrim(NEW_ID_NUM), length(ltrim(NEW_ID_NUM)) length_after_ltrim
FROM (
SELECT id_num, regexp_replace(id_num, '[:(),./#*&-#]', ' ') NEW_ID_NUM FROM
(
SELECT '1234$%45' as id_num from dual UNION
SELECT '#SHARMA' as id_num from dual UNION
SELECT 'JACK TEST' as id_num from dual UNION
SELECT 'XYZ#$' as id_num from dual UNION
SELECT '#ABCDE()' as id_num from dual -- THe 1st character is space
)
)

How to add blanks between digits (currency) - Oracle format

I need to give format to the following number:
1234567.89
as
1 234 567.89
I already tried:
select regexp_replace( '1234567.89', '(...)', '\1 ' ) from dual;
But its starting from left to right the counting and it's ignoring the decimal dot.
Thanks in advance.
Best regards.
SELECT TO_CHAR(10000,'99G999D99MI',
'NLS_NUMERIC_CHARACTERS = ''. ''
NLS_CURRENCY = '' ') "Amount"
FROM DUAL;
you can do it like this
select replace(to_char(1234567.89, '9,999,999,999,999,999.99'), ',', ' ') x from dual
Only in this case, you need to know how big is the biggest number you will have. If lets say, your biggest number is in millions, make format model in billions, so it covers either exact number of digits or more

Oracle Identifying multiple spaces using REGEXP_LIKE

I'm trying to identify fields that have more than one space in a comment, e.g. 'this lhas three spaces'
Using this I can get anything with two spaces, but would like to be able to get 2 or more:
select * from labtec.spaces
where REGEXP_LIKE(SPACES, '[[:space:]]{2}');
Any suggestions?
I believe that you can:
select * from labtec.spaces
where REGEXP_LIKE(SPACES, '[[:space:]]{2,}');
Note the comma.
For "Between three and five" you would use {3,5}, for "two or more" {2,}, for "eight or less" {,8}
where REGEXP_LIKE(SPACES, '[[:space:]][[:space:]]+');
You do not need to check for two-or-more characters - checking for two is sufficient to filter the rows since if there are three characters then matching only two of them will work just as well as matching two-or-more.
This will find strings which have two or more (consecutive or non-consecutive) space CHR(32) characters (without using regular expressions):
SELECT *
FROM labtec.spaces
WHERE INSTR( spaces, ' ', 1, 2 ) > 0
This will find where there are two or more consecutive space CHR(32) characters:
SELECT *
FROM labtec.spaces
WHERE INSTR( spaces, ' ' ) > 0
If you want any two (or more) consecutive white-space characters then you only need to check for two matching characters:
SELECT *
FROM labtec.spaces
WHERE REGEXP_LIKE( spaces, '\s\s' ) -- Or, using POSIX syntax '[[:space:]]{2}'
Update - Leading and trailing spaces
SELECT *
FROM labtec.spaces
WHERE SUBSTR( spaces, 1, 2 ) = ' ' -- at least two leading spaces
OR SUBSTR( spaces, -2 ) = ' ' -- at least two trailing spaces
or, using (perl-like) regular expressions:
SELECT *
FROM labtec.spaces
WHERE REGEXP_LIKE( spaces, '^\s\s|\s\s$' )

How to remove specific value from comma separated string in oracle

I want remove specific value from comma separated sting using oracle.
Sample Input -
col
1,2,3,4,5
Suppose i want to remove 3 from the string.
Sample Output -
col
1,2,4,5
Please suggest how i can do this using oracle query.
Thanks.
Here is a solution that uses only standard string functions (rather than regular expressions) - which should result in faster execution in most cases; it removes 3 only when it is the first character followed by comma, the last character preceded by comma, or preceded and followed by comma, and it removes the comma that precedes it in the middle case and it removes the comma that follows it in the first and third case.
It is able to remove two 3's in a row (which some of the other solutions offered are not able to do) while leaving in place consecutive commas (which presumably stand in for NULL) and do not disturb numbers like 38 or 123.
The strategy is to first double up every comma (replace , with ,,) and append and prepend a comma (to the beginning and the end of the string). Then remove every occurrence of ,3,. From what is left, replace every ,, back with a single , and finally remove the leading and trailing ,.
with
test_data ( str ) as (
select '1,2,3,4,5' from dual union all
select '1,2,3,3,4,4,5' from dual union all
select '12,34,5' from dual union all
select '1,,,3,3,3,4' from dual
)
select str,
trim(both ',' from
replace( replace(',' || replace(str, ',', ',,') || ',', ',3,'), ',,', ',')
) as new_str
from test_data
;
STR NEW_STR
------------- ----------
1,2,3,4,5 1,2,4,5
1,2,3,3,4,4,5 1,2,4,4,5
12,34,5 12,34,5
1,,,3,3,3,4 1,,,4
4 rows selected.
Note As pointed out by MT0 (see Comments below), this will trim too much if the original string begins or ends with commas. To cover that case, instead of wrapping everything within trim(both ',' from ...) I should wrap the rest within a subquery, and use something like substr(new_str, 2, length(new_str) - 2) in the outer query.
Here is one method:
select trim(both ',' from replace(',' || '1,2,3,4,5' || ',', ',' || '3' || ',', ','))
That said, storing comma-delimited strings is a really, really bad idea. There is almost no reason to do such a thing. Oracle supports JSON, XML, and nested tables -- all of which are better alternatives.
The need to remove an element suggests a poor data design.
You can convert the list rows using an XMLTABLE, filter to remove the unwanted rows and then re-aggregate them:
SELECT LISTAGG( x.value.getStringVal(), ',' ) WITHIN GROUP ( ORDER BY idx )
FROM XMLTABLE(
( '1,2,3,4,5' )
COLUMNS value XMLTYPE PATH '.',
idx FOR ORDINALITY
) x
WHERE x.value.getStringVal() != 3;
For a simple filter this is probably not worth it and you should use something like (based on #mathguy's solution):
SELECT SUBSTR( new_list, 2, LENGTH( new_list ) - 2 ) AS new_list
FROM (
SELECT REPLACE(
REPLACE(
',' || REPLACE( :list, ',', ',,' ) || ',',
',' || :value_to_replace || ','
),
',,',
','
) AS new_list
FROM DUAL
)
However, if the filtering is more complicated then it might be worth converting the list to rows, filtering and re-aggregating.
I do not knwo how to do this in Oracle, but with SQL-Server I'd use a trick:
convert the list to XML by replacing the comma with tags
use XQuery to filter the data
reconcatenate
This is SQL Server syntax but might point you the direction:
declare #s varchar(100)='1,2,2,3,3,4';
declare #exclude int=3;
WITH Casted AS
(
SELECT CAST('<x>' + REPLACE(#s,',','</x><x>') + '</x>' AS XML) AS TheXml
)
SELECT x.value('.','int')
FROM Casted
CROSS APPLY TheXml.nodes('/x[text()!=sql:variable("#exclude")]') AS A(x)
UPDATE
I just found this answer which seems to show pretty well how to start...
I agree with Gordon regarding the fact that storing comma delimited data in a column is a really bad idea.
I just preceed the csv with a ',', then use the replace function followed by a left trim function to clean-up the preceeding ','.
SCOTT#tst>VAR b_number varchar2(5);
SCOTT#tst>EXEC :b_number:= '3';
PL/SQL procedure successfully completed.
SCOTT#tst>WITH srce AS (
2 SELECT
3 ',' || '3,1,2,3,3,4,5,3' col
4 FROM
5 dual
6 ) SELECT
7 ltrim(replace(col,',' ||:b_number),',') col
8 FROM
9 srce;
COL
1,2,4,5

Counting word lengths in a string

I am using an Oracle regular expression to extract the first letter of each word in a string. The results are returned in a single cell, with spaces representing hard breaks. Here is an example...
input:
'I hope that some kind person
browsing stack overflow
can help me'
output:
ihtskp bso chm
What I am trying to do next is count the length of each "word" in my output, like this:
6 3 3
Alternatively, a count of the words in each line of the original string would be acceptable, as it would yield the same result.
Thanks!
Count the number of spaces and add one:
select (length(your_col) - length(replace(your_col, ' '))+1) from your_table;
It will give you the number of words per line. From there you can get all counts on one line by using listagg function:
select LISTAGG(cnt,' ') within group (order by null) from (
select (length(a)-length(replace(a,' '))+1) cnt from (
select 'apa bpa bv' a from dual
union all
select 'n bb gg' a from dual
union all
select 'ff ff rr gg' a from dual))
group by null;
Perhaps you also need to split the strings if they contain newlines or are they split already?
I tried to edit my original post but it hasn't appeared, but I figured out a way to solve my issue. I just decided to break the words into rows, since I know how to character count rows, and then reassembled the character counts into a single cell using listagg:
with my_string as (
select regexp_substr (words,'[0-9]+|[a-z]+|[A-Z]+',1,lvl) parsed
from (
select words, level lvl
from letters connect by level <= length(words) - length(replace(words,' ')) + 1)
)
select listagg(length(parsed),' ') within group (order by parsed) word_count
from my_string