Issues with SUBSTR function Oracle_SQL - sql

I used the SUBSTR function for the similar purposes, but I encountered the following issue:
I am extracting 6 characters from the right, but the data in column is inconsistent and for some rows it has characters less than 6, i.e. 5 or 4. So for such rows, the function returns blanks. How can I fix this?
Example Scenario 1:
SUBSTR('0000123456',-6,6)
Output: 123456
Scenario 2 (how do I fix this?, I need it to return '23456'):
SUBSTR('23456',-6,6)
Output: ""

You can use a case expression: if the string length is strictly greater than 6 then return just the last 6 characters; otherwise return the string itself. This way you don't need to call substr unless it is really needed.
Alternatively, if speed is not the biggest issue and you are allowed to use regular expressions, you can write this more compactly - select between 0 and 6 characters - as many as possible - at the end of the string.
Finally, if you don't mind using undocumented functions, you can use reverse and standard substr (starting from character 1 and extracting the first 6 characters; that will work as expected even if the string has length less than 6). So: reverse the string, extract first (up to) 6 characters, and then reverse again to restore the order. WARNING: This is shown only for fun; DO NOT USE THIS METHOD!
with
test_data (str) as (
select '0123449389' from dual union all
select '00000000' from dual union all
select null from dual union all
select 'abcd' from dual
)
select str,
case when length(str) > 6 then substr(str, -6) else str end as case_substr,
regexp_substr(str, '.{0,6}$') as regexp_substr,
reverse(substr(reverse(str), 1, 6)) as rev_substr
from test_data
;
STR CASE_SUBSTR REGEXP_SUBSTR REV_SUBSTR
---------- ------------- ------------- --------------
0123449389 449389 449389 449389
00000000 000000 000000 000000
abcd abcd abcd abcd

One method uses coalesce():
select coalesce(substr('23456', -6, 6), '23456')
Another tweaks the length:
select substr('23456', greatest(- length('23456'), -6), 6)

Related

Retrieve the characters before a matching pattern

135 ;1111776698 ;AB555678765
I have the above string and what I am looking for is to retrieve all the digits before the first occurrence of ;.
But the number of characters before the first occurrence of ; varies i.e. it may be a 4 digit number or 3 digit number.
I have played with regex_instr and instr, but I unable to figure this out.
The query should return all the digits before the first occurrence of ;
This answer assumes that you are using Oracle database. I don't know of way to do this using REGEX_INSTR alone, but we can do with REGEXP_REPLACE using capture groups:
SELECT REGEXP_REPLACE('135 ;1111776698 ;AB555678765', '^\s*(\d{3,4})\s*;.*', '\1')
FROM dual;
Demo
Here is the regex pattern being used:
^\s*(\d{3,4})\s*;.*
This allows, from the start of the string, any amount of leading whitespace, followed by a 3 or 4 digit number, followed again by any amount of whitespace, then a semicolon. The .* at the end of the pattern just consumes whatever remains in your string. Note (\d{3,4}), which captures the 3-4 digit number, which is then available in the replacement as \1.
Using INSTR,SUBTSR and TRIM should work ( based on your comment that there are "just white spaces and digits" )
select TRIM(SUBSTR(s,1, INSTR(s,';')-1)) FROM t;
Demo
The following using regexp_substr() should work:
SELECT s, REGEXP_SUBSTR(s, '^[^;]*')
Make sure you try all possible values in that first position, even those you don't expect and make sure they are handled as you want them to be. Always expect the unexpected! This regex matches the first subgroup of zero or more optional digits (allows a NULL to be returned) when followed by an optional space then a semi-colon, or the end of the line. You may need to tighten (or loosen) up the matching rules for your situation, just make sure to test even for incorrect values, especially if the input comes from user-entered data.
with tbl(id, str) as (
select 1, '135 ;1111776698 ;AB555678765' from dual union all
select 2, ' 135 ;1111776698 ;AB555678765' from dual union all
select 3, '135;1111776698 ;AB555678765' from dual union all
select 4, ';1111776698 ;AB555678765' from dual union all
select 5, ';135 ;1111776698 ;AB555678765' from dual union all
select 6, ';;1111776698 ;AB555678765' from dual union all
select 7, 'xx135 ;1111776698 ;AB555678765' from dual union all
select 8, '135;1111776698 ;AB555678765' from dual union all
select 9, '135xx;1111776698 ;AB555678765' from dual
)
select id, regexp_substr(str, '(\d*?)( ?;|$)', 1, 1, NULL, 1) element_1
from tbl
order by id;
ID ELEMENT_1
---------- ------------------------------
1 135
2 135
3 135
4
5
6
7 135
8 135
9
9 rows selected.
To get the desired result, you should use REGEX_SUBSTR as it will substring your desired data from the string you give. Here is the example of the Query.
Solution to your example data:
SELECT REGEXP_SUBSTR('135 ;1111776698 ;AB555678765','[^;]+',1,1) FROM DUAL;
So what it does, Regex splits the string on the basis of ; separator. You needed the first occurrence so I gave arguments as 1,1.
So if you need the second string 1111776698 as your output you can give an argument as 1,2.
The syntax for Regexp_substr is as following:
REGEXP_SUBSTR( string, pattern [, start_position [, nth_appearance [, match_parameter [, sub_expression ] ] ] ] )
Here is the link for more examples:
https://www.techonthenet.com/oracle/functions/regexp_substr.php
Let me know if this works for you. Best luck.

Number check in oracle sql

How to check in 10 digit number whether it contain 999 or 000 in the 4-6th bytes ?
I have a n idea with using INSTR but i don't know how to execute it
This is strange. If the "number" is really a string, then you can use like or substr():
where col like '___999%' or col like '___000%'
or:
where substr(col, 4, 3) in ('999', '000')
or even regular expressions.
Given the nature of your question, you can turn a number into a string and use these methods. However, if you are looking at particular digits, then the "number" should be stored as a string.
If they are actually numbers rather than strings then you could use numeric manipulation:
with t (n) as (
select 1234567890 from dual
union all select 1239997890 from dual
union all select 1230007890 from dual
union all select 1299967890 from dual
union all select 1234000890 from dual
)
select n,
mod(n, 10000000) as stage1,
mod(n, 10000000)/10000 as stage2,
trunc(mod(n, 10000000)/10000) as stage3,
case when trunc(mod(n, 10000000)/10000) in (0, 999) then 'Yes' else 'No' end as matches
from t;
N STAGE1 STAGE2 STAGE3 MATCHES
---------- ---------- ---------- ---------- -------
1234567890 4567890 456.789 456 No
1239997890 9997890 999.789 999 Yes
1230007890 7890 .789 0 Yes
1299967890 9967890 996.789 996 No
1234000890 4000890 400.089 400 No
Stage 1 effectively strips off the first three digits. Stage two almost strips off the last four digits, but leaves fractions, so stage 3 adds trunc() (you could also use floor()) to ignore those fractional parts.
The result of that is the numeric value of the 4-6th digits, and you can then test if that is 0, 999 or something else.
This is really looking at the 4th to 6th most significant digits, which is the same if the number is always 10 digits; if it might actually have different numbers of digits then you'd need to clarify what you want to see.
select
1 from dual where instr(98800054542,000,4,3)in (6) or instr(98800054542,999,4,3)in (6); let me know if this helped.

PL SQL replace conditionally suggestion

I need to replace the entire word with 0 if the word has any non-digit character. For example, if digital_word='22B4' then replace with 0, else if digital_word='224' then do not replace.
SELECT replace_funtion(digital_word,'has non numeric character pattern',0,digital_word)
FROM dual;
I tried decode, regexp_instr, regexp_replace but could not come up with the right solution.
Please advise.
Thank you.
the idea is simple - you need check if the value is numeric or not
script:
with nums as
(
select '123' as num from dual union all
select '456' as num from dual union all
select '7A9' as num from dual union all
select '098' as num from dual
)
select n.*
,nvl2(LENGTH(TRIM(TRANSLATE(num, ' +-.0123456789', ' '))),'0',num)
from nums n
result
1 123 123
2 456 456
3 7A9 0
4 098 098
see more articles below to see which way is better to you
How can I determine if a string is numeric in SQL?
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:15321803936685
How to tell if a value is not numeric in Oracle?
You might try the following:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '\D') THEN '0' ELSE digital_word END
FROM dual;
The regular expression class \D matches any non-digit character. You could also use [^0-9] to the same effect:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '\D') THEN '0' ELSE digital_word END
FROM dual;
Alternately you could see if the value of digital_word is made up of nothing but digits:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '^\d+$') THEN digital_word ELSE '0' END
FROM dual;
Hope this helps.
The fastest way is to replace all digits with null (to simply delete them) and see if anything is left. You don't need regular expressions (slow!) for this, you just need the standard string function TRANSLATE().
Unfortunately, Oracle has to work around their own inconsistent treatment of NULL - sometimes as empty string, sometimes not. In the case of the TRANSLATE() function, you can't simply translate every digit to nothing; you must also translate a non-digit character to itself, so that the third argument is not an empty string (which is treated as a real NULL, as in relational theory). See the Oracle documentation for the TRANSLATE() function. https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions216.htm#SQLRF06145
Then, the result can be obtained with a CASE expression (or various forms of NULL handling functions; I prefer CASE, which is SQL Standard):
with
nums ( num ) as (
select '123' from dual union all
select '-56' from dual union all
select '7A9' from dual union all
select '0.9' from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins BELOW THIS LINE. Use your own table and column names.
select num,
case when translate(num, 'z0123456789', 'z') is null
then num
else '0'
end as result
from nums
;
NUM RESULT
--- ------
123 123
-56 0
7A9 0
0.9 0
Note: everything here is in varchar2 data type (or some other kind of string data type). If the results should be converted to number, wrap the entire case expression within TO_NUMBER(). Note also that the strings '-56' and '0.9' are not all-digits (they contain non-digits), so the result is '0' for both. If this is not what you needed, you must correct the problem statement in the original post.
Something like the following update query will help you:
update [table] set [col] = '0'
where REGEXP_LIKE([col], '.*\D.*', 'i')

Query to remove all non-digit but only keep last period/dot

Struggle to design a regular expression to filter field value from varchar2 to number, so that it can remove all non-digit and only left the last period in the string, so that
"about 1,000.00" return 1000.00 or 1000
"3,000,000.000" return 300000.000 or 3000000
"3.000.000.000" return return 3000000.000 or 3000000
"a^*3^%*(C4.5d*9" return 34.59
Any method just change the string into accurate convertible string that can be converted by to_number()
I use
SELECT REGEXP_REPLACE(field_value, '[^0-9\.]+', '') from dual;
but can't resolve the 3rd case....
Because the regex in oracle are somewhat limited I don't think it's possible only using regexp_replace. You could do a workaround like this:
SELECT
CASE
WHEN last_dot < 2 THEN digits_and_dots
ELSE REPLACE(SUBSTR(digits_and_dots, 1, last_dot - 1), '.') ||
SUBSTR(digits_and_dots, last_dot)
END
FROM (
SELECT
INSTR(digits_and_dots, '.', -1) last_dot,
digits_and_dots
FROM (
SELECT
REGEXP_REPLACE(field_value, '[^0-9\.]+', '') digits_and_dots
FROM DUAL
) t
) o
Here's a way to do it, assuming there is one decimal character. The value you are working with is a string so I think of the decimal that we want to keep as a separator of the string and split it into 2 parts based on that. The first part is all characters leading up to but not including the last decimal, the second part is the last decimal and all characters after it. Then apply the replace, getting rid of everything that is not a number from the first part, and everything that is not a number or a decimal from the second part, then concatenate them together. Needs more testing with varied inputs but you get the idea. All these regular expressions are kind of expensive though so I doubt this will be the fastest solution.
with tbl(str) as (
select 'about 1,000.00' from dual union
select '3,000,000.000' from dual union
select '3.000.000.000' from dual union
select 'a^*3^%*(C4.5d*9' from dual
)
select str original,
regexp_replace(regexp_substr(str, '^(.*)\.', 1, 1, NULL, 1), '[^0-9]+', '') ||
regexp_replace(regexp_substr(str, '.*(\..*)$', 1, 1, NULL, 1), '[^0-9\.]+', '') Converted
from tbl;
SQL> /
ORIGINAL CONVERTED
--------------- ---------------
3,000,000.000 3000000.000
3.000.000.000 3000000.000
a^*3^%*(C4.5d*9 34.59
about 1,000.00 1000.00
SQL>
Shortest way is as follows:
select regexp_substr('a^*3^%*(C4.5d*9s','\d+\.\d+') from dual;
or
select regexp_replace('a^*3^%*(C4.5d*9s', '[^0.0-9]', '') from dual;

Retrieve segment from value

I have this value in my field which have 5 segment for example 100-200-300-400-500.
How do I query to only retrieve the first 3 segment? Which mean the query result will display as 100-200-300.
The old SUBSTR and INSTR will be faster and less CPU intensive as compared to REGEXP.
SQL> WITH DATA AS(
2 SELECT '100-200-300-400-500' str FROM dual
3 )
4 SELECT substr(str, 1, instr(str, '-', 1, 3)-1) str
5 FROM DATA
6 /
STR
-----------
100-200-300
SQL>
The above SUBSTR and INSTR query uses the logic to find the 3rd occurrence of the hyphen "-" and then take the substring from position 1 till the third occurrence of '-'.
((\d)+-(\d)+-(\d)+)
If the Position of this sequence is arbitrary, you might go for REGularEXPressions
select regexp_substr(
'Test-Me 100-200-300-400-500 AGain-Home',
'((\d)+-(\d)+-(\d)+)'
) As Result
from dual
RESULT
-----------
100-200-300
Otherwise Simple SUBSTR will do
you have tow way, the first is substring.
The second is fast, us a REGEXP like this.
REGEXP_SUBSTR('100-200-300-400-500','[[:digit:]]{3}-[[:digit:]]{3}-[[:digit:]]{3}')"REGEXPR_SUBSTR" FROM DUAL;