Remove 2 characters in oracle sql - sql

I have a column that contains 12 digits but user wants only to generate a 10 digits.
I tried the trim, ltrim function but nothing work. Below are the queries I tried.
ltrim('10', 'column_name')
ltrim('10', column_name)
ltrim(10, column_name)
For example I have a column that contains a 12 digit number
100000000123
100000000456
100000000789
and the expected result I want is
0000000123
0000000456
0000000789

To extract the last 10 characters of an input string, regardless of how long the string is (so this will work if some inputs have 10 characters, some 12, and some 15 characters), you could use negative starting position in substr:
substr(column_name, -10)
For example:
with
my_table(column_name) as (
select '0123401234' from dual union all
select '0001112223334' from dual union all
select '12345' from dual union all
select '012345012345' from dual
)
select column_name, substr(column_name, -10) as substr
from my_table;
COLUMN_NAME SUBSTR
------------- ----------
0123401234 0123401234
0001112223334 1112223334
12345
012345012345 2345012345
Note in particular the third example. The input has only 5 digits, so obviously you can't get a 10 digit number from it. The result is NULL (undefined).
Note also that if you use something like substr(column_name, 3) you will get just '345' in that case; most likely not the desired result.

try to use SUBSTR(column_name, 2)

Related

Oracle REGEXP_REPLACE for both space and "%" at the same time

I have following Oracle SQL code:
SELECT TO_NUMBER(TRIM(REGEXP_REPLACE(per_growth, '(%)(\s)')),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', ''') AS per_growth
FROM sometable;
This code supposed to look for percentage sign first then space and exclude them from the result. However, it is showing
ORA-01722: invalid number
error. I am learning sql yet and do not know exact cause. Is it something went wrong with (%)(\s)? The value in the table is 50%
You can use TRANSLATE to get rid of all instances of unwanted characters:
SELECT TO_NUMBER(
TRANSLATE(per_growth, '0% ', '0'),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', '''
) as per_growth
FROM sometable;
Note: TRANSLATE(expr, from_string, to_string) works by swapping all instances of the characters in from_string with the corresponding characters in to_string and if there are more characters in from_string than to_string then the remaining characters are removed. It is faster than using regular expressions and on a par with using REPLACE but it can handle multiple replacements at once, which REPLACE cannot.
If you did want to use the slower REGEXP_REPLACE then you can replace all whitespace characters and all percent characters, whether together or not, using:
SELECT TO_NUMBER(
REGEXP_REPLACE(per_growth, '[%[:space:]]'),
'FM99999999999999999990D099999999999999999',
'NLS_NUMERIC_CHARACTERS = '', '''
) as per_growth
FROM sometable;
Which, for the sample data:
CREATE TABLE sometable (per_growth) AS
SELECT '1%' FROM DUAL UNION ALL
SELECT '%2' FROM DUAL UNION ALL
SELECT '3 %' FROM DUAL UNION ALL
SELECT '4% ' FROM DUAL UNION ALL
SELECT '5,0%' FROM DUAL UNION ALL
SELECT '%%% 123 456 789,0123456 %' FROM DUAL;
Both output:
PER_GROWTH
1
2
3
4
5
123456789.0123456
db<>fiddle here
Did you try good, old REPLACE?
select replace(replace(per_growth, '%', ''), ' ', '') as result
from your_table
You can use
REGEXP_REPLACE(per_growth, '( )(%)')
in order to get rid of % sign and whitespace(s) together
or
TRIM(REPLACE(per_growth,'%'))
to get rid of % sign first, and then leading and trailing spaces next,
before numerical conversion.

Retrieve the characters before a matching pattern

135 ;1111776698 ;AB555678765
I have the above string and what I am looking for is to retrieve all the digits before the first occurrence of ;.
But the number of characters before the first occurrence of ; varies i.e. it may be a 4 digit number or 3 digit number.
I have played with regex_instr and instr, but I unable to figure this out.
The query should return all the digits before the first occurrence of ;
This answer assumes that you are using Oracle database. I don't know of way to do this using REGEX_INSTR alone, but we can do with REGEXP_REPLACE using capture groups:
SELECT REGEXP_REPLACE('135 ;1111776698 ;AB555678765', '^\s*(\d{3,4})\s*;.*', '\1')
FROM dual;
Demo
Here is the regex pattern being used:
^\s*(\d{3,4})\s*;.*
This allows, from the start of the string, any amount of leading whitespace, followed by a 3 or 4 digit number, followed again by any amount of whitespace, then a semicolon. The .* at the end of the pattern just consumes whatever remains in your string. Note (\d{3,4}), which captures the 3-4 digit number, which is then available in the replacement as \1.
Using INSTR,SUBTSR and TRIM should work ( based on your comment that there are "just white spaces and digits" )
select TRIM(SUBSTR(s,1, INSTR(s,';')-1)) FROM t;
Demo
The following using regexp_substr() should work:
SELECT s, REGEXP_SUBSTR(s, '^[^;]*')
Make sure you try all possible values in that first position, even those you don't expect and make sure they are handled as you want them to be. Always expect the unexpected! This regex matches the first subgroup of zero or more optional digits (allows a NULL to be returned) when followed by an optional space then a semi-colon, or the end of the line. You may need to tighten (or loosen) up the matching rules for your situation, just make sure to test even for incorrect values, especially if the input comes from user-entered data.
with tbl(id, str) as (
select 1, '135 ;1111776698 ;AB555678765' from dual union all
select 2, ' 135 ;1111776698 ;AB555678765' from dual union all
select 3, '135;1111776698 ;AB555678765' from dual union all
select 4, ';1111776698 ;AB555678765' from dual union all
select 5, ';135 ;1111776698 ;AB555678765' from dual union all
select 6, ';;1111776698 ;AB555678765' from dual union all
select 7, 'xx135 ;1111776698 ;AB555678765' from dual union all
select 8, '135;1111776698 ;AB555678765' from dual union all
select 9, '135xx;1111776698 ;AB555678765' from dual
)
select id, regexp_substr(str, '(\d*?)( ?;|$)', 1, 1, NULL, 1) element_1
from tbl
order by id;
ID ELEMENT_1
---------- ------------------------------
1 135
2 135
3 135
4
5
6
7 135
8 135
9
9 rows selected.
To get the desired result, you should use REGEX_SUBSTR as it will substring your desired data from the string you give. Here is the example of the Query.
Solution to your example data:
SELECT REGEXP_SUBSTR('135 ;1111776698 ;AB555678765','[^;]+',1,1) FROM DUAL;
So what it does, Regex splits the string on the basis of ; separator. You needed the first occurrence so I gave arguments as 1,1.
So if you need the second string 1111776698 as your output you can give an argument as 1,2.
The syntax for Regexp_substr is as following:
REGEXP_SUBSTR( string, pattern [, start_position [, nth_appearance [, match_parameter [, sub_expression ] ] ] ] )
Here is the link for more examples:
https://www.techonthenet.com/oracle/functions/regexp_substr.php
Let me know if this works for you. Best luck.

Number check in oracle sql

How to check in 10 digit number whether it contain 999 or 000 in the 4-6th bytes ?
I have a n idea with using INSTR but i don't know how to execute it
This is strange. If the "number" is really a string, then you can use like or substr():
where col like '___999%' or col like '___000%'
or:
where substr(col, 4, 3) in ('999', '000')
or even regular expressions.
Given the nature of your question, you can turn a number into a string and use these methods. However, if you are looking at particular digits, then the "number" should be stored as a string.
If they are actually numbers rather than strings then you could use numeric manipulation:
with t (n) as (
select 1234567890 from dual
union all select 1239997890 from dual
union all select 1230007890 from dual
union all select 1299967890 from dual
union all select 1234000890 from dual
)
select n,
mod(n, 10000000) as stage1,
mod(n, 10000000)/10000 as stage2,
trunc(mod(n, 10000000)/10000) as stage3,
case when trunc(mod(n, 10000000)/10000) in (0, 999) then 'Yes' else 'No' end as matches
from t;
N STAGE1 STAGE2 STAGE3 MATCHES
---------- ---------- ---------- ---------- -------
1234567890 4567890 456.789 456 No
1239997890 9997890 999.789 999 Yes
1230007890 7890 .789 0 Yes
1299967890 9967890 996.789 996 No
1234000890 4000890 400.089 400 No
Stage 1 effectively strips off the first three digits. Stage two almost strips off the last four digits, but leaves fractions, so stage 3 adds trunc() (you could also use floor()) to ignore those fractional parts.
The result of that is the numeric value of the 4-6th digits, and you can then test if that is 0, 999 or something else.
This is really looking at the 4th to 6th most significant digits, which is the same if the number is always 10 digits; if it might actually have different numbers of digits then you'd need to clarify what you want to see.
select
1 from dual where instr(98800054542,000,4,3)in (6) or instr(98800054542,999,4,3)in (6); let me know if this helped.

Query to remove all non-digit but only keep last period/dot

Struggle to design a regular expression to filter field value from varchar2 to number, so that it can remove all non-digit and only left the last period in the string, so that
"about 1,000.00" return 1000.00 or 1000
"3,000,000.000" return 300000.000 or 3000000
"3.000.000.000" return return 3000000.000 or 3000000
"a^*3^%*(C4.5d*9" return 34.59
Any method just change the string into accurate convertible string that can be converted by to_number()
I use
SELECT REGEXP_REPLACE(field_value, '[^0-9\.]+', '') from dual;
but can't resolve the 3rd case....
Because the regex in oracle are somewhat limited I don't think it's possible only using regexp_replace. You could do a workaround like this:
SELECT
CASE
WHEN last_dot < 2 THEN digits_and_dots
ELSE REPLACE(SUBSTR(digits_and_dots, 1, last_dot - 1), '.') ||
SUBSTR(digits_and_dots, last_dot)
END
FROM (
SELECT
INSTR(digits_and_dots, '.', -1) last_dot,
digits_and_dots
FROM (
SELECT
REGEXP_REPLACE(field_value, '[^0-9\.]+', '') digits_and_dots
FROM DUAL
) t
) o
Here's a way to do it, assuming there is one decimal character. The value you are working with is a string so I think of the decimal that we want to keep as a separator of the string and split it into 2 parts based on that. The first part is all characters leading up to but not including the last decimal, the second part is the last decimal and all characters after it. Then apply the replace, getting rid of everything that is not a number from the first part, and everything that is not a number or a decimal from the second part, then concatenate them together. Needs more testing with varied inputs but you get the idea. All these regular expressions are kind of expensive though so I doubt this will be the fastest solution.
with tbl(str) as (
select 'about 1,000.00' from dual union
select '3,000,000.000' from dual union
select '3.000.000.000' from dual union
select 'a^*3^%*(C4.5d*9' from dual
)
select str original,
regexp_replace(regexp_substr(str, '^(.*)\.', 1, 1, NULL, 1), '[^0-9]+', '') ||
regexp_replace(regexp_substr(str, '.*(\..*)$', 1, 1, NULL, 1), '[^0-9\.]+', '') Converted
from tbl;
SQL> /
ORIGINAL CONVERTED
--------------- ---------------
3,000,000.000 3000000.000
3.000.000.000 3000000.000
a^*3^%*(C4.5d*9 34.59
about 1,000.00 1000.00
SQL>
Shortest way is as follows:
select regexp_substr('a^*3^%*(C4.5d*9s','\d+\.\d+') from dual;
or
select regexp_replace('a^*3^%*(C4.5d*9s', '[^0.0-9]', '') from dual;

Querying substrings against a list of values

I'm reading from a dataset that I unfortunately don't have the access to modify. It has concatenated strings of values, and I want to select records for which any of those substrings (as split by a given character) matches any of the values in a specific list. I'll be passing the queries in via Python, so it won't be compared against a static list.
For example, the table looks like:
CrappyColumn
-----------
1;2
4
1
2;1
1;3
2
And I might want to return anything that has 2 or 4 in it. So, my result should be:
1;2
4
2
2;1
I have played with regexp_substr and gotten something that actually works; however, it just runs indefinitely (as much as 10 minutes before I give up) when I run it on the full dataset (which only includes about three thousand records with values that are often a couple hundred characters long). I need something that works in a reasonable amount of time for repeated execution.
I realize that--even with a variable comparison list--I could just write my Python code to parse the list and construct multiple LIKE statements, but that seems inefficient, and I assume that there is a better way.
And here's what I've done that takes too long:
SELECT DISTINCT CrappyColumn
FROM
(SELECT DISTINCT CrappyColumn, regexp_substr(CrappyColumn, '[^;]+', 1, LEVEL) as UGH
FROM CrappyTable
CONNECT BY regexp_substr(CrappyColumn, '[^;]+', 1, LEVEL) IS NOT NULL)
WHERE UGH IN ('2', '4')
Is there a better, faster, cleaner way to accomplish this?
EDIT - RESOLUTION:
Thanks to vkp's help, here is what I implemented:
regexp_like(SITE_ID, '^(2|4)(:)|(:)(2|4)(:)|(:)(2|4)$|^(2|4)$')
I modified it for my final product, so that it can handle strings of more than one character--by changing [2|4] to (2|4). This works in cases of searching for numbers that aren't single-digit.
You can use like:
select t.*
from crappytable t
where ';' || crappycolumn || ';' like '%;2;%' or
';' || crappycolumn || ';' like '%;4;%';
You seem to know that storing lists of values in a single column is a bad idea, so I'll spare the harangue ;)
EDIT:
If you don't like like, you can use regexp_like() like this:
where regexp_like(';' || crappycolumn || ';', ';2;|;4;')
A simpler method would be to use regexp_like to check if the list has 2 or 4 in it.
select *
from tablename
where regexp_like(crappycolumn,'^[2|4][^0-9]|[^0-9][2|4][^0-9]|[^0-9][2|4]$|^[2|4]$')
^[2|4][^0-9] - Starts with 2 or 4 not followed by a digit.
[^0-9][2|4][^0-9] - 2 or 4 not succeeded or preceded by a digit.
[^0-9][2|4]$ - Ends with 2 or 4 not preceded by a digit.
^[2|4]$ - 2 or 4 is the only character in the string.
Another form of regexp_like(). This regex looks for 2 or 4 only when proceeded by the beginning of the line or a semi-colon and when followed by a semi-colon or the end of the line:
SQL> with crappy_tbl(crappy_col) as (
select '1;2' from dual union
select '4' from dual union
select '1' from dual union
select '2;1' from dual union
select '1;3' from dual union
select '2' from dual union
select '22;;44;' from dual
)
select crappy_col
from crappy_tbl
where regexp_like(crappy_col, '(^|;)(2|4)(;|$)');
CRAPPY_
-------
1;2
2
2;1
4
SQL>