Inserting special characters inside a string without replacing any character - sql

Suppose I have a varchar variable.Can I insert different symbols after evry 2 character of the string untill end of the string .i.e untill length (string).
For example:
Input: '12345678'
And we don't know what is input and length of input and the output we want is :
Output: '12_34&56#78' (special character can be anything )
Please let me know if it possible doing dynamically using loop or something.

This can be done using a simple REGEXP_REPLACE.
The first argument will be your original string.
The second argument will be your regex pattern you want to put a a character after. In this case it is . (any character) occurring 2 times. It is surrounded by parentheses which will be needed for the next argument to specify a capture group.
The third argument is using \1 to specify the first capture group, then you can put anything you'd like to appear after the capture group. In the example below I used a !.
SELECT REGEXP_REPLACE ('12345678', '(.{2})', '\1!') as str FROM DUAL;
STR
_______________
12!34!56!78!
If you do not want the character at the end of your string, you can TRIM it from the right side or SUBSTR if the special character may appear at the end of the original string.

You can use an hierarchical query during the split and padding a special character through use of DBMS_RANDOM.VALUE in order to produce some special characters except for the last piece, and then LISTAGG() function to combine the pieces back again such as
SELECT id,LISTAGG( SUBSTR(col,1+(level-1)*2,2)||
CASE WHEN level < CEIL(LENGTH(col)/2)
THEN
CHR(TRUNC(DBMS_RANDOM.VALUE(33,48)))
END)
WITHIN GROUP (ORDER BY level)
AS result
FROM t
GROUP BY id
CONNECT BY level <= CEIL(LENGTH(col)/2)
AND PRIOR SYS_GUID() IS NOT NULL
AND PRIOR ID = ID
Demo

Related

Remove template text on regexp_replace in Oracle's SQL

I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
SELECT REGEXP_REPLACE(col,'&&#\d+;')
FROM t
where
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'

Regular expression - capture number between underscores within a sequence between commas

I have a field in a database table in the format:
111_2222_33333,222_444_3,aaa_bbb_ccc
This is format is uniform to the entire field. Three underscore separated numeric values, a comma, three more underscore separated numeric values, another comma and then three underscore separated text values. No spaces in between
I want to extract the middle value from the second numeric sequence, in the example above I want to get 444
In a SQL query I inherited, the regex used is ^.,(\d+)_.$ but this doesn't seem to do anything.
I've tried to identify the first comma, first number after and the following underscore ,222_ to use as a starting point and from there get the next number without the _ after it
This (,\d*_)(\d+[^_]) selects ,222_444 and is the closest I've gotten
We can try using REGEXP_REPLACE with a capture group:
SELECT
REGEXP_REPLACE(
'111_2222_33333,222_444_3,aaa_bbb_ccc',
'^[^,]+,[^_]+_(.*?)_[^_]+,.*$',
'\1') AS num
FROM yourTable;
Here is a demo showing that the above regex' first capture group contains the quantity you want.
Demo

How can I extract a substring from a character column without using SUBSTR()?

I have a questions regarding below data.
You clearly can see each EMP_IDENTIFIER has connected with EMP_ID.
So I need to pull only identifier which is 10 characters that will insert another column.
How would I do that?
I did some traditional way, using INSTR, SUBSTR.
I just want to know is there any other way to do it but not using INSTR, SUBSTR.
EMP_ID(VARCHAR2)EMP_IDENTIFIER(VARCHAR2)
62049 62049-2162400111
6394 6394-1368000222
64473 64473-1814702333
61598 61598-0876000444
57452 57452-0336503555
5842 5842-0000070666
75778 75778-0955501777
76021 76021-0546004888
76274 76274-0000454999
73910 73910-0574500122
I am using Oracle 11g.
If you want the second part of the identifier and it is always 10 characters:
select t.*, substr(emp_identifier, -10) as secondpart
from t;
Here is one way:
REGEXP_SUBSTR (EMP_IDENTIFIER, '-(.{10})',1,1,null,1)
That will give the 1st 10 character string that follows a dash ("-") in your string. Thanks to mathguy for the improvement.
Beyond that, you'll have to provide more details on the exact logic for picking out the identifier you want.
Since apparently this is for learning purposes... let's say the assignment was more complicated. Let's say you had a longer input string, and it had several groups separated by -, and the groups could include letters and digits. You know there are at least two groups that are "digits only" and you need to grab the second such "purely numeric" group. Then something like this will work (and there will not be an instr/substr solution):
select regexp_substr(input_str, '(-|^)(\d+)(-|$)', 1, 2, null, 2) from ....
This searches the input string for one or more digits ( \d means any digit, + means one or more occurrences) between a - or the beginning of the string (^ means beginning of the string; (a|b) means match a OR b) and a - or the end of the string ($ means end of the string). It starts searching at the first character (the second argument of the function is 1); it looks for the second occurrence (the argument 2); it doesn't do any special matching such as ignore case (the argument "null" to the function), and when the match is found, return the fragment of the match pattern included in the second set of parentheses (the last argument, 2, to the regexp function). The second fragment is the \d+ - the sequence of digits, without the leading and/or trailing dash -.
This solution will work in your example too, it's just overkill. It will find the right "digits-only" group in something like AS23302-ATX-20032-33900293-CWV20-3499-RA; it will return the second numeric group, 33900293.

Comparing fields when a field has data in between 2 characters that match the field being compared

I have code that looks like this:
left outer join
gme_batch_header bh
on
substr(ln.lot_number,instr(ln.lot_number,'(') + 1,
instr(ln.lot_number,')') - instr(ln.lot_number,'(') - 1)
=
bh.batch_no
It works fine, but I have come across a few lot numbers that have two sections of strings that are between parenthesis. How would I compare what is between the second set of parenthesis? Here is an example of the data in the lot number field:
E142059-307-SCRAP-(74055)
This one works with the code,
58LF-3-B-2-2-2 (SCRAP)-(61448)
This one tries comparing SCRAP with the batch no, which isn't correct. It needs to be the 61448.
The result is always the last item in parenthesis.
After more research, I actually got it to work with this code:
substr(ln.lot_number,instr(ln.lot_number,'(',-1) + 1, instr(ln.lot_number,')',-1) - instr(ln.lot_number,'(',-1) - 1)
Assuming SQL2005+, and it is always the last occurrence you want, then I would suggest finding the last instance of a ( in your query and substring to there. To get the last instance you could use something like:
REVERSE(SUBSTRING(REVERSE(lot_number),0,CHARINDEX('(',REVERSE(lot_number))))
If your version of Oracle supports regular expressions try this:
substr(regexp_substr(ln.lot_number,'[0-9]+\)$'),1,length(regexp_substr(ln.lot_number,'[0-9]+\)$'))-1)
Explanation:
regexp_substr(scrap_row,'[0-9]+\)$' ==> find me just numbers in the string that ends in ). This returns the numbers but it includes the closing parenthesis.
To remove the closing parenthsis, just send it through substring and extract first number through the length of the number stopping at 1 character from the end of the string.
Query for analysis:
with scrap
as (select '58LF-3-B-2-2-2 (SCRAP)-(61448)' as scrap_row from dual)
select scrap_row,
regexp_substr(scrap_row,'[0-9]+\)$') as regex_substring,
length(regexp_substr(scrap_row,'[0-9]+\)$')) as length_regex_substring,
substr(regexp_substr(scrap_row,'[0-9]+\)$'),1,length(regexp_substr(scrap_row,'[0-9]+\)$'))-1) as regex_sans_parenthesis
from scrap
If you have 11g, this will do it pretty simply by using the subgroup argument of regexp_substr() and constructing the regex appropriately:
SQL> with tbl(data) as
(
select 'E142059-307-SCRAP-(74055)' from dual
union
select '58LF-3-B-2-2-2 (SCRAP)-(61448)' from dual
)
select data from tbl
where regexp_substr(data, '\((\d+)\)$', 1, 1, NULL, 1)
= '61448';
DATA
------------------------------
58LF-3-B-2-2-2 (SCRAP)-(61448)
The regular expression can be read as:
\( - Search for a literal left paren
( - Start a remembered subgroup
\d+ - followed by 1 more more digits
) - End remembered subgroup
\) - followed by a literal right paren
$ - at the end of the line.
The regexp_substr function arguments are:
Source - the source string
Pattern - The regex pattern to look for
position - Position in the string to start looking for the pattern
occurrence - If the pattern occurs multiple times, which occurrence you want
match_params - See the docs, not used here
subexpression - which subexpression to use (the remembered group)
So in English, look for a series of 1 or more digits surrounded by parens, where it occurs at the end of the line and save the digit part only to use to compare. IMHO a lot easier to follow/maintain than nested instr(), substr().
For re-useability, make a function called get_last_number_in_parens() that contains this code and uses an argument of the string to search. This way that logic is encapsulated and can be re-used by folks that may not be so comfortable with regular expressions, but can benefit from the power! One place to maintain code too. Then call like this:
select data from tbl
where get_last_number_in_parens(data) = '61448';
How easy is that?!
Hello you can check with this code. It works whaever the condition may be
SELECT SUBSTR('58LF-3-B-2-2-2-(61448)',instr('58LF-3-B-2-2-2-(61448)','(',-1)+1,LENGTH('58LF-3-B-2-2-2-(61448)')-instr('58LF-3-B-2-2-2-(61448)','(',-1)-1)
FROM dual;
SELECT SUBSTR('58LF-3-B-2-2-2 (SCRAP)-(61448)',instr('58LF-3-B-2-2-2 (SCRAP)-(61448)','(',-1)+1,LENGTH('58LF-3-B-2-2-2 (SCRAP)-(61448)')-instr('58LF-3-B-2-2-2 (SCRAP)-(61448)','(',-1)-1)
FROM dual;
Output
==================================
61448
==================================

How to extract group from regular expression in Oracle?

I got this query and want to extract the value between the brackets.
select de_desc, regexp_substr(de_desc, '\[(.+)\]', 1)
from DATABASE
where col_name like '[%]';
It however gives me the value with the brackets such as "[TEST]". I just want "TEST". How do I modify the query to get it?
The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.
In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;
Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.
10g does not appear to have this option, but in your case you can achieve the same result with:
select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);
since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)
You need to do a replace and use a regex pattern that matches the whole string.
select regexp_replace(de_desc, '.*\[(.+)\].*', '\1') from DATABASE;