What is the optimum SQL for producing the below output? - sql

The input column has comma separated integer values like
Sample input 1)
1,2,200000,2345323,1200000
Sample Output 1)
1,2,2345323
Sample input 2)
546^515,400000,657180,3
Sample Output 2)
546^515, 657180,3
The output string should filter out all integers which have "5" trailing zeros.

Excising the five-zero numbers is straightforward with regexp_replace(). But it seems we also need to tidy up the commas left behind. This solution has another regexp_replace() to catch the ,, when the excised number is inside the string, and a simple trim() for when the number is the first or last in the string:
with cte as (
select '1,2,200000,2345323,1200000' as str from dual union all
select '546^515,400000,657180,3' as str from dual union all
select '546^515,400000,1200000,657180,3' as str from dual union all
select '54000000,400000,1200000,657180,300000000' as str from dual
)
select trim(both ',' from regexp_replace(regexp_replace(str, '([0-9]+)00000', null),',(,+)',',')) as new_str
from cte
/

Related

Oracle replace some duplicated characters (non digits )

anyone can help me to build proper syntax for regexp_replace to remove any multiplicated non-digits and non-letters from string ? If digit/letter is multiplicated - it is not changed
eg.
source and expected result:
'ABBC000001223, ABC00000212,,, '
'ABBC000001223, ABC00000212, '
(removed second occurance of space after comma and second and third comma )
Use this REGEXP_REPLACE to match any non alphanumeric character in the first group
([^[:alnum:]])
followed by one or more same charcters (group 1)
([^[:alnum:]])(\1)+
and replace it with the original character (group 1)
I added some other data to demonstrate the result
with dta as (
select 'ABBC000001223, ABC00000212,,, ' txt from dual union all
select ',.,;,;;;;,,,,,,,,,,,,#''++`´' txt from dual union all
select 'ABBC000001223ABC00000212' txt from dual)
select txt,
regexp_replace(txt,'([^[:alnum:]])(\1)+', '\1') result
from dta
TXT
-------------------------------
RESULT
--------------------------------
ABBC000001223, ABC00000212,,,
ABBC000001223, ABC00000212,
,.,;,;;;;,,,,,,,,,,,,#'++`´
,.,;,;,#'+`´
ABBC000001223ABC00000212
ABBC000001223ABC00000212

How to get first string after character Oracle SQL

I'm trying to get first string after a character.
Example is like
ABCDEF||GHJ||WERT
I need only
GHJ
I tried to use REGEXP but i couldnt do it.
Can anyone help me with please?
Thank you
Somewhat simpler:
SQL> select regexp_substr('ABCDEF||GHJ||WERT', '\w+', 1, 2) result from dual;
^
RES |
--- give me the 2nd "word"
GHJ
SQL>
which reads as: give me the 2nd word out of that string. Won't work properly if GHJ consists of several words (but that's not what your example suggests).
Something like I interpret with a separator in place, In this case it is || or | example is with oracle database
-- pattern -- > [^] represents non-matching character and + for says one or more character followed by ||
-- 3rd parameter --> starting position
-- 4th parameter --> nth occurrence
WITH tbl(str) AS
(SELECT 'ABCDEF||GHJ||WERT' str FROM dual)
SELECT regexp_substr(str
,'[^||]+'
,1
,2) output
FROM tbl;
I think the most general solution is:
WITH tbl(str) AS (
SELECT 'ABCDEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABC|DEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABClDEF||GHJ||WERT' str FROM dual
)
SELECT regexp_replace(str, '^.*\|\|(.*)\|\|.*', '\1')
FROM tbl;
Note that this works even if the individual elements contain punctuation or a single vertical bar -- which the other solutions do not. Here is a comparison.
Presumably, the double vertical bar is being used for maximum flexibility.
You should use regexp_substr function
select regexp_substr('ABCDEF||GHJ||WERT ', '\|{2}([^|]+)', 1, 1, 'i', 1) str
from dual;
STR
---
GHJ

USING SQL . extract numbers comma separated from string 'HEADER|N1000|E1001|N1002|E1003|N1004|N1005'

'HEADER|N1000|E1001|N1002|E1003|N1004|N1005'
'HEADER|N156|E1|N7|E122|N4|E5'
'HEADER|E0|E1|E2|E3|E4|E5'
'HEADER|N0|N1|N2|N3|N4|N5'
'HEADER|N125'
How to extract the numbers in comma-separated format from this stringS?
Expected result:
1000,1001,1002,1003,1004,1005
How to extract the numbers with N or E as suffix/prefix ie.
N1000
Expected result:
1000,1002,1004,1005
below regex does not return the result needed. But I want some thing like this
select REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005', '.*?(\d+)', '\1,'), ',?\.*$', '') from dual
the problem here is
when i want numbers with E OR N
select REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005', '.*?N(\d+)', '\1,'), ',?\.*$', '') from dual
select REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005', '.*?E(\d+)', '\1,'), ',?\.*$', '') from dual
they give good results for this scenerio
but when i input 'HEADER|N1000|E1001' it gives wrong answer plzzz verify and correct it
Update
Based on the changes to the question, the original answer is not valid. Instead, the solution is considerably more complex, using a hierarchical query to extract all the numbers from the string and then LISTAGG to put back together a list of numbers extracted from each string. To extract all numbers we use this query:
WITH cte AS (
SELECT DISTINCT data, level AS l, REGEXP_SUBSTR(data, '[NE]\d+', 1, level) AS num FROM test
CONNECT BY REGEXP_SUBSTR(data, '[NE]\d+', 1, level) IS NOT NULL
)
SELECT data, LISTAGG(SUBSTR(num, 2), ',') WITHIN GROUP (ORDER BY l) AS "All numbers"
FROM cte
GROUP BY data
Output (for the new sample data):
DATA All numbers
HEADER|E0|E1|E2|E3|E4|E5 0,1,2,3,4,5
HEADER|N0|N1|N2|N3|N4|N5 0,1,2,3,4,5
HEADER|N1000|E1001|N1002|E1003|N1004|N1005 1000,1001,1002,1003,1004,1005
HEADER|N125 125
HEADER|N156|E1|N7|E122|N4|E5 156,1,7,122,4,5
To select only numbers beginning with E, we modify the query to replace the [EN] in the REGEXP_SUBSTR expressions with just E i.e.
SELECT DISTINCT data, level AS l, REGEXP_SUBSTR(data, 'E\d+', 1, level) AS num FROM test
CONNECT BY REGEXP_SUBSTR(data, 'E\d+', 1, level) IS NOT NULL
Output:
DATA E-numbers
HEADER|E0|E1|E2|E3|E4|E5 0,1,2,3,4,5
HEADER|N0|N1|N2|N3|N4|N5
HEADER|N1000|E1001|N1002|E1003|N1004|N1005 1001,1003
HEADER|N125
HEADER|N156|E1|N7|E122|N4|E5 1,122,5
A similar change can be made to extract numbers commencing with N.
Demo on dbfiddle
Original Answer
One way to achieve your desired result is to replace a string of characters leading up to a number with that number and a comma, and then replace any characters from the last ,| to the end of string from the result:
SELECT REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005|', '.*?(\d+)', '\1,'), ',?\|.*$', '') FROM dual
Output:
1000,1001,1002,1003,1004,1005
To only output the numbers beginning with N, we add that to the prefix string before the capture group:
SELECT REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005|', '.*?N(\d+)', '\1,'), ',?\|.*$', '') FROM dual
Output:
1000,1002,1004,1005
To only output the numbers beginning with E, we add that to the prefix string before the capture group:
SELECT REGEXP_REPLACE(REGEXP_REPLACE('HEADER|N1000|E1001|N1002|E1003|N1004|N1005|', '.*?E(\d+)', '\1,'), ',?\|.*$', '') FROM dual
Output:
1001,1003
Demo on dbfiddle
I don't know what DBMS you are using, but here's one way to do it in Postgres:
WITH cte AS (
SELECT CAST('HEADER|N1000|E1001|N1002|E1003|N1004|N1005|' AS VARCHAR(1000)) AS myValue
)
SELECT SUBSTRING(MyVal FROM 2)
FROM (
SELECT REGEXP_SPLIT_TO_TABLE(myValue,'\|') MyVal
FROM cte
) src
WHERE SUBSTRING(MyVal FROM 1 FOR 1) = 'N'
;
SQL Fiddle
As Far as I have understood the question , you want to extract substrings starting with N from the string, You can try following (And then you can merge the output seperated by commas if needed)
select REPLACE(value, 'N', '')
from STRING_SPLIT('HEADER|N1000|E1001|N1002|E1003|N1004|N1005|', '|')
where value like 'N%'
OutPut :
1000
1002
1004
1005

Oracle SQL query to convert a string into a comma separated string with comma after every n characters

How can we convert a string of any length into a comma separated string with comma after every n characters. I am using Oracle 10g and above. I tried with REGEXP_SUBSTR but couldn't get desired result.
e.g.: for below string comma after every 5 characters.
input:
aaaaabbbbbcccccdddddeeeeefffff
output:
aaaaa,bbbbb,ccccc,ddddd,eeeee,fffff,
or
aaaaa,bbbbb,ccccc,ddddd,eeeee,fffff
Thanks in advance.
This can be done with regexp_replace, like so:
WITH sample_data AS (SELECT 'aaaaabbbbbcccccdddddeeeeefffff' str FROM dual UNION ALL
SELECT 'aaaa' str FROM dual UNION ALL
SELECT 'aaaaabb' str FROM dual)
SELECT str,
regexp_replace(str, '(.{5})', '\1,')
FROM sample_data;
STR REGEXP_REPLACE(STR,'(.{5})','\
------------------------------ --------------------------------------------------------------------------------
aaaaabbbbbcccccdddddeeeeefffff aaaaa,bbbbb,ccccc,ddddd,eeeee,fffff,
aaaa aaaa
aaaaabb aaaaa,bb
The regexp_replace simply looks for any 5 characters (.{5}), and then replaces them with the same 5 characters plus a comma. The brackets around the .{5} turn it into a labelled subexpression - \1, since it's the first set of brackets - which we can then use to represent our 5 characters in the replacement section.
You would then need to trim the extra comma off the resultant string, if necessary.
SELECT RTRIM ( REGEXP_REPLACE('aaaaabbbbbcccccdddddeeeeefffff', '(.{5})' ,'\1,') ,',') replaced
FROM DUAL;
This worked for me:
WITH strlen AS
(
SELECT 'aaaaabbbbbcccccdddddeeeeefffffggggg' AS input,
LENGTH('aaaaabbbbbcccccdddddeeeeefffffggggg') AS LEN,
5 AS part
FROM dual
)
,
pattern AS
(
SELECT regexp_substr(strlen.input, '[[:alnum:]]{5}', 1, LEVEL)
||',' AS line
FROM strlen,
dual
CONNECT BY LEVEL <= strlen.len / strlen.part
)
SELECT rtrim(listagg(line, '') WITHIN GROUP (
ORDER BY 1), ',') AS big_bang$
FROM pattern ;

regexp_replace / Substr extract string before and after dashes Oracle

I have a string and I am not able to extract the single characters which is bounded by dashes. I wrote Replace(REGEXP_SUBSTR(string,.*-[[:alnum:]]-'),'-') but it is not giving the expected output.
I need,
XTT-D-X-K-345ROCKVIEW-CA Output = > D X K
RT-5-345REDE Output = > 5
FT-5-3-345HOTELWI Output = > 5 3
But I am getting
XTT-D-X-K-
RT-5-
FT-5-3-
I need to add something which I am not able to figure out.Maybe it can be done with just using regexp instead of using replace on regexp.
try use this:
SELECT Replace(REGEXP_SUBSTR(str, '\-([[:alnum:]]\-)+'), '-', ' ') as outstr
FROM (SELECT 'XTT-D-X-K-345ROCKVIEW-CA' AS str FROM dual
UNION ALL SELECT 'RT-5-345REDE' AS str FROM dual
UNION ALL SELECT 'FT-5-3-345HOTELWI' AS str FROM dual