Count characters and their position in a string - sql

I have a string like '###ABC##DE'. I need to find out the position of '#' and then how many '#' after that. For example in this case output should be
position count
======= ======
1 3
7 2
Is it possible to do it in sql statement or do we need plsql block here?

Here's a way of doing it in a SQL statement:
WITH sample_data AS (SELECT '###ABC##DE' str FROM dual UNION ALL
SELECT 'A#B#C#D#E##' str FROM dual UNION ALL
SELECT 'ABCDE' str FROM dual)
SELECT str,
NVL(length(regexp_substr(str, '#+', 1, LEVEL)), 0) num_hashes,
regexp_instr(str, '#+', 1, LEVEL) hash_pos
FROM sample_data
CONNECT BY regexp_substr(str, '#+', 1, LEVEL) IS NOT NULL
AND PRIOR str = str
AND PRIOR sys_guid() IS NOT NULL;
STR NUM_HASHES HASH_POS
----------- ---------- ----------
###ABC##DE 3 1
###ABC##DE 2 7
A#B#C#D#E## 1 2
A#B#C#D#E## 1 4
A#B#C#D#E## 1 6
A#B#C#D#E## 1 8
A#B#C#D#E## 2 10
ABCDE 0 0
What this does is use a hierarchical query (that's the connect by part) to go through the string and search for 1 or more # characters. It will output a row for each group.

Related

How do I extract the first 3 consonates from a string field SQL?

how can I extract from a field in records that contain names only the first 3 consonants and if a name does not have 3 consonants it adds the first vowel of the name?
For example, if I had the following record in the People table:
Field:Name
VALUE:Richard result=> RCH
FIELD:Name
VALUE:Paul result=> PLA
Here's one option; read comments within code.
Sample data:
SQL> with test (name) as
2 (select 'Richard' from dual union all
3 select 'Paul' from dual
4 ),
Query begins here:
5 temp as
6 -- val1 - consonants; val2 - vowels
7 (select
8 name,
9 translate(upper(name), '#AEIOU', '#') val1,
10 translate(upper(name), '#BCDFGHJKLMNPQRSTWXYZ', '#') val2
11 from test
12 )
13 -- finally: if there are enough consonants (val1's length is >= 3), return the first 3
14 -- letters (that's WHEN).
15 -- Otherwise, add as many vowels as necessary (that's what ELSE does)
16 select name,
17 case when length(val1) >= 3 then substr(val1, 1, 3)
18 else val1 || substr(val2, 1, 3 - length(val1))
19 end result
20 from temp;
NAME RESULT
------- --------------
Richard RCH
Paul PLA
SQL>
Just for fun using regexp:
select
name
,substr(
regexp_replace(
upper(name)
,'^([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*).*'
,'\2\4\6\1\3\5'
),1,3) as result
from test;
([AEIOU]*) - is a group of vowels, 0 or more characters
([^AEIOU]*) - is a group of not-vowels (or consonants in this case), 0 or more characters
so this regexp looks for a pattern (vowels1)(consonants1)(vowels2)(consonants2)(vowels3)(consonants3) and reorders it to (consonants1)(consonants2)(consonants3)(vowels1)(vowels2)(vowels3)
then we just take first 3 characters from the reordered string
Full test case:
with test (name) as
(select 'Richard' from dual union all
select 'Paul' from dual union all
select 'Annete' from dual union all
select 'Anny' from dual union all
select 'Aiua' from dual union all
select 'Isaiah' from dual union all
select 'Sue' from dual
)
select
name
,substr(
regexp_replace(
upper(name)
,'^([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*).*'
,'\2\4\6\1\3\5'
),1,3) as result
from test;
NAME RESULT
------- ------------
Richard RCH
Paul PLA
Annete NNT
Anny NNY
Aiua AIU
Isaiah SHI
Sue SUE
7 rows selected.

Oracle SQL splitting a string field on character count and append a letter at the end

I am facing a situation where I have a Oracle database table String field where I want to append a letters at beggining and at end with size max length 5.
eg- The original string = AAAAABBBBBCCCCCDDDDD
out put string =EAAAAAFEBBBBDFECCCCCFEDDDDDF
Here after character size of 5 is identified E is appended at the beginning and F at the end. how can I achieve this using oracle sql select query?
Thanks
Assuming that your output string has a typo and that it should be 'EAAAAAFEBBBBBFECCCCCFEDDDDDF', this could be a way:
SELECT sys_connect_by_path( substr(x, 1 + (level-1)*5, 5) || 'F', 'E')
FROM (select 'AAAAABBBBBCCCCCDDDDD' x from dual) text
where connect_by_isleaf = 1
CONNECT BY level * 5 <= length(x)
How it works:
the CONNECT BY impements recursion and the condition level * 5 <= length(x) makes it stop when level (the number of iteration) is big enough to comsume all the string.
With this string, you have 4 groups of 5 chcracters each, so you get 4 iterations:
SQL> select level
2 FROM (select 'AAAAABBBBBCCCCCDDDDD' x from dual) text
3 CONNECT BY level * 5 <= length(x);
LEVEL
----------
1
2
3
4
Once you have recursion, at each iteration you need to get the nth group of characters:
SQL> select level, substr(x, 1 + (level-1)*5, 5)
2 FROM (select 'AAAAABBBBBCCCCCDDDDD' x from dual) text
3 CONNECT BY level * 5 <= length(x);
LEVEL SUBSTR(X,1+(LEVEL-1)*5,5)
---------- --------------------------------
1 AAAAA
2 BBBBB
3 CCCCC
4 DDDDD
Now you need a way to concatenate all these substrings, adding 'E' and 'F'; here I use sys_connect_by_path by concatenating the 'F' at the end and using the 'E' as separator:
SQL> select level, sys_connect_by_path( substr(x, 1 + (level-1)*5, 5) || 'F', 'E') result
2 FROM (select 'AAAAABBBBBCCCCCDDDDD' x from dual) text
3 CONNECT BY level * 5 <= length(x);
LEVEL RESULT
---------- ------------------------------
1 EAAAAAF
2 EAAAAAFEBBBBBF
3 EAAAAAFEBBBBBFECCCCCF
4 EAAAAAFEBBBBBFECCCCCFEDDDDDF
Last step, you just need the "last" row, that is the leaf of the tree generated by recursion, so you add
where connect_by_isleaf = 1
To handle the case in which the latest group of characters is less than 5, you can edit the CONNECT BY clause into:
CONNECT BY (level-1) * 5 < length(x);

how to split string with multiple special characters in oracle

I have a string with 2 special characters as below
String :'PAN~HLASD4564P|VOTER_ID~VDD3455355'
I want output in 2 columns as below :
ID_TYPE VALUE
------- ------
PAN HLASD4564P
VOTER_ID VDD3455355
enter image description here
You can use CONNECT BY and REGEXP_SUBSTR as follows:
SQL> WITH YOUR_DATA AS (
2 SELECT 'PAN~HLASD4564P|VOTER_ID~VDD3455355' AS STR
3 FROM DUAL
4 ) -- Your query starts from here
5 SELECT
6 REGEXP_SUBSTR(NEW_STR, '[^~]+', 1, 1) AS ID,
7 REGEXP_SUBSTR(NEW_STR, '[^~]+', 1, 2) AS VALUE
8 FROM
9 (
10 SELECT REGEXP_SUBSTR(STR, '[^|]+', 1, LEVEL) NEW_STR
11 FROM YOUR_DATA
12 CONNECT BY LEVEL <= 2
13 );
ID VALUE
---------- -------------
PAN HLASD4564P
VOTER_ID VDD3455355
SQL>

How to get tree hierarchy for a child node in oracle?

I need help in writing in a hierarchical query to get the parent path for a child node. I am trying to use function sys_connect_by_path but I am unable to do so because the result of the function having parent title's exceeds max char limit for a column(4000 chars). So I need to hold the path in a custom collection or into a clob which I am finding difficult to come up with.
example:
contentid - parentid
0 - null
1 - 0
2 - 1
3 - 2
4 - 2
5 - 6
6 - 3
7 - 6
Expected result:
contentid - Expected result set
0 - null
1 - 0
2 - 1,0
3 - 2,1,0
4 - 2,1,0
5 - 6,3,2,1,0
6 - 3,2,1,0
7 - 6,3,2,1,0
Query to get parent path for a child node into a column
SELECT CHILD_ID,
PATH
FROM (SELECT sys_connect_by_path(CHILD_TITLE, '|') PATH
, connect_by_root(PARENT_ID) ROOT_ID, CHILD_ID
FROM table
CONNECT BY PRIOR CHILD_ID = PARENT_ID
ORDER BY CHILD_ID)
WHERE ROOT_ID IS NULL;
I need it in a clob/custom collection which can hold more than 4000 characters.
As long as you're on 11gR2 or higher you could use recursive subquery factoring instead of the connect by hierarchical syntax.
If your table is called t with:
CHILD_ID PARENT_ID CHILD_T
---------- ---------- -------
0 root
1 0 first
2 1 second
3 2 third
4 2 fourth
5 6 fifth
6 3 sixth
7 6 seventh
you can do:
with r (child_id, child_title, id_path, title_path) as (
select child_id, child_title, to_clob(null), to_clob(null)
from t
where parent_id is null
union all
select t.child_id, t.child_title,
t.parent_id ||','|| r.id_path, r.child_title ||'|'|| r.title_path
from r
join t on t.parent_id = r.child_id
)
select child_id, id_path, title_path
from r
order by child_id;
CHILD_ID ID_PATH TITLE_PATH
---------- -------------------- ----------------------------------------
0
1 0, root|
2 1,0, first|root|
3 2,1,0, second|first|root|
4 2,1,0, second|first|root|
5 6,3,2,1,0, sixth|third|second|first|root|
6 3,2,1,0, third|second|first|root|
7 6,3,2,1,0, sixth|third|second|first|root|
The anchor member turns the paths into CLOBs; the recursive member appends each title to the CLOB, which keeps it as that data type.
You can trim off the trailing comma/bar, or modify the query a bit so they never appear:
with r (parent_id, child_id, child_title, id_path, title_path) as (
select parent_id, child_id, child_title, to_clob(null), to_clob(null)
from t
where parent_id is null
union all
select t.parent_id, t.child_id, t.child_title,
t.parent_id || case when r.parent_id is not null then ',' end || r.id_path,
r.child_title || case when r.parent_id is not null then '|' end || r.title_path
from r
join t on t.parent_id = r.child_id
)
select child_id, id_path, title_path
from r
order by child_id;
CHILD_ID ID_PATH TITLE_PATH
---------- -------------------- ----------------------------------------
0
1 0 root
2 1,0 first|root
3 2,1,0 second|first|root
4 2,1,0 second|first|root
5 6,3,2,1,0 sixth|third|second|first|root
6 3,2,1,0 third|second|first|root
7 6,3,2,1,0 sixth|third|second|first|root
Your sample values don't demonstrate the need for a CLOB, but adding in more data to the dummy table shows the generated values can exceed 4k:
insert into t
select level + 7, level + 6, 'title'
from dual
connect by level <= 2000;
with r (...) -- as above
select max(length(id_path)), max(length(title_path))
from r;
MAX(LENGTH(ID_PATH)) MAX(LENGTH(TITLE_PATH))
-------------------- -----------------------
8920 12031
SYS_CONNECT_BY_PATH is pretty much an application of LISTAGG as demonstrated below: first you generate the rows you need, including CONNECT_BY_ROOT and LEVEL, and then you aggregate. Doing it a little more explicitly, as I show below, gives you more control over exactly what you want in the aggregate, in what order to use the levels, etc. (NOTE: I don't think that is how Oracle does it internally, since LISTAGG was added much later than SYS_CONNECT_BY_PATH, but logically that's how it works.)
So the problem either way is the 4,000 character limit. Side by side with the LISTAGG function, I show a different aggregation, using XMLAGG - which does not have the 4,000 character limit. With large input data the LISTAGG line will not work, but the XMLAGG line will work fine and will produce a CLOB. Good luck!
Query:
with
t ( child_id, parent_id ) as (
select 0, null from dual union all
select 1, 0 from dual union all
select 2, 1 from dual union all
select 3, 2 from dual union all
select 4, 2 from dual union all
select 5, 6 from dual union all
select 6, 3 from dual union all
select 7, 6 from dual
)
select child_id,
listagg(parent_id, ',') within group (order by lvl) as gen_tree_1,
rtrim(xmlcast(xmlagg(xmlelement(e, parent_id||',') order by lvl) as clob), ',')
as gen_tree_2
from ( select connect_by_root child_id as child_id, parent_id, level as lvl
from t
connect by child_id = prior parent_id
)
group by child_id
order by child_id
;
Output:
CHILD_ID GEN_TREE_1 GEN_TREE_2
---------- -------------------- --------------------
0
1 0 0
2 1,0 1,0
3 2,1,0 2,1,0
4 2,1,0 2,1,0
5 6,3,2,1,0 6,3,2,1,0
6 3,2,1,0 3,2,1,0
7 6,3,2,1,0 6,3,2,1,0
8 rows selected.

Preserve order when converting a delimited string to a column

I want to preserve the record order, which is provided as comma delimited string. The 5th item in by delimited string is a null. I need the 5th row to be null as well.
with test as
(select 'ABC,DEF,GHI,JKL,,MNO' str from dual
)
select rownum, regexp_substr (str, '[^,]+', 1, rownum) split
from test
connect by level <= length (regexp_replace (str, '[^,]+' )) + 1
The current result I'm getting puts this in the 6th position:
1 ABC
2 DEF
3 GHI
4 JKL
5 MNO
6
Order is preserved by your expression, but your regular expression doesn't match nulls correctly, so the 5th item disappears. The 6th row is a NULL because there are no more match after the 5th match.
You could do this instead:
SQL> with test as
2 (select 'ABC,DEF,GHI,JKL,,MNO' str from dual
3 )
4 SELECT rownum,
5 rtrim(regexp_substr(str || ',', '[^,]*,', 1, rownum), ',') split
6 FROM test
7 CONNECT BY LEVEL <= length(regexp_replace(str, '[^,]+')) + 1;
ROWNUM SPLIT
---------- ---------------------------------------------------------------
1 ABC
2 DEF
3 GHI
4 JKL
5
6 MNO
6 rows selected
Or this:
SQL> with test as
2 (select 'ABC,DEF,GHI,JKL,,MNO' str from dual
3 )
4 SELECT rownum,
5 regexp_substr(str, '([^,]*)(,|$)', 1, rownum, 'i', 1) split
6 FROM test
7 CONNECT BY LEVEL <= length(regexp_replace(str, '[^,]+')) + 1;
ROWNUM SPLIT
---------- ------------------------------------------------------------
1 ABC
2 DEF
3 GHI
4 JKL
5
6 MNO
6 rows selected
Try Something like this:
SELECT
STR,
REPLACE ( SUBSTR ( STR,
CASE LEVEL
WHEN 1
THEN
0
ELSE
INSTR ( STR,
'~',
1,
LEVEL
- 1 )
END
+ 1,
1 ),
'~' )
FROM
(SELECT 'A~~C~~E' AS STR FROM DUAL)
CONNECT BY
LEVEL <= LENGTH ( REGEXP_REPLACE ( STR,
'[^~]+' ) )
+ 1;
This one works..
SELECT
ROWNUM,
CAST ( REGEXP_SUBSTR ( STR,
'(.*?)(,|$)',
1,
LEVEL,
NULL,
1 ) AS CHAR ( 12 ) )
OUTPUT
FROM
(SELECT 'ABC,DEF,GHI,JKL,,MNO' AS STR FROM DUAL)
CONNECT BY
LEVEL <= REGEXP_COUNT ( STR,
',' )
+ 1;