Use a Regex in SQL statement to extract several strings - Oracle

Use a Regex in SQL statement to extract several strings - Oracle - sql

I want to extract some strings from the VAL column, according the regex furhter below in bold. This is an example of the data I have in source :
Table1
-----------------
ID VAL
-----------------
1 GR-RDE
2 GR-RZA-RDE
3 GR-RZA-RDE_RZA
4 GR-RGS
5 GR-RZA-OR-ORC
6 GR-RZA-RDE-OR-ORC_RZA
Desired result :
> Output
-----------------
ID RESULT
-----------------
1 RDE
2 RZA
2 RDE
3 RZA
3 RDE
4 RGS
5 RZA
5 OR
6 RZA
6 RDE
6 OR
To do that, I've done this regex :
(?<=-)(RDE|RZA|RGS|OR)(?![A-Z])
(?<=-) : checks that the character before is '-'
(RDE|RZA|RGS|OR) : search for 'RDE', 'RZA', 'RGS', 'OR' strings
(?![A-Z]) : ignore the string if it's followed by a letter
The regex works perfectly and it ignores all the unwhanted parts :
My problem is that I don't find the way to use this regex in a SQL statement (Oracle database). I've tried to perform a test with something like this, which returns Null :
select REGEXP_SUBSTR(VAL,'(?<=-)(RDE|RZA|RGS|OR)(?![A-Z])') from Table1;

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Table1 ( ID, VAL ) AS
SELECT 1, 'GR-RDE' FROM DUAL UNION ALL
SELECT 2, 'GR-RZA-RDE' FROM DUAL UNION ALL
SELECT 3, 'GR-RZA-RDE_RZA' FROM DUAL UNION ALL
SELECT 4, 'GR-RGS' FROM DUAL UNION ALL
SELECT 5, 'GR-RZA-OR-ORC' FROM DUAL UNION ALL
SELECT 6, 'GR-RZA-RDE-OR-ORC_RZA' FROM DUAL
Query 1:
WITH words ( id, val, lvl, str, maxlvl ) AS (
SELECT id,
val,
1,
REGEXP_SUBSTR( val, '[A-Z]+', 1, 1 ),
REGEXP_COUNT( val, '[A-Z]+' )
FROM table1
UNION ALL
SELECT id,
val,
lvl + 1,
REGEXP_SUBSTR( val, '[A-Z]+', 1, lvl + 1 ),
maxlvl
FROM words
WHERE lvl < maxlvl
)
SELECT id, str, lvl
FROM words
ORDER BY id, lvl
Results:
| ID | STR | LVL |
|----|-----|-----|
| 1 | GR | 1 |
| 1 | RDE | 2 |
| 2 | GR | 1 |
| 2 | RZA | 2 |
| 2 | RDE | 3 |
| 3 | GR | 1 |
| 3 | RZA | 2 |
| 3 | RDE | 3 |
| 3 | RZA | 4 |
| 4 | GR | 1 |
| 4 | RGS | 2 |
| 5 | GR | 1 |
| 5 | RZA | 2 |
| 5 | OR | 3 |
| 5 | ORC | 4 |
| 6 | GR | 1 |
| 6 | RZA | 2 |
| 6 | RDE | 3 |
| 6 | OR | 4 |
| 6 | ORC | 5 |
| 6 | RZA | 6 |
Query 2:
SELECT t.id, w.COLUMN_VALUE AS str
FROM Table1 t
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR( t.val, '[A-Z]+', 1, LEVEL )
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.val, '[A-Z]+' )
) AS SYS.ODCIVARCHAR2LIST
)
) w
Results:
| ID | STR |
|----|-----|
| 1 | GR |
| 1 | RDE |
| 2 | GR |
| 2 | RZA |
| 2 | RDE |
| 3 | GR |
| 3 | RZA |
| 3 | RDE |
| 3 | RZA |
| 4 | GR |
| 4 | RGS |
| 5 | GR |
| 5 | RZA |
| 5 | OR |
| 5 | ORC |
| 6 | GR |
| 6 | RZA |
| 6 | RDE |
| 6 | OR |
| 6 | ORC |
| 6 | RZA |

Related

Get the count of longest streak including the break point

I am working on the problem where I have to get the count of streak with max value, but to get the exact result I have to count that point as well where the streak breaks. My table looks like this
+-----------------+--------+-------+
| customer_number | Months | Flags |
+-----------------+--------+-------+
| 1 | 12 | 1 |
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 8 | 1 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 1 |
| 6 | 12 | 1 |
| 6 | 1 | 1 |
| 6 | 2 | 1 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
| 6 | 5 | 4 |
| 6 | 9 | 1 |
| 6 | 10 | 1 |
| 6 | 11 | 1 |
| 7 | 5 | 1 |
| 8 | 9 | 1 |
| 8 | 10 | 1 |
| 8 | 11 | 1 |
| 9 | 9 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 10 | 11 | 1 |
+-----------------+--------+-------+
and my desired output is
+----------+--------------------+
| Customer | Consecutive streak |
+----------+--------------------+
| 1 | 10 |
| 6 | 6 |
| 7 | 1 |
| 8 | 3 |
| 9 | 3 |
| 10 | 1 |
+----------+--------------------+
the code I have
SELECT customer_number, max(streak) max_consecutive_streak FROM (
SELECT customer_number, COUNT(*) as streak
FROM
(select *,
(row_number() over (order by customer_number) -
row_number() over (order by customer_number)
) as counts
from table1
) cc
group by customer_number, counts
)
GROUP BY 1;
It is working good but for customer_number 6 it returns 5 but I want it to be 6, means it should count 4 as well in its longest streak as the streak breaks at this point. Any idea how can I achieve that?

You can use a cte with row_number:
with cte(r, id, flag) as (
select row_number() over (order by c.customer_number), c.* from customers c
),
freq(id, t, f) as (
select c2.id, c2.f, count(*) from
(select c.id, (select sum(c1.flag!=c.flag) from cte c1 where c1.id=c.id and c1.r <= c.r) f from cte c)
c2 group by c2.id, c2.f
)
select id, max(f) from freq group by id;

Oracle SQL - Generate aggregate rows for certain rows using select

I have a table like below.
|FILE| ID |PARENTID|SHOWCHILD|CAT1|CAT2|CAT3|TOTAL|
|F1 | A1 | P1 | N | 3 | 2 | 6 | 11 |
|F2 | A2 | P2 | N | 4 | 7 | 3 | 14 |
|F3 | A3 | P1 | N | 3 | 1 | 1 | 5 |
|F4 | LG1| | Y | 6 | 3 | 7 | 16 |
|F5 | LG2| | Y | 4 | 7 | 3 | 14 |
Now, Is it possible if I want to find the total (ie) aggregate of cat1, cat2, cat3 & total only for rows which has showChild as 'Y' and add that to the resultset.
|Tot| Res | Res | N | 10 | 10 | 10 | 30 |
Expected final output:
|FILE| ID |PARENTID|SHOWCHILD|CAT1|CAT2|CAT3|TOTAL|
|F1 | A1 | P1 | N | 3 | 2 | 6 | 11 |
|F2 | A2 | P2 | N | 4 | 7 | 3 | 14 |
|F3 | A3 | P1 | N | 3 | 1 | 1 | 5 |
|F4 | LG1| | Y | 6 | 3 | 7 | 16 |
|F5 | LG2| | Y | 4 | 7 | 3 | 14 |
|Tot | Res| Res | N | 10 | 10 | 10 | 30 |
Here I have added the Tot row(last row) after considering only the rows which has showchild as 'Y' and added that to the resultset.
I am trying for a solution without using UNION
Any help on achieving the above results is highly appreciated.
Thank you.

One approach would be to use a union:
WITH cte AS (
SELECT "FILE", ID, PARENTID, SHOWCHILD, CAT1, CAT2, CAT3, TOTAL, 1 AS position
FROM yourTable
UNION ALL
SELECT 'Tot', 'Res', 'Res', 'N', SUM(CAT1), SUM(CAT2), SUM(CAT3), SUM(TOTAL), 2
FROM yourTable
WHERE SHOWCHILD = 'Y'
)
SELECT "FILE", ID, PARENTID, SHOWCHILD, CAT1, CAT2, CAT3, TOTAL
FROM cte
ORDER BY
position,
"FILE";
Demo

You can try using UNION
select FILE,ID ,PARENTID,SHOWCHILD,CAT1,CAT2,CAT3,TOTAL from table1
union
select 'Tot','Res','Res','N',sum(cat1), sum(cat2),sum(cat3), sum(total)
from table1 where SHOWCHILD='Y'

I see you already accepted an answer, but you did ask for a solution that did not involve UNION. One such solution would be to use GROUPING SETS.
GROUPING SETS allow you to specify different grouping levels in your query and generate aggregates at each of those levels. You can use it to generate an output row for each record plus a single "total" row, as per your requirements. The function GROUPING can be used in expressions to identify whether each output row is in one group or the other.
Example, with test data:
with input_data ("FILE", "ID", PARENTID, SHOWCHILD, CAT1, CAT2, CAT3, TOTAL ) AS (
SELECT 'F1','A1','P1','N',3,2,6,11 FROM DUAL UNION ALL
SELECT 'F2','A2','P2','N',4,7,3,14 FROM DUAL UNION ALL
SELECT 'F3','A3','P1','N',3,1,1,5 FROM DUAL UNION ALL
SELECT 'F4','LG1','','Y',6,3,7,16 FROM DUAL UNION ALL
SELECT 'F5','LG2','','Y',4,7,3,14 FROM DUAL )
SELECT decode(grouping("FILE"),1,'Tot',"FILE") "FILE",
decode(grouping("ID"),1,'Res',"ID") "ID",
decode(grouping(parentid),1, 'Res',parentid) parentid,
decode(grouping(showchild),1, 'N',showchild) showchild,
decode(grouping("FILE"),1,sum(decode(showchild,'Y',cat1,0)),sum(cat1)) cat1,
decode(grouping("FILE"),1,sum(decode(showchild,'Y',cat2,0)),sum(cat2)) cat2,
decode(grouping("FILE"),1,sum(decode(showchild,'Y',cat3,0)),sum(cat3)) cat3,
decode(grouping("FILE"),1,sum(decode(showchild,'Y',total,0)),sum(total)) total
from input_data
group by grouping sets (("FILE", "ID", parentid, showchild), ())
+------+-----+-----+----------+-----------+------+------+------+-------+
| FILE | F2 | ID | PARENTID | SHOWCHILD | CAT1 | CAT2 | CAT3 | TOTAL |
+------+-----+-----+----------+-----------+------+------+------+-------+
| F1 | F1 | A1 | P1 | N | 3 | 2 | 6 | 11 |
| F2 | F2 | A2 | P2 | N | 4 | 7 | 3 | 14 |
| F3 | F3 | A3 | P1 | N | 3 | 1 | 1 | 5 |
| F4 | F4 | LG1 | - | Y | 6 | 3 | 7 | 16 |
| F5 | F5 | LG2 | - | Y | 4 | 7 | 3 | 14 |
| Tot | Tot | Res | Res | N | 10 | 10 | 10 | 30 |
+------+-----+-----+----------+-----------+------+------+------+-------+

Recursive query in Oracle to find children and sibling

I'm struggling a hierarchical SQL query. I want to have another 2 columns of the disp_order of its children and sibling.
Children - Should hold all disp_order of their child and their grand children and so far.
Sibling - Should hold the disp_order of the row having the same parent.
+------------+-----+-------------+--------+
| disp_order | lvl | description | parent |
+------------+-----+-------------+--------+
| 0 | 1 | A | |
| 1 | 2 | B | 0 |
| 2 | 3 | C | 1 |
| 3 | 4 | D | 2 |
| 4 | 5 | E | 3 |
| 5 | 2 | F | 0 |
| 6 | 3 | G | 5 |
| 7 | 3 | H | 5 |
| 8 | 3 | I | 5 |
| 9 | 4 | J | 8 |
| 10 | 5 | K | 9 |
+------------+-----+-------------+--------+
What the result should be:
+------------+-----+-------------+--------+------------------------+---------+
| disp_order | lvl | description | parent | children | sibling |
+------------+-----+-------------+--------+------------------------+---------+
| 0 | 1 | A | | 1,2,3,4,5,6,7,8,9,10 | |
| 1 | 2 | B | 0 | 2,3,4 | 5 |
| 2 | 3 | C | 1 | 3,4 | |
| 3 | 4 | D | 2 | 4 | |
| 4 | 5 | E | 3 | | |
| 5 | 2 | F | 0 | 6,7,8,9,10 | 1 |
| 6 | 3 | G | 5 | | 7,8 |
| 7 | 3 | H | 5 | | 6,8 |
| 8 | 3 | I | 5 | 9,10 | 6,7 |
| 9 | 4 | J | 8 | 10 | |
| 10 | 5 | K | 9 | | |
+------------+-----+-------------+--------+------------------------+---------+
Here is my current query:
SELECT t.*,
( SELECT MAX( disp_order )
FROM tbl_pattern p
WHERE p.lvl = t.lvl - 1
AND p.disp_order < t.disp_order ) AS parent
FROM tbl_pattern t

Continuing from your previous question:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl_pattern ( order_no, code, disp_order, lvl, description ) AS
SELECT 'RM001-01', 1, 0, 1, 'HK140904-1A' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 1, 2, 'HK140904-1B' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 2, 3, 'HK140904-1B' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 3, 4, 'HK140904-1C' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 4, 5, 'HK140904-1D' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 5, 2, 'HK140904-1E' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 6, 3, 'HK140904-1E' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 7, 3, 'HK140904-1X' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 8, 4, 'HK140904-1E' FROM DUAL UNION ALL
SELECT 'RM001-01', 1, 9, 5, 'HK140904-1E' FROM DUAL;
Query 1:
WITH data ( order_no, code, disp_order, lvl, description, parent ) AS (
SELECT t.*,
( SELECT MAX( disp_order )
FROM tbl_pattern p
WHERE p.order_no = t.order_no
AND p.code = t.code
AND p.lvl = t.lvl - 1
AND p.disp_order < t.disp_order ) AS parent
FROM tbl_pattern t
)
SELECT d.*,
( SELECT LISTAGG( c.disp_order, ',' ) WITHIN GROUP ( ORDER BY c.disp_order )
FROM data c
START WITH c.parent = d.disp_order
AND c.order_no = d.order_no
AND c.code = d.code
CONNECT BY PRIOR c.disp_order = c.parent
AND PRIOR c.order_no = c.order_no
AND PRIOR c.code = c.code
) AS children,
( SELECT LISTAGG( c.disp_order, ',' ) WITHIN GROUP ( ORDER BY c.disp_order )
FROM data c
WHERE c.parent = d.parent
AND c.disp_order <> d.disp_order
AND c.order_no = d.order_no
AND c.code = d.code
) AS siblings
FROM data d
Results:
| ORDER_NO | CODE | DISP_ORDER | LVL | DESCRIPTION | PARENT | CHILDREN | SIBLINGS |
|----------|------|------------|-----|-------------|--------|-------------------|----------|
| RM001-01 | 1 | 0 | 1 | HK140904-1A | (null) | 1,2,3,4,5,6,7,8,9 | (null) |
| RM001-01 | 1 | 1 | 2 | HK140904-1B | 0 | 2,3,4 | 5 |
| RM001-01 | 1 | 2 | 3 | HK140904-1B | 1 | 3,4 | (null) |
| RM001-01 | 1 | 3 | 4 | HK140904-1C | 2 | 4 | (null) |
| RM001-01 | 1 | 4 | 5 | HK140904-1D | 3 | (null) | (null) |
| RM001-01 | 1 | 5 | 2 | HK140904-1E | 0 | 6,7,8,9 | 1 |
| RM001-01 | 1 | 6 | 3 | HK140904-1E | 5 | (null) | 7 |
| RM001-01 | 1 | 7 | 3 | HK140904-1X | 5 | 8,9 | 6 |
| RM001-01 | 1 | 8 | 4 | HK140904-1E | 7 | 9 | (null) |
| RM001-01 | 1 | 9 | 5 | HK140904-1E | 8 | (null) | (null) |

How to Find Items that Do NOT Have a pre-Pipe "Base" Value

I have a database with a column (obj_id) in a table (parts) where I SHOULD have an obj_id of 12345 that is a set for another row that would have 12345|.
So:
select obj_id from parts where obj_id like '12345%';
12345
12345|A
12345|B
12345|77
Now, someone violated the guideline and put in some items with the piped-value but not the base value w/o the pipe (e.g. 12378|J, 12378|8 but not 12378).
I need to know how to write a SQL query to find these piped-values that do NOT have their matching base (non-piped) value in the table.

Without some realistic sample data to work with it's hard to know what you really want. Below a 2 queries that may be of assistance, but perhaps it will also make you note how useful sample data can be:
See this working at SQL Fiddle
CREATE TABLE PARTS
(id int, OBJ_ID varchar2(200))
;
INSERT ALL
INTO PARTS (id, OBJ_ID)
VALUES (1,'12345 12345|A 12345|B 12345|77')
INTO PARTS (id, OBJ_ID)
VALUES (2,'12346|A 12346|B 12346|77')
INTO PARTS (id, OBJ_ID)
VALUES (3,'12378|J, 12378|8')
INTO PARTS (id, OBJ_ID)
VALUES (4,NULL)
INTO PARTS (id, OBJ_ID)
VALUES (5,'fred. wilma, barney, betty')
SELECT * FROM dual
;
Query 1:
select
*
from PARTS p
where instr(p.OBJ_ID,' ') > instr(p.OBJ_ID,'|')
Results:
| ID | OBJ_ID |
|----|----------------------------|
| 2 | 12346|A 12346|B 12346|77 |
| 3 | 12378|J, 12378|8 |
| 5 | fred. wilma, barney, betty |
Query 2:
select
id, rn_a, regexp_substr (OBJ_ID_SPLIT, '[^|]+', 1, rn_b) as OBJ_ID_SPLIT
from (
select
p.id, c1.rn_a, regexp_substr (p.OBJ_ID, '[^ ]+', 1, c1.rn_a) as OBJ_ID_SPLIT
from PARTS p
cross join (select rownum as rn_a
from (select max(length (regexp_replace (OBJ_ID, '[^|]+'))) + 1 as mx
from PARTS
)
connect by level <= mx) c1
where p.OBJ_ID like '%|%'
) d
cross join (select 1 rn_b from dual union all select 2 from dual) c2
order by id, rn_a
Results:
| ID | RN_A | OBJ_ID_SPLIT |
|----|------|--------------|
| 1 | 1 | 12345 |
| 1 | 1 | (null) |
| 1 | 2 | 12345 |
| 1 | 2 | A |
| 1 | 3 | 12345 |
| 1 | 3 | B |
| 1 | 4 | 12345 |
| 1 | 4 | 77 |
| 2 | 1 | 12346 |
| 2 | 1 | A |
| 2 | 2 | 12346 |
| 2 | 2 | B |
| 2 | 3 | 12346 |
| 2 | 3 | 77 |
| 2 | 4 | (null) |
| 3 | 1 | 12378 |
| 3 | 1 | J, |
| 3 | 2 | 12378 |
| 3 | 2 | 8 |
| 3 | 3 | (null) |
| 3 | 4 | (null) |

Order comments by thread path and by number of total votes

I'm having some trouble ordering comments by their thread path and by number of upvotes of each comment.
Now they are only ordering by thread path. I've tried and searched a lot of things but nothings results.
This is my query
WITH RECURSIVE first_comments AS (
(
(
SELECT id, text, level, parent_id, array[id] AS thread_path, total_votes FROM comments
WHERE comments."postId" = 1 AND comments."level" = 0
)
)
UNION
(
SELECT e.id, e.text, e.level, e.parent_id, (fle.thread_path || e.id), e.total_votes
FROM
(
SELECT id, text, level, parent_id, total_votes FROM comments
WHERE comments."postId" = 1
) e, first_comments fle
WHERE e.parent_id = fle.id
)
)
SELECT id, text, level, total_votes, thread_path from first_comments ORDER BY 5 ASC
This query results in:
--------------------------------------------------
| id | level | total_votes | thread_path |
--------------------------------------------------
| 1 | 0 | 5 | {1} |
| 3 | 1 | 9 | {1,3} |
| 7 | 2 | 5 | {1,3,7} |
| 9 | 2 | 7 | {1,3,9} |
| 11 | 3 | 0 | {1,3,9,11} |
| 12 | 4 | 0 | {1,3,9,11,12} |
| 13 | 5 | 0 | {1,3,9,11,12,13} |
| 10 | 1 | 20 | {1,10} |
| 2 | 0 | 10 | {2} |
| 6 | 1 | 1 | {2,6} |
| 4 | 0 | 8 | {4} |
| 8 | 1 | 6 | {4,8} |
| 5 | 0 | 3 | {5} |
--------------------------------------------------
And the result should be
--------------------------------------------------
| id | level | total_votes | thread_path |
--------------------------------------------------
| 2 | 0 | 10 | {2} |
| 6 | 1 | 1 | {2,6} |
| 4 | 0 | 8 | {4} |
| 8 | 1 | 6 | {4,8} |
| 1 | 0 | 5 | {1} |
| 10 | 1 | 20 | {1,10} |
| 3 | 1 | 9 | {1,3} |
| 9 | 2 | 7 | {1,3,9} |
| 11 | 3 | 0 | {1,3,9,11} |
| 12 | 4 | 0 | {1,3,9,11,12} |
| 13 | 5 | 0 | {1,3,9,11,12,13} |
| 7 | 2 | 5 | {1,3,7} |
| 5 | 0 | 3 | {5} |
--------------------------------------------------
What I'm missing here...?
Thank for the help

Just accumulate another array next to path, witch will contain not just the id of each comment in its path, but the total_votes (as a negative number) before each id. After that, you can order by that column.
WITH RECURSIVE first_comments AS (
(
(
SELECT id, text, level, parent_id, array[id] AS path, total_votes,
array[-total_votes, id] AS path_and_votes
FROM comments
WHERE comments."postId" = 1 AND comments."level" = 0
)
)
UNION
(
SELECT e.id, e.text, e.level, e.parent_id, (fle.path || e.id), e.total_votes,
(fle.path_and_votes || -e.total_votes || e.id)
FROM
(
SELECT id, text, level, parent_id, total_votes FROM comments
WHERE comments."postId" = 1
) e, first_comments fle
WHERE e.parent_id = fle.id
)
)
SELECT id, text, level, total_votes, path from first_comments ORDER BY path_and_votes ASC
SQLFiddle (only data -- without the recursive CTE)

You want to order by the total votes at the top level. I think I'll approach this by by using a window function.
Instead of:
SELECT id, text, level, total_votes, path
from first_comments
ORDER BY 5 ASC;
which explicitly orders by the path. Try this:
select id, text, level, total_votes,
max(total_votes) over (partition by path[1]) as toplevel_votes
from first_comments
order by 6 desc;
This calculates the total votes at the top most level and uses that for ordering.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Use a Regex in SQL statement to extract several strings - Oracle - sql

Related

Get the count of longest streak including the break point

Oracle SQL - Generate aggregate rows for certain rows using select

Recursive query in Oracle to find children and sibling

How to Find Items that Do NOT Have a pre-Pipe "Base" Value

Order comments by thread path and by number of total votes

Categories

Resources