Split column into multiple rows in ORACLE based on static length of substring - sql

I have seen multiple topics here for "Split column into multiple rows" but they all are based on some delimiter.
I want to split the column based on length in oracle.
Suppose i have a table
codes | product
--------------------------+--------
C111C222C333C444C555..... | A
codes are type VARCHAR2(800) and product is VARCHAR2(1).
Here in codes field we have many codes (maximum 200) which belongs to product A. and length of each code is 4 ( so C111, C222, C333 are different codes)
I want output of my select query like this-
code | product
---------------+-------
C111 | A
C222 | A
C333 | A
C444 | A
C555 | A
...
and so on.
please help me with this. Thanks in advance.

Here's yet another variation using regexp_substr() along with CONNECT BY to "loop" through the string by 4 character substrings:
SQL> with tbl(codes, product) as (
select 'C111C222C333C444C555', 'A' from dual union all
select 'D111D222D333', 'B' from dual
)
select regexp_substr(codes, '(.{4})', 1, level, null, 1) code, product
from tbl
connect by level <= (length(codes)/4)
and prior codes = codes
and prior sys_guid() is not null;
CODE P
-------------------- -
C111 A
C222 A
C333 A
C444 A
C555 A
D111 B
D222 B
D333 B
8 rows selected.
SQL>

Here is how I would do it. Let me know if you need more input / better explanations:
select substr(tt.codes,(((t.l-1)*4)+1),4) code,tt.product from tst_tab tt
join (select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab)) t
on t.l <= length(tt.codes)/4
order by tt.product,t.l;
Some explanantions:
-- this part gives the numbers from 1 ... maximum number of codes in codes column
select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab);
-- here is the query without the code extraction, it is just the numbers 1... numbers of codes for the product
select t.l,tt.product from tst_tab tt
join (select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab)) t
on t.l <= length(tt.codes)/4
order by tt.product,t.l;
-- and then the substr just extracts the right code:
substr(tt.codes,(((t.l-1)*4)+1),4)
Set up of my test data:
create table tst_tab (codes VARCHAR2(800),product VARCHAR2(1));
insert into tst_tab values ('C111C222C333C444C555','A');
insert into tst_tab values ('C111C222C333C444C555D666','B');
insert into tst_tab values ('C111','C');
commit;

One option might be this:
SQL> with test (codes, product) as
2 (select 'C111C222C333C444C555', 'A' from dual union all
3 select 'D555D666D777', 'B' from dual
4 )
5 select substr(codes, 4 * (column_value - 1) + 1, 4) code, product
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= length(codes) / 4
9 ) as sys.odcinumberlist))
10 order by 1;
CODE P
---- -
C111 A
C222 A
C333 A
C444 A
C555 A
D555 B
D666 B
D777 B
8 rows selected.
SQL>

Yet another a little bit different option of using recursive SQL to do this.
(To make it more concise I didn't add an example of test data. It could be taken from #Littlefoot or #Peter answers)
select code, product
from (
select distinct
substr(codes, (level - 1) * 4 + 1, 4) as code,
level as l,
product
from YourTable
connect by substr(codes, (level - 1) * 4 + 1, 4) is not null
)
order by product, l
P.S. #Thorsten Kettner made a fair point about considering to restructure your tables. That would be the right thing to do for sake of easier maintenance of your database in future

Related

SQL Connect By Level sometimes works, sometimes doesn't can't understand why

I am trying to run the query in Oracle, and if I change the round to 0, I get a result, but anytime there are decimals I am not getting a result back when using the connect by level part. But if I run I my query from after n.n= I get the result.
Reason I am trying to use the connect by level is I have a requirement to put my entire query into the where clause as in the application there is a restriction to do the group by clause I need.
SELECT n.n
FROM
(SELECT TO_NUMBER( LEVEL) - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n =
(subquery)
Examples of values I have which work in HOURS seem to be like whole number, wo when these are summed they are still whole numbers
5
10
5
5
20
But where I have seen the query not work is where I have decimal values such as:
3.68
2.45
5
10
5
Table:ASSIGNMENTS_M
Table: RESULT_VALUES
Columns: Result_ID, Assignment_ID, Date_Earned, Hours
INSERT INTO RESULT_VALUES(Result_ID, Assignment_ID, Date_Earned, Hours) VALUES(50,123456,to_date('01/02/2020', 'DD/MM/YYYY'),3.68 51,230034,to_date('02/02/2020', 'DD/MM/YYYY'),5 52,123456,to_date('03/02/2020', 'DD/MM/YYYY'),10 53,123456,to_date('04/02/2020', 'DD/MM/YYYY'),5 60,123456,to_date('05/02/2020', 'DD/MM/YYYY'),5 90,123456,to_date('06/02/2020', 'DD/MM/YYYY'),5 2384,123456,to_date('07/02/2020', 'DD/MM/YYYY'),10);
Expected Result = 38.68
Here's one solution, even though it's odd you want to do this:
The adjusted fiddle:
Working test case
This increments by 0.1 to find the matching row:
SELECT n.n
FROM ( SELECT TO_NUMBER(LEVEL)/10 - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n = (
SELECT round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
)
;
This increments by 1 to find the matching row, but adjusts the calculation to allow this:
SELECT n.n / 10
FROM ( SELECT TO_NUMBER(LEVEL) - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n = (
SELECT round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
) * 10
;
The result:
None of your results match the number sequence generated by the n derived table:
SELECT p1.assignment_id, round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
;
Result:
+---------------=+
| id | fte |
+----------------+
| 123456 | 43.7 |
+----------------+
That's the reason. Now how do you want to change this logic?
Do you want an approximate comparison or do you want your sequence to be in 0.1 increments?

Rewrite a query with "LEVEL" in snowflake

How can I write this SQL query into SNOWFLAKE?
SELECT LEVEL lv FROM DUAL CONNECT BY LEVEL <= 3;
You can find some good starting points by using CONNECT BY (https://docs.snowflake.com/en/sql-reference/constructs/connect-by.html) and here (https://docs.snowflake.com/en/user-guide/queries-hierarchical.html).
Snowflake is also supporting recursive CTEs.
It seems like you want to duplicate the rows three times. For a fixed, small multiplier such as 3, you could just enumerate the numbers:
select c.lv, a.*
from abc a
cross join (select 1 lv union all select 2 union all select 3) c
A more generic approach, in the spirit of the original query, uses a standard recursive common-table-expression to generate the numbers:
with cte as (
select 1 lv
union all select lv + 1 from cte where lv <= 3
)
select c.lv, a.*
from abc a
cross join cte c
There is level in snowflake. The differences from Oracle are:
In snowflake it's neccesary to use prior with connect by expression.
And you can't just select level - there should be any existing column in the select statement.
Example:
SELECT LEVEL, dummy FROM
(select 'X' dummy ) DUAL
CONNECT BY prior LEVEL <= 3;
LEVEL DUMMY
1 X
2 X
3 X
4 X
As per #Danil Suhomlinov post, we can further simplify using COLUMN1 for special table dual single column:
SELECT LEVEL, column1 FROM dual
CONNECT BY prior LEVEL <= 3;
LEVEL | COLUMN1
------+--------
1 |
2 |
3 |
4 |

Oracle - generate records with alphanumeric number and character ascending

I want to have a column which holds customer unique key, consisting of number+character in an ascending order.
Is it possible to somehow instruct Oracle to generate many records, from 1a to 1z, then 2a to 2z, etc. up until 300000z:
CUSTOMER_NUM
------------
10a
10b
10c
.
.
10z
11a
11b
.
Best I have reached so far is something like this:
select ROUND(DBMS_RANDOM.VALUE(9,21)), dbms_random.string('l', 1) from dual;
Any ideas anyone please? I would like to generate test table with at least 300000 records.
Thanks!
Below query should give you desired result.
WITH
TABLE1 AS (SELECT LEVEL PART_1 FROM DUAL CONNECT BY LEVEL <= 300000),
TABLE2 AS (SELECT CHR(LEVEL+96) AS PART_2 FROM DUAL CONNECT BY LEVEL <27)
SELECT PART_1||PART_2 AS KEY_ID FROM TABLE1,TABLE2
You may need something like the following:
select alpha || num
from
(select substr('qwertyuiopasdfghjklzxcvbnm', level, 1) as alpha from dual connect by level <= 26)
cross join
(select level as num from dual connect by level <= 2)
order by num, alpha
The first query uses a string containing all the characters and splits it into 26 rows containing a single character.
The second query generates a given number of numbers to join with the characters.
You can try this one:
WITH n AS
(SELECT LEVEL AS Num FROM dual CONNECT BY LEVEL < 10),
c AS
(SELECT CHR(LEVEL + 96) AS alpha FROM dual CONNECT BY LEVEL <= 26)
SELECT num||alpha
FROM n
CROSS JOIN c;

Oracle - split single row into multiple rows based on column value [duplicate]

This question already has answers here:
How to unfold the results of an Oracle query based on the value of a column
(4 answers)
Closed 6 years ago.
I have a table like this
parent_item child_item quantity
A B 2
A C 3
B E 1
B F 2
And would like to split this in multiple lines based on the quantity
parent_item child_item quantity
A B 1
A B 1
A C 1
A C 1
A C 1
B E 1
B F 1
B F 1
The column quantity (1) is not really necessary.
I was able to generate something with the help of connect by / level, but for large tables it's very very slow.I'm not really familiar with connect by / level, but this seemed to work, although I can't really explain:
select distinct parent_item, level LEVEL_TAG, child_item, level||quantity
FROM table
CONNECT BY quantity>=level
order by 1 asc;
I found similar questions, but in most cases topicstarter want's to split a delimited column value in multiple lines (Oracle - split single row into multiple rows)
What's the most performant method to solve this?
Thanks
Use a recursive sub-query factoring clause:
WITH split ( parent_item, child_item, lvl, quantity ) AS (
SELECT parent_item, child_item, 1, quantity
FROM your_table
UNION ALL
SELECT parent_item, child_item, lvl + 1, quantity
FROM split
WHERE lvl < quantity
)
SELECT parent_item, child_item, 1 As quantity
FROM split;
Or you can use a correlated hierarchical query:
SELECT t.parent_item, t.child_item, 1 AS quantity
FROM your_table t,
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= t.quantity
)
AS SYS.ODCINUMBERLIST
)
) l;
As for which is more performant - try benchmarking the different solutions as we cannot tell you what will be more performant on your system.

How to check any missing number from a series of numbers?

I am doing a project creating an admission system for a college; the technologies are Java and Oracle.
In one of the tables, pre-generated serial numbers are stored. Later, against those serial numbers, the applicant's form data will be entered. My requirement is that when the entry process is completed I will have to generate a Lot wise report. If during feeding pre-generated serial numbers any sequence numbers went missing.
For example, say in a table, the sequence numbers are 7001, 7002, 7004, 7005, 7006, 7010.
From the above series it is clear that from 7001 to 7010 the numbers missing are 7003, 7007, 7008 and 7009
Is there any DBMS function available in Oracle to find out these numbers or if any stored procedure may fulfill my purpose then please suggest an algorithm.
I can find some techniques in Java but for speed I want to find the solution in Oracle.
A solution without hardcoding the 9:
select min_a - 1 + level
from ( select min(a) min_a
, max(a) max_a
from test1
)
connect by level <= max_a - min_a + 1
minus
select a
from test1
Results:
MIN_A-1+LEVEL
-------------
7003
7007
7008
7009
4 rows selected.
Try this:
SELECT t1.SequenceNumber + 1 AS "From",
MIN(t2.SequenceNumber) - 1 AS "To"
FROM MyTable t1
JOIN MyTable t2 ON t1.SequenceNumber < t2.SequenceNumber
GROUP BY t1.SequenceNumber
HAVING t1.SequenceNumber + 1 < MIN(t2.SequenceNumber)
Here is the result for the sequence 7001, 7002, 7004, 7005, 7006, 7010:
From To
7003 7003
7007 7009
This worked but selects the first sequence (start value) since it doesn't have predecessor. Tested in SQL Server but should work in Oracle
SELECT
s.sequence FROM seqs s
WHERE
s.sequence - (SELECT sequence FROM seqs WHERE sequence = s.sequence-1) IS NULL
Here is a test result
Table
-------------
7000
7001
7004
7005
7007
7008
Result
----------
7000
7004
7007
To get unassigned sequence, just do value[i] - 1 where i is greater first row e.g. (7004 - 1 = 7003 and 7007 - 1 = 7006) which are available sequences
I think you can improve on this simple query
This works on postgres >= 8.4. With some slight modifications to the CTE-syntax it could be made to work for oracle and microsoft, too.
-- EXPLAIN ANALYZE
WITH missing AS (
WITH RECURSIVE fullhouse AS (
SELECT MIN(num)+1 as num
FROM numbers n0
UNION ALL SELECT 1+ fh0.num AS num
FROM fullhouse fh0
WHERE EXISTS (
SELECT * FROM numbers ex
WHERE ex.num > fh0.num
)
)
SELECT * FROM fullhouse fh1
EXCEPT ( SELECT num FROM numbers nx)
)
SELECT * FROM missing;
Here's a solution that:
Relies on Oracle's LAG function
Does not require knowledge of the complete sequence (but thus doesn't detect if very first or last numbers in sequence were missed)
Lists the values surrounding the missing lists of numbers
Lists the missing lists of numbers as contiguous groups (perhaps convenient for reporting)
Tragically fails for very large lists of missing numbers, due to listagg limitations
SQL:
WITH MentionedValues /*this would just be your actual table, only defined here to provide data for this example */
AS (SELECT *
FROM ( SELECT LEVEL + 7000 seqnum
FROM DUAL
CONNECT BY LEVEL <= 10000)
WHERE seqnum NOT IN (7003,7007,7008,7009)--omit those four per example
),
Ranges /*identifies all ranges between adjacent rows*/
AS (SELECT seqnum AS seqnum_curr,
LAG (seqnum, 1) OVER (ORDER BY seqnum) AS seqnum_prev,
seqnum - (LAG (seqnum, 1) OVER (ORDER BY seqnum)) AS diff
FROM MentionedValues)
SELECT Ranges.*,
( SELECT LISTAGG (Ranges.seqnum_prev + LEVEL, ',') WITHIN GROUP (ORDER BY 1)
FROM DUAL
CONNECT BY LEVEL < Ranges.diff) "MissingValues" /*count from lower seqnum+1 up to lower_seqnum+(diff-1)*/
FROM Ranges
WHERE diff != 1 /*ignore when diff=1 because that means the numers are sequential without skipping any*/
;
Output:
SEQNUM_CURR SEQNUM_PREV DIFF MissingValues
7004 7002 2 "7003"
7010 7006 4 "7007,7008,7009"
One simple way to get your answer for your scenario is this:
create table test1 ( a number(9,0));
insert into test1 values (7001);
insert into test1 values (7002);
insert into test1 values (7004);
insert into test1 values (7005);
insert into test1 values (7006);
insert into test1 values (7010);
commit;
select n.n from (select ROWNUM + 7001 as n from dual connect by level <= 9) n
left join test1 t on n.n = t.a where t.a is null;
The select will give you the answer from your example. This only makes sense, if you know in advance in which range your numbers are and the range should not too big. The first number must be the offset in the ROWNUM part and the length of the sequence is the limit to the level in the connect by part.
I would have suggested connect by level as Stefan has done, however, you can't use a sub-query in this statement, which means that it isn't really suitable for you as you need to know what the maximum and minimum values of your sequence are.
I would suggest a pipe-lined table function might be the best way to generate the numbers you need to do the join. In order for this to work you'd need an object in your database to return the values to:
create or replace type t_num_array as table of number;
Then the function:
create or replace function generate_serial_nos return t_num_array pipelined is
l_first number;
l_last number;
begin
select min(serial_no), max_serial_no)
into l_first, l_last
from my_table
;
for i in l_first .. l_last loop
pipe row(i);
end loop;
return;
end generate_serial_nos;
/
Using this function the following would return a list of serial numbers, between the minimum and maximum.
select * from table(generate_serial_nos);
Which means that your query to find out which serial numbers are missing becomes:
select serial_no
from ( select *
from table(generate_serial_nos)
) generator
left outer join my_table actual
on generator.column_value = actual.serial_no
where actual.serial_no is null
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL <= (SELECT MAX(a) FROM test1)
MINUS
SELECT a FROM test1 ;
Improved query is:
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL <= (SELECT MAX(a) FROM test1)
MINUS
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL < (SELECT Min(a) FROM test1)
MINUS
SELECT a FROM test1;
Note: a is column in which we find missing value.
Try with a subquery:
SELECT A.EMPNO + 1 AS MissingEmpNo
FROM tblEmpMaster AS A
WHERE A.EMPNO + 1 NOT IN (SELECT EMPNO FROM tblEmpMaster)
select A.ID + 1 As ID
From [Missing] As A
Where A.ID + 1 Not IN (Select ID from [Missing])
And A.ID < n
Data: ID
1
2
5
7
Result: ID
3
4
6