How to generate a dynamic sequence in Oracle - sql

I have a table A which represents a valid sequence of numbers, which looks something like this:
| id | start | end | step |
|----|-------|-------|------|
| 1 | 4000 | 4999 | 4 |
| 2 | 3 | 20000 | 1 |
A[1] thus represents the sequence [4000, 4004, 4008, ...4996]
and another B of "occupied" numbers that looks like this:
| id | number | ... |
|-----|--------|-----|
| 1 | 4000 | ... |
| 2 | 4003 | ... |
| ... | ... | ... |
I want to construct a query which using A and B, finds the first unoccupied number for a particular sequence.
I have been trying – and failing – to do, is to generate a list of valid numbers from a row in A and then left outer join table B on B.number = valid_number where B.id is null from which result I could then select min(...).

How about this?
I simplified your test case (END value isn't that high) in order to save space (otherwise, I'd have to use smaller font :)).
What does it do?
CTEs A and B are your sample data
FULL_ASEQ creates a sequence of numbers from table A
if you want what it returns, remove everything from line #17 and - instead of it - run select * from full_aseq
the final query returns the first available sequence number, i.e. the one that hasn't been used yet (lines #19 - 23).
Here you go:
SQL> with
2 a (id, cstart, cend, step) as
3 (select 1, 4000, 4032, 4 from dual union all
4 select 2, 3, 20, 1 from dual
5 ),
6 b (id, cnumber) as
7 (select 1, 4000 from dual union all
8 select 1, 4004 from dual union all
9 select 2, 4003 from dual
10 ),
11 full_aseq as
12 (select a.id, a.cstart + column_value * a.step seq_val
13 from a cross join table(cast(multiset(select level from dual
14 connect by level <= (a.cend - a.cstart) / a.step
15 ) as sys.odcinumberlist))
16 )
17 select f.id, min(f.seq_val) min_seq_val
18 from full_aseq f
19 where not exists (select null
20 from b
21 where b.id = f.id
22 and b.cnumber = f.seq_val
23 )
24 group by f.id;
ID MIN_SEQ_VAL
---------- -----------
1 4008
2 4
SQL>

You can use LEAD to compute the difference between ordered rows in table B. Any row having a difference (to the next row) that exceeds the step value for that sequence is a gap.
Here's that concept, implemented (below). I threw in a sequence ID "3" that has no values in table B, to illustrate that it generates the proper first value.
with
a (id, cstart, cend, step) as
(select 1, 4000, 4032, 4 from dual union all
select 2, 3, 20000, 1 from dual union all
select 3, 100, 200, 3 from dual
),
b (id, cnumber) as
(select 1, 4000 from dual union all
select 1, 4004 from dual union all
select 1, 4012 from dual union all
select 2, 4003 from dual
),
work1 as (
select a.id,
b.cnumber cnumber,
lead(b.cnumber,1) over ( partition by b.id order by b.cnumber ) - b.cnumber diff,
a.step,
a.cstart,
a.cend
from a left join b on b.id = a.id )
select w1.id,
CASE WHEN min(w1.cnumber) is null THEN w1.cstart
WHEN min(w1.cnumber)+w1.step < w1.cend THEN min(w1.cnumber)+w1.step
ELSE null END next_cnumber
from work1 w1
where ( diff is null or diff > w1.step )
group by w1.id, w1.step, w1.cstart, w1.cend
order by w1.id
+----+--------------+
| ID | NEXT_CNUMBER |
+----+--------------+
| 1 | 4008 |
| 2 | 4004 |
| 3 | 100 |
+----+--------------+
You can further improve the results by excluding rows in table B that are impossible for the sequence. E.g., exclude a row for ID #1 having a value of, say, 4007.

I'll ask the obvious and suggest why not use an actual sequence?
SQL> set timing on
SQL> CREATE SEQUENCE SEQ_TEST_A
START WITH 4000
INCREMENT BY 4
MINVALUE 4000
MAXVALUE 4999
NOCACHE
NOCYCLE
ORDER
Sequence created.
Elapsed: 00:00:01.09
SQL> CREATE SEQUENCE SEQ_TEST_B
START WITH 3
INCREMENT BY 1
MINVALUE 3
MAXVALUE 20000
NOCACHE
NOCYCLE
ORDER
Sequence created.
Elapsed: 00:00:00.07
SQL> -- get nexvals from A
SQL> select seq_test_a.nextval from dual
NEXTVAL
----------
4000
1 row selected.
Elapsed: 00:00:00.09
SQL> select seq_test_a.nextval from dual
NEXTVAL
----------
4004
1 row selected.
Elapsed: 00:00:00.08
SQL> select seq_test_a.nextval from dual
NEXTVAL
----------
4008
1 row selected.
Elapsed: 00:00:00.08
SQL> -- get nextvals from B
SQL> select seq_test_b.nextval from dual
NEXTVAL
----------
3
1 row selected.
Elapsed: 00:00:00.08
SQL> select seq_test_b.nextval from dual
NEXTVAL
----------
4
1 row selected.
Elapsed: 00:00:00.08
SQL> select seq_test_b.nextval from dual
NEXTVAL
----------
5
1 row selected.
Elapsed: 00:00:00.08

Related

pivot table with two rows into columns

i have the next result for my table:
our_date | number_people
------------------------
23/09/19 | 26
24/09/19 | 26
ALWAYS will be just two rows
and i want pivot this result and get this:
our_date_1 | number_people_1 | our_date_2 | number_people_2
-----------------------------------------------------------------
23/09/19 | 26 | 24/09/19 | 26
to get the differences between number_people_1 and number_people_2
i try with:
select *
from table_1
pivot(
count(number_people)
for our_date in (:P_TODAY, :P_YESTERDAY)
)
and this is my actual error:
ORA-56900: la variable de enlace no está soportada en la operación PIVOT|UNPIVOT
56900. 0000 - "bind variable is not supported inside pivot|unpivot operation"
*Cause: Attempted to use bind variables inside pivot|unpivot operation.
*Action: This is not supported.
what's is wrong? how can i use dynamic values inside for clause ?
Best regards
Error says that this:
for fecha in (our_date)
can't have our_date (column name) as list of values; it (the list) has to contain constants, e.g.
for our_date in (date '2019-09-23', date '2019-09-24')
Once you fix that, query might look like this:
SQL> with table_1 (our_date, number_people) as
2 (select date '2019-09-23', 26 from dual union all
3 select date '2019-09-24', 26 from dual
4 )
5 select *
6 from table_1
7 pivot (max(number_people)
8 for our_date in (date '2019-09-23', date '2019-09-24')
9 );
TO_DATE(' 2019-09-23 00:00:00' TO_DATE(' 2019-09-24 00:00:00'
------------------------------ ------------------------------
26 26
SQL>
But, that's not exactly what you wanted.
What if there are 3, 4 or more rows in that table? Is it possible, or will there always be only 2 rows?
If it is always only 2 rows, self-join can do the job. For example:
SQL> with table_1 (our_date, number_people) as
2 (select date '2019-09-23', 26 from dual union all
3 select date '2019-09-24', 22 from dual
4 ),
5 temp as
6 (select our_date, number_people,
7 row_number() over (order by our_date) rn
8 from table_1
9 )
10 select
11 a.our_date our_date_1,
12 a.number_people number_people_1,
13 --
14 b.our_date our_date_2,
15 b.number_people number_people_2
16 from temp a cross join temp b
17 where a.rn = 1
18 and b.rn = 2;
OUR_DATE_1 NUMBER_PEOPLE_1 OUR_DATE_2 NUMBER_PEOPLE_2
---------- --------------- ---------- ---------------
23.09.2019 26 24.09.2019 22
SQL>

SQL to distinct part of the string - Oracle SQL

I have a table table1 with column line which is of type CLOB
Here are the values:
seq line
------------------------------
1 ISA*00*TEST
ISA*00*TEST1
GS*123GG*TEST*456:EHE
ST*ERT*RFR*EDRR*EER
GS*123GG*TEST*456:EHE
-------------------------------
2 ISA*01*TEST
GS*124GG*TEST*456:EHE
GS*125GG*TEST*456:EHE
ST*ERQ*RFR*EDRR*EER
ST*ERW*RFR*EDRR*EER
ST*ERR*RFR*EDRR*EER
I am trying to find the distinct string of the substring before the second star.
The output would be:
distinct_line_value count
ISA*00 2
GS*123GG 2
ST*ERT 1
ISA*01 1
GS*124GG 1
GS*125GG 1
ST*ERQ 1
ST*ERW 1
ST*ERR 1
Any ideas how I can do it based on distinct for the first 2 stars?
Here's one option:
Test case:
SQL> select * from test;
SEQ LINE
---------- --------------------------------------------------
1 ISA*00*TEST
ISA*00*TEST1
GS*123GG*TEST*456:EHE
ST*ERT*RFR*EDRR*EER
GS*123GG*TEST
2 ISA*01*TEST
GS*124GG*TEST*456:EHE
GS*125GG*TEST*456:EHE
ST*ERQ*RFR*EDRR*EER
ST*E
Query (see comments within the code; apart from that REGEXP_SUBSTR is crucial here, along with its 'm' match parameter which treats the input string as multiple lines):
SQL> with
2 -- split CLOB values to rows
3 inter as
4 (select seq,
5 regexp_substr(line, '^.*$', 1, column_value, 'm') res
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= regexp_count(line, chr(10)) + 1
9 ) as sys.odcinumberlist))
10 ),
11 -- convert CLOB to VARCHAR2 (so that SUBSTR works)
12 inter2 as
13 (select to_char(res) res From inter)
14 -- the final result
15 select substr(res, 1, instr(res, '*', 1, 2)) val, count(*)
16 from inter2
17 group by substr(res, 1, instr(res, '*', 1, 2))
18 order by 1;
VAL COUNT(*)
-------------------------------------------------- ----------
GS*123GG* 2
GS*124GG* 1
GS*125GG* 1
ISA*00* 2
ISA*01* 1
ST*ERQ* 1
ST*ERR* 1
ST*ERT* 1
ST*ERW* 1
9 rows selected.
SQL>

PL/SQL - SQL dynamic row and column parsing

I took a look into the forums and couldn't really find something that I needed.
What I have is two tables one table with (Parse_Table)
File_ID|Start_Pos|Length|Description
------------------------------------
1 | 1 | 9 | Pos1
1 | 10 | 1 | Pos2
1 | 11 | 1 | Pos3
2 | 1 | 4 | Pos1
2 | 5 | 7 | Pos2
and another table that needs to be parsed like (Input_file)
String
ABCDEFGHI12
ASRQWERTQ45
123456789AB
321654852PO
and I want to have the result where If I put it will use this specific parsing spec
select DESCRIPTION, Start_pos,Length from Parse_table where File_ID=1
and be able to parse input file
String | Pos1 |Pos2|Pos3
---------------------------------
ABCDEFGHI12 |ABCDEFGHI | 1 | 2
ASRQWERTQ45 |ASRQWERTQ | 4 | 5
123456789AB |123456789 | A | B
321654852PO |321654852 | P | O
and alternatively if I put file_id=2 it would parse the values differently.
I looked at using the Pivot function, but it looks like number of columns are static, at least to my knowledge.
thanks in advance for your support please let me know what I can do in SQL.
You can get "close-ish" with the standard decode tricks to pivot the table assuming a ceiling on the maximum number of fields expected.
SQL> create table t ( fid int, st int, len int, pos varchar2(10));
Table created.
SQL>
SQL> insert into t values ( 1 , 1 , 9 , 'Pos1');
1 row created.
SQL> insert into t values ( 1 , 10 , 1 , 'Pos2');
1 row created.
SQL> insert into t values ( 1 , 11 , 1 , 'Pos3');
1 row created.
SQL> insert into t values ( 2 , 1 , 4 , 'Pos1');
1 row created.
SQL> insert into t values ( 2 , 5 , 7 , 'Pos2');
1 row created.
SQL>
SQL> create table t1 ( s varchar2(20));
Table created.
SQL>
SQL> insert into t1 values ('ABCDEFGHI12');
1 row created.
SQL> insert into t1 values ('ASRQWERTQ45');
1 row created.
SQL> insert into t1 values ('123456789AB');
1 row created.
SQL> insert into t1 values ('321654852PO');
1 row created.
SQL>
SQL>
SQL> select
2 t1.s,
3 max(decode(t.seq,1,substr(t1.s,t.st,t.len))) c1,
4 max(decode(t.seq,2,substr(t1.s,t.st,t.len))) c2,
5 max(decode(t.seq,3,substr(t1.s,t.st,t.len))) c3,
6 max(decode(t.seq,4,substr(t1.s,t.st,t.len))) c4,
7 max(decode(t.seq,5,substr(t1.s,t.st,t.len))) c5,
8 max(decode(t.seq,6,substr(t1.s,t.st,t.len))) c6
9 from t1,
10 ( select t.*, row_number() over ( partition by fid order by st ) as seq
11 from t
12 where fid = 1
13 ) t
14 group by t1.s
15 order by 1;
S C1 C2 C3 C4 C5 C6
-------------------- ------------- ------------- ------------- ------------- ------------- -------------
123456789AB 123456789 A B
321654852PO 321654852 P O
ABCDEFGHI12 ABCDEFGHI 1 2
ASRQWERTQ45 ASRQWERTQ 4 5
4 rows selected.
SQL>
SQL> select
2 t1.s,
3 max(decode(t.seq,1,substr(t1.s,t.st,t.len))) c1,
4 max(decode(t.seq,2,substr(t1.s,t.st,t.len))) c2,
5 max(decode(t.seq,3,substr(t1.s,t.st,t.len))) c3,
6 max(decode(t.seq,4,substr(t1.s,t.st,t.len))) c4,
7 max(decode(t.seq,5,substr(t1.s,t.st,t.len))) c5,
8 max(decode(t.seq,6,substr(t1.s,t.st,t.len))) c6
9 from t1,
10 ( select t.*, row_number() over ( partition by fid order by st ) as seq
11 from t
12 where fid = 2
13 ) t
14 group by t1.s
15 order by 1;
S C1 C2 C3 C4 C5 C6
-------------------- ------------- ------------- ------------- ------------- ------------- -------------
123456789AB 1234 56789AB
321654852PO 3216 54852PO
ABCDEFGHI12 ABCD EFGHI12
ASRQWERTQ45 ASRQ WERTQ45
4 rows selected.
If you really wanted that result to then come back with only the desired column count and custom column names, then you're into dynamic SQL territory. How you'd tackle that depends on the tool you are providing the data to. If it can consume a REF CURSOR, then a little PL/SQL would do the trick.
An unknown number of columns can be returned from a SQL statement, but it requires code built with PL/SQL, ANY types, and Oracle Data Cartridge.
That code is tricky to write but you can start with my open source project Method4. Download, unzip, #install, and then
write a SQL statement to generate a SQL statement.
Query
select * from table(method4.dynamic_query(
q'[
--Create a SQL statement to query PARSE_FILE.
select
'select '||
listagg(column_expression, ',') within group (order by start_pos) ||
' from parse_file'
column_expressions
from
(
--Create individual SUBSTR column expressions.
select
parse_table.*,
'substr(string, '||start_pos||', '||length||') '||description column_expression
from parse_table
--CHANGE BELOW LINE TO USE A DIFFERENT FILE:
where file_id = 2
order by start_pos
)
]'
));
Sample Schema
create table parse_table as
select 1 file_id, 1 start_pos, 9 length, 'Pos1' description from dual union all
select 1 file_id, 10 start_pos, 1 length, 'Pos2' description from dual union all
select 1 file_id, 11 start_pos, 1 length, 'Pos3' description from dual union all
select 2 file_id, 1 start_pos, 4 length, 'Pos1' description from dual union all
select 2 file_id, 5 start_pos, 7 length, 'Pos2' description from dual;
create table parse_file as
select 'ABCDEFGHI12' string from dual union all
select 'ASRQWERTQ45' string from dual union all
select '123456789AB' string from dual union all
select '321654852PO' string from dual;
Results
When FILE_ID = 1:
POS1 POS2 POS3
---- ---- ----
ABCDEFGHI 1 2
ASRQWERTQ 4 5
123456789 A B
321654852 P O
When FILE_ID = 2:
POS1 POS2
---- ----
ABCD EFGHI12
ASRQ WERTQ45
1234 56789AB
3216 54852PO

Strange random behavior in where clause

I have a table like this:
Id | GroupId | Category
------------------------
1 | 101 | A
2 | 101 | B
3 | 101 | C
4 | 103 | B
5 | 103 | D
6 | 103 | A
........................
I need select one of the GroupId randomly. For this I have used the following PL/SQL code block:
declare v_group_count number;
v_group_id number;
begin
select count(distinct GroupId) into v_group_count from MyTable;
SELECT GroupId into v_group_id FROM
(
SELECT GroupId, ROWNUM RN FROM
(SELECT DISTINCT GroupId FROM MyTable)
)
WHERE RN=Round(dbms_random.value(1, v_group_count));
end;
Because I rounded random value then it will be an integer value and the WHERE RN=Round(dbms_random.value(1, v_group_count)) condition must return one row always. Generally it gives me one row as expected. But strangely sometimes it gives me no rows and sometimes it returns two rows. That's why it gives error in this section:
SELECT GroupId into v_group_id
Anyone knows the reason of that behaviour?
round(dbms_random.value(1, v_group_count)) is being executed for every row, so every row might be selected or not.
P.s.
ROUND is a bad choice.
The probability of getting any of the edge values (e.g. 1 and 10) is half the probability of getting any other value (e.g. 2 to 9).
It is 0.0555... (1/18) Vs. 0.111... (1/9)
[ 1,1.5) --> 1
[1.5,2.5) --> 2
.
.
.
[8.5,9.5) --> 9
[9.5, 10) --> 10
select n,count(*)
from (select round(dbms_random.value(1, 10)) as n
from dual
connect by level <= 100000
)
group by n
order by n
;
N COUNT(*)
1 5488
2 11239
3 11236
4 10981
5 11205
6 11114
7 11211
8 11048
9 10959
10 5519
My recommendation is to use FLOOR on dbms_random.value(1,N+1)
select n,count(*)
from (select floor(dbms_random.value(1, 11)) as n
from dual
connect by level <= 100000
)
group by n
order by n
;
N COUNT(*)
1 10091
2 10020
3 10020
4 10021
5 9908
6 10036
7 10054
8 9997
9 9846
10 10007
If you want to select one randomly:
declare v_group_count number;
v_group_id number;
begin
SELECT GroupId into v_group_id
FROM (SELECT DISTINCT GroupId
FROM MyTable
ORDER BY dbms_random.value
) t
WHERE rownum = 1
end;

Select to build groups by (analytic) sum

Please help me to build a sql select to assign (software development) tasks to a software release. Actually this is a fictive example to solve my real business specific problem.
I have a relation Tasks:
ID Effort_In_Days
3 3
1 2
6 2
2 1
4 1
5 1
I want to distribute the Tasks to releases which are at most 2 days long (tasks longer than 2 shall still be put into one release). In my real problem I have much more "days" available to distribute "tasks" to. Expected output:
Release Task_ID
1 3
2 1
3 6
4 2
4 4
5 5
I think I need to use analytic functions, something with sum(effort_in_days) over and so on, to get the result. But I'm I haven't used analytic functions much and didn't find an example that's close enough to my specific problem. I need to build groups (releases) if a sum (>= 2) is reached.
I would do something like:
with data as (
select 3 ID, 3 Effort_In_Days from dual union all
select 1 ID, 2 Effort_In_Days from dual union all
select 6 ID, 2 Effort_In_Days from dual union all
select 2 ID, 1 Effort_In_Days from dual union all
select 4 ID, 1 Effort_In_Days from dual union all
select 5 ID, 1 Effort_In_Days from dual
)
select id, effort_in_days, tmp, ceil(tmp/2) release
from (
select id, effort_in_days, sum(least(effort_in_days, 2)) over (order by effort_in_days desc rows unbounded preceding) tmp
from data
);
Which results in:
ID EFFORT_IN_DAYS TMP RELEASE
---------- -------------- ---------- ----------
3 3 2 1
1 2 4 2
6 2 6 3
2 1 7 4
4 1 8 4
5 1 9 5
Basically, I am using least() to convert everything over 2 down to 2. Then I am putting all rows in descending order by that value and starting to assign releases. Since they are in descending order with a max value of 2, I know I need to assign a new release every time when I get to a multiple of 2.
Note that if you had fractional values, you could end up with releases that do not have a full 2 days assigned (as opposed to having over 2 days assigned), which may or may not meet your needs.
Also note that I am only showing all columns in my output to make it easier to see what the code is actually doing.
This is an example of a bin-packing problem (see here). There is not an optimal solution in SQL, that I am aware of, except in some boundary cases. For instance, if all the tasks have the same length or if all the tasks are >= 2, then there is an easy-to-find optimal solution.
A greedy algorithm works pretty well. This is to put a given record in the first bin where it fits, probably going through the list in descending size order.
If your problem is really as you state it, then the greedy algorithm will work to produce an optimal solution. That is, if the maximum value is 2 and the efforts are integers. There might even be a way to calculate the solution in SQL in this case.
Otherwise, you will need pl/sql code to achieve an approximate solution.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE data AS
select 3 ID, 3 Effort_In_Days from dual union all
select 1 ID, 2 Effort_In_Days from dual union all
select 6 ID, 2 Effort_In_Days from dual union all
select 2 ID, 1 Effort_In_Days from dual union all
select 4 ID, 1 Effort_In_Days from dual union all
select 5 ID, 1 Effort_In_Days from dual union all
select 9 ID, 2 Effort_In_Days from dual union all
select 7 ID, 1 Effort_In_Days from dual union all
select 8 ID, 1 Effort_In_Days from dual;
Query 1:
Give the rows an index so that they can be kept in order easily;
Assign groups to the rows where the Effort_In_Days is 1 so that all adjacent rows with Effort_In_Days of 1 are in the same group and rows separated by higher values for Effort_In_Days are in different groups;
Assign a cost of 1 to each row where the Effort_In_Days is higher than 1 or where Effort_In_Days is 1 and the row has an odd row number within the group; then
Finally, the release is the sum of all the costs for the row and all preceding rows.
Like this:
WITH indexes AS (
SELECT ID,
Effort_In_Days,
ROWNUM AS idx
FROM Data
),
groups AS (
SELECT ID,
Effort_In_Days,
idx,
CASE Effort_In_Days
WHEN 1
THEN idx - ROW_NUMBER() OVER ( PARTITION BY Effort_In_Days ORDER BY idx )
END AS grp
FROM indexes
ORDER BY idx
),
costs AS (
SELECT ID,
Effort_In_Days,
idx,
CASE Effort_In_Days
WHEN 1
THEN MOD( ROW_NUMBER() OVER ( PARTITION BY grp ORDER BY idx ), 2 )
ELSE 1
END AS cost
FROM groups
ORDER BY idx
)
SELECT ID,
Effort_In_Days,
SUM( cost ) OVER ( ORDER BY idx ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS Release
FROM costs
ORDER BY idx
Results:
| ID | EFFORT_IN_DAYS | RELEASE |
|----|----------------|---------|
| 3 | 3 | 1 |
| 1 | 2 | 2 |
| 6 | 2 | 3 |
| 2 | 1 | 4 |
| 4 | 1 | 4 |
| 5 | 1 | 5 |
| 9 | 2 | 6 |
| 7 | 1 | 7 |
| 8 | 1 | 7 |