SQL Connect By Level sometimes works, sometimes doesn't can't understand why - sql

I am trying to run the query in Oracle, and if I change the round to 0, I get a result, but anytime there are decimals I am not getting a result back when using the connect by level part. But if I run I my query from after n.n= I get the result.
Reason I am trying to use the connect by level is I have a requirement to put my entire query into the where clause as in the application there is a restriction to do the group by clause I need.
SELECT n.n
FROM
(SELECT TO_NUMBER( LEVEL) - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n =
(subquery)
Examples of values I have which work in HOURS seem to be like whole number, wo when these are summed they are still whole numbers
5
10
5
5
20
But where I have seen the query not work is where I have decimal values such as:
3.68
2.45
5
10
5
Table:ASSIGNMENTS_M
Table: RESULT_VALUES
Columns: Result_ID, Assignment_ID, Date_Earned, Hours
INSERT INTO RESULT_VALUES(Result_ID, Assignment_ID, Date_Earned, Hours) VALUES(50,123456,to_date('01/02/2020', 'DD/MM/YYYY'),3.68 51,230034,to_date('02/02/2020', 'DD/MM/YYYY'),5 52,123456,to_date('03/02/2020', 'DD/MM/YYYY'),10 53,123456,to_date('04/02/2020', 'DD/MM/YYYY'),5 60,123456,to_date('05/02/2020', 'DD/MM/YYYY'),5 90,123456,to_date('06/02/2020', 'DD/MM/YYYY'),5 2384,123456,to_date('07/02/2020', 'DD/MM/YYYY'),10);
Expected Result = 38.68

Here's one solution, even though it's odd you want to do this:
The adjusted fiddle:
Working test case
This increments by 0.1 to find the matching row:
SELECT n.n
FROM ( SELECT TO_NUMBER(LEVEL)/10 - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n = (
SELECT round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
)
;
This increments by 1 to find the matching row, but adjusts the calculation to allow this:
SELECT n.n / 10
FROM ( SELECT TO_NUMBER(LEVEL) - 1 n FROM DUAL CONNECT BY LEVEL <= 1000 ) n
WHERE n.n = (
SELECT round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
) * 10
;
The result:
None of your results match the number sequence generated by the n derived table:
SELECT p1.assignment_id, round((sum(P2.HOURS)),1) FTE
FROM ASSIGNMENTS_M P1, RESULT_HOURS P2
WHERE P2.date_earned BETWEEN to_date('2020/01/01','YYYY/MM/DD') AND to_date('2020/10/31','YYYY/MM/DD')
AND P1.ASSIGNMENT_ID = 123456
GROUP BY P1.ASSIGNMENT_ID
;
Result:
+---------------=+
| id | fte |
+----------------+
| 123456 | 43.7 |
+----------------+
That's the reason. Now how do you want to change this logic?
Do you want an approximate comparison or do you want your sequence to be in 0.1 increments?

Related

Redshift: Generate a sequential range of numbers

I'm currently migrating PostgreSQL code from our existing DWH to new Redshift DWH and few queries are not compatible.
I have a table which has id, start_week, end_week and orders_each_week in a single row. I'm trying to generate a sequential series between the start_week and end_week so that I separate rows for each week between the give timeline.
Eg.,
This how it is present in the table
+----+------------+----------+------------------+
| ID | start_week | end_week | orders_each_week |
+----+------------+----------+------------------+
| 1 | 3 | 5 | 10 |
+----+------------+----------+------------------+
This is how I want to have it
+----+------+--------+
| ID | week | orders |
+----+------+--------+
| 1 | 3 | 10 |
+----+------+--------+
| 1 | 4 | 10 |
+----+------+--------+
| 1 | 5 | 10 |
+----+------+--------+
The code below is throwing error.
SELECT
id,
generate_series(start_week::BIGINT, end_week::BIGINT) AS demand_weeks
FROM client_demand
WHERE createddate::DATE >= '2021-01-01'
[0A000][500310] Amazon Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
[01000] Function "generate_series(bigint,bigint)" not supported.
So basically I am trying to find a sequential series between two numbers and I couldn't find any solution and any help here is really appreciated. Thank you
Gordon Linoff has shown a very common method for doing this and this approach has the advantage that the process isn't generating "rows" that don't already exist. This can make this faster than approaches that generate data on the fly. However, you need to have a table with about the right number of rows laying around and this isn't always the case. He also shows that this number series needs to be cross joined with your data to perform the function you need.
If you need to generate a large number of numbers in a series not using an existing table there are a number of ways to do this. Here's my go to approach:
WITH twofivesix AS (
SELECT
p0.n
+ p1.n * 2
+ p2.n * POWER(2,2)
+ p3.n * POWER(2,3)
+ p4.n * POWER(2,4)
+ p5.n * POWER(2,5)
+ p6.n * POWER(2,6)
+ p7.n * POWER(2,7)
as n
FROM
(SELECT 0 as n UNION SELECT 1) p0,
(SELECT 0 as n UNION SELECT 1) p1,
(SELECT 0 as n UNION SELECT 1) p2,
(SELECT 0 as n UNION SELECT 1) p3,
(SELECT 0 as n UNION SELECT 1) p4,
(SELECT 0 as n UNION SELECT 1) p5,
(SELECT 0 as n UNION SELECT 1) p6,
(SELECT 0 as n UNION SELECT 1) p7
),
fourbillion AS (
SELECT (a.n * POWER(256, 3) + b.n * POWER(256, 2) + c.n * 256 + d.n) as n
FROM twofivesix a,
twofivesix b,
twofivesix c,
twofivesix d
)
SELECT ...
This example makes a whole bunch of numbers (4B) but you can extend or reduce the number in the series by changing the number of times the tables are cross joined and by adding where clauses (as Gordon Linoff did). I don't expect you need a list anywhere close to this long but wanted to show how this can be used to make series that are very long. (You can also write with in base 10 if that makes more sense to you.)
So if you have a table with a more rows that you need number then this can be the fastest method but if you don't have such a table or table lengths vary over time you may want this pure SQL approach.
Among the many Postgres features that Redshift does not support is generate_series() (except on the master node). You can generate one yourself.
If you have a table with enough rows in Redshift, then I find that this approach works:
with n as (
select row_number() over () - 1 as n
from client_demand cd
)
select cd.id, cd.start_week + n.n as week, cd.orders_each_week
from client_demand cd join
n
on n.n <= (end_week - start_week);
This assumes that you have a table with enough rows to generate enough numbers for the on clause. If the table is really big, then add something like limit 100 in the n CTE to limit the size.
If there are only a handful of values, you can use:
select 0 as n union all
select 1 as n union all
select 2 as n

Split column into multiple rows in ORACLE based on static length of substring

I have seen multiple topics here for "Split column into multiple rows" but they all are based on some delimiter.
I want to split the column based on length in oracle.
Suppose i have a table
codes | product
--------------------------+--------
C111C222C333C444C555..... | A
codes are type VARCHAR2(800) and product is VARCHAR2(1).
Here in codes field we have many codes (maximum 200) which belongs to product A. and length of each code is 4 ( so C111, C222, C333 are different codes)
I want output of my select query like this-
code | product
---------------+-------
C111 | A
C222 | A
C333 | A
C444 | A
C555 | A
...
and so on.
please help me with this. Thanks in advance.
Here's yet another variation using regexp_substr() along with CONNECT BY to "loop" through the string by 4 character substrings:
SQL> with tbl(codes, product) as (
select 'C111C222C333C444C555', 'A' from dual union all
select 'D111D222D333', 'B' from dual
)
select regexp_substr(codes, '(.{4})', 1, level, null, 1) code, product
from tbl
connect by level <= (length(codes)/4)
and prior codes = codes
and prior sys_guid() is not null;
CODE P
-------------------- -
C111 A
C222 A
C333 A
C444 A
C555 A
D111 B
D222 B
D333 B
8 rows selected.
SQL>
Here is how I would do it. Let me know if you need more input / better explanations:
select substr(tt.codes,(((t.l-1)*4)+1),4) code,tt.product from tst_tab tt
join (select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab)) t
on t.l <= length(tt.codes)/4
order by tt.product,t.l;
Some explanantions:
-- this part gives the numbers from 1 ... maximum number of codes in codes column
select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab);
-- here is the query without the code extraction, it is just the numbers 1... numbers of codes for the product
select t.l,tt.product from tst_tab tt
join (select level l from dual connect by level <= (select max(length(codes)/4) from tst_tab)) t
on t.l <= length(tt.codes)/4
order by tt.product,t.l;
-- and then the substr just extracts the right code:
substr(tt.codes,(((t.l-1)*4)+1),4)
Set up of my test data:
create table tst_tab (codes VARCHAR2(800),product VARCHAR2(1));
insert into tst_tab values ('C111C222C333C444C555','A');
insert into tst_tab values ('C111C222C333C444C555D666','B');
insert into tst_tab values ('C111','C');
commit;
One option might be this:
SQL> with test (codes, product) as
2 (select 'C111C222C333C444C555', 'A' from dual union all
3 select 'D555D666D777', 'B' from dual
4 )
5 select substr(codes, 4 * (column_value - 1) + 1, 4) code, product
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= length(codes) / 4
9 ) as sys.odcinumberlist))
10 order by 1;
CODE P
---- -
C111 A
C222 A
C333 A
C444 A
C555 A
D555 B
D666 B
D777 B
8 rows selected.
SQL>
Yet another a little bit different option of using recursive SQL to do this.
(To make it more concise I didn't add an example of test data. It could be taken from #Littlefoot or #Peter answers)
select code, product
from (
select distinct
substr(codes, (level - 1) * 4 + 1, 4) as code,
level as l,
product
from YourTable
connect by substr(codes, (level - 1) * 4 + 1, 4) is not null
)
order by product, l
P.S. #Thorsten Kettner made a fair point about considering to restructure your tables. That would be the right thing to do for sake of easier maintenance of your database in future

SQL random number that doesn't repeat within a group

Suppose I have a table:
HH SLOT RN
--------------
1 1 null
1 2 null
1 3 null
--------------
2 1 null
2 2 null
2 3 null
I want to set RN to be a random number between 1 and 10. It's ok for the number to repeat across the entire table, but it's bad to repeat the number within any given HH. E.g.,:
HH SLOT RN_GOOD RN_BAD
--------------------------
1 1 9 3
1 2 4 8
1 3 7 3 <--!!!
--------------------------
2 1 2 1
2 2 4 6
2 3 9 4
This is on Netezza if it makes any difference. This one's being a real headscratcher for me. Thanks in advance!
To get a random number between 1 and the number of rows in the hh, you can use:
select hh, slot, row_number() over (partition by hh order by random()) as rn
from t;
The larger range of values is a bit more challenging. The following calculates a table (called randoms) with numbers and a random position in the same range. It then uses slot to index into the position and pull the random number from the randoms table:
with nums as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9
),
randoms as (
select n, row_number() over (order by random()) as pos
from nums
)
select t.hh, t.slot, hnum.n
from (select hh, randoms.n, randoms.pos
from (select distinct hh
from t
) t cross join
randoms
) hnum join
t
on t.hh = hnum.hh and
t.slot = hnum.pos;
Here is a SQLFiddle that demonstrates this in Postgres, which I assume is close enough to Netezza to have matching syntax.
I am not an expert on SQL, but probably do something like this:
Initialize a counter CNT=1
Create a table such that you sample 1 row randomly from each group and a count of null RN, say C_NULL_RN.
With probability C_NULL_RN/(10-CNT+1) for each row, assign CNT as RN
Increment CNT and go to step 2
Well, I couldn't get a slick solution, so I did a hack:
Created a new integer field called rand_inst.
Assign a random number to each empty slot.
Update rand_inst to be the instance number of that random number within this household. E.g., if I get two 3's, then the second 3 will have rand_inst set to 2.
Update the table to assign a different random number anywhere that rand_inst>1.
Repeat assignment and update until we converge on a solution.
Here's what it looks like. Too lazy to anonymise it, so the names are a little different from my original post:
/* Iterative hack to fill 6 slots with a random number between 1 and 13.
A random number *must not* repeat within a household_id.
*/
update c3_lalfinal a
set a.rand_inst = b.rnum
from (
select household_id
,slot_nbr
,row_number() over (partition by household_id,rnd order by null) as rnum
from c3_lalfinal
) b
where a.household_id = b.household_id
and a.slot_nbr = b.slot_nbr
;
update c3_lalfinal
set rnd = CAST(0.5 + random() * (13-1+1) as INT)
where rand_inst>1
;
/* Repeat until this query returns 0: */
select count(*) from (
select household_id from c3_lalfinal group by 1 having count(distinct(rnd)) <> 6
) x
;

How to check any missing number from a series of numbers?

I am doing a project creating an admission system for a college; the technologies are Java and Oracle.
In one of the tables, pre-generated serial numbers are stored. Later, against those serial numbers, the applicant's form data will be entered. My requirement is that when the entry process is completed I will have to generate a Lot wise report. If during feeding pre-generated serial numbers any sequence numbers went missing.
For example, say in a table, the sequence numbers are 7001, 7002, 7004, 7005, 7006, 7010.
From the above series it is clear that from 7001 to 7010 the numbers missing are 7003, 7007, 7008 and 7009
Is there any DBMS function available in Oracle to find out these numbers or if any stored procedure may fulfill my purpose then please suggest an algorithm.
I can find some techniques in Java but for speed I want to find the solution in Oracle.
A solution without hardcoding the 9:
select min_a - 1 + level
from ( select min(a) min_a
, max(a) max_a
from test1
)
connect by level <= max_a - min_a + 1
minus
select a
from test1
Results:
MIN_A-1+LEVEL
-------------
7003
7007
7008
7009
4 rows selected.
Try this:
SELECT t1.SequenceNumber + 1 AS "From",
MIN(t2.SequenceNumber) - 1 AS "To"
FROM MyTable t1
JOIN MyTable t2 ON t1.SequenceNumber < t2.SequenceNumber
GROUP BY t1.SequenceNumber
HAVING t1.SequenceNumber + 1 < MIN(t2.SequenceNumber)
Here is the result for the sequence 7001, 7002, 7004, 7005, 7006, 7010:
From To
7003 7003
7007 7009
This worked but selects the first sequence (start value) since it doesn't have predecessor. Tested in SQL Server but should work in Oracle
SELECT
s.sequence FROM seqs s
WHERE
s.sequence - (SELECT sequence FROM seqs WHERE sequence = s.sequence-1) IS NULL
Here is a test result
Table
-------------
7000
7001
7004
7005
7007
7008
Result
----------
7000
7004
7007
To get unassigned sequence, just do value[i] - 1 where i is greater first row e.g. (7004 - 1 = 7003 and 7007 - 1 = 7006) which are available sequences
I think you can improve on this simple query
This works on postgres >= 8.4. With some slight modifications to the CTE-syntax it could be made to work for oracle and microsoft, too.
-- EXPLAIN ANALYZE
WITH missing AS (
WITH RECURSIVE fullhouse AS (
SELECT MIN(num)+1 as num
FROM numbers n0
UNION ALL SELECT 1+ fh0.num AS num
FROM fullhouse fh0
WHERE EXISTS (
SELECT * FROM numbers ex
WHERE ex.num > fh0.num
)
)
SELECT * FROM fullhouse fh1
EXCEPT ( SELECT num FROM numbers nx)
)
SELECT * FROM missing;
Here's a solution that:
Relies on Oracle's LAG function
Does not require knowledge of the complete sequence (but thus doesn't detect if very first or last numbers in sequence were missed)
Lists the values surrounding the missing lists of numbers
Lists the missing lists of numbers as contiguous groups (perhaps convenient for reporting)
Tragically fails for very large lists of missing numbers, due to listagg limitations
SQL:
WITH MentionedValues /*this would just be your actual table, only defined here to provide data for this example */
AS (SELECT *
FROM ( SELECT LEVEL + 7000 seqnum
FROM DUAL
CONNECT BY LEVEL <= 10000)
WHERE seqnum NOT IN (7003,7007,7008,7009)--omit those four per example
),
Ranges /*identifies all ranges between adjacent rows*/
AS (SELECT seqnum AS seqnum_curr,
LAG (seqnum, 1) OVER (ORDER BY seqnum) AS seqnum_prev,
seqnum - (LAG (seqnum, 1) OVER (ORDER BY seqnum)) AS diff
FROM MentionedValues)
SELECT Ranges.*,
( SELECT LISTAGG (Ranges.seqnum_prev + LEVEL, ',') WITHIN GROUP (ORDER BY 1)
FROM DUAL
CONNECT BY LEVEL < Ranges.diff) "MissingValues" /*count from lower seqnum+1 up to lower_seqnum+(diff-1)*/
FROM Ranges
WHERE diff != 1 /*ignore when diff=1 because that means the numers are sequential without skipping any*/
;
Output:
SEQNUM_CURR SEQNUM_PREV DIFF MissingValues
7004 7002 2 "7003"
7010 7006 4 "7007,7008,7009"
One simple way to get your answer for your scenario is this:
create table test1 ( a number(9,0));
insert into test1 values (7001);
insert into test1 values (7002);
insert into test1 values (7004);
insert into test1 values (7005);
insert into test1 values (7006);
insert into test1 values (7010);
commit;
select n.n from (select ROWNUM + 7001 as n from dual connect by level <= 9) n
left join test1 t on n.n = t.a where t.a is null;
The select will give you the answer from your example. This only makes sense, if you know in advance in which range your numbers are and the range should not too big. The first number must be the offset in the ROWNUM part and the length of the sequence is the limit to the level in the connect by part.
I would have suggested connect by level as Stefan has done, however, you can't use a sub-query in this statement, which means that it isn't really suitable for you as you need to know what the maximum and minimum values of your sequence are.
I would suggest a pipe-lined table function might be the best way to generate the numbers you need to do the join. In order for this to work you'd need an object in your database to return the values to:
create or replace type t_num_array as table of number;
Then the function:
create or replace function generate_serial_nos return t_num_array pipelined is
l_first number;
l_last number;
begin
select min(serial_no), max_serial_no)
into l_first, l_last
from my_table
;
for i in l_first .. l_last loop
pipe row(i);
end loop;
return;
end generate_serial_nos;
/
Using this function the following would return a list of serial numbers, between the minimum and maximum.
select * from table(generate_serial_nos);
Which means that your query to find out which serial numbers are missing becomes:
select serial_no
from ( select *
from table(generate_serial_nos)
) generator
left outer join my_table actual
on generator.column_value = actual.serial_no
where actual.serial_no is null
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL <= (SELECT MAX(a) FROM test1)
MINUS
SELECT a FROM test1 ;
Improved query is:
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL <= (SELECT MAX(a) FROM test1)
MINUS
SELECT ROWNUM "Missing_Numbers" FROM dual CONNECT BY LEVEL < (SELECT Min(a) FROM test1)
MINUS
SELECT a FROM test1;
Note: a is column in which we find missing value.
Try with a subquery:
SELECT A.EMPNO + 1 AS MissingEmpNo
FROM tblEmpMaster AS A
WHERE A.EMPNO + 1 NOT IN (SELECT EMPNO FROM tblEmpMaster)
select A.ID + 1 As ID
From [Missing] As A
Where A.ID + 1 Not IN (Select ID from [Missing])
And A.ID < n
Data: ID
1
2
5
7
Result: ID
3
4
6

Generating Random Number In Each Row In Oracle Query

I want to select all rows of a table followed by a random number between 1 to 9:
select t.*, (select dbms_random.value(1,9) num from dual) as RandomNumber
from myTable t
But the random number is the same from row to row, only different from each run of the query. How do I make the number different from row to row in the same execution?
Something like?
select t.*, round(dbms_random.value() * 8) + 1 from foo t;
Edit:
David has pointed out this gives uneven distribution for 1 and 9.
As he points out, the following gives a better distribution:
select t.*, floor(dbms_random.value(1, 10)) from foo t;
At first I thought that this would work:
select DBMS_Random.Value(1,9) output
from ...
However, this does not generate an even distribution of output values:
select output,
count(*)
from (
select round(dbms_random.value(1,9)) output
from dual
connect by level <= 1000000)
group by output
order by 1
1 62423
2 125302
3 125038
4 125207
5 124892
6 124235
7 124832
8 125514
9 62557
The reasons are pretty obvious I think.
I'd suggest using something like:
floor(dbms_random.value(1,10))
Hence:
select output,
count(*)
from (
select floor(dbms_random.value(1,10)) output
from dual
connect by level <= 1000000)
group by output
order by 1
1 111038
2 110912
3 111155
4 111125
5 111084
6 111328
7 110873
8 111532
9 110953
you don’t need a select … from dual, just write:
SELECT t.*, dbms_random.value(1,9) RandomNumber
FROM myTable t
If you just use round then the two end numbers (1 and 9) will occur less frequently, to get an even distribution of integers between 1 and 9 then:
SELECT MOD(Round(DBMS_RANDOM.Value(1, 99)), 9) + 1 FROM DUAL