I am looking for ideas on how to group numbers into low and high ranges in Oracle SQL. I looking to to avoid cursors...any ideas welcome
Example input
ID
LOW
HIGH
A
0
2
A
2
3
A
3
5
A
9
11
A
11
13
A
13
15
B
0
1
B
1
4
B
7
9
B
11
12
B
12
17
B
17
18
Which would result in the following grouping into ranges
ID
LOW
HIGH
A
0
5
A
9
15
B
0
4
B
7
9
B
11
18
This is a Gaps & Islands problem. You can use the traditional solution.
For example:
select max(id) as id, min(low) as low, max(high) as high
from (
select x.*, sum(i) over(order by id, low) as g
from (
select t.*,
case when low = lag(high) over(partition by id order by low)
and id = lag(id) over(partition by id order by low)
then 0 else 1 end as i
from t
) x
) y
group by g
Result:
ID LOW HIGH
--- ---- ----
A 0 5
A 9 15
B 0 4
B 7 9
B 11 18
See running example at db<>fiddle.
From Oracle 12, you should use MATCH_RECOGNIZE for row-by-row pattern matching:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY low, high
MEASURES
FIRST(low) AS low,
MAX(high) AS high
PATTERN (overlapping* last_row)
DEFINE
overlapping AS NEXT(low) <= MAX(high)
)
Which, for the sample data:
CREATE TABLE table_name (id, low, high) AS
SELECT 'A', 0, 2 FROM DUAL UNION ALL
SELECT 'A', 2, 3 FROM DUAL UNION ALL
SELECT 'A', 3, 5 FROM DUAL UNION ALL
SELECT 'A', 9, 11 FROM DUAL UNION ALL
SELECT 'A', 11, 13 FROM DUAL UNION ALL
SELECT 'A', 13, 15 FROM DUAL UNION ALL
SELECT 'B', 0, 1 FROM DUAL UNION ALL
SELECT 'B', 1, 4 FROM DUAL UNION ALL
SELECT 'B', 7, 9 FROM DUAL UNION ALL
SELECT 'B', 11, 12 FROM DUAL UNION ALL
SELECT 'B', 12, 17 FROM DUAL UNION ALL
SELECT 'B', 17, 18 FROM DUAL UNION ALL
SELECT 'C', 0, 10 FROM DUAL UNION ALL
SELECT 'C', 1, 3 FROM DUAL UNION ALL
SELECT 'C', 5, 8 FROM DUAL UNION ALL
SELECT 'C', 9, 15 FROM DUAL UNION ALL
SELECT 'C', 10, 14 FROM DUAL UNION ALL
SELECT 'C', 11, 13 FROM DUAL;
Outputs:
ID
LOW
HIGH
A
0
5
A
9
15
B
0
4
B
7
9
B
11
18
C
0
15
fiddle
Related
I have the requirement to flag the customers Y only when all the related customers have also passed the check.
below are the two tables:
relationship table :
customer_id related_customer
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
11 11
11 22
22 11
22 22
Check table
customer_id check_flag
1 y
2 y
3 n
11 y
22 y
I want output like below:
customer_id paas_fail_flag
1 n
2 n
3 n
11 y
22 y
output justification: since 1,2,3 are related customers and since one of them (3) has n in table 2 , so all the related customers should also have n.
11,22 are related customers and both have y in table 2.so in output both should have y.
You need to join relationship to check and use conditional aggregation:
SELECT r.customer_id,
COALESCE(MAX(CASE WHEN c.check_flag = 'n' THEN c.check_flag END), 'y') paas_fail_flag
FROM relationship r INNER JOIN "check" c
ON c.customer_id = r.related_customer
GROUP BY r.customer_id
ORDER BY r.customer_id
See the demo.
Something like this? Sample data in lines #1 - 40; query begins at line #41:
SQL> WITH
2 -- sample data
3 rel (customer_id, related_customer)
4 AS
5 (SELECT 1, 1 FROM DUAL
6 UNION ALL
7 SELECT 1, 2 FROM DUAL
8 UNION ALL
9 SELECT 1, 3 FROM DUAL
10 UNION ALL
11 SELECT 2, 1 FROM DUAL
12 UNION ALL
13 SELECT 2, 2 FROM DUAL
14 UNION ALL
15 SELECT 2, 3 FROM DUAL
16 UNION ALL
17 SELECT 3, 1 FROM DUAL
18 UNION ALL
19 SELECT 3, 2 FROM DUAL
20 UNION ALL
21 SELECT 3, 3 FROM DUAL
22 UNION ALL
23 SELECT 11, 11 FROM DUAL
24 UNION ALL
25 SELECT 11, 22 FROM DUAL
26 UNION ALL
27 SELECT 22, 11 FROM DUAL
28 UNION ALL
29 SELECT 22, 22 FROM DUAL),
30 chk (customer_id, check_flag)
31 AS
32 (SELECT 1, 'y' FROM DUAL
33 UNION ALL
34 SELECT 2, 'y' FROM DUAL
35 UNION ALL
36 SELECT 3, 'n' FROM DUAL
37 UNION ALL
38 SELECT 11, 'y' FROM DUAL
39 UNION ALL
40 SELECT 22, 'y' FROM DUAL),
41 temp
42 AS
43 -- minimum CHECK_FLAG per customer and related customer
44 ( SELECT r.customer_id, r.related_customer, MIN (c.check_flag) mcf
45 FROM rel r JOIN chk c ON c.customer_id = r.related_customer
46 GROUP BY r.customer_id, r.related_customer)
47 SELECT customer_id, MIN (mcf) flag
48 FROM temp
49 GROUP BY customer_id
50 ORDER BY customer_id;
CUSTOMER_ID FLAG
----------- ----
1 n
2 n
3 n
11 y
22 y
SQL>
Assuming that your relationship data could be sparse, for example:
CREATE TABLE relationship ( customer_id, related_customer ) AS
SELECT 2, 3 FROM DUAL UNION ALL
SELECT 3, 1 FROM DUAL UNION ALL
SELECT 3, 2 FROM DUAL UNION ALL
SELECT 11, 22 FROM DUAL;
CREATE TABLE "CHECK" ( customer_id, check_flag ) AS
SELECT 1, 'y' FROM DUAL UNION ALL
SELECT 2, 'y' FROM DUAL UNION ALL
SELECT 3, 'n' FROM DUAL UNION ALL
SELECT 11, 'y' FROM DUAL UNION ALL
SELECT 22, 'y' FROM DUAL;
(Note: The below query will also work on your dense data, where every relationship combination is enumerated.)
Then you can use a hierarchical query:
SELECT customer_id,
MIN(check_flag) AS check_flag
FROM (
SELECT CONNECT_BY_ROOT(c.customer_id) AS customer_id,
c.check_flag AS check_flag
FROM "CHECK" c
LEFT OUTER JOIN relationship r
ON (r.customer_id = c.customer_id)
WHERE CONNECT_BY_ISLEAF = 1
CONNECT BY NOCYCLE
( PRIOR r.related_customer = c.customer_id
OR PRIOR c.customer_id = r.related_customer )
AND PRIOR c.check_flag = 'y'
)
GROUP BY
customer_id
ORDER BY
customer_id
Which outputs:
CUSTOMER_ID
CHECK_FLAG
1
n
2
n
3
n
11
y
22
y
db<>fiddle here
I have a table who looks like this:
Pam_A Week Value_1
A 1 10
A 2 13
B 3 15
B 4 10
B 5 11
B 6 10
I want to achieve the following:
Pam_A Week Value_1 Value_2
A 1 10
A 2 13
B 3 15 28
B 4 10 38
B 5 11 49
B 6 10 59
When Pam_A=B, sum the current Value_1 and its preceding row value and keep that value increasing accordding the next value in Value_1
Any ideas for achieve this cumulative sum?
First of all you need to mark all rows that you want to count. You can do it like this:
with t(Pam_A, Week, Value_1) as (
select 'A', 1, 10 from dual union all
select 'A', 2, 13 from dual union all
select 'B', 3, 15 from dual union all
select 'B', 4, 10 from dual union all
select 'B', 5, 11 from dual union all
select 'B', 6, 10 from dual
)
select
Pam_A, Week, Value_1
,case
when Pam_A='B' or lead(Pam_A)over(order by week) = 'B'
then 'Y'
else 'N'
end as flag
from t;
Results:
PAM_A WEEK VALUE_1 FLAG
----- ---------- ---------- ----
A 1 10 N
A 2 13 Y
B 3 15 Y
B 4 10 Y
B 5 11 Y
B 6 10 Y
6 rows selected.
Then you can aggregate only rows that have flag='Y':
with t(Pam_A, Week, Value_1) as (
select 'A', 1, 10 from dual union all
select 'A', 2, 13 from dual union all
select 'B', 3, 15 from dual union all
select 'B', 4, 10 from dual union all
select 'B', 5, 11 from dual union all
select 'B', 6, 10 from dual
)
select
v.*
,case
when flag='Y' and Pam_a='B'
then sum(Value_1)over(partition by flag order by Week)
end as sums
from (
select
Pam_A, Week, Value_1
,case
when Pam_A='B' or lead(Pam_A)over(order by week) = 'B'
then 'Y'
else 'N'
end as flag
from t
) v;
Results:
PAM_A WEEK VALUE_1 FLAG SUMS
----- ---------- ---------- ---- ----------
A 1 10 N
A 2 13 Y
B 3 15 Y 28
B 4 10 Y 38
B 5 11 Y 49
B 6 10 Y 59
6 rows selected.
Using a combination of LEAD and SUM analytic functions, you can determine which rows have the next PAM_A as a B, then only SUM if the next row is a B or the current row is a B.
Query
WITH
d (pam_a, week, value_1)
AS
(SELECT 'A', 1, 10 FROM DUAL
UNION ALL
SELECT 'A', 2, 13 FROM DUAL
UNION ALL
SELECT 'B', 3, 15 FROM DUAL
UNION ALL
SELECT 'B', 4, 10 FROM DUAL
UNION ALL
SELECT 'B', 5, 11 FROM DUAL
UNION ALL
SELECT 'B', 6, 10 FROM DUAL)
SELECT pam_a,
week,
value_1,
CASE
WHEN pam_a = 'B'
THEN
SUM (CASE WHEN next_pam_a = 'B' OR pam_a = 'B' THEN value_1 ELSE 0 END)
OVER (ORDER BY week)
ELSE
NULL
END value_2
FROM (SELECT pam_a, week, value_1, LEAD (pam_a) OVER (ORDER BY week) AS next_pam_a FROM d);
Result
PAM_A WEEK VALUE_1 VALUE_2
________ _______ __________ __________
A 1 10
A 2 13
B 3 15 28
B 4 10 38
B 5 11 49
B 6 10 59
If I understand, you want a cumulative sum but with conditionality:
select t.*,
(case when pam_A = 'B' then sum(value_1) over (order by week) end) as value_2
from t;
I am trying to figure out the root parent in a table with hierarchical data. The following example works as expected but I need to do something extra. I want to avoid the query to ignore null id1 and show the (root parent - 1) if the root parent is null.
with table_a ( id1, child_id ) as (
select null, 1 from dual union all
select 1, 2 from dual union all
select 2, 3 from dual union all
select 3, NULL from dual union all
select 4, NULL from dual union all
select 5, 6 from dual union all
select 6, 7 from dual union all
select 7, 8 from dual union all
select 8, NULL from dual
)
select connect_by_root id1 as id, id1 as root_parent_id
from table_a
where connect_by_isleaf = 1
connect by child_id = prior id1
order by id 1
This brings up the following data
4 4
6 5
7 5
8 5
5 5
3 null
null null
2 null
1 null
what I want is
3 1
1 1
2 1
4 4
7 5
8 5
5 5
6 5
is it possible?
Thanks for the help
Using a recursive CTE you can do:
with table_a ( id1, child_id ) as (
select null, 1 from dual union all
select 1, 2 from dual union all
select 2, 3 from dual union all
select 3, NULL from dual union all
select 4, NULL from dual union all
select 5, 6 from dual union all
select 6, 7 from dual union all
select 7, 8 from dual union all
select 8, NULL from dual
),
n (s, e) as (
select id1 as s, child_id as e from table_a where id1 not in
(select child_id from table_a
where id1 is not null and child_id is not null)
union all
select n.s, a.child_id
from n
join table_a a on a.id1 = n.e
)
select
coalesce(e, s) as c, s
from n
order by s
Result:
C S
- -
3 1
1 1
2 1
4 4
5 5
7 5
8 5
6 5
As a side note, "Recursive CTEs" are more flexible than the old-school CONNECT BY.
This looks like it works but it may be incorrect, as I do not quite understand the logic behind choosing 1 for this, looks arbitrary to me, not much like real data will be.
As Hogan has asked already, it would be helpful if you could perhaps provide an explanation or an expanded data set to test this hierarchy.
with table_a ( id1, child_id ) as (
select null, 1 from dual union all
select 1, 2 from dual union all
select 2, 3 from dual union all
select 3, NULL from dual union all
select 4, NULL from dual union all
select 5, 6 from dual union all
select 6, 7 from dual union all
select 7, 8 from dual union all
select 8, NULL from dual
)
select connect_by_root id1 as id, id1 as root_parent_id
from table_a
where connect_by_isleaf = 1 and connect_by_root id1 is not null
connect by nocycle child_id = prior nvl(id1, 1)
order by 2, 1;
Sample execution:
FSITJA#dbd01 2019-07-19 13:51:13> with table_a ( id1, child_id ) as (
2 select null, 1 from dual union all
3 select 1, 2 from dual union all
4 select 2, 3 from dual union all
5 select 3, NULL from dual union all
6 select 4, NULL from dual union all
7 select 5, 6 from dual union all
8 select 6, 7 from dual union all
9 select 7, 8 from dual union all
10 select 8, NULL from dual
11 )
12 select connect_by_root id1 as id, id1 as root_parent_id
13 from table_a
14 where connect_by_isleaf = 1 and connect_by_root id1 is not null
15 connect by nocycle child_id = prior nvl(id1, 1)
16 order by 2, 1;
ID ROOT_PARENT_ID
---------- --------------
1 1
2 1
3 1
4 4
5 5
6 5
7 5
8 5
8 rows selected.
I have the following tables already in my DB
EMP
E_N E_NAM E_RATE E_DEP
--- ----- ---------- -----
1 A 400
2 B 200 1
3 C 150 2
4 D 150 3
5 E 120 1
6 F 100 1
7 G 100 2
8 H 50 2
9 I 50 3
10 J 50 3
11 K 150 3
WORKS
E_NO PR_NO HRS
--- --- ----------
2 1 10
3 2 20
5 1 20
5 2 20
5 3 20
6 1 10
6 2 10
I have to compute the amount billed to each project as AMOUNT, and that is the sum of the amount billed to the project by all employees who work on said project. The amount billed being E_RATE*HRS (product of HRS and E_RATE).
There are only 3 PR_NO: 1, 2 and 3.
I've tried this multiple times with no avail, I know that it has to be a nested query and the calculation to be shown AS AMOUNT, but no clue on how exactly to only display the 3 projects with the calculation already made.
Sounds like simple join and aggregation:
select w.pr_no,
sum(w.hrs * e.e_rate) as amount
from works w
join emp e on w.e_no = e.e_n
group by w.pr_no;
simple aggregate SUM() function after joining the tables
--test data
with EMP(e_no, e_name, e_rate, e_dep) as
(select 1, 'A', 400, null from dual union all
select 2, 'B', 200, 1 from dual union all
select 3, 'C', 150, 2 from dual union all
select 4, 'D', 150, 3 from dual union all
select 5, 'E', 120, 1 from dual union all
select 6, 'F', 100, 1 from dual union all
select 7, 'G', 100, 2 from dual union all
select 8, 'H', 50, 2 from dual union all
select 9, 'I', 50, 3 from dual union all
select 10, 'J', 50, 3 from dual union all
select 11, 'K', 150, 3 from dual),
WORKS(e_no, pr_no, hrs) as
(select 2, 1, 10 from dual union all
select 3, 2, 20 from dual union all
select 5, 1, 20 from dual union all
select 5, 2, 20 from dual union all
select 5, 3, 20 from dual union all
select 6, 1, 10 from dual union all
select 6, 2, 10 from dual)
-- actual query starts here
select w.pr_no, sum(w.hrs*e.e_rate) as amount
from works w
inner join emp e on (w.e_no = e.e_no)
group by w.pr_no;
"PR_NO"|"AMOUNT"
1|5400
2|6400
3|2400
We have a table which have millions of entry. The table have two columns, now there is correlation between X and Y when X is beyond a value, Y tends to be B (However it is not always true, its a trend not a certainty).
Here i want to find the threshold value for X, i.e(X1) such that at least 99% of the value which are less than X1 are B.
It can be done using code easily. But is there a SQL query which can do the computation.
For the below dataset expected is 6 because below 6 more than 99% is 'B' and there is no bigger value of X for which more than 99% is 'B'. However if I change it to precision of 90% then it will become 12 because if X<12 more than 90% of the values are 'B' and there is no bigger value of X for which it holds true
So we need to find the biggest value X1 such that at least 99% of the value lesser than X1 are 'B'.
X Y
------
2 B
3 B
3 B
4 B
5 B
5 B
5 B
6 G
7 B
7 B
7 B
8 B
8 B
8 B
12 G
12 G
12 G
12 G
12 G
12 G
12 G
12 G
13 G
13 G
13 B
13 G
13 G
13 G
13 G
13 G
14 B
14 G
14 G
Ok, I think this accomplishes what you want to do, but it will not work for the data volume you are mentioning. I'm posting it anyway in case it can help someone else provide an answer.
This may be one of those cases where the most efficient way is to use a cursor with sorted data.
Oracle has some builting functions for correlation analysis but I've never worked with it so I don't know how they work.
select max(x)
from (select x
,y
,num_less
,num_b
,num_b / nullif(num_less,0) as percent_b
from (select x
,y
,(select count(*) from table b where b.x<a.x) as num_less
,(select count(*) from table b where b.x<a.x and b.y = 'B') as num_b
from table a
)
where num_b / nullif(num_less,0) >= 0.99
);
The inner select does the following:
For every value of X
Count the nr of values < X
Count the nr of 'B'
The next SELECT computes the ratio of B's and filter only the rows where the ratio is above the threshold. The outer just picks the max(x) from those remaining rows.
Edit:
The non-scalable part in the above query is the semi-cartesian self-joins.
This is mostly inspired from the previous answer, which had some flaws.
select max(next_x) from
(
select
count(case when y='B' then 1 end) over (order by x) correct,
count(case when y='G' then 1 end) over (order by x) wrong,
lead(x) over (order by x) next_x
from table_name
)
where correct/(correct + wrong) > 0.99
Sample data:
create table table_name(x number, y varchar2(1));
insert into table_name
select 2, 'B' from dual union all
select 3, 'B' from dual union all
select 3, 'B' from dual union all
select 4, 'B' from dual union all
select 5, 'B' from dual union all
select 5, 'B' from dual union all
select 5, 'B' from dual union all
select 6, 'G' from dual union all
select 7, 'B' from dual union all
select 7, 'B' from dual union all
select 7, 'B' from dual union all
select 8, 'B' from dual union all
select 8, 'B' from dual union all
select 8, 'B' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 12, 'G' from dual union all
select 13, 'G' from dual union all
select 13, 'G' from dual union all
select 13, 'B' from dual union all
select 13, 'G' from dual union all
select 13, 'G' from dual union all
select 13, 'G' from dual union all
select 13, 'G' from dual union all
select 13, 'G' from dual union all
select 14, 'B' from dual union all
select 14, 'G' from dual union all
select 14, 'G' from dual;
Give a try with this and share the results:
Assuming table name as table_name and columns as x and y
with TAB AS (
select (count(x) over (PARTITION BY Y order by x rows between unbounded preceding and current row))/
(COUNT(case when y='B' then 1 end) OVER (PARTITION BY Y)) * 100 CC, x, y
from table_name)
select x,y from (SELECT min(cc) over (partition by y) min_cc, x, cc, y
FROM TAB
where cc >= 99)
where min_cc = cc