I have a table like this;
--Table_Name--
A | B | C
-----------------
A1 NULL NULL
A1 NULL NULL
A2 NULL NULL
NULL B1 NULL
NULL B2 NULL
NULL B3 NULL
NULL NULL C1
I want to get like this ;
--Table_Name--
A | B | C
-----------------
A1 B1 C1
A2 B2 NULL
NULL B3 NULL
How should I do that ?
Here's one option:
sample data is from line #1 - 9
the following CTEs (lines #11 - 13) fetch ranked distinct not null values from each column
the final query (line #15 onward) returns desired result by outer joining previous CTEs on ranked value
SQL> with test (a, b, c) as
2 (select 'A1', null, null from dual union all
3 select 'A1', null, null from dual union all
4 select 'A2', null, null from dual union all
5 select null, 'B1', null from dual union all
6 select null, 'B2', null from dual union all
7 select null, 'B3', null from dual union all
8 select null, null, 'C1' from dual
9 ),
10 --
11 ta as (select distinct a, dense_rank() over (order by a) rn from test where a is not null),
12 tb as (select distinct b, dense_rank() over (order by b) rn from test where b is not null),
13 tc as (select distinct c, dense_rank() over (order by c) rn from test where c is not null)
14 --
15 select ta.a, tb.b, tc.c
16 from ta full outer join tb on ta.rn = tb.rn
17 full outer join tc on ta.rn = tc.rn
18 order by a, b, c
19 /
A B C
-- -- --
A1 B1 C1
A2 B2
B3
SQL>
If you have only one value per column, then I think a simpler solution is to enumerate the values and aggregate:
select max(a) as a, max(b) as b, max(c) as c
from (select t.*,
dense_rank() over (partition by (case when a is null then 1 else 2 end),
(case when b is null then 1 else 2 end),
(case when c is null then 1 else 2 end)
order by a, b, c
) as seqnum
from t
) t
group by seqnum;
This only "aggregates" once and only uses one window function, so I think it should have better performance than handling each column individually.
Another approach is to use lateral joins which are available in Oracle 12C -- but this assumes that the types are compatible:
select max(case when which = 'a' then val end) as a,
max(case when which = 'b' then val end) as b,
max(case when which = 'c' then val end) as c
from (select which, val,
dense_rank() over (partition by which order by val) as seqnum
from t cross join lateral
(select 'a' as which, a as val from dual union all
select 'b', b from dual union all
select 'c', c from dual
) x
where val is not null
) t
group by seqnum;
The performance may be comparable, because the subquery removes so many rows.
Related
I have the below 2 tables:
assigned
VK
PC
A
B
RANK
VK1
PC1
A1
null
1
VK2
PC1
A1
A2
2
VK3
PC1
A2
null
3
VK4
PC1
A2
null
4
VK5
PC1
null
A1
5
VK6
PC1
null
A2
6
res
PC
A
MAXI
PC1
A1
2
PC1
A2
2
I would like to have the below desired output, based on this logic:
If B!=A, then assign B to C if the count of the value in B in the preceding rows order by rank is less than 'MAXI' in table res for that 'PC' and 'A'. If B=A or B is null, assign A to C.
After updating setting column C with the logic in point 1, if the count of any 'A's is less than 'MAXI' in table res, update the first null value to that 'A' until the count of 'A's is equal than 'MAXI' in res. Similarly, if the count of 'A's exceed 'MAXI' for any of the 'A's, set C to null the lowest assigned ranks until the condition is met.
Desired output:
VK
PC
A
B
RANK
C
VK1
PC1
A1
null
1
A1
VK2
PC1
A1
A2
2
A2
VK3
PC1
A2
null
3
A2
VK4
PC1
A2
null
4
A1
VK5
PC1
null
A1
5
null
VK6
PC1
null
A2
6
null
NOTE: row 4 was assigned to A1 instead of A2 because row 2 had to be assigned to A2 and thus row 4 exceeded the quota for A2. Quota for A1 was still 1 (less than 2), so could be assigned to A1. For row 5, there were already 2 A1s assigned (row 1 and 4), so the quota was exceeded and C had to be null.
EDIT: this is what I've tried so far.
with assigned (vk, pc, a, b, r) as(
select 'VK1', 'PC1', 'A1', null, 1 from dual union all
select 'VK2', 'PC1', 'A1', 'A2', 2 from dual union all
select 'VK3', 'PC1', 'A2', null, 3 from dual union all
select 'VK4', 'PC1', 'A2', null, 4 from dual union all
select 'VK5', 'PC1', null, 'A1', 5 from dual union all
select 'VK6', 'PC1', null, 'A2', 6 from dual),
res(pc, a, maxi) as(
select 'PC1', 'A1', 2 from dual union all
select 'PC1', 'A2', 2 from dual
)
, aux AS (
SELECT
a.*,
coalesce(a.b, a.a) d,
COUNT(coalesce(a.b, a.a)) OVER(
PARTITION BY coalesce(a.b, a.a)
ORDER BY
r
) i,
b.maxi
FROM
assigned a
LEFT JOIN res b ON ( b.pc = a.pc
AND b.a = a.a )
)
SELECT
a.*,
case
when i<=maxi then d
else a end c
FROM
aux a
order by r;
Your logic appears to be something like this:
WITH res_rows ( pc, a, maxi, total ) AS (
SELECT pc, a, maxi, SUM( maxi ) OVER( PARTITION BY pc )
FROM res
WHERE maxi > 0
UNION ALL
SELECT pc, a, maxi - 1, total FROM res_rows WHERE maxi > 1
),
p1 ( vk, pc, a, b, r, c, rn, pc_r ) AS (
SELECT a.*,
COALESCE(b, a),
ROW_NUMBER() OVER (PARTITION BY pc, COALESCE(b, a) ORDER BY r),
ROW_NUMBER() OVER (PARTITION BY pc ORDER BY r)
FROM assigned a
),
p2 ( vk, pc, a, b, r, c, rn ) AS (
SELECT p1.vk,
p1.pc,
p1.a,
p1.b,
p1.r,
r.a,
CASE
WHEN r.a IS NULL
THEN ROW_NUMBER() OVER (
PARTITION BY p1.pc
ORDER BY CASE WHEN r.a IS NULL THEN p1.r END
)
ELSE p1.rn
END
FROM p1
LEFT OUTER JOIN res_rows r
ON ( p1.pc = r.pc AND p1.c = r.a AND p1.rn = r.maxi AND p1.pc_r <= total )
),
missing ( pc, a, rn ) AS (
SELECT pc,
a,
ROW_NUMBER() OVER ( PARTITION BY pc ORDER BY ROWNUM )
FROM (
SELECT pc, a, maxi FROM res_rows
MINUS
SELECT pc, c, rn FROM p2 WHERE c IS NOT NULL
)
)
SELECT p2.vk,
p2.pc,
p2.a,
p2.b,
p2.r,
COALESCE( m.a, p2.c ) AS c
FROM p2
LEFT OUTER JOIN missing m
ON ( p2.pc = m.pc AND p2.c IS NULL AND p2.rn = m.rn )
ORDER BY r
Which outputs:
VK
PC
A
B
R
C
VK1
PC1
A1
1
A1
VK2
PC1
A1
A2
2
A2
VK3
PC1
A2
3
A2
VK4
PC1
A2
4
A1
VK5
PC1
A1
5
VK6
PC1
A2
6
db<>fiddle here
I have the below table. Is it possible to do a cummulative distinct count? For example, if A1 has 3 distinct values, then the count for it will be 3. Afterwards, check for A1 and A2. If A1 and A2 together have 5 distinct values, 5. Repeat until A1 + A2 ... + An and count the distinct values.
A
V
A1
V1
A1
V2
A1
V2
A2
V1
A2
V2
A2
V3
My expected output would be:
A
C
A1
2
A2
3
This answers the original version of the question.
You can aggregate twice . . . once to keep the first occurrence of v and the second to aggregate again:
select a, count(*) as new_cs
from (select v, min(a) as a
from t
group by v
) v
group by a;
Note: The above only shows as that have new values. If you want all a, then window functions are a better approach:
select a, sum(case when seqnum = 1 then 1 else 0 end) as c
from (select t.*, row_number() over (partition by v order by a) as seqnum
from t
) t
group by a
order by a;
Here is a db<>fiddle.
You can use ROW_NUMBER() window function to find the 1st occurrence of each V and then COUNT() window function to count only these 1st occurrences:
SELECT DISTINCT A,
COUNT(CASE WHEN rn = 1 THEN 1 END) OVER (ORDER BY A) C
FROM (
SELECT A, ROW_NUMBER() OVER (PARTITION BY V ORDER BY A) rn
FROM tablename
) t
ORDER BY A
See the demo.
You can use a partitioned outer join to ensure that all V values are counted for all A values and then use the FIRST_VALUE analytic function to find whether a value exists in the current or preceding A values for the V:
SELECT a,
COUNT( DISTINCT fv ) AS c
FROM (
SELECT t.a,
FIRST_VALUE(t.v) IGNORE NULLS OVER (
PARTITION BY v.v
ORDER BY t.a
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS fv
FROM ( SELECT DISTINCT v FROM table_name ) v
LEFT OUTER JOIN table_name t
PARTITION BY ( t.a )
ON ( t.v = v.v )
)
GROUP BY a
ORDER BY a
Which, for the sample data:
CREATE TABLE table_name ( A, V ) AS
SELECT 'A1', 'V1' FROM DUAL UNION ALL
SELECT 'A1', 'V2' FROM DUAL UNION ALL
SELECT 'A1', 'V3' FROM DUAL UNION ALL
SELECT 'A2', 'V1' FROM DUAL UNION ALL
SELECT 'A2', 'V3' FROM DUAL UNION ALL
SELECT 'A2', 'V4' FROM DUAL UNION ALL
SELECT 'A3', 'V2' FROM DUAL UNION ALL
SELECT 'A3', 'V3' FROM DUAL UNION ALL
SELECT 'A4', 'V1' FROM DUAL UNION ALL
SELECT 'A4', 'V5' FROM DUAL;
Outputs:
A
C
A1
3
A2
4
A3
4
A4
5
db<>fiddle here
I have a table like this
A_Count
B_Count
A
B
C
1
0
A
NULL
C1
0
1
NULL
B
C1
1
1
A
B
C2
1
1
A
B
C2
and I want to have a result table (only need to show column A and B) like:
A_Count
B_Count
A
B
C
1
1
A
B
C1
1
1
A
B
C2
1
1
A
B
C2
So my goal is to merge two row having the following condiction:
both rows belong to same group C and only merge when one row has A being null and one row has B being null.
so its like:
group by C
having sum(A_COUNT) =1 AND sum(B_COUNT) =1
but the problem is, I want to keep those rows that are not merged (ROW 3 & 4) , can someone tell me how to do that? many thanks!
You can use conditional analytical function and group by as follows:
Select max(a) as a, max(b) as b, c from
(Select a, b, c,
case when nulla = 1 and nullb = 1 and (a is null or b is null)
then 0
else row_number() over (partition by c order by 1)
end as rn
from (Select a, b, c,
count(case when a is null then 1 end) over(partition by c) as nulla,
count(case when b is null then 1 end) over (partition by c) as nullb
From your_table t
)
)
Group by c, rn
DB<>Fiddle Thanks to MT0. Used the sample data from MT0's fiddle.
If you were using Oracle 12 then you could use MATCH_RECOGNIZE:
SELECT a_count, b_count, a, b, c
FROM (
SELECT t.*,
NVL2(
A,
ROW_NUMBER() OVER ( PARTITION BY C ORDER BY NVL2( B, 1, 0 ) DESC, ROWNUM ),
ROW_NUMBER() OVER ( PARTITION BY C ORDER BY NVL2( A, 1, 0 ) DESC, ROWNUM )
) AS rn
FROM table_name t
)
MATCH_RECOGNIZE (
PARTITION BY C
ORDER BY rn, A NULLS LAST
MEASURES
FIRST( a_count ) AS a_count,
LAST( b_count ) AS b_count,
FIRST( a ) AS a,
LAST( b ) AS b
PATTERN ( a b? )
DEFINE
a AS a.a IS NOT NULL,
b AS a.b IS NULL AND b.a IS NULL AND b.b IS NOT NULL
)
Before that Oracle version, you can get a similar effect using analytic functions to determine which rows to aggregate:
SELECT SUM( a_count ) AS a_count,
SUM( b_count ) AS b_count,
MAX( a ) AS a,
MAX( b ) AS b,
c
FROM (
SELECT t.*,
NVL2(
A,
ROW_NUMBER() OVER ( PARTITION BY C ORDER BY NVL2( B, 1, 0 ) DESC, ROWNUM ),
ROW_NUMBER() OVER ( PARTITION BY C ORDER BY NVL2( A, 1, 0 ) DESC, ROWNUM )
) AS rn
FROM table_name t
)
GROUP BY c, rn
Which, for the sample data (in an unordered state, with additional rows to demonstrate grouping additional pairs of rows):
CREATE TABLE table_name ( A_Count, B_Count, A, B, C ) AS
SELECT 1, 0, 'A', NULL, 'C1' FROM DUAL UNION ALL
SELECT 0, 1, NULL, 'B', 'C1' FROM DUAL UNION ALL
SELECT 1, 1, 'A', 'B', 'C2' FROM DUAL UNION ALL
SELECT 0, 1, NULL, 'B', 'C2' FROM DUAL UNION ALL -- Added row
SELECT 1, 0, 'A', NULL, 'C2' FROM DUAL UNION ALL -- Added row
SELECT 1, 0, 'A', NULL, 'C2' FROM DUAL UNION ALL -- Added row
SELECT 1, 1, 'A', 'B', 'C2' FROM DUAL UNION ALL
SELECT 0, 1, NULL, 'B', 'C2' FROM DUAL -- Added row
Both output:
A_COUNT | B_COUNT | A | B | C
------: | ------: | :- | :- | :-
1 | 1 | A | B | C1
1 | 1 | A | B | C2
1 | 1 | A | B | C2
1 | 1 | A | B | C2
1 | 1 | A | B | C2
db<>fiddle here
You can do this with join:
select (t1.a_count + coalesce(t2.a_count, 0)) as a_count,
(t1.b_count + coalesce(t2.b_count, 0)) as b_count,
coalesce(t1.a, t2.a) as a,
coalesce(t1.b, t2.b) as b,
t1.c
from t t1 left join
t t2
on t1.c = t2.c and
t1.a is not null and t2.a is null and
t1.b is null and t2.b is not null
where t1.a is not null;
As you've described the problem, aggregation doesn't seem necessary.
Here is a db<>fiddle with your original data.
I've got a table A with 3 columns that contains the same data, for exemple:
TABLE A
KEY COL1 COL2 COL3
1 A B C
2 B C null
3 A null null
4 D E F
5 null C B
6 B C A
7 D E F
As a result I expect the distinct values of this table and the order doesn't matter. So key 1 and 6 are the same and 2 and 5 also and 4 and 7. The rest is different.
Ofcourse, I can't use a distinct in my select that will only filter 4 and 7.
I could use a very complex case statement, or a select in a select with an order by. But this needs to be used in a conversion, so performance is an issue here.
Does anyone have a good performant way to do this?
The result I expect
COL1 COL2 COL3
A B C
B C null
A null null
D E F
If you can have many columns then you can UNPIVOT then order the values and then PIVOT and take the DISTINCT rows:
Oracle Setup:
CREATE TABLE table_name ( KEY, COL1, COL2, COL3 ) AS
SELECT 1, 'A', 'B', 'C' FROM DUAL UNION ALL
SELECT 2, 'B', 'C', null FROM DUAL UNION ALL
SELECT 3, 'A', null, null FROM DUAL UNION ALL
SELECT 4, 'D', 'E', 'F' FROM DUAL UNION ALL
SELECT 5, null, 'C', 'B' FROM DUAL UNION ALL
SELECT 6, 'B', 'C', 'A' FROM DUAL UNION ALL
SELECT 7, 'D', 'E', 'F' FROM DUAL
Query:
SELECT DISTINCT
COL1, COL2, COL3
FROM (
SELECT key,
value,
ROW_NUMBER() OVER ( PARTITION BY key ORDER BY value ) AS rn
FROM table_name
UNPIVOT ( value FOR name IN ( COL1, COL2, COL3 ) ) u
)
PIVOT ( MAX( value ) FOR rn IN (
1 AS COL1,
2 AS COL2,
3 AS COL3
) )
Output:
COL1 | COL2 | COL3
:--- | :--- | :---
A | B | C
B | C | null
D | E | F
A | null | null
db<>fiddle here
The complicated case expression is going to have the best performance. But the simplest method is going to be conditional aggregation:
select key,
max(case when seqnum = 1 then col end) as col1,
max(case when seqnum = 2 then col end) as col2,
max(case when seqnum = 3 then col end) as col3
from (select key,col,
row_number() over (partition by key order by col asc) as seqnum
from ((select key, col1 as col from t) union all
(select key, col2 as col from t) union all
(select key, col3 as col from t)
) kc
where col is not null
) kc
group by key;
I have rows that look like .
OrderNo OrderStatus SomeOtherColumn
A 1
A 1
A 3
B 1 X
B 1 Y
C 2
C 3
D 2
I want to return all orders that have only one possible value of orderstatus. For e.g Here order B has only order status 1 SO result should be
B 1 X
B 1 Y
Notes:
Rows can be duplicated with same order status. For e.g. B here.
I am interested in the order having a very peculiar status for e.g. 1 here and not having any other status. So if B had a status of 3 at any point of time it is disqualified.
You can use not exists:
select t.*
from t
where not exists (select 1
from t t2
where t.orderno = t2.orderno and t.OrderStatus = t2.OrderStatus
);
If you just want the orders where this is true, you can use group by and having:
select orderno
from t
group by orderno
having min(OrderStatus) = max(OrderStatus);
If you only want a status of 1 then add max(OrderStatus) = 1 to the having clause.
Here is one way to do it. It does not handle the case where the status can be NULL; if that is possible, you will need to explain how you want it handled.
SQL> create table test_data ( orderno, status, othercol ) as (
2 select 'A', 1, null from dual union all
3 select 'A', 1, null from dual union all
4 select 'A', 3, null from dual union all
5 select 'B', 1, 'X' from dual union all
6 select 'B', 1, 'Y' from dual union all
7 select 'C', 2, null from dual union all
8 select 'C', 3, null from dual union all
9 select 'D', 2, null from dual
10 );
Table created.
SQL> variable input_status number
SQL> exec :input_status := 1
PL/SQL procedure successfully completed.
SQL> column orderno format a8
SQL> column othercol format a8
SQL> select orderno, status, othercol
2 from (
3 select t.*, count(distinct status) over (partition by orderno) as cnt
4 from test_data t
5 )
6 where status = :input_status
7 and cnt = 1
8 ;
ORDERNO STATUS OTHERCOL
-------- ---------- --------
B 1 X
B 1 Y
One way to handle NULL status (if that may happen), if in that case the orderno should be rejected (not included in the output), is to define the cnt differently:
count(case when status != :input_status or status is null then 1 end)
over (partition by orderno) as cnt
and in the outer query change the WHERE clause to a single condition,
where cnt = 0
Count distinct OrderStatus partitioned by OrderNo and show only rows where number equals one:
select OrderNo, OrderStatus, SomeOtherColumn
from ( select t.*, count(distinct orderstatus) over (partition by orderno) cnt
from t )
where cnt = 1
SQLFiddle demo
Just wanted to add something to Gordon's answer, using a stats function:
select orderno
from t
group by orderno
having variance(orderstatus) = 0;