I am not sure how to get the right script
To get the reference code detail, one of those tables 'car_col' has to fill in not duplicate results from the other second tables 'car_cat', if 'car_col' doesn't have then 'car_cat' fill in
select
a. customer_id ,
a. car_code ,
b. description as code_desc ,
a. price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
output result;
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
as you can see customer 345 has double row code 45 red
REFERENCE TABLES below;
select *
from reference_codes
where domain = 'car_col'
DOMAIN
CODE
DESCRIPTION
car_col
12
blue
car_col
23
black
car_col
45
red
car_col
67
green
select *
from reference_codes
where domain = 'car_cat'
DOMAIN
CODE
DESCRIPTION
car_cat
24
yellow
car_cat
45
red
car_cat
70
purple
car_cat
90
row
I want output result
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
I am using ORACLE SQL
thank you
Hard to give a concrete answer with really seeing all the data or the model, but I'm interpreting your requirement as:
"If I can't find a join on a customer to CAR_COL then fall back to using CAR_CAT"
Taking your existing code, we can add a column to "prioritise" the data, eg
select *
from (
select
a.customer_id ,
a.car_code ,
b.description as code_desc ,
a.price,
row_number() over ( partition by customer_id, car_code
order by case when domain = 'car_col' then 1 else 2 end ) as rating
from product a
left join (select * from reference_codes
where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
)
where rating = 1
If your sample data is something like here:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
),
product (CUSTOMER_ID, CODE, PRICE) AS
(
Select 123, 12, 100 From Dual Union All
Select 123, 23, 100 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 678, 67, 110 From Dual Union All
Select 678, 24, 110 From Dual Union All
Select 908, 45, 130 From Dual Union All
Select 908, 70, 130 From Dual
)
... then with your query the result is as you showed above. There is column PRICE selected but we can't see the values in your question, though.
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120 -- one row less if 'car_col' is missing
345 45 red 120
345 45 red 120
345 45 red 120
678 24 yellow 110
678 67 green 110
908 70 purple 130
908 45 red 130
908 45 red 130
To get your expected result (if there are no different prices) you could just add DISTINCT keyword to get what you want:
Select DISTINCT
... ... ... your query
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
... and if there are different prices than you could use aggregation with group by like below
Select
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION "CODE_DESC",
AVG(a.PRICE) "AVG_PRICE"
From
product a
Left Join
( Select * From reference_codes Where DOMAIN IN('car_col', 'car_cat') ) b
ON(a.CODE = b.CAR_CODE)
Group By
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION
Order By
a.CUSTOMER_ID
CUSTOMER_ID CODE CODE_DESC AVG_PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
Your query adjusted (depending on PRICE) either by DISTINCT or aggregation will work ok even when 'car_col' is missing:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
--Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
To get the desired output, you can use the CASE statement and ROW_NUMBER function in your SQL query as below:
select
a.customer_id,
a.car_code,
case
when row_number() over (partition by a.customer_id, a.car_code order by b.domain = 'car_col') = 1
then b.description
end as code_desc,
a.price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a.car_code = b.code
group by a.customer_id, a.car_code, code_desc, a.price
The CASE statement checks if the ROW_NUMBER function is equal to 1, and returns the description only if it is. The ROW_NUMBER function is used to assign a unique number to each row within a partition (in this case, partitioned by customer_id and car_code) and ordered by the domain being equal to 'car_col'. By doing this, we ensure that the first row for each customer_id and car_code combination is the one from the 'car_col' table. The GROUP BY clause is used to aggregate the rows with the same customer_id, car_code, code_desc, and price.
Related
I have three worksheets: players, teams, and weights (how highly a particular attribute is weighted when determining player-team match).
Players
Name
Age
Height
Free_Throw_Perc
...
Bod
23
74
62
...
Teams
| Team_Name | Age | Height | Free_Throw_Perc | ... |
|-----------|-----|--------|-----------------|-----|
|Team1|23|78|62|...|
Weights
| Team_Name | Age | Height | Free_Throw_Perc | ... |
|:---------:|:---:|:------:|:---------------:|:---:|
| Team1 | 5 | 10 | 10 | ... |
CREATE TABLE players (name, age, height, free_throw_perc) AS
SELECT 'Alice', 20, 160, 90 FROM DUAL UNION ALL
SELECT 'Betty', 21, 165, 80 FROM DUAL UNION ALL
SELECT 'Carol', 22, 170, 70 FROM DUAL UNION ALL
SELECT 'Debra', 23, 175, 60 FROM DUAL UNION ALL
SELECT 'Emily', 24, 180, 50 FROM DUAL UNION ALL
SELECT 'Fiona', 25, 185, 40 FROM DUAL UNION ALL
SELECT 'Gerri', 26, 190, 30 FROM DUAL UNION ALL
SELECT 'Heidi', 27, 195, 20 FROM DUAL UNION ALL
SELECT 'Irene', 28, 200, 10 FROM DUAL;
CREATE TABLE teams (team_name, age, height, free_throw_perc) AS
SELECT 'ALPHA', 20,175,90 FROM DUAL;
CREATE TABLE weights team_name, age, height, free_throw_perc) AS
SELECT 'ALPHA', 5,10,10 FROM DUAL;
The teams table corresponds to the players table but contains a record for each team detailing their ideal player based on the current composition of the team. The weights table contains a record for each team with an integer value weight stating how much they care about each player attribute. I am trying to compute a total match score for each player-team combination. I was able to do this quite easily with python but am struggling to accomplish the same in SQL.
In Python this would be a simple for loop with logical operators comparing each cell of one dataframe to each cell of another, but the lack of positional referencing in SQL makes this a lot trickier to do and generalize (be able to use the same queries for other pairs of tables with different attributes).
So far I have
BEGIN
FOR c in (SELECT column_name FROM all_tab_columns WHERE table_name = 'teams')
LOOP
INSERT INTO match_table (players.Name, candidates.c)
SELECT players.Name, players.c WHERE players.c = teams.c
END LOOP;
BEGIN
FOR c IN (SELECT column_name FROM all_tab_columns WHERE table_name = 'weights')
LOOP
UPDATE match_table
SET match_table.c = (SELECT weights.c FROM weights WHERE match_table.c = weights.c)
END LOOP;
From what I can tell that will generate a table of player names with a single column corresponding to a match to a team attribute populated by the corresponding weight and all other columns full of null values. If that is the case, I can group by name to create a singular record with all matches and corresponding weights.
The script should loop through each player and team and compare the attributes of the player with those desired by the team. Where there is a match a new row should be added to the match_table with the player name and nulls values except for the column that matched. That should be done for each player-team attribute match. Then those matches should be replaced by the corresponding weight from the weight table. I would then like to sum those to get a total match score. I can't use the '+' operator because the column nammes will vary. They will always match between the three tables, but there will be varied attributes of interest.
The expected output would look something like:
players.name
Age
Height
Free_Throw_Perc
...
'Alice'
5
NULL
NULL
...
'Alice'
NULL
10
NULL
...
How would I then sum across each record to find the total match score of each candidate for a team?
If you have the sample data:
CREATE TABLE teams ( id, name ) AS
SELECT 1, 'Alpha' FROM DUAL UNION ALL
SELECT 2, 'Beta' FROM DUAL UNION ALL
SELECT 3, 'Gamma' FROM DUAL;
CREATE TABLE players (name, team, age, height, free_throw_perc) AS
SELECT 'Alice', 1, 20, 160, 90 FROM DUAL UNION ALL
SELECT 'Betty', 1, 21, 165, 80 FROM DUAL UNION ALL
SELECT 'Carol', 1, 22, 170, 70 FROM DUAL UNION ALL
SELECT 'Debra', 2, 23, 175, 60 FROM DUAL UNION ALL
SELECT 'Emily', 2, 24, 180, 50 FROM DUAL UNION ALL
SELECT 'Fiona', 2, 25, 185, 40 FROM DUAL UNION ALL
SELECT 'Gerri', 3, 26, 190, 30 FROM DUAL UNION ALL
SELECT 'Heidi', 3, 27, 195, 20 FROM DUAL UNION ALL
SELECT 'Irene', 3, 28, 200, 10 FROM DUAL;
CREATE TABLE weights(team, key, weight) AS
SELECT 1, 'AGE', 1.0 FROM DUAL UNION ALL
SELECT 1, 'HEIGHT', 0.5 FROM DUAL UNION ALL
SELECT 1, 'FREE_THROW_PERC', 0.2 FROM DUAL UNION ALL
SELECT 2, 'AGE', 0.0 FROM DUAL UNION ALL
SELECT 2, 'HEIGHT', 1.0 FROM DUAL UNION ALL
SELECT 2, 'FREE_THROW_PERC', 0.8 FROM DUAL UNION ALL
SELECT 3, 'AGE', 0.5 FROM DUAL UNION ALL
SELECT 3, 'HEIGHT', 0.5 FROM DUAL UNION ALL
SELECT 3, 'FREE_THROW_PERC', 1.0 FROM DUAL;
And you want to insert the sum of the weight column from the weights table multiplied by the respective value in the players table into the following table:
CREATE TABLE match_table(
team INT,
value NUMBER
);
Then you can use the following INSERT query:
INSERT INTO match_table (team, value)
SELECT p.team,
SUM(p.value * w.weight)
FROM ( SELECT name, team, key, value
FROM players
UNPIVOT ( value FOR key IN (age, height, free_throw_perc) )
) p
INNER JOIN weights w
ON ( p.team = w.team AND p.key = w.key )
GROUP BY p.team
Then the table will contain the weighted totals:
TEAM
VALUE
2
660
3
393
1
358.5
fiddle
And if your match_table is:
CREATE TABLE match_table(
player VARCHAR2(20),
team INT,
age NUMBER,
height NUMBER,
free_throw_perc NUMBER,
total NUMBER
);
Then you can use the query (and calculate the total with the + operator):
INSERT INTO match_table (player, team, age, height, free_throw_perc, total)
SELECT p.name,
p.team,
p.age * w.age_weight,
p.height * w.height_weight,
p.free_throw_perc * w.free_throw_perc_weight,
p.age * w.age_weight
+ p.height * w.height_weight
+ p.free_throw_perc * w.free_throw_perc_weight
FROM players p
INNER JOIN (
SELECT *
FROM weights
PIVOT (
MAX(weight)
FOR key IN (
'AGE' AS age_weight,
'HEIGHT' AS height_weight,
'FREE_THROW_PERC' AS free_throw_perc_weight
)
)
) w
ON (p.team = w.team)
Which gives the values:
PLAYER
TEAM
AGE
HEIGHT
FREE_THROW_PERC
TOTAL
Alice
1
20
80
18
118
Betty
1
21
82.5
16
119.5
Carol
1
22
85
14
121
Debra
2
0
175
48
223
Emily
2
0
180
40
220
Fiona
2
0
185
32
217
Gerri
3
13
95
30
138
Heidi
3
13.5
97.5
20
131
Irene
3
14
100
10
124
fiddle
Or, if the players are uncorrelated to a team then:
INSERT INTO match_table (player, team, age, height, free_throw_perc, total)
SELECT p.name,
w.team,
p.age * w.age_weight,
p.height * w.height_weight,
p.free_throw_perc * w.free_throw_perc_weight,
p.age * w.age_weight
+ p.height * w.height_weight
+ p.free_throw_perc * w.free_throw_perc_weight
FROM players p
CROSS JOIN (
SELECT *
FROM weights
PIVOT (
MAX(weight)
FOR key IN (
'AGE' AS age_weight,
'HEIGHT' AS height_weight,
'FREE_THROW_PERC' AS free_throw_perc_weight
)
)
) w
Which, for the sample data, outputs:
PLAYER
TEAM
AGE
HEIGHT
FREE_THROW_PERC
TOTAL
Alice
1
20
80
18
118
Alice
2
0
160
72
232
Alice
3
10
80
90
180
Betty
1
21
82.5
16
119.5
Betty
2
0
165
64
229
Betty
3
10.5
82.5
80
173
Carol
1
22
85
14
121
Carol
2
0
170
56
226
Carol
3
11
85
70
166
Debra
1
23
87.5
12
122.5
Debra
2
0
175
48
223
Debra
3
11.5
87.5
60
159
Emily
1
24
90
10
124
Emily
2
0
180
40
220
Emily
3
12
90
50
152
Fiona
1
25
92.5
8
125.5
Fiona
2
0
185
32
217
Fiona
3
12.5
92.5
40
145
Gerri
1
26
95
6
127
Gerri
2
0
190
24
214
Gerri
3
13
95
30
138
Heidi
1
27
97.5
4
128.5
Heidi
2
0
195
16
211
Heidi
3
13.5
97.5
20
131
Irene
1
28
100
2
130
Irene
2
0
200
8
208
Irene
3
14
100
10
124
fiddle
I'm trying to count occurrences of values in a specific column.
cus_id prod_id income
100 10 90
100 10 80
100 20 110
122 20 9
122 30 10
When doing the query, I would like to receive something like this:
cus_id count(prod_id = 10) (prod_id = 20) (prod_id = 30) sum(income)
100 2 1 0 280
122 0 1 1 19
At the moment my initial approach is this:
select cus_id, prod_id, count(prod_id), sum(income) from t group by 1,2
Any insights would be highly appreciated. Thanks in advance!
Oracle SQL
with t (cus_id, prod_id, income) as (
select 100, 10, 90 from dual union all
select 100, 10, 80 from dual union all
select 100, 20, 110 from dual union all
select 122, 20, 9 from dual union all
select 122, 30, 10 from dual)
select
cus_id,
count(case when prod_id = 10 then income end) sum_prod_10,
count(case when prod_id = 20 then income end) sum_prod_20,
count(case when prod_id = 30 then income end) sum_prod_30,
count(income) sum_income
from t
group by cus_id;
CUS_ID SUM_PROD_10 SUM_PROD_20 SUM_PROD_30 SUM_INCOME
---------- ----------- ----------- ----------- ----------
122 0 1 1 2
100 2 1 0 3
SQL>
https://dbfiddle.uk/XZs56Hks
I don't know if the title describes my requirements. I have a table C_Bpartner (with C_BPartner_ID as a Primary Key) for employees like this:
name Hiringorderno Orderdate C_Bpartner_ID
A 30 25/02/2002 100
B 47 13/10/2005 101
D 110 22/09/2010 105
and other tables like emp_training:
C_Bpartner_ID TrainingOrderno Orderdate
100 46 14/05/2012
100 58 10/07/2013
101 76 22/10/2015
and emp_penalty:
C_Bpartner_ID PenaltyOrderno Orderdate
105 133 14/05/2012
101 153 25/03/2018
I want the resulting table to be like:
name orderno Orderdate C_Bpartner_ID
A 30 25/02/2012 100
A 46 14/05/2005 100
A 58 10/07/2013 100
B 47 13/10/2005 101
B 76 22/10/2015 101
B 153 25/03/2018 101
D 110 22/09/2010 105
D 133 14/05/2012 105
so, I joined C_BPartner with itself and coalesce them, in order to get a second record for the same C_BPartner_ID. then tried to get the Hiringorderno from C_BPartner bp and join C_BPartner pp with emp_penalty pt(as an example) and get PenaltyOrderno
and combine them with coalesce(bp.Hiringorderno,pt.PenaltyOrderno) and do that for all other tables and for Orderdate as well. but it doesn't duplicate records. it picks the first coalesce parameter and discards the other. like this
coalesce(bp.name,pp.name) coalesce(bp.Hiringorderno,pt.PenaltyOrderno) Hiringorderno PenaltyOrderno
A 30 30 null
B 47 47 153
the emp_penalty record for B is not there.
There's other ways to do this, but I think the most clear and intuitive way is to UNION the 3 queries that you're trying to do.
select name, hiringorderno as orderno, orderdate, C_Bpartner_ID, 'HIRING' as ordertype, null as emp_penalty_ID
from C_Bpartner
union all
select bp.name, trainingorderno, t.orderdate, bp.C_Bpartner_ID, 'TRAINING', null
from emp_training t
join C_Bpartner bp
on bp.C_Bpartner_ID = t.C_Bpartner_ID
union all
select bp.name, PenaltyOrderno, p.orderdate, bp.C_Bpartner_ID, 'PENALTY', p.emp_penalty_ID
from emp_penalty p
join C_Bpartner bp
on bp.C_Bpartner_ID = p.C_Bpartner_ID
;
Edit: I added 2 columns to show 2 common ways to differentiate the union'ed records.
One way is to add a constant string or number to each select statement - that way you can use CASE WHEN ordertype = 'PENALTY' ... or WHERE ordertype = 'TRAINING' to filter your records.
Another way, like you mentioned, is to fill in a column for one of the selects, like emp_penalty_id, but set it to null for the other select statements.
All the select statements being unioned together need to have the same number of columns, with compatible types. The first select statement defines the column names and types for the rest, which is why I didn't need to add column aliases to the second and third selects.
One option is to union 3 queries:
SQL> with
2 c_bpartner (name, hiringorderno, orderdate, c_bpartner_id) as
3 (select 'A', 30, date '2002-02-25', 100 from dual union all
4 select 'B', 47, date '2005-10-13', 101 from dual union all
5 select 'D', 110,date '2010-09-22', 105 from dual
6 ),
7 emp_training(c_bpartner_id, trainingorderno, orderdate) as
8 (select 100, 46, date '2012-05-14' from dual union all
9 select 100, 58, date '2013-07-10' from dual union all
10 select 101, 76, date '2015-10-22' from dual
11 ),
12 emp_penalty (c_bpartner_id, penaltyorderno, orderdate) as
13 (select 105, 133, date '2012-05-14' from dual union all
14 select 101, 153, date '2018-03-25' from dual
15 )
16 select c.name, c.hiringorderno as orderno, c.orderdate, c.c_bpartner_id
17 from c_bpartner c
18 union all
19 select c.name, t.trainingorderno, t.orderdate, t.c_bpartner_id
20 from c_bpartner c join emp_training t on t.c_bpartner_id = c.c_bpartner_id
21 union all
22 select c.name, p.penaltyorderno, p.orderdate, p.c_bpartner_id
23 from c_bpartner c join emp_penalty p on p.c_bpartner_id = c.c_bpartner_id
24 order by 1, 2;
N ORDERNO ORDERDATE C_BPARTNER_ID
- ---------- ---------- -------------
A 30 25/02/2002 100
A 46 14/05/2012 100
A 58 10/07/2013 100
B 47 13/10/2005 101
B 76 22/10/2015 101
B 153 25/03/2018 101
D 110 22/09/2010 105
D 133 14/05/2012 105
8 rows selected.
SQL>
I want to list out the ID's that have ordered different codes. I dont want to list ID which ordered only one code(i.e. ID 4, 5).
ID product code
1 Apple 145
1 Grapes 146
2 Orange 147
2 Apple 145
2 Plum 148
3 Grapes 146
3 Orange 147
4 Grapes 146
5 Orange 147
And I want it to look like this
ID Codes
1 145 | 146
2 147 | 145 | 148
3 146 | 147
Appreciate any help!
You need to GROUP BY id, and the condition on "more than one order" goes into a HAVING clause (because it is a constraint on each group, not on each individual row in the input data). The aggregation is done with LISTAGG.
with
test_data ( id, product, code ) as (
select 1, 'Apple' , 145 from dual union all
select 1, 'Grapes', 146 from dual union all
select 2, 'Orange', 147 from dual union all
select 2, 'Apple' , 145 from dual union all
select 2, 'Plum' , 148 from dual union all
select 3, 'Grapes', 146 from dual union all
select 3, 'Orange', 147 from dual union all
select 4, 'Grapes', 146 from dual union all
select 5, 'Orange', 147 from dual
)
-- End of test data (not part of the solution). Query begins below this line.
select id, listagg(code, ' | ') within group (order by id) as codes
from test_data
group by id
having count(*) > 1
;
ID CODE
-- ---------------
1 145 | 146
2 145 | 147 | 148
3 146 | 147
However, in Oracle 10 you don't have LISTAGG(). Before Oracle 11.2, a common way to get the same result was to use hierarchical queries, something like below:
select id, ltrim(sys_connect_by_path(code, ' | '), ' | ') as codes
from (
select id, code,
row_number() over (partition by id order by code) as rn
from test_data
)
where connect_by_isleaf = 1 and level > 1
connect by rn = prior rn + 1
and prior id = id
and prior sys_guid() is not null
start with rn = 1
;
EDITED:
If repeated CODE for the same ID need to be "distincted" first, then - using the second solution - the following changes are needed, both in the innermost subquery:
change SELECT ID, CODE, ... to SELECT DISTINCT ID, CODE, ...
change ROW_NUMBER() to DENSE_RANK()
Here is a minimal working example of what I'm trying to do and what I'm getting:
I have a query as follows:
/*
with tran_party as -- ALL DUMMY DATA ARE IN THESE CTE FOR YOUR REFERENCE
(select 1 tran_party_id, 11 transaction_id, 101 team_id_redirect
from dual
union all
select 2, 11, 101 from dual
union all
select 3, 11, 102 from dual
union all
select 4, 12, 103 from dual
union all
select 5, 12, 103 from dual
union all
select 6, 12, 104 from dual
union all
select 7, 13, 104 from dual
union all
select 8, 13, 105 from dual),
tran as
(select 11 transaction_id, 1001 account_id, 1034.93 amount from dual
union all
select 12, 1001, 2321.89 from dual
union all
select 13, 1002, 3201.47 from dual),
account as
(select 1001 account_id, 111 team_id from dual
union all
select 1002, 112 from dual),
team as
(select 101 team_id, 'UUU' as team_code from dual
union all
select 102, 'VV' from dual
union all
select 103, 'WWW' from dual
union all
select 104, 'XXXXX' from dual
union all
select 105, 'Z' from dual)
-- */
-- The Actual Query
select a.account_id,
t.transaction_id,
(select listagg (tm_redir.team_code, ', ')
within group (order by tm_redir.team_code)
from tran_party tp_redir
inner join team tm_redir
on tp_redir.team_id_redirect = tm_redir.team_id
inner join tran t_redir
on tp_redir.transaction_id = t_redir.transaction_id
where t_redir.account_id = a.account_id
and t_redir.transaction_id != t.transaction_id)
as teams_redirected
from tran t inner join account a on t.account_id = a.account_id;
NOTE: tran_party.team_id_redirect is a foreign key that references team.team_id.
Current output:
ACCOUNT_ID TRANSACTION_ID TEAMS_REDIRECTED
---------- -------------- ----------------
1001 11 WWW, WWW, XXXXX
1001 12 UUU, UUU, VV
1002 13
Expected output:
I want the repeated items in TEAMS_REDIRECTED column to be selected only once, like this:
ACCOUNT_ID TRANSACTION_ID TEAMS_REDIRECTED
---------- -------------- ----------------
1001 11 WWW, XXXXX
1001 12 UUU, VV
1002 13
What I tried:
Instead of selecting from tran_party directly, I wrote an inline view that selects distinct values from tran_party like this:
select a.account_id,
t.transaction_id,
(select listagg (tm_redir.team_code, ', ')
within group (order by tm_redir.team_code)
from (select distinct transaction_id, team_id_redirect -- Note this inline view
from tran_party) tp_redir
inner join team tm_redir
on tp_redir.team_id_redirect = tm_redir.team_id
inner join tran t_redir
on tp_redir.transaction_id = t_redir.transaction_id
where t_redir.account_id = a.account_id
and t_redir.transaction_id != t.transaction_id)
as teams_redirected
from tran t inner join account a on t.account_id = a.account_id;
While this does give me the expected output, when I use this solution in my actual code, it takes about 13 seconds to retrieve just one row. Thus I cannot use what I already tried.
Any help will be appreciated.
The following method gets rid of the in-line view to fetch duplicates, it uses REGEXP_REPLACE and RTRIM on the LISTAGG function to get the distinct result set in the aggregated list. Thus, it won't do more than one scan.
Adding this piece to your code,
RTRIM(REGEXP_REPLACE(listagg (tm_redir.team_code, ',')
WITHIN GROUP (ORDER BY tm_redir.team_code),
'([^,]+)(,\1)+', '\1'),
',')
Modified query-
SQL> with tran_party as -- ALL DUMMY DATA ARE IN THESE CTE FOR YOUR REFERENCE
2 (select 1 tran_party_id, 11 transaction_id, 101 team_id_redirect
3 from dual
4 union all
5 select 2, 11, 101 from dual
6 union all
7 select 3, 11, 102 from dual
8 union all
9 select 4, 12, 103 from dual
10 union all
11 select 5, 12, 103 from dual
12 union all
13 select 6, 12, 104 from dual
14 union all
15 select 7, 13, 104 from dual
16 union all
17 select 8, 13, 105 from dual),
18 tran as
19 (select 11 transaction_id, 1001 account_id, 1034.93 amount from dual
20 union all
21 select 12, 1001, 2321.89 from dual
22 union all
23 select 13, 1002, 3201.47 from dual),
24 account as
25 (select 1001 account_id, 111 team_id from dual
26 union all
27 select 1002, 112 from dual),
28 team as
29 (select 101 team_id, 'UUU' as team_code from dual
30 union all
31 select 102, 'VV' from dual
32 union all
33 select 103, 'WWW' from dual
34 union all
35 select 104, 'XXXXX' from dual
36 union all
37 select 105, 'Z' from dual)
38 -- The Actual Query
39 select a.account_id,
40 t.transaction_id,
41 (SELECT RTRIM(
42 REGEXP_REPLACE(listagg (tm_redir.team_code, ',')
43 WITHIN GROUP (ORDER BY tm_redir.team_code),
44 '([^,]+)(,\1)+', '\1'),
45 ',')
46 from tran_party tp_redir
47 inner join team tm_redir
48 on tp_redir.team_id_redirect = tm_redir.team_id
49 inner join tran t_redir
50 on tp_redir.transaction_id = t_redir.transaction_id
51 where t_redir.account_id = a.account_id
52 and t_redir.transaction_id != t.transaction_id)
53 AS teams_redirected
54 from tran t inner join account a on t.account_id = a.account_id
55 /
ACCOUNT_ID TRANSACTION_ID TEAMS_REDIRECTED
---------- -------------- --------------------
1001 11 WWW,XXXXX
1001 12 UUU,VV
1002 13
SQL>