Related
I am not sure how to get the right script
To get the reference code detail, one of those tables 'car_col' has to fill in not duplicate results from the other second tables 'car_cat', if 'car_col' doesn't have then 'car_cat' fill in
select
a. customer_id ,
a. car_code ,
b. description as code_desc ,
a. price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
output result;
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
as you can see customer 345 has double row code 45 red
REFERENCE TABLES below;
select *
from reference_codes
where domain = 'car_col'
DOMAIN
CODE
DESCRIPTION
car_col
12
blue
car_col
23
black
car_col
45
red
car_col
67
green
select *
from reference_codes
where domain = 'car_cat'
DOMAIN
CODE
DESCRIPTION
car_cat
24
yellow
car_cat
45
red
car_cat
70
purple
car_cat
90
row
I want output result
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
I am using ORACLE SQL
thank you
Hard to give a concrete answer with really seeing all the data or the model, but I'm interpreting your requirement as:
"If I can't find a join on a customer to CAR_COL then fall back to using CAR_CAT"
Taking your existing code, we can add a column to "prioritise" the data, eg
select *
from (
select
a.customer_id ,
a.car_code ,
b.description as code_desc ,
a.price,
row_number() over ( partition by customer_id, car_code
order by case when domain = 'car_col' then 1 else 2 end ) as rating
from product a
left join (select * from reference_codes
where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
)
where rating = 1
If your sample data is something like here:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
),
product (CUSTOMER_ID, CODE, PRICE) AS
(
Select 123, 12, 100 From Dual Union All
Select 123, 23, 100 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 678, 67, 110 From Dual Union All
Select 678, 24, 110 From Dual Union All
Select 908, 45, 130 From Dual Union All
Select 908, 70, 130 From Dual
)
... then with your query the result is as you showed above. There is column PRICE selected but we can't see the values in your question, though.
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120 -- one row less if 'car_col' is missing
345 45 red 120
345 45 red 120
345 45 red 120
678 24 yellow 110
678 67 green 110
908 70 purple 130
908 45 red 130
908 45 red 130
To get your expected result (if there are no different prices) you could just add DISTINCT keyword to get what you want:
Select DISTINCT
... ... ... your query
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
... and if there are different prices than you could use aggregation with group by like below
Select
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION "CODE_DESC",
AVG(a.PRICE) "AVG_PRICE"
From
product a
Left Join
( Select * From reference_codes Where DOMAIN IN('car_col', 'car_cat') ) b
ON(a.CODE = b.CAR_CODE)
Group By
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION
Order By
a.CUSTOMER_ID
CUSTOMER_ID CODE CODE_DESC AVG_PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
Your query adjusted (depending on PRICE) either by DISTINCT or aggregation will work ok even when 'car_col' is missing:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
--Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
To get the desired output, you can use the CASE statement and ROW_NUMBER function in your SQL query as below:
select
a.customer_id,
a.car_code,
case
when row_number() over (partition by a.customer_id, a.car_code order by b.domain = 'car_col') = 1
then b.description
end as code_desc,
a.price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a.car_code = b.code
group by a.customer_id, a.car_code, code_desc, a.price
The CASE statement checks if the ROW_NUMBER function is equal to 1, and returns the description only if it is. The ROW_NUMBER function is used to assign a unique number to each row within a partition (in this case, partitioned by customer_id and car_code) and ordered by the domain being equal to 'car_col'. By doing this, we ensure that the first row for each customer_id and car_code combination is the one from the 'car_col' table. The GROUP BY clause is used to aggregate the rows with the same customer_id, car_code, code_desc, and price.
first of all, i'm not really sure if this possible or not.
let say I have this dataset example
CREATE TABLE TRANSACTION(
user_id numeric,
account_id varchar,
product varchar,
colour varchar,
price numeric);
insert into transaction (user_id, account_id, product, colour, price)
values
(1, 'a1', 'biycle', 'black', 500),
(1, 'a2', 'motorbike', 'red', 1000),
(1, 'a2', 'motorbike', 'blue', 1200),
(2, 'b3', 'car', 'grey', 10000),
(2, 'b2', 'motorbike', 'black', 1250),
(3, 'c1', 'biycle', 'black', 500),
(3, 'c2', 'biycle', 'black', 525),
(3, 'c4', 'skateboard', 'white', 250),
(3, 'c5', 'scooter', 'blue', 260)
from that table we know that
the total real customer is 3 (1,2,3) and
the total real account is 8 (a1, a2, b3, b2, c1, c2, c4, c5)
and then with this code
SELECT
product,
colour,
sum(price)total_price,
count(DISTINCT user_id)customer_total,
count(DISTINCT account_id)account_total
from transaction
group by
product, colour
and the return is like this
product
colour
total_price
customer_total
account_total
biycle
black
1525
2
3
car
grey
10000
1
1
motorbike
black
1250
1
1
motorbike
blue
1200
1
1
motorbike
red
1000
1
1
scooter
blue
260
1
1
skateboard
white
250
1
1
from the output above,
if we total the customer_total, it will be 8 and
if we total the account_total, it will be 9
is there any alternative way so that the customer_total will be 3 and account_total will be 8
You can calculate the accounts and customer total using an inline query that computes the customer and account totals within the same query.
SELECT
product,
colour,
sum(price)total_price,
(select count(DISTINCT user_id) from transaction) as customer_total,
(select count(DISTINCT account_id) from transaction) as account_total
from transaction
group by
product, colour
Result:
product
colour
total_price
customer_total
account_total
biycle
black
1525
3
8
car
grey
10000
3
8
motorbike
black
1250
3
8
motorbike
blue
1200
3
8
motorbike
red
1000
3
8
scooter
blue
260
3
8
skateboard
white
250
3
8
I have three worksheets: players, teams, and weights (how highly a particular attribute is weighted when determining player-team match).
Players
Name
Age
Height
Free_Throw_Perc
...
Bod
23
74
62
...
Teams
| Team_Name | Age | Height | Free_Throw_Perc | ... |
|-----------|-----|--------|-----------------|-----|
|Team1|23|78|62|...|
Weights
| Team_Name | Age | Height | Free_Throw_Perc | ... |
|:---------:|:---:|:------:|:---------------:|:---:|
| Team1 | 5 | 10 | 10 | ... |
CREATE TABLE players (name, age, height, free_throw_perc) AS
SELECT 'Alice', 20, 160, 90 FROM DUAL UNION ALL
SELECT 'Betty', 21, 165, 80 FROM DUAL UNION ALL
SELECT 'Carol', 22, 170, 70 FROM DUAL UNION ALL
SELECT 'Debra', 23, 175, 60 FROM DUAL UNION ALL
SELECT 'Emily', 24, 180, 50 FROM DUAL UNION ALL
SELECT 'Fiona', 25, 185, 40 FROM DUAL UNION ALL
SELECT 'Gerri', 26, 190, 30 FROM DUAL UNION ALL
SELECT 'Heidi', 27, 195, 20 FROM DUAL UNION ALL
SELECT 'Irene', 28, 200, 10 FROM DUAL;
CREATE TABLE teams (team_name, age, height, free_throw_perc) AS
SELECT 'ALPHA', 20,175,90 FROM DUAL;
CREATE TABLE weights team_name, age, height, free_throw_perc) AS
SELECT 'ALPHA', 5,10,10 FROM DUAL;
The teams table corresponds to the players table but contains a record for each team detailing their ideal player based on the current composition of the team. The weights table contains a record for each team with an integer value weight stating how much they care about each player attribute. I am trying to compute a total match score for each player-team combination. I was able to do this quite easily with python but am struggling to accomplish the same in SQL.
In Python this would be a simple for loop with logical operators comparing each cell of one dataframe to each cell of another, but the lack of positional referencing in SQL makes this a lot trickier to do and generalize (be able to use the same queries for other pairs of tables with different attributes).
So far I have
BEGIN
FOR c in (SELECT column_name FROM all_tab_columns WHERE table_name = 'teams')
LOOP
INSERT INTO match_table (players.Name, candidates.c)
SELECT players.Name, players.c WHERE players.c = teams.c
END LOOP;
BEGIN
FOR c IN (SELECT column_name FROM all_tab_columns WHERE table_name = 'weights')
LOOP
UPDATE match_table
SET match_table.c = (SELECT weights.c FROM weights WHERE match_table.c = weights.c)
END LOOP;
From what I can tell that will generate a table of player names with a single column corresponding to a match to a team attribute populated by the corresponding weight and all other columns full of null values. If that is the case, I can group by name to create a singular record with all matches and corresponding weights.
The script should loop through each player and team and compare the attributes of the player with those desired by the team. Where there is a match a new row should be added to the match_table with the player name and nulls values except for the column that matched. That should be done for each player-team attribute match. Then those matches should be replaced by the corresponding weight from the weight table. I would then like to sum those to get a total match score. I can't use the '+' operator because the column nammes will vary. They will always match between the three tables, but there will be varied attributes of interest.
The expected output would look something like:
players.name
Age
Height
Free_Throw_Perc
...
'Alice'
5
NULL
NULL
...
'Alice'
NULL
10
NULL
...
How would I then sum across each record to find the total match score of each candidate for a team?
If you have the sample data:
CREATE TABLE teams ( id, name ) AS
SELECT 1, 'Alpha' FROM DUAL UNION ALL
SELECT 2, 'Beta' FROM DUAL UNION ALL
SELECT 3, 'Gamma' FROM DUAL;
CREATE TABLE players (name, team, age, height, free_throw_perc) AS
SELECT 'Alice', 1, 20, 160, 90 FROM DUAL UNION ALL
SELECT 'Betty', 1, 21, 165, 80 FROM DUAL UNION ALL
SELECT 'Carol', 1, 22, 170, 70 FROM DUAL UNION ALL
SELECT 'Debra', 2, 23, 175, 60 FROM DUAL UNION ALL
SELECT 'Emily', 2, 24, 180, 50 FROM DUAL UNION ALL
SELECT 'Fiona', 2, 25, 185, 40 FROM DUAL UNION ALL
SELECT 'Gerri', 3, 26, 190, 30 FROM DUAL UNION ALL
SELECT 'Heidi', 3, 27, 195, 20 FROM DUAL UNION ALL
SELECT 'Irene', 3, 28, 200, 10 FROM DUAL;
CREATE TABLE weights(team, key, weight) AS
SELECT 1, 'AGE', 1.0 FROM DUAL UNION ALL
SELECT 1, 'HEIGHT', 0.5 FROM DUAL UNION ALL
SELECT 1, 'FREE_THROW_PERC', 0.2 FROM DUAL UNION ALL
SELECT 2, 'AGE', 0.0 FROM DUAL UNION ALL
SELECT 2, 'HEIGHT', 1.0 FROM DUAL UNION ALL
SELECT 2, 'FREE_THROW_PERC', 0.8 FROM DUAL UNION ALL
SELECT 3, 'AGE', 0.5 FROM DUAL UNION ALL
SELECT 3, 'HEIGHT', 0.5 FROM DUAL UNION ALL
SELECT 3, 'FREE_THROW_PERC', 1.0 FROM DUAL;
And you want to insert the sum of the weight column from the weights table multiplied by the respective value in the players table into the following table:
CREATE TABLE match_table(
team INT,
value NUMBER
);
Then you can use the following INSERT query:
INSERT INTO match_table (team, value)
SELECT p.team,
SUM(p.value * w.weight)
FROM ( SELECT name, team, key, value
FROM players
UNPIVOT ( value FOR key IN (age, height, free_throw_perc) )
) p
INNER JOIN weights w
ON ( p.team = w.team AND p.key = w.key )
GROUP BY p.team
Then the table will contain the weighted totals:
TEAM
VALUE
2
660
3
393
1
358.5
fiddle
And if your match_table is:
CREATE TABLE match_table(
player VARCHAR2(20),
team INT,
age NUMBER,
height NUMBER,
free_throw_perc NUMBER,
total NUMBER
);
Then you can use the query (and calculate the total with the + operator):
INSERT INTO match_table (player, team, age, height, free_throw_perc, total)
SELECT p.name,
p.team,
p.age * w.age_weight,
p.height * w.height_weight,
p.free_throw_perc * w.free_throw_perc_weight,
p.age * w.age_weight
+ p.height * w.height_weight
+ p.free_throw_perc * w.free_throw_perc_weight
FROM players p
INNER JOIN (
SELECT *
FROM weights
PIVOT (
MAX(weight)
FOR key IN (
'AGE' AS age_weight,
'HEIGHT' AS height_weight,
'FREE_THROW_PERC' AS free_throw_perc_weight
)
)
) w
ON (p.team = w.team)
Which gives the values:
PLAYER
TEAM
AGE
HEIGHT
FREE_THROW_PERC
TOTAL
Alice
1
20
80
18
118
Betty
1
21
82.5
16
119.5
Carol
1
22
85
14
121
Debra
2
0
175
48
223
Emily
2
0
180
40
220
Fiona
2
0
185
32
217
Gerri
3
13
95
30
138
Heidi
3
13.5
97.5
20
131
Irene
3
14
100
10
124
fiddle
Or, if the players are uncorrelated to a team then:
INSERT INTO match_table (player, team, age, height, free_throw_perc, total)
SELECT p.name,
w.team,
p.age * w.age_weight,
p.height * w.height_weight,
p.free_throw_perc * w.free_throw_perc_weight,
p.age * w.age_weight
+ p.height * w.height_weight
+ p.free_throw_perc * w.free_throw_perc_weight
FROM players p
CROSS JOIN (
SELECT *
FROM weights
PIVOT (
MAX(weight)
FOR key IN (
'AGE' AS age_weight,
'HEIGHT' AS height_weight,
'FREE_THROW_PERC' AS free_throw_perc_weight
)
)
) w
Which, for the sample data, outputs:
PLAYER
TEAM
AGE
HEIGHT
FREE_THROW_PERC
TOTAL
Alice
1
20
80
18
118
Alice
2
0
160
72
232
Alice
3
10
80
90
180
Betty
1
21
82.5
16
119.5
Betty
2
0
165
64
229
Betty
3
10.5
82.5
80
173
Carol
1
22
85
14
121
Carol
2
0
170
56
226
Carol
3
11
85
70
166
Debra
1
23
87.5
12
122.5
Debra
2
0
175
48
223
Debra
3
11.5
87.5
60
159
Emily
1
24
90
10
124
Emily
2
0
180
40
220
Emily
3
12
90
50
152
Fiona
1
25
92.5
8
125.5
Fiona
2
0
185
32
217
Fiona
3
12.5
92.5
40
145
Gerri
1
26
95
6
127
Gerri
2
0
190
24
214
Gerri
3
13
95
30
138
Heidi
1
27
97.5
4
128.5
Heidi
2
0
195
16
211
Heidi
3
13.5
97.5
20
131
Irene
1
28
100
2
130
Irene
2
0
200
8
208
Irene
3
14
100
10
124
fiddle
I have the below table
i want to aggregate and rollup the data to be displayed per below screenshot.
How do i go about this, is it possible to exactly rollup this way?
I would use grouping sets:
select fruit, type, sum(amount), sum(percent)
from t
group by grouping sets ( (fruit, type), (fruit) );
You can use the ROLLUP and then necessary conditions to omit the extra generated rows as follows:
SQL> -- Sample data
SQL> WITH DATAA (FRUIT, TYPE, AMOUNT, PERCENT) AS
2 (
3 SELECT 'Apple', 'Green', 10017, 17 FROM DUAL UNION ALL
4 SELECT 'Orange', 'Green', 10016, 16 FROM DUAL UNION ALL
5 SELECT 'Papaya', 'Yellow', 10014, 14 FROM DUAL UNION ALL
6 SELECT 'Papaya', 'Blue', 10005, 5 FROM DUAL UNION ALL
7 SELECT 'Papaya', 'Green', 10012, 12 FROM DUAL
8 )
9 -- Your query starts from here
10 SELECT *
11 FROM (
12 SELECT FRUIT, TYPE, AMOUNT, SUM(PERCENT) AS PERCENT
13 FROM DATAA
14 GROUP BY ROLLUP(FRUIT, TYPE, AMOUNT)
15 )
16 WHERE ( FRUIT IS NOT NULL AND TYPE IS NOT NULL AND AMOUNT IS NOT NULL )
17 OR ( FRUIT IS NOT NULL AND TYPE IS NULL AND AMOUNT IS NULL )
18 ORDER BY FRUIT, TYPE DESC NULLS LAST;
FRUIT TYPE AMOUNT PERCENT
------ ------ ---------- ----------
Apple Green 10017 17
Apple 17
Orange Green 10016 16
Orange 16
Papaya Yellow 10014 14
Papaya Green 10012 12
Papaya Blue 10005 5
Papaya 31
8 rows selected.
SQL>
I want to join/union two tables that have primary key, category, and a score, in such a way, that results will show the primary key and all categories and scores present in both tables together, and, if a given category is in only one table, then with a null for the score from the second table.
The tables are as follow:
opinion_1
fruit category score
apple color 15
apple sweet 50
apple scent 35
orange color 40
orange sweet 60
opinion_2
fruit category score
apple color 28
apple sweet 12
orange color 29
orange sweet 50
orange scent 31
I've tried full outer joining and double left joining with union, getting the same results with categories incorrectly multiplied.
WITH opinion_1 AS (
SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
opinion_1.fruit,
opinion_1.category as category,
opinion_1.score as score1,
opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit
I expect the following result of the operation:
fruit category score1 score2
apple color 15 28
apple sweet 50 12
apple scent 35 null
orange color 40 29
orange sweet 60 50
orange scent null 31
but I'm getting this:
fruit category score1 score2
apple color 15 12
apple color 15 28
apple sweet 50 12
apple sweet 50 28
apple scent 35 12
apple scent 35 28
orange color 40 50
orange color 40 31
orange color 40 29
orange sweet 60 50
orange sweet 60 31
orange sweet 60 29
I think you are missing a condition on your join to get the result you expect. Moreover, selecting opinion_1.fruit and opinion_1.category will produce nulls if there is no records for some fruit on opinion_1 whilst there are on opinion_2. The following query will produce the expected result :
WITH opinion_1 AS (
SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
coalesce(opinion_1.fruit, opinion_2.fruit) as fruit,
coalesce(opinion_1.category, opinion_2.category) as category,
opinion_1.score as score1,
opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit and opinion_1.category = opinion_2.category
Below is for BigQuery Standard SQL
#standardSQL
WITH opinion_1 AS (
SELECT 'apple' AS fruit, 'color' AS category, 15 AS score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' AS fruit, 'color' AS category, 28 AS score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
IFNULL(a.fruit, b.fruit) fruit,
IFNULL(a.category, b.category) AS category,
a.score AS score1,
b.score AS score2
FROM opinion_1 a
FULL OUTER JOIN opinion_2 b
USING(fruit, category)
with result
Row fruit category score1 score2
1 apple color 15 28
2 apple sweet 50 12
3 apple scent 35 null
4 orange color 40 29
5 orange sweet 60 50
6 orange scent null 31