sql: comparting two tables with different values produces duplicated results - sql

I want to join/union two tables that have primary key, category, and a score, in such a way, that results will show the primary key and all categories and scores present in both tables together, and, if a given category is in only one table, then with a null for the score from the second table.
The tables are as follow:
opinion_1
fruit category score
apple color 15
apple sweet 50
apple scent 35
orange color 40
orange sweet 60
opinion_2
fruit category score
apple color 28
apple sweet 12
orange color 29
orange sweet 50
orange scent 31
I've tried full outer joining and double left joining with union, getting the same results with categories incorrectly multiplied.
WITH opinion_1 AS (
SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
opinion_1.fruit,
opinion_1.category as category,
opinion_1.score as score1,
opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit
I expect the following result of the operation:
fruit category score1 score2
apple color 15 28
apple sweet 50 12
apple scent 35 null
orange color 40 29
orange sweet 60 50
orange scent null 31
but I'm getting this:
fruit category score1 score2
apple color 15 12
apple color 15 28
apple sweet 50 12
apple sweet 50 28
apple scent 35 12
apple scent 35 28
orange color 40 50
orange color 40 31
orange color 40 29
orange sweet 60 50
orange sweet 60 31
orange sweet 60 29

I think you are missing a condition on your join to get the result you expect. Moreover, selecting opinion_1.fruit and opinion_1.category will produce nulls if there is no records for some fruit on opinion_1 whilst there are on opinion_2. The following query will produce the expected result :
WITH opinion_1 AS (
SELECT 'apple' as fruit, 'color' as category, 15 as score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' as fruit, 'color' as category, 28 as score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
coalesce(opinion_1.fruit, opinion_2.fruit) as fruit,
coalesce(opinion_1.category, opinion_2.category) as category,
opinion_1.score as score1,
opinion_2.score as score2
FROM opinion_1
full outer join opinion_2
on opinion_1.fruit = opinion_2.fruit and opinion_1.category = opinion_2.category

Below is for BigQuery Standard SQL
#standardSQL
WITH opinion_1 AS (
SELECT 'apple' AS fruit, 'color' AS category, 15 AS score UNION ALL
SELECT 'apple', 'sweet', 50 UNION ALL
SELECT 'apple', 'scent', 35 UNION ALL
SELECT 'orange', 'color', 40 UNION ALL
SELECT 'orange', 'sweet', 60
), opinion_2 AS (
SELECT 'apple' AS fruit, 'color' AS category, 28 AS score UNION ALL
SELECT 'apple', 'sweet', 12 UNION ALL
SELECT 'orange', 'color', 29 UNION ALL
SELECT 'orange', 'sweet', 50 UNION ALL
SELECT 'orange', 'scent', 31
)
SELECT
IFNULL(a.fruit, b.fruit) fruit,
IFNULL(a.category, b.category) AS category,
a.score AS score1,
b.score AS score2
FROM opinion_1 a
FULL OUTER JOIN opinion_2 b
USING(fruit, category)
with result
Row fruit category score1 score2
1 apple color 15 28
2 apple sweet 50 12
3 apple scent 35 null
4 orange color 40 29
5 orange sweet 60 50
6 orange scent null 31

Related

reference code to fill one of those two table in condition

I am not sure how to get the right script
To get the reference code detail, one of those tables 'car_col' has to fill in not duplicate results from the other second tables 'car_cat', if 'car_col' doesn't have then 'car_cat' fill in
select
a. customer_id ,
a. car_code ,
b. description as code_desc ,
a. price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
output result;
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
as you can see customer 345 has double row code 45 red
REFERENCE TABLES below;
select *
from reference_codes
where domain = 'car_col'
DOMAIN
CODE
DESCRIPTION
car_col
12
blue
car_col
23
black
car_col
45
red
car_col
67
green
select *
from reference_codes
where domain = 'car_cat'
DOMAIN
CODE
DESCRIPTION
car_cat
24
yellow
car_cat
45
red
car_cat
70
purple
car_cat
90
row
I want output result
customer_ID
code
description
123
12
blue
123
23
black
345
45
red
678
67
green
678
24
yellow
908
45
red
908
70
purple
I am using ORACLE SQL
thank you
Hard to give a concrete answer with really seeing all the data or the model, but I'm interpreting your requirement as:
"If I can't find a join on a customer to CAR_COL then fall back to using CAR_CAT"
Taking your existing code, we can add a column to "prioritise" the data, eg
select *
from (
select
a.customer_id ,
a.car_code ,
b.description as code_desc ,
a.price,
row_number() over ( partition by customer_id, car_code
order by case when domain = 'car_col' then 1 else 2 end ) as rating
from product a
left join (select * from reference_codes
where domain in ('car_col', 'car_cat') ) b
on a. car_code = b. code
)
where rating = 1
If your sample data is something like here:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
),
product (CUSTOMER_ID, CODE, PRICE) AS
(
Select 123, 12, 100 From Dual Union All
Select 123, 23, 100 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 345, 45, 120 From Dual Union All
Select 678, 67, 110 From Dual Union All
Select 678, 24, 110 From Dual Union All
Select 908, 45, 130 From Dual Union All
Select 908, 70, 130 From Dual
)
... then with your query the result is as you showed above. There is column PRICE selected but we can't see the values in your question, though.
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120 -- one row less if 'car_col' is missing
345 45 red 120
345 45 red 120
345 45 red 120
678 24 yellow 110
678 67 green 110
908 70 purple 130
908 45 red 130
908 45 red 130
To get your expected result (if there are no different prices) you could just add DISTINCT keyword to get what you want:
Select DISTINCT
... ... ... your query
CUSTOMER_ID CODE CODE_DESC PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
... and if there are different prices than you could use aggregation with group by like below
Select
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION "CODE_DESC",
AVG(a.PRICE) "AVG_PRICE"
From
product a
Left Join
( Select * From reference_codes Where DOMAIN IN('car_col', 'car_cat') ) b
ON(a.CODE = b.CAR_CODE)
Group By
a.CUSTOMER_ID ,
a.CODE ,
b.DESCRIPTION
Order By
a.CUSTOMER_ID
CUSTOMER_ID CODE CODE_DESC AVG_PRICE
----------- ---------- --------- ----------
123 12 blue 100
123 23 black 100
345 45 red 120
678 24 yellow 110
678 67 green 110
908 45 red 130
908 70 purple 130
Your query adjusted (depending on PRICE) either by DISTINCT or aggregation will work ok even when 'car_col' is missing:
WITH
reference_codes (DOMAIN, CAR_CODE, DESCRIPTION) AS
(
Select 'car_col', 12, 'blue' From Dual Union All
Select 'car_col', 23, 'black' From Dual Union All
--Select 'car_col', 45, 'red' From Dual Union All
Select 'car_col', 67, 'green' From Dual Union All
Select 'car_cat', 24, 'yellow' From Dual Union All
Select 'car_cat', 45, 'red' From Dual Union All
Select 'car_cat', 70, 'purple' From Dual Union All
Select 'car_cat', 90, 'row' From Dual
To get the desired output, you can use the CASE statement and ROW_NUMBER function in your SQL query as below:
select
a.customer_id,
a.car_code,
case
when row_number() over (partition by a.customer_id, a.car_code order by b.domain = 'car_col') = 1
then b.description
end as code_desc,
a.price
from product a
left join (select * from reference_codes where domain in ('car_col', 'car_cat') ) b
on a.car_code = b.code
group by a.customer_id, a.car_code, code_desc, a.price
The CASE statement checks if the ROW_NUMBER function is equal to 1, and returns the description only if it is. The ROW_NUMBER function is used to assign a unique number to each row within a partition (in this case, partitioned by customer_id and car_code) and ordered by the domain being equal to 'car_col'. By doing this, we ensure that the first row for each customer_id and car_code combination is the one from the 'car_col' table. The GROUP BY clause is used to aggregate the rows with the same customer_id, car_code, code_desc, and price.

Selecting MAX of a Value from multiple categories from a table

I am looking to get the max weight of Apple, Orange, Mango - there could be any number of fruits. Bold items from the table is what i would like to have my query response.
I know this can be done by partitioning the table for example:
SELECT fruits,max(weight) OVER(PARTITION BY fruits)
FROM fruitstat
GROUP BY fruits;
But this is not getting my expected results. I need the ones which are the max weight fruits in its category.
Fruits
Color
Weight
Apple
red
23
Orange
orange
6
Mango
yellow
13
Apple
red
15
Orange
orange
19
Mango
yellow
16
Apple
red
44
Orange
orange
31
Mango
yellow
12
Apple
red
14
Orange
orange
22
Mango
yellow
11
Just group the MAX(weight) by fruits:
WITH fruit AS
(
SELECT 'Apple' as fruits,'red' as color ,23 as weight FROM dual UNION ALL
SELECT 'Orange','orange',6 FROM dual UNION ALL
SELECT 'Mango','yellow',13 FROM dual UNION ALL
SELECT 'Apple','red',15 FROM dual UNION ALL
SELECT 'Orange','orange',19 FROM dual UNION ALL
SELECT 'Mango','yellow',16 FROM dual UNION ALL
SELECT 'Apple','red',44 FROM dual UNION ALL
SELECT 'Orange','orange',31 FROM dual UNION ALL
SELECT 'Mango','yellow',12 FROM dual UNION ALL
SELECT 'Apple','red',14 FROM dual UNION ALL
SELECT 'Orange','orange',22 FROM dual UNION ALL
SELECT 'Mango','yellow',11 FROM dual
)
SELECT fruits, MAX(weight)
FROM fruit
GROUP BY fruits;
P.S. MAX for the apple is 44, not 23, at least in your sample data
You don't need to use group by on window functions. You can do this, instead.
First, sort the fruits weight by using rank.
select rank() over (partition by fruits order by weight) as rank, fruits, weight
from fruitstat
After that, you can use subquery to return the first value only.
select fruits, weight
from (select rank() over (partition by fruits order by weight) as rank, fruits, weight from fruitstat) a
where rank = 1

Using aggregate on another aggregate function - MAX() on an aggregate

I've a tournament bracket consisting of 2 Groups (Group A and Group B).
I already have a query where I retreive some information such as the average rating, amount of teams etc.
To the problem: I don't want to allow 2 teams of the same group having the same color. I want a column representing the MAX() of ColorInterference without having to make a subselect.
Is this possible or am I forced to make a SELECT MAX(ColorInterference) over the query result?
Group
Team
Color
Rating
TeamsCount
AverageRating
AverageRatingGroup
ColorInterference
A
Helos
Green
1452
8
1518
1544
0
A
Pelicans
Purple
1687
8
1518
1544
0
A
Epic Square Dance
Red
1498
8
1518
1544
0
A
Narnia Ninjas
Yellow
1542
8
1518
1544
0
B
O.T.
Blue
1502
8
1518
1492
0
B
Helos
Green
1452
8
1518
1492
1
B
Treasure Goggles
Green
1485
8
1518
1492
1
B
Red Off
Yellow
1530
8
1518
1492
0
DECLARE #Bracket_Groups Table ([Group] nvarchar(10), Team nvarchar(50), Color nvarchar(50), Rating int)
INSERT INTO #Bracket_Groups(Team, [Group], Color, Rating)
SELECT 'Narnia Ninjas', 'A', 'Yellow' , 1542
UNION SELECT 'Helos', 'A', 'Green', 1452
UNION SELECT 'Pelicans', 'A', 'Purple', 1687
UNION SELECT 'Epic Square Dance', 'A', 'Red', 1498
UNION SELECT 'O.T.', 'B', 'Blue', 1502
UNION SELECT 'Red Off', 'B', 'Yellow', 1530
UNION SELECT 'Helos', 'B', 'Green', 1452
UNION SELECT 'Treasure Goggles', 'B', 'Green', 1485
SELECT
[Group]
, Team
, Color
, Rating
, COUNT(*) OVER() as TeamsCount
, AVG(Rating) OVER () as AverageRating
, AVG(Rating) OVER (Partition By [Group]) as Group_AverageRating
, SIGN(COUNT(Color) OVER (partition By [Group], Color) - 1) as ColorInterference
FROM #Bracket_Groups
Order by [Group]

Rollup - Oracle db sql

I have the below table
i want to aggregate and rollup the data to be displayed per below screenshot.
How do i go about this, is it possible to exactly rollup this way?
I would use grouping sets:
select fruit, type, sum(amount), sum(percent)
from t
group by grouping sets ( (fruit, type), (fruit) );
You can use the ROLLUP and then necessary conditions to omit the extra generated rows as follows:
SQL> -- Sample data
SQL> WITH DATAA (FRUIT, TYPE, AMOUNT, PERCENT) AS
2 (
3 SELECT 'Apple', 'Green', 10017, 17 FROM DUAL UNION ALL
4 SELECT 'Orange', 'Green', 10016, 16 FROM DUAL UNION ALL
5 SELECT 'Papaya', 'Yellow', 10014, 14 FROM DUAL UNION ALL
6 SELECT 'Papaya', 'Blue', 10005, 5 FROM DUAL UNION ALL
7 SELECT 'Papaya', 'Green', 10012, 12 FROM DUAL
8 )
9 -- Your query starts from here
10 SELECT *
11 FROM (
12 SELECT FRUIT, TYPE, AMOUNT, SUM(PERCENT) AS PERCENT
13 FROM DATAA
14 GROUP BY ROLLUP(FRUIT, TYPE, AMOUNT)
15 )
16 WHERE ( FRUIT IS NOT NULL AND TYPE IS NOT NULL AND AMOUNT IS NOT NULL )
17 OR ( FRUIT IS NOT NULL AND TYPE IS NULL AND AMOUNT IS NULL )
18 ORDER BY FRUIT, TYPE DESC NULLS LAST;
FRUIT TYPE AMOUNT PERCENT
------ ------ ---------- ----------
Apple Green 10017 17
Apple 17
Orange Green 10016 16
Orange 16
Papaya Yellow 10014 14
Papaya Green 10012 12
Papaya Blue 10005 5
Papaya 31
8 rows selected.
SQL>

Transpose vertical input data to horizontal output in Oracle

I need to transpose rows to columns in Oracle. I've the data in this format:
Apple Orange Mango Banana
15 20 12 67
The required result is:
Fruit Number
Apple 15
Orange 20
Mango 12
Banana 67
I used Union to get this result but this is not the generic one.
SELECT ‘Apple’ AS Fruit, Apple AS Number FROM fruits_tbl UNION
SELECT ‘Orange’, Orange FROM fruits_tbl UNION
SELECT ‘Mango’, Mango FROM fruits_tbl UNION
SELECT ‘Banana’, Banana FROM fruits_tbl;
I want standard procedure to get the output as suggested.
Update: Figured out Pivot is the correct approach!
Since Oracle 11g (tab is your table name):
select * from tab
UNPIVOT (num for fruit in (apple as 'apple', orange as 'orange', mango as 'mango', banana as 'banana'));
Oracle 10g:
with col_names as (
select 'apple' fruit from dual
union all select 'orange' from dual
union all select 'mango' from dual
union all select 'banana' from dual
)
select c.fruit,
case c.fruit
when 'apple' then t.apple
when 'orange' then t.orange
when 'mango' then t.mango
when 'banana' then t.banana
end as num
from tab t
cross join col_names c;