Case Statement to Add a Column - sql

I have the following table:
ID Fruit
A apple
A banana
A grapes
B orange
B apple
B grapes
C grapes
C orange
C banana
I would like to add a new column called Apple such to denote whether ID is associated with apple or not:
ID Fruit Apple
A apple yes
A banana yes
A grapes yes
B orange yes
B apple yes
B grapes yes
C grapes no
C orange no
C banana no

Since this seems like a contrived example, I'll post several options. The best one will depend on what you're really doing.
First up, this is likely to perform best, but it risks duplicating rows if you could have multiple matches for the JOINed table. It's also the only solution I'm presenting to actually use a CASE expression as requested.
SELECT a.*, case when b.ID IS NOT NULL THEN 'Yes' ELSE 'No' END AS Apple
FROM MyTable a
LEFT JOIN MyTable b on b.ID = a.ID AND b.Fruit = 'Apple'
Alternatively, this will never duplicate rows, but has to re-run the nested query for each result row. If this is not a contrived example, but something more like homework, this is probably the expected result.
SELECT *, coalesce(
(
SELECT TOP 1 'Yes'
FROM MyTable b
WHERE b.ID = a.ID AND b.Fruit = 'Apple'
), 'No') As Apple
FROM MyTable a
Finally, this also re-runs the nested query for each result row, but there is the potential a future enhancement will improve on that and it makes it possible to provide values for multiple columns from the same nested subquery.
SELECT a.*, COALESCE(c.Apple, 'No') Apple
FROM MyTable a
OUTER APPLY (
SELECT TOP 1 'Yes' As Apple
FROM MyTable b
WHERE b.ID = a.ID AND b.Fruit = 'Apple'
) c
See them work here:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=e1991e8541e7421b90f601c7e8c8906b

It could be achieved without self JOIN by using windowed COUNT_IF:
SELECT *, COUNT_IF(Fruit = 'apple') OVER(PARTITION BY ID) > 0 AS Apple
FROM tab;
Output:

In a new asking of this same question with more columns,
the form Lukas advocates performs poorly, as the COUNT/SUM is done per row,
at which point do a join between the raw, and the aggregated results should perform better.
select *
FROM MyTable as a
natural join (
SELECT c.id,
iff(count_if(c.Fruit like 'apple') > 0, 'yes', 'no') as "Apple",
--iff(count_if(c.Fruit like 'banana') > 0, 'yes', 'no') as "Banana",
--iff(count_if(c.Fruit like 'orange') > 0, 'yes', 'no') as "Orange"
FROM MyTable as c
GROUP BY 1
) as b

For the following table,
ID Fruit
A apple
A banana
A grapes
B orange
B apple
B grapes
C grapes
C orange
C banana
Adding a new column called Apple to denote whether ID is associated with apple or not, the resultset would be
ID Fruit Apple
A apple yes
A banana no
A grapes no
B orange no
B apple yes
B grapes no
C grapes no
C orange no
C banana no
If the expected resultset is as above, the below query will help to get the desired output.
select
id,
case
when fruit='apple' then 'yes'
when fruit!='apple' then 'no'
end as Apple
from Fruits;

Related

subset a table to include rows associated with only one occurence of a set of entries

I have the following table for 4 individuals with their favorite fruit:
tbl:
ID FRUIT
personA banana
personB apple
personC orange
personD grapefruit
personA avocado
personB banana
personC melon
personD pear
personA banana
I would like to extract all the entries for the IDs that are associated with only one of the following: banana, apple, orange.
This means that I am only hoping to extract the rows for person A and person C. person B has both ("apple" AND "banana"). person D has none of the three desired fruits.
I currently have:
select *
from tbl t1
where exists (select 1
from tbl t2
where t1.ID=t2.ID and
(t2.fruit='banana' or
t2.fruit='apple' or
t2.fruit='orange'))
join (select ID, count(*) over (partition by t3.fruit)
from (select distinct ID, fruit
from tbl
where fruit='banana' or
fruit='apple' or
fruit='orange')) t3
on t3.ID=t.ID and
t3.cnt=1
Is there a better/simpler way to execute this? I would like to optimize it to reduce run time given the tables are quite large.
desired table:
ID FRUIT
personA banana
personC orange
personA avocado
personC melon
personA banana
The subquery determines all IDs which have those three fruits and count how many individual fruits they have.
You know can join the ID to the table and only take those IDs which have only 1 fruit, if you also want three fruits you and add this in the join condition
SELECT
t1.ID, FRUIT
FROM
tbl t1
JOIN
(SELECT
ID, COUNT(DISTINCT FRUIT) countf
FROM
tbl
WHERE
FRUIT IN ('banana', 'apple', 'orange')
GROUP BY 1) t2 ON t1.ID = t2.ID AND t2.countf = 1;
We can use GROUP BY and HAVING COUNT(DISTINCT) to find the persons we want.
SELECT
a.ID,
b.FRUITID
FROM tb1 a
JOIN tb1 b ON a.ID = b.ID
WHERE a.FRUITID IN ('banana', 'apple', 'orange')
GROUP BY
a.ID,
b.FRUITID
HAVING COUNT(DISTINCT a.FRUITID) = 1;
GO
ID | FRUITID
:------ | :------
personA | avocado
personA | banana
personC | melon
personC | orange
db<>fiddle here
Alternative approach is to use COUNT_IF:
SELECT *
FROM tbl t
QUALIFY COUNT_IF(t.fruit = 'banana') OVER(PARTITION BY ID) = 1
AND COUNT_IF(t.fruit = 'apple') OVER(PARTITION BY ID) = 1
AND COUNT_IF(t.fruit = 'orange') OVER(PARTITION BY ID) = 1;
Answer:
This can be done with:
select *
from data
qualify count( distinct iff(fruit in ('apple','banana','orange'), fruit, null))
over(partition by id) = 1
which gives:
ID
FRUIT
personA
avocado
personA
banana
personA
banana
personC
melon
personC
orange
How that works:
this can be shown how it works by showing the intermediate state of the IFF and the COUNT DISTINCT like so:
select *
,iff(fruit in ('apple','banana','orange'), fruit, null) as v
,count( distinct v) over(partition by id) as c
from data
gives:
ID
FRUIT
V
C
personA
avocado
1
personA
banana
banana
1
personA
banana
banana
1
personB
apple
apple
2
personB
banana
banana
2
personC
melon
1
personC
orange
orange
1
personD
grapefruit
0
personD
pear
0
Thus personD is eliminated for have no magic fruit, and personB is eliminated for have too much.
If you are deeply caring about performance (which I would test) I assume nbk's solution would perform the fastest.

Filtering based on multiple categories

My search to filter based on multiple categories listed below, for any records that fall in categories A-F but do not have more than 1 item from the same category. I will try to explain with an example.
A Bread
B Apple
C Strawberry OR Blueberry OR Raspberry
D Watermelon OR Muskmelon OR Honeydew
E Papaya
F Oranges OR Peaches OR Nectarines
T1:
1
2
3
4
5
6
7
T2:
ID Category
1 Bread
2 Apple
2 Strawberry
3 Blueberry
3 Raspberry
4 Watermelon
5 Muskmelon
5 Honeydew
4 Papaya
2 Oranges
1 Peaches
5 Nectarines
In the above scenario, my search is to return:
1 Bread,Peaches
2 Apple,Strawberry, Oranges
4 Watermelon,Papaya
3 and 5 are not to be returned as they have items from the same category -
#3: Blueberry and Raspberry
#5: Muskmelon, Honeydew and Nectarines
First of all, you need a table -- call it CATEGORY_GROUPS ( category, category_group ) -- that relates this information from your post:
A Bread
B Apple
C Strawberry OR Blueberry OR Raspberry
D Watermelon OR Muskmelon OR Honeydew
E Papaya
F Oranges OR Peaches OR Nectarines
Where, for example, 'Bread' would be the category and 'A' would be the category group.
Then, you join t1, t2, and category_groups together in a query so you have every item, its category and its category group. Then group by the id.
The key part is how to restrict the items that have duplicates. If there are duplicates, then the number of distinct category groups will be less than the number of categories. So, you can use that condition in your HAVING clause to get what you want.
Like this should work:
SELECT t1.id, listagg(t2.category,',') within group ( order by category )
FROM t1 inner join t2 on t2.id = t1.id
inner join category_groups cg on cg.category = t2.category
GROUP BY t1.id
HAVING COUNT(DISTINCT cg.category_group) < COUNT(DISTINCT t2.category)

comparing multiple columns between tables hive

I have two tables
Table A:
Fruit Number
Apple 7235
Plum 1284
Pear 8932
Orange 2839
Table B:
Fruit Number
Apple 7235
Apple 3893
Plum 1284
Pear 8932
Orange 2839
Orange 4732
I want the end result of my query to get the columns that are not the same for the tables. For example New TableC:
Fruit Number
Apple 3893
Orange 4732
I tried to do joins but the join is only taking in the first occurrence of a record. How can i achieve the desired results above.
Use a full join which gets you rows missing on either side.
select fruit,coalesce(num1,num2) as number
from (select coalesce(a.fruit,b.fruit) as fruit,a.number as num1,b.number as num2
from a
full join b on a.fruit=b.fruit and a.number=b.number
where a.number is null or b.number is null
) t

SQL query to find 'fruit' that no-one loves

Can someone help me out with how I can find the 'Fruit' that no-one loves?
Fruit LoveIt Name
Apple Y John
Apple N Mary
Apple Y Stephen
Pear N Lois
Pear N Jo
Pear N Fiona
Thanks,
Here's a variant that doesn't rely on counting but stresses thinking in sets (relational algebra style, if you will): the fruits no one loves are all fruits but those that are loved by somebody:
SELECT DISTINCT f.Fruit
FROM fruits f
EXCEPT
SELECT f.Fruit
FROM fruits f
WHERE f.LoveIt = 'Y'
EXCEPT is SQL's set difference operator.
Using aggregation:
select fruit
from fruits
group by fruit
having count(case when LoveIt = 'Y' then 1 end) = 0;
I would try this:
select fruit, loveit, count(*)
from survey
group by 1,2
having loveit = 'N'
and count(*) = 0;
Select distinct fruit from tab x where fruit not in (select fruit from tab where love it = 'Y')

SQL query to join results from two tables but also include rows that do not have counterparts in the other table?

Given two tables APPLE and ORANGE,
NAME APPLES
Alice 5
Bob 10
Trudy 1
NAME ORANGES
Bob 50
Trudy 10
Dick 10
How can I write a JOIN to show the table:
NAME APPLES ORANGES
Alice 5 -
Bob 10 50
Trudy 1 10
Dick - 10
I currently have
SELECT a.NAME, APPLES, ORANGES
FROM APPLE a
JOIN
ORANGE o ON o.NAME = a.NAME
but that only returns the fields that have a value in both APPLE and ORANGE.
SELECT COALESCE(a.NAME, b.NAME) as NAME, APPLES, ORANGES
FROM APPLE a
FULL OUTER JOIN ORANGE o ON o.NAME = a.NAME
SELECT a.NAME, a.APPLES, o.ORANGES
FROM APPLE a
FULL OUTER JOIN
ORANGE o ON o.NAME = a.NAME
should be:
SELECT COALESCE(a.NAME,o.name) as Name, APPLES, ORANGES
FROM APPLE a
FULL OUTER JOIN ORANGE o ON o.NAME = a.NAME
Example: http://sqlfiddle.com/#!4/1ae9a/4
Change JOIN to FULL OUTER JOIN.
you have to use a left right outer join depending on which table contains the inclomplete data