Filtering out rows - sql

I have simplified a table as an example
tray| food
-------+-------
1 | fruit
2 | veg
2 | fruit
2 | meat
3 | meat
4 | bread
What I want to find for each fruit, is the number of trays that ONLY contain that type of food. So the output should look like this:
food| count
-------+-------
fruit | 1
veg | 0
meat | 1
bread | 1
I tried writing a query:
SELECT fruit, COUNT(*)
FROM Inventory
WHERE NOT EXISTS (SELECT *
FROM Inventory I
WHERE I.tray = tray AND I.fruit<>fruit)
GROUP BY fruit;
However the table returned is incorrect and it looks like my sub query is wrong but it makes logical sense to me.
food | count
-------+-------
fruit | 2
veg | 1
bread | 1
meat | 2
It looks like tray 2 is counting once for fruit, meat and veg when it should not. But shouldn't that be ruled out by my NOT EXISTS subquery? How do I fix this?

Clever little problem. Here is one solution:
select f.food, count(t.tray)
from (select distinct food from t
) f left join
(select t.tray, min(food) as minfood, max(food) as maxfood
from tray t
group by tray
) t
on f.food = t.minfood and f.food = t.maxfood
group by f.food;
Getting a count of zero suggests a query where left join and group by would be useful.

select i1.food,
count(ft.tray)
from inventory i1
left join (
select i1.tray,
count(distinct i1.food) as num_food
from inventory i1
group by i1.tray
) ft on i1.tray = ft.tray and num_food = 1
group by i1.food;
The inner query (ft) counts the number of different foods per tray. The outer (main) query counts the number of trays per food for those trays that only contain a single type of food.
Online example: http://rextester.com/PUN27611

Related

pull all data only if there are distinct within a group in SQL

I have table with the following columns (product ID, product group code, product category)
I only want to pull the data if there there are two or more unique product category data within each product group. for example I have the following data.
Product id | product group code | product category
1 | a | Apple
2 | a | Orange
3 | a | Apple
4 | b | Toys
5 | b | Toys
I only want to see all the unique product category for each product code. The output i want to see is:
Product id product group code product category
1 | a | Apple
2 | a | Orange
3 | a | Apple
Thanks
I only want to pull the data if there there are two or more unique product category data within each product group. for example I have the following data.
This answer is based on the results you show which is consistent. The paragraph before the results is unclear.
One method is exists:
select t.*
from t
where exists (select 1
from t t2
where t2.product_group = t.product_group and
t2.product_category <> t.product_category
);
How about two nested selects? one grouping and selecting the group that has more than one in COUNT DISTINCT product_group_id, then join the "good group" back to the original input?
WITH
-- your input as an in-line table
input(Product_id,product_group_code,product_category) AS (
SELECT 1,'a','Apple'
UNION ALL SELECT 2,'a','Orange'
UNION ALL SELECT 3,'a','Apple'
UNION ALL SELECT 4,'b','Toys'
UNION ALL SELECT 5,'b','Toys'
)
,
good_grp AS (
SELECT
product_group_code
FROM input
GROUP BY product_group_code
HAVING COUNT(DISTINCT product_category) >1
)
SELECT
i.*
FROM input i
JOIN good_grp USING(product_group_code)
ORDER BY 1
-- returning ...
Product_id | product_group_code | product_category
-----------+--------------------+------------------
1 | a | Apple
2 | a | Orange
3 | a | Apple

SQL: Select Vegetarians

i'm stuck with a small problem here and i can't seem to find the right answer.
I have a DB with multiple Tables, in which i have ANIMALS and their FOOD. I have to select which ANIMALS only eat VEGETARIAN meals. The table is set up with a BOOLEAN. 0 = MEAT 1 = VEGETARIAN.
There are ANIMALS in the Table who eat MEAT and VEGETARIAN. I don't need them, i strictly need Animals who only eat VEGETARIAN.
Thanks in advance.
You didn't say what is You table schema, so i will create simple one with fields AnimalId int , Meat bit .
In table the records are stored as simple as that
+----------+------+
| AnimalId | meat |
+----------+------+
| 1 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | 1 |
| 4 | 0 |
+----------+------+
So Animal 1 is eating both meat and vegetables, animal nr 2 only vegetables, animal nr 3 only meat. Our expected result is to get only Animal ID 2 and 4.
You need to group data by AnimalId, take MAX from meat bit column after casting it to int and filter data having 0 as result of MAX.
SELECT AnimalId
FROM dbo.Animals
GROUP BY AnimalId
HAVING (MAX(CAST(Meat AS INT)) = 0)
You can try this (but it's only mok, since you didn't write anything else):
SELECT *
FROM ANIMALS A
INNER JOIN BR_ANIMALS_FOOD B ON A.ID = B.ID_ANI
INNER JOIN FOODS C ON B.ID_FOODS = C.ID
WHERE NOT EXISTS (SELECT 1 FROM FOODS D WHERE B.ID_FOODS = D.ID AND D.FOOD_TYPE = 0)
AND C.FOOD_TYPE = 1

How do I Get a count of a value that may be found in one of two rows

If I had a table with values
Game Id | Home | Away |
------- | -------- |------- |
0 | Team A | Team B |
1 | Team C | Team D |
2 | Team B | Team C |
3 | Team D | Team C |
In SQL, how would I get a Count of each team regardless of whether they were Home or Away.
E.g.
Team | Count
------- | -----
Team A | 1
Team B | 2
Team C | 3
Team D | 2
My hack in python was to split into two tables of counts, and merge the tables together but I think there is a much better way to do this in SQL
In SQL, you would use union all and group by:
select team, count(*)
from ((select home as team from t) union all
(select away from t)
) t
group by team;
You need to unpivot the data and do the count
You can use Unnest and Array to unpivot the data. Unnest converts an array to a set of rows
SELECT unnest(array["Home", "Away"]) AS team,
count(1)
FROM yourtable
GROUP BY team
Live Demo
Maybe like this.
Count on Home column, count on away, union both and count again.
SELECT Team,
sum(occurrence) AS count
FROM
( SELECT Home AS Team,
count(*) AS occurrence
GROUP BY Home
UNION ALL SELECT Away AS Team,
count(*) AS occurrence
GROUP BY Away) AS lookup
Grouping Sets:
with t (game_id, home, away) as ( values
(0,'Team A','Team B'),
(1,'Team C','Team D'),
(2,'Team B','Team C'),
(3,'Team D','Team C')
)
select coalesce(home, away) as team, sum(total) as total
from (
select home, away, count(*) as total
from t
group by grouping sets (home, away)
) s
group by 1
order by 1
;
team | total
--------+-------
Team A | 1
Team B | 2
Team C | 3
Team D | 2

Find rows with two or more relationships

I have 3 tables:
Foods table stores all food items, Tags table stores all tags, FoodTagRelation stores the relation between food and tags. I want to write a query to select all Food that have exactly 2 tags with specified Ids (please read the SQL I have written at the bottom)
Foods Table
Id | FoodItem
----------------------
1 | Mango
2 | Custard
3 | Pizza
Tags Table
Id | TagName
----------------------
1 | Fruit
2 | Cold
3 | Hot
4 | Veg
FoodTagRelation
Id | FoodId | TagId
----------------------
1 | 1 | 1
2 | 1 | 4
3 | 2 | 1
4 | 2 | 2
5 | 2 | 4
Now I want to select all foods that have exactly two tags on it: e.g. select all foods which have both tags: Fruit and Cold.
I tried this query, but it returns all food with tags Fruit OR Cold.
select * from Foods
inner join FoodTagRelation
on
Foods.Id=FoodTagRelation.FoodId
where
tagid in ('1','2')
How can I re-write this query to only return foods that have BOTH tags?
For a more generic answer that allows you to change the tags for which you're searching:
DECLARE #Search_Tags TABLE (TagId INT)
INSERT INTO #Search_Tags (TagId) VALUES (1), (2)
SELECT
F.Id,
F.FoodItem
FROM
Foods F
INNER JOIN FoodTagRelation FTR ON
FTR.FoodId = F.Id
INNER JOIN #Search_Tags ST ON
ST.TagId = FTR.TagId
GROUP BY
F.Id,
F.FoodItem
HAVING
COUNT(*) = (SELECT COUNT(*) FROM #Search_Tags)
SELECT
F.id,
F.FoodItem
FROM
Foods F
INNER JOIN FoodTagRelation FTR
ON F.Id = FTR.FoodId
WHERE
FTR.tagid in('1','2')
GROUP BY
F.id,
F.FoodItem
HAVING
count(Distinct FTR.tagid) > 1
Features: uses count distinct, to prevent an issue with duplicate tagid's for a given FoodID in your FoodTagRelation table. (If you don't think that duplicates are a concern, then you can remove the 'distinct' keyword). Secondly, I kept your WHERE clause, because that allows you to look for specific tags, as opposed to just any two. Finally, I listed out your fields, because that was necessary in order to use the group by clause (which in turn, was necessary in order to use the HAVING clause.)
When you said "select all Food which exactly 2 tags", if a food have 3 tag which include Fruit and Cold and some other tag. Does it count?
Anyway, here is query to find food that have both Fruit and Cold.
SELECT *
FROM Foods f
INNER JOIN FoodTagRelation ft1
ON f.Id=ft1.FoodId
INNER JOIN FoodTagRelation ft2
ON f.Id=ft2.FoodId
WHERE
ft1.tagid = 1 AND ft2.tagid = 2
do a group by on FoodID and use having count(tagID) = 2
select *
from foods as f inner join foodtagrelation as ftr on f.id=ftr.foodid
where (ftr.tagid = 1 or ftr.tagid = 2)
group by f.foodid
having count(*) = 2
SELECT * from Foods where FoodId in (
select FoodID from FoodTagRelation where TagId in (1,2)
group by FoodId having count(*)=2
)
NOTE Updating my SQL because the Rusi seems to only care about Foods with exactly to tags where TagId is (1 or 2)

Filter a one-to-many query by requiring all of many meet criteria

Imagine the following tables:
create table boxes( id int, name text, ...);
create table thingsinboxes( id int, box_id int, thing enum('apple,'banana','orange');
And the tables look like:
Boxes:
id | name
1 | orangesOnly
2 | orangesOnly2
3 | orangesBananas
4 | misc
thingsinboxes:
id | box_id | thing
1 | 1 | orange
2 | 1 | orange
3 | 2 | orange
4 | 3 | orange
5 | 3 | banana
6 | 4 | orange
7 | 4 | apple
8 | 4 | banana
How do I select the boxes that contain at least one orange and nothing that isn't an orange?
How does this scale, assuming I have several hundred thousand boxes and possibly a million things in boxes?
I'd like to keep this all in SQL if possible, rather than post-processing the result set with a script.
I'm using both postgres and mysql, so subqueries are probably bad, given that mysql doesn't optimize subqueries (pre version 6, anyway).
SELECT b.*
FROM boxes b JOIN thingsinboxes t ON (b.id = t.box_id)
GROUP BY b.id
HAVING COUNT(DISTINCT t.thing) = 1 AND SUM(t.thing = 'orange') > 0;
Here's another solution that does not use GROUP BY:
SELECT DISTINCT b.*
FROM boxes b
JOIN thingsinboxes t1
ON (b.id = t1.box_id AND t1.thing = 'orange')
LEFT OUTER JOIN thingsinboxes t2
ON (b.id = t2.box_id AND t2.thing != 'orange')
WHERE t2.box_id IS NULL;
As always, before you make conclusions about the scalability or performance of a query, you have to try it with a realistic data set, and measure the performance.
I think Bill Karwin's query is just fine, however if a relatively small proportion of boxes contain oranges, you should be able to speed things up by using an index on the thing field:
SELECT b.*
FROM boxes b JOIN thingsinboxes t1 ON (b.id = t1.box_id)
WHERE t1.thing = 'orange'
AND NOT EXISTS (
SELECT 1
FROM thingsinboxes t2
WHERE t2.box_id = b.id
AND t2.thing <> 'orange'
)
GROUP BY t1.box_id
The WHERE NOT EXISTS subquery will only be run once per orange thing, so it's not too expensive provided there aren't many oranges.