SQL: Select Vegetarians - sql

i'm stuck with a small problem here and i can't seem to find the right answer.
I have a DB with multiple Tables, in which i have ANIMALS and their FOOD. I have to select which ANIMALS only eat VEGETARIAN meals. The table is set up with a BOOLEAN. 0 = MEAT 1 = VEGETARIAN.
There are ANIMALS in the Table who eat MEAT and VEGETARIAN. I don't need them, i strictly need Animals who only eat VEGETARIAN.
Thanks in advance.

You didn't say what is You table schema, so i will create simple one with fields AnimalId int , Meat bit .
In table the records are stored as simple as that
+----------+------+
| AnimalId | meat |
+----------+------+
| 1 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | 1 |
| 4 | 0 |
+----------+------+
So Animal 1 is eating both meat and vegetables, animal nr 2 only vegetables, animal nr 3 only meat. Our expected result is to get only Animal ID 2 and 4.
You need to group data by AnimalId, take MAX from meat bit column after casting it to int and filter data having 0 as result of MAX.
SELECT AnimalId
FROM dbo.Animals
GROUP BY AnimalId
HAVING (MAX(CAST(Meat AS INT)) = 0)

You can try this (but it's only mok, since you didn't write anything else):
SELECT *
FROM ANIMALS A
INNER JOIN BR_ANIMALS_FOOD B ON A.ID = B.ID_ANI
INNER JOIN FOODS C ON B.ID_FOODS = C.ID
WHERE NOT EXISTS (SELECT 1 FROM FOODS D WHERE B.ID_FOODS = D.ID AND D.FOOD_TYPE = 0)
AND C.FOOD_TYPE = 1

Related

Find out what group id contains all relevant attributes in SQL

So lets say in this case, the group that we have is groups of animals.
Lets say I have the following tables:
animal_id | attribute_id | animal
----------------------------------
1 | 1 | dog
1 | 4 | dog
2 | 1 | cat
2 | 3 | cat
3 | 2 | fish
3 | 5 | fish
id | attribute
------------------
1 | four legs
2 | no legs
3 | feline
4 | canine
5 | aquatic
Where the first table contains the attributes that define an animal, and the second table keeps track of what each attribute is. Now lets say that we run a query on some data and get the following result table:
attribute_id
------------
1
4
This data would describe a dog, since it is the only animal_id that has both attributes 1 and 4. I want to be able to somehow get the animal_id (which in this case would be 1) based on the third table, which is essentially a table that has already been generated that contains the attributes of an animal.
EDIT
So the third table that has 1 and 4 doesn't have to be 1 and 4. It could return 2 and 5 (for fish), or 1 and 3 (cat). We can assume that it's result will always match one animal completely, but we don't know which one.
You can use group by and having:
with a as (
select 1 as attribute_id from dual union all
select 4 as attribute_id from dual
)
select t.animal_id, t.animal
from t join
a
on t.attribute_id = a.attribute_id
group by t.animal_id, t.animal
having count(*) = (select count(*) from a);
The above will find all animals that have those attributes and any others. If you want animals that have exactly those 2 attributes:
with a as (
select 1 as attribute_id from dual union all
select 4 as attribute_id from dual
)
select t.animal_id, t.animal
from t left join
a
on t.attribute_id = a.attribute_id
group by t.animal_id, t.animal
having count(*) = (select count(*) from a) and
count(*) = count(a.attribute_id);

Comparing different columns in SQL for each row

after some transformation I have a result from a cross join (from table a and b) where I want to do some analysis on. The table for this looks like this:
+-----+------+------+------+------+-----+------+------+------+------+
| id | 10_1 | 10_2 | 11_1 | 11_2 | id | 10_1 | 10_2 | 11_1 | 11_2 |
+-----+------+------+------+------+-----+------+------+------+------+
| 111 | 1 | 0 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
| 111 | 1 | 0 | 1 | 0 | 333 | 0 | 0 | 0 | 0 |
| 111 | 1 | 0 | 1 | 0 | 444 | 1 | 0 | 1 | 1 |
| 112 | 0 | 1 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
+-----+------+------+------+------+-----+------+------+------+------+
The ids in the first column are different from the ids in the sixth column.
In a row are always two different IDs that are matched with each other. The other columns always have either 0 or 1 as a value.
I am now trying to find out how many values(meaning both have "1" in 10_1, 10_2 etc) two IDs have on average in common, but I don't really know how to do so.
I was trying something like this as a start:
SELECT SUM(CASE WHEN a.10_1 = 1 AND b.10_1 = 1 then 1 end)
But this would obviously only count how often two ids have 10_1 in common. I could make something like this for example for different columns:
SELECT SUM(CASE WHEN (a.10_1 = 1 AND b.10_1 = 1)
OR (a.10_2 = 1 AND b.10_1 = 1) OR [...] then 1 end)
To count in general how often two IDs have one thing in common, but this would of course also count if they have two or more things in common. Plus, I would also like to know how often two IDS have two things, three things etc in common.
One "problem" in my case is also that I have like ~30 columns I want to look at, so I can hardly write down for each case every possible combination.
Does anyone know how I can approach my problem in a better way?
Thanks in advance.
Edit:
A possible result could look like this:
+-----------+---------+
| in_common | count |
+-----------+---------+
| 0 | 100 |
| 1 | 500 |
| 2 | 1500 |
| 3 | 5000 |
| 4 | 3000 |
+-----------+---------+
With the codes as column names, you're going to have to write some code that explicitly references each column name. To keep that to a minimum, you could write those references in a single union statement that normalizes the data, such as:
select id, '10_1' where "10_1" = 1
union
select id, '10_2' where "10_2" = 1
union
select id, '11_1' where "11_1" = 1
union
select id, '11_2' where "11_2" = 1;
This needs to be modified to include whatever additional columns you need to link up different IDs. For the purpose of this illustration, I assume the following data model
create table p (
id integer not null primary key,
sex character(1) not null,
age integer not null
);
create table t1 (
id integer not null,
code character varying(4) not null,
constraint pk_t1 primary key (id, code)
);
Though your data evidently does not currently resemble this structure, normalizing your data into a form like this would allow you to apply the following solution to summarize your data in the desired form.
select
in_common,
count(*) as count
from (
select
count(*) as in_common
from (
select
a.id as a_id, a.code,
b.id as b_id, b.code
from
(select p.*, t1.code
from p left join t1 on p.id=t1.id
) as a
inner join (select p.*, t1.code
from p left join t1 on p.id=t1.id
) as b on b.sex <> a.sex and b.age between a.age-10 and a.age+10
where
a.id < b.id
and a.code = b.code
) as c
group by
a_id, b_id
) as summ
group by
in_common;
The proposed solution requires first to take one step back from the cross-join table, as the identical column names are super annoying. Instead, we take the ids from the two tables and put them in a temporary table. The following query gets the result wanted in the question. It assumes table_a and table_b from the question are the same and called tbl, but this assumption is not needed and tbl can be replaced by table_a and table_b in the two sub-SELECT queries. It looks complicated and uses the JSON trick to flatten the columns, but it works here:
WITH idtable AS (
SELECT a.id as id_1, b.id as id_2 FROM
-- put cross join of table a and table b here
)
SELECT in_common,
count(*)
FROM
(SELECT idtable.*,
sum(CASE
WHEN meltedR.value::text=meltedL.value::text THEN 1
ELSE 0
END) AS in_common
FROM idtable
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_a
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedL ON (idtable.id_1 = meltedL.id)
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_b
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedR ON (idtable.id_2 = meltedR.id
AND meltedL.key = meltedR.key)
GROUP BY idtable.id_1,
idtable.id_2) tt
GROUP BY in_common ORDER BY in_common;
The output here looks like this:
in_common | count
-----------+-------
2 | 2
3 | 1
4 | 1
(3 rows)

Filtering out rows

I have simplified a table as an example
tray| food
-------+-------
1 | fruit
2 | veg
2 | fruit
2 | meat
3 | meat
4 | bread
What I want to find for each fruit, is the number of trays that ONLY contain that type of food. So the output should look like this:
food| count
-------+-------
fruit | 1
veg | 0
meat | 1
bread | 1
I tried writing a query:
SELECT fruit, COUNT(*)
FROM Inventory
WHERE NOT EXISTS (SELECT *
FROM Inventory I
WHERE I.tray = tray AND I.fruit<>fruit)
GROUP BY fruit;
However the table returned is incorrect and it looks like my sub query is wrong but it makes logical sense to me.
food | count
-------+-------
fruit | 2
veg | 1
bread | 1
meat | 2
It looks like tray 2 is counting once for fruit, meat and veg when it should not. But shouldn't that be ruled out by my NOT EXISTS subquery? How do I fix this?
Clever little problem. Here is one solution:
select f.food, count(t.tray)
from (select distinct food from t
) f left join
(select t.tray, min(food) as minfood, max(food) as maxfood
from tray t
group by tray
) t
on f.food = t.minfood and f.food = t.maxfood
group by f.food;
Getting a count of zero suggests a query where left join and group by would be useful.
select i1.food,
count(ft.tray)
from inventory i1
left join (
select i1.tray,
count(distinct i1.food) as num_food
from inventory i1
group by i1.tray
) ft on i1.tray = ft.tray and num_food = 1
group by i1.food;
The inner query (ft) counts the number of different foods per tray. The outer (main) query counts the number of trays per food for those trays that only contain a single type of food.
Online example: http://rextester.com/PUN27611

Counting Distinct Records Using Multiple Criteria From Another Table In MySQL

This Is Not Homework. I have changed the names of the tables and fields, for illustrative purposes only. I admit that I am completely new to MySQL. Please consider that in your answer.
The best way to illustrate the function of the query I need is like this:
I have two tables.
One table has a 0..1 to 0..n relationship to the other table.
For Simplicities Sake Only, Suppose that the two tables were Recipe and Ingredient.
One of the fields in the Ingredient table refers to the Recipe table, but may be null.
Just For Example:
I want to know the SQL for something like: How many recipes call for "Olives" in the amount of "1" AND "Mushrooms" in the amount of "2"
Being brand new to The Structured Query Language, I'm not even sure what to google for this information.
Am I on the right track with the following?:
SELECT COUNT(DISTINCT Recipe.ID) AS Answer FROM Recipe, Ingredient
WHERE Ingredient.RecipeID=Recipe.ID AND Ingredient.Name='Olives'
AND Ingredient.Amount=1 AND Ingredient.Name='Mushrooms'
AND Ingredient.Amount=2
I realize this is totally wrong because Name cannot be BOTH Olives And Mushrooms... but don't know what to put instead, since I need to count all recipes with both, but only all recipes with both.
How can I properly write such a query for MySQL?
You were close.
Try something like
SELECT COUNT(Recipe.ID) Answer
FROM Recipe INNER JOIN
Ingredient olives ON olives.RecipeID=Recipe.ID INNER JOIN
Ingredient mushrooms ON mushrooms.RecipeID=Recipe.ID
WHERE olives.Name='Olives'
AND mushrooms.Name='Mushrooms'
AND olives.Amount = 1
AND mushrooms.Amount = 2
You can join to the same table twice, all you need to do is give the table an appropriate alias.
Use:
SELECT COUNT(r.id)
FROM RECIPE r
WHERE EXISTS(SELECT NULL
FROM INGREDIENTS i
WHERE i.recipeid = r.id
AND i.name = 'Olives'
AND i.amount = 1)
AND EXISTS(SELECT NULL
FROM INGREDIENTS i
WHERE i.recipeid = r.id
AND i.name = 'Mushrooms'
AND i.amount = 2)
SELECT COUNT(DISTINCT Recipe.ID) AS Answer
FROM Recipe, Ingredient as ing1, Ingredient as ing2
WHERE
Ing1.RecipeID=Recipe.ID AND Ing1.Name="Olives" AND ing1.Amount=1 AND
Ing2.RecipeID=Recipe.ID AND Ing2.Name="Mushrooms" AND ing2.Amount=2;
Hope this help
First of all, you use the old join syntax. INNER JOIN creates the same execution plan, but is much more clear. Then, your query is all about a plain, old condition. You just have to write them correctly! (It takes some practice, I agree.)
SELECT COUNT(DISTINCT R.ID) Answer
FROM Recipe R
INNER JOIN Ingredient I
ON R.ID = I.RecipeID
AND (R.Name = 'Olives' AND I.Amount = 1)
OR (R.Name = 'Mushrooms' AND I.Amount = 2);
Here is my sample data :
mysql> SELECT * FROM Ingredient;
+------+------+--------+----------+
| ID | Name | Amount | RecipeID |
+------+------+--------+----------+
| 1 | I1 | 1 | 1 |
| 1 | I2 | 2 | 1 |
| 1 | I3 | 2 | 1 |
| 1 | I4 | 3 | 1 |
| 1 | I1 | 1 | 2 |
| 1 | I2 | 1 | 2 |
| 1 | I3 | 3 | 2 |
| 1 | I4 | 2 | 2 |
| 1 | I1 | 2 | 3 |
| 1 | I2 | 1 | 3 |
+------+------+--------+----------+
10 rows in set (0.00 sec)
mysql> SELECT * FROM Recipe;
+------+-----------+
| ID | Name |
+------+-----------+
| 1 | Mushrooms |
| 2 | Olives |
| 3 | Tomatoes |
+------+-----------+
3 rows in set (0.00 sec)
And my query outputs 2, as it should.
Edit: Actually, I realised my query selected wrong rows. This works correctly :
SELECT COUNT(DISTINCT R.Id) Answer
FROM Recipe R
INNER JOIN Ingredient I
ON R.ID = I.RecipeID
WHERE
(R.Name = 'Mushrooms' AND I.Amount = 2)
OR (R.Name = 'Olives' AND I.Amount = 1);
Which outputs 2 as well.
Just for reference...
How many recipes call for "Olives" in the amount of "1" AND "Mushrooms" in the amount of "2"
Looking at your table structure above, you may want an associative entity to go between recipes and ingredients. This makes it a lot easier to do the kind of query you're looking for with an inner join.
In between your two tables, I would imagine something like this...
Recipe_Ingredient
-----------------
(PK, FK) RecipeID
(PK, FK) IngredientID
If you have that, then each recipe may have many ingredients, and each ingredient may be a part of many recipes. Once you have a table like that to properly associate your two tables, you can join them together and get a complete recipe. For a more complicated query like this where you have separate conditions for two recipes, I would probably sub-query it to help understanding.
SELECT SUM(RecipeCount) as RecipeCount
FROM
(
SELECT COUNT(r.*) as RecipeCount
FROM Recipe r
INNER JOIN Recipe_Ingredient ri on r.ID = ri.RecipeID
INNER JOIN Ingredient i on i.ID = ri.IngredientID
WHERE i.Name = 'Olives' AND i.Amount = 1
UNION ALL
SELECT COUNT(r.*) as RecipeCount
FROM Recipe r
INNER JOIN Recipe_Ingredient ri on r.ID = ri.RecipeID
INNER JOIN Ingredient i on i.ID = ri.IngredientID
WHERE i.Name = 'Mushrooms' AND i.Amount = 2
) as subTable

Filter a one-to-many query by requiring all of many meet criteria

Imagine the following tables:
create table boxes( id int, name text, ...);
create table thingsinboxes( id int, box_id int, thing enum('apple,'banana','orange');
And the tables look like:
Boxes:
id | name
1 | orangesOnly
2 | orangesOnly2
3 | orangesBananas
4 | misc
thingsinboxes:
id | box_id | thing
1 | 1 | orange
2 | 1 | orange
3 | 2 | orange
4 | 3 | orange
5 | 3 | banana
6 | 4 | orange
7 | 4 | apple
8 | 4 | banana
How do I select the boxes that contain at least one orange and nothing that isn't an orange?
How does this scale, assuming I have several hundred thousand boxes and possibly a million things in boxes?
I'd like to keep this all in SQL if possible, rather than post-processing the result set with a script.
I'm using both postgres and mysql, so subqueries are probably bad, given that mysql doesn't optimize subqueries (pre version 6, anyway).
SELECT b.*
FROM boxes b JOIN thingsinboxes t ON (b.id = t.box_id)
GROUP BY b.id
HAVING COUNT(DISTINCT t.thing) = 1 AND SUM(t.thing = 'orange') > 0;
Here's another solution that does not use GROUP BY:
SELECT DISTINCT b.*
FROM boxes b
JOIN thingsinboxes t1
ON (b.id = t1.box_id AND t1.thing = 'orange')
LEFT OUTER JOIN thingsinboxes t2
ON (b.id = t2.box_id AND t2.thing != 'orange')
WHERE t2.box_id IS NULL;
As always, before you make conclusions about the scalability or performance of a query, you have to try it with a realistic data set, and measure the performance.
I think Bill Karwin's query is just fine, however if a relatively small proportion of boxes contain oranges, you should be able to speed things up by using an index on the thing field:
SELECT b.*
FROM boxes b JOIN thingsinboxes t1 ON (b.id = t1.box_id)
WHERE t1.thing = 'orange'
AND NOT EXISTS (
SELECT 1
FROM thingsinboxes t2
WHERE t2.box_id = b.id
AND t2.thing <> 'orange'
)
GROUP BY t1.box_id
The WHERE NOT EXISTS subquery will only be run once per orange thing, so it's not too expensive provided there aren't many oranges.