select [CAR] based on [FEATURE]s -- all or nothing - sql

Currently, I have 2 tables with the following data:
[CAR]
Honda
Ford
Mazda
[FEATURE]
CD player
Sunroof
Leather
In another table, I store their relationships.
[CAR_FEATURE]
Honda | CD Player
Mazda | CD Player
Mazda | Sunroof
Mazda | Leather
My problem is... I need to select the [CAR] that has all the [FEATURE] I am looking for -- all or nothing.
For example, if I am looking for a car with a CD Player, Sunroof, AND Leather, then it would output Mazda.
If I am looking for a car with a CD Player, then it would ONLY output Honda.
If I am looking for a car with a Sunroof, then it would output nothing because I am searching for all or nothing.
How would I write the query statement in SQL for this particular case?

This variation of T McKeown's query will get you a car matching all the features exactly, not more or less:
SELECT CAR
FROM CAR_FEATURE
GROUP BY CAR
HAVING SUM(CASE WHEN FEATURE IN ('CD Player', 'Sunroof', 'Leather')
THEN 1
ELSE -1
END) = 3

This would work:
SELECT CAR
FROM CAR_FEATURE
WHERE FEATURE IN ('CD Player', 'Sunroof', 'Leather')
GROUP BY CAR
HAVING COUNT(*) = 3
Just adjust the HAVING clause based on the features, you could pass in a CSV list of features and create a TABLE variable table that has all the features so the query could be like this:
DECLARE #LIST_OF_FEATURES TABLE(FEATURE VARCHAR(50))
--INSERT INTO THE #LIST_OF_FEATURES USING THE FORXML
SELECT CAR
FROM CAR_FEATURE
WHERE FEATURE IN (SELECT FEATURE FROM #LIST_OF_FEATURES)
GROUP BY CAR
HAVING COUNT(*) = (SELECT COUNT(*) FROM #LIST_OF_FEATURES)

I don't know the exact table structure, so I will improvise with field names. I hope it's clear from context:
SELECT c.*
FROM CAR c
JOIN CAR_FEATURE cf ON (c.id = cf.car_id)
JOIN FEATURE f ON (f.id = cf.feature_id)
LEFT JOIN FEATURE f2 ON (f2.id = cf.feature_id
AND f2.id NOT IN (<list of feature ids>))
WHERE f.id IN (<same list of features ids as above>)
AND f2.id IS NULL
This would bring the cars that have the features you want, but that do not have any other features.
Concrete example. Suppose your feature id are the names (not a great idea, but good for the example), then you would write:
SELECT c.*
FROM CAR c
JOIN CAR_FEATURE cf ON (c.id = cf.car_id)
JOIN FEATURE f ON (f.id = cf.feature_id)
LEFT JOIN FEATURE f2 ON (f2.id = cf.feature_id
AND f2.id NOT IN ('CD Player','Sunroof'))
WHERE f.id IN ('CD Player','Sunroof')
AND f2.id IS NULL
Hope this helps!

Related

Bad performance when joining two sets based on a

To better illustrate my problem picture the following data set that has Rooms that contain a "range" of animals. To represent the range, each animal is assigned a sequence number in a separate table. There are different animal types and the sequence is "reset" for each of them.
Table A
RoomId
StartAnimal
EndAnimal
GroupType
1
Monkey
Bee
A
1
Lion
Buffalo
A
2
Ant
Frog
B
Table B
Animal
Sequence
Type
Monkey
1
A
Zebra
2
A
Bee
3
A
Turtle
4
A
Lion
5
A
Buffalo
6
A
Ant
1
B
Frog
2
B
Desired Output
Getting all the animals for each Room based on their Start-End entries, e.g.
RoomId
Animal
1
Monkey
1
Zebra
1
Bee
1
Lion
1
Buffalo
2
Ant
2
Frog
I have been able to get the desired output by first creating a view where the rooms have their start and end sequence numbers, and then Join them with the animal list comparing the ranges.
The problem is that this is performing poorly in my real data set where there are around 10k rooms and around 340k animals. Is there a different (better) way to go about this that I'm not seeing?
Example fiddle I'm working with: https://dbfiddle.uk/RnagCTf0
The query I tried is
WITH fullAnimals AS (
SELECT DISTINCT(RoomId), a.[Animal], ta.[GroupType], a.[sequence] s1, ae.[sequence] s2
FROM [TableA] ta
LEFT JOIN [TableB] a ON a.[Animal] = ta.[StartAnimal] AND a.[Type] = ta.[GroupType]
LEFT JOIN [TableB] ae ON ae.[Animal] = ta.[EndAnimal] AND ae.[Type] = a.[Type]
)
SELECT DISTINCT(r.Id), Name, b.[Animal], b.[Type]
FROM [TableB] b
LEFT JOIN fullAnimals ON (b.[Sequence] >= s1 AND b.[Sequence] <= s2)
INNER JOIN [Rooms] r ON (r.[Id] = fullAnimals.[RoomId]) --this is a third table that has more data from the rooms
WHERE b.[Type] = fullAnimals.[GroupType]
Thanks!
One option, to remove the aggregations, is to use the following joins:
between TableA and TableB, to gather "a.StartAnimal" id
between TableA and TableB, to gather "a.EndAnimal" id
between TableB and the previous two TableBs, to gather only the rows that have b.Sequence between the two values of "a.StartAnimal" id and "b.StartAnimal" id, on the matching "Type".
between Table A and Rooms, to gather room infos
SELECT r.*, b.Animal, b.Type
FROM TableA a
INNER JOIN TableB b1 ON a.StartAnimal = b1.Animal
INNER JOIN TableB b2 ON a.EndAnimal = b2.Animal
INNER JOIN TableB b ON b.Sequence BETWEEN b1.Sequence AND b2.Sequence
AND a.GroupType = b.Type
INNER JOIN Rooms r ON r.Id = a.roomId
Check the updated demo here.

Query to select a row from a column that matches "x but not y" where y is everything else thats not x

I'm trying to write a query that will select rows that match only what I'm looking for. If the row has other stuff, then I don't want it. The column is a varchar field and the values in the column are a comma delimited string.
So here is the dilemma:
The table has a recipe column and an ingredients column. Like this:
Muffin | "salt"
Cake | "salt,pepper"
Pie | "salt,pepper,butter"
In my query I want to find all of the recipes that contain ANY COMBINATION of salt and/or pepper but nothing else.
If I write the query like this:
select recipe
from mytable
where ingredients like "%pepper%" and/or ingredients like "%salt%"
I want the Muffin and the Cake be returned but not the Pie (because it has additional ingredients that are not specifically listed in the search criteria). How do I write the exclusion?
I'm using SQL server 2008
You've already received comments encouraging you to consider a different design for your schema and the rationale for this so I'll only focus on a suggestion for your schema here.
You may consider using REPLACE to determine if the column of ingredients will be empty or whether this recipe has no other ingredients. The LIKE was used to determine whether the recipe had the desired ingredients.
Approach 1
SELECT
recipe,
ingredients
FROM mytable
WHERE (
CONCAT(',',ingredients,',') LIKE '%,salt,%' OR
CONCAT(',',ingredients,',') LIKE '%,pepper,%'
) AND
REPLACE(REPLACE(REPLACE(ingredients,'salt',','),'pepper',','),',','')=''
View working demo online
Approach 2
Ingredients that are only desired are placed in a subquery and filtered using the left join. The having clause is then used to determine whether the list of ingredients only has these ingredients .
SELECT recipe
FROM (
SELECT
recipe,
ingredients,
desired
FROM
mytable m
LEFT JOIN (
SELECT 'salt' as desired UNION ALL
SELECT 'pepper'
) d ON CONCAT(',',ingredients,',') LIKE CONCAT('%,',d.desired,',%')
) t
GROUP BY
recipe
HAVING
LEN(
MAX(
REPLACE(ingredients,',','')
)
) <= SUM(LEN(desired))
View working demo online
select recipe
from T cross apply string_split(ingredients, ',')
group by recipe
having count(case when value in (<my list>) then 1 end) > 0
and count(case when value not in (<my list>) then 1 end) = 0
You really should have 3 tables for this solution. You are suffering from the X Y Problem
Your solution for example should look something like this:
product
product_id
name
1
Muffin
2
Cake
3
Pie
ingredient
ingredient_id
name
1
Salt
2
Pepper
3
Butter
ingredient_to_product
product_id
ingredient_id
1
1
2
1
2
2
3
1
3
2
3
3
Then you can simply write your query based on your positive ingredient list and not worry about what they DON'T HAVE: According to your original statement: "in my query I want to find all of the recipes that contain ANY COMBINATION of salt and/or pepper but nothing else." That can be accomplished using the IN operator
SELECT a.name FROM product a
LEFT JOIN ingredient_to_product b
ON a.product_id = b.product_id
LEFT JOIN ingredient c
ON b.ingredient_id = c.ingredient_id
WHERE c.name IN ('salt','pepper');
And conversely you can exclude with NOT IN --
WHERE c.name IN ('salt','pepper') AND c.name NOT IN ('milk', 'butter');
OR to include salt and pepper -- But exclude every other possibility .. Use a nested SELECT to exclude everything BUT salt, and pepper
WHERE c.name IN ('salt','pepper')
AND c.name NOT IN (
SELECT c.name FROM product a
LEFT JOIN ingredient_to_product b
ON a.product_id = b.product_id
LEFT JOIN ingredient c
ON b.ingredient_id = c.ingredient_id
WHERE c.name NOT IN ('salt','pepper')
GROUP BY c.name
)
GROUP BY a.name;

Nesting SELECT statements with duplicate entries and COUNT

I'm working with 3 tables: actors, films, and actor_film. Actors and films only have 2 fields: id (primary key) and name. Actor_film also has 2 fields, actor and film, which are both foreign keys representing actor and film ids, respectively. So if a film had 4 actors in it, there'd be 4 actor_film entries with the same film and 4 different actors.
My problem is that, given a certain actor's id, I'd like to return the actor id, actor name, film name, and the total number of actors in that film. However, the only actors that I want to show are ones that contain certain letters in their names.
Let me clear things up with an example. Say Tom Hanks is in only 2 movies, Forrest Gump and Saving Private Ryan, and I'm looking for actors in those 2 movies that have "Gary" or "Matt" in their names. Further suppose that there are 4 actors in Forrest Gump, and 5 in Saving Private Ryan. Then, the only thing I'd want to return would be (without the column names, of course)
actor id | actor name | film name | # actors
abcdefg | Gary Sinise | Forrest Gump | 4
hijklmn | Matt Damon | Saving Private Ryan | 5
opqrstu | Paul Giamatti | Saving Private Ryan | 5
Currently, I'm 75% of the way there by using:
SELECT actor.id, actors.name, films.name,
FROM (
SELECT actor_film.film
FROM actor_film, actors
WHERE actor_film.actor = actors.id
) AS a, actor_film, actors, films
WHERE actor_film.film = a.film
AND actors.id = actor_film.actor
AND films.id = a.film;
This is returning stuff like:
arnie | Arnold Schwarzenegger | Around the World in 80 Days
arnie | Arnold Schwarzenegger | Around the World in 80 Days
for a film that has 2 actors in it. In other words, I can't pull out all the distinct actors in the movie, but get the proper count for it implicitly and not explicitly with COUNT.
Anyway, I think I'm looking for some kind of INNER JOIN or nested SELECT, but I'm new to SQLite3 and don't know how to bring these together. Any solutions would be great, and any explanations on top of that would be amazing as well.
You shouldn't use the old style joins. They were old in '95 when the newer standard that let you do left joins clearer was made a standard.
I've noticed you also use plurals for your table names (eg "actors") The standard style is to use the singular for the table table name (eg "actor")
I use both these suggestions below, I also show you each step. I suggest you run the queries for each step and look at the output to understand how everything works since you are new to SQL.
Ok, lets take you problem step by step. First of all to see each actor and the films they are in (your first 3 columns) do this:
SELECT a.id as actor_id, a.name as actor_name, f.name as film_name
FROM actor as a
JOIN actor_film af on a.id = af.actor
JOIN film as f on af.film = f.id
Your last column can be found with the following query:
SELECT af.film as film_id, count(*) as c
FROM actor_film as af
GROUP BY af.film
Now we just join them together
SELECT a.id as actor_id, a.name as actor_name, f.name as film_name, fc.c as num_actors
FROM actor as a
JOIN actor_film af on a.id = af.actor
JOIN film as f on af.film = f.id
JOIN (
SELECT af.film as film_id, count(*) as c
FROM actor_file as af
GROUP BY af.film
) as fc on af.film = fc.film_id
If you want you can add a
WHERE a.name = 'Gary' OR a.name = 'Matt'
depending on your platform you might want
WHERE lower(a.name) = 'gary' OR lower(a.name) = 'matt'

How to use all records from another table as counting columns?

I have 4 tables:
location:
location_id name
------------------------
1 France
device:
device_id location_id model_id
-------------------------------------
1 1 1
2 1 2
3 1 3
model:
model_id family_id name
-------------------------------------
1 1 C-max
2 1 S-max
3 2 Vectra
and family:
family_id name
---------------------
1 Ford
2 Opel
I need to build a complicated SQL query now. As the result, I would like to receive this:
location_id name Ford Opel
------------------------------------------
1 France 2 1
Is it possible to do it in SQL at all? I see there there problems:
About using other table records as columns in the query
About nested tables
About counting the elements (count function?)
Any comments/reference materials will be for me helpful. I do not await the final code.
In SQL queries the columns are fix. You get more or less rows depending on data, not columns. But that doesn't matter, because SQL is about to get data not to display it. The latter is a task for the GUI layer.
So get the desired data, which is the number of models per location and family mainly.
select l.location_id, l.name as location_name, f.name as family_name, count(*) as models
from location l
join device d on d.location_id = l.location_id
join model m on m.model_id = d.model_id
join family f on f.family_id = m.family_id
group by l.location_id, l.name, f.name
order by l.location_id, l.name, f.name;
This is all you need from the database. How to show the data is a task for your programm, a Delphi app in your case. So use Delphi to read the data with above query and fill your grid in a simple loop.
Thank you all for your helpful tips.
I solved my problem using the static method and the code published by #Matt. Because somebody else may looking for the solution, I paste here my working query for PostgreSQL:
SELECT DISTINCT t.location_id, t.name, SUM(t.ford) AS ford, SUM(t.opel) as opel
FROM(
SELECT l.location_id, l.name,
(SELECT COUNT(m.family_id) WHERE m.family_id = '1') AS ford,
(SELECT COUNT(m.family_id) WHERE m.family_id = '2') AS opel
FROM location l
INNER JOIN device d ON l.location_id = d.location_id
INNER JOIN model m ON d.model_id = m.model_id
INNER JOIN family f ON m.family_id = f.family_id
GROUP BY l.location_id, l.name, m.family_id
) t
GROUP BY t.location_id;

Using data from tables in a group by

names:
id, first, last
879 Scotty Anderson
549 Melvin Anderson
554 Freddy Appleton
321 Grace Appleton
112 Milton Appleton
189 Jackson Black
99 Elizabeth Black
298 Jordan Frey
parents:
id, student_id
549 879
321 554
112 554
99 189
298 189
Expected Output
(without the 'Student:' / 'Parent:')
Student: Anderson, Scotty
Parent: Anderson, Melvin
Student: Appleton, Freddy
Parent: Appleton, Grace
Parent: Appleton, Milton
Student: Black, Jackson
Parent: Black, Elizabeth
Parent: Frey, Jordan
Using the data above, how can I achieve the expected output?
I currently use SQL similar to this to get a list of current students and names.
select b.last, b.first
from term a, names b
where a.id = b.id(+)
order by b.last
Which returns:
Anderson, Scotty
Appleton, Freddy
Black, Jackson
My question is how to take the parents table and add to this query so it has this output:
Anderson, Scotty
Anderson, Melvin
Appleton, Freddy
Appleton, Grace
Appleton, Milton
Black, Jackson
Black, Elizabeth
Frey, Jordan
The idea in a query like this is to break the data down into something that helps you solve the problem, and then put it back together as needed. In this case I'm going to make use of common table expressions, which allows me to treat queries as tables and then recombine them handily.
Looking at the desired results it looks like we want to have the students appear first, followed by their mothers (ladies first :-), and then their fathers. So, OK, let's figure out how to extract the needed data. We can get the students and their associated data pretty simply:
select distinct p.student_id as student_id,
n.first,
n.last,
0 as type
from parents p
inner join names n
on n.id = p.student_id
The type column, with its constant value of zero, is just used to identify that this is a student. (You'll see why in a minute).
Now, getting the mother's is a bit more difficult because we don't have any gender information to use. However, we'll use what we have, which is names. We know that names like Melvin, Milton, and Jordan are "guy" names. (Yes, I know Jordan can be a girls name too. My daughter has a male coach named Jordan, and a female teammate named Jordan. Just go with it - for purposes of argument in this case Jordan is a guys name, 'K? 'K :-). So we'll use that information to help us identify the mom's:
select p.student_id, n.first, n.last, 1 as type
from parents p
inner join names n
on n.id = p.id
where first not in ('Melvin', 'Milton', 'Jordan')
Notice here that we assign the value of 1 to the type column for mothers.
Similarly, we'll find the dads:
select p.student_id, n.first, n.last, 2 as type
from parents p
inner join names n
on n.id = p.id
where first in ('Melvin', 'Milton', 'Jordan')
And here we assign a value of 2 for the type.
OK - given the above we just need to combine the data properly. We don't want to use a JOIN, however, because we want the names to get spit out one after the other from the query - and the way we do THAT in SQL is with the UNION or UNION ALL operator. (Generally, you're going to want to use UNION ALL, because UNION will check the result set to ensure there are no duplicates - which in the case of a large result set takes, oh, more or less FOREVER!). And so, the final query looks like:
with all_students as (select distinct p.student_id as student_id,
n.first,
n.last,
0 as type
from parents p
inner join names n
on n.id = p.student_id),
all_mothers as (select p.student_id, n.first, n.last, 1 as type
from parents p
inner join names n
on n.id = p.id
where first not in ('Melvin', 'Milton', 'Jordan')),
all_fathers as (select p.student_id, n.first, n.last, 2 as type
from parents p
inner join names n
on n.id = p.id
where first in ('Melvin', 'Milton', 'Jordan'))
select last || ', ' || first as name from
(select * from all_students
union all
select * from all_mothers
union all
select * from all_fathers)
order by student_id desc, type;
We just take the student data, followed by the mom data, followed by the dad data, then sort it by the student ID from highest to lowest (I just looked at the desired results to figure out that this should be a descending sort), and then by the type (which results in the student (type=0) being first, following by their mother (type=1) and then their father (type=2)).
SQLFiddle here
Share and enjoy.
generic SQL, mmmmm I'd like there to be A generic SQL :)
First off you want to stop using the antique (+) join syntax that is exclusive to Oracle
select b.last, b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
That is way more generic! (nb: You can abbreviate to just LEFT JOIN)
Now to concatenate (Last Name comma space First Name) there are options some not generic
SQL Server/MySQL and others supporting CONCAT()
select CONCAT(b.last , ', ', b.first)
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
not all versions of Oracle or SQL Server support CONCAT()
Oracle's concat() only takes 2 parameters; grrrrr
ORACLE
select b.last || ', ' || b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
In this form Oracle generally handles data type conversions automatically (I think, please check on date/timestamps maybe others)
TSQL (Sybase, MS SQL Server)
select b.last + ', ' + b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
In this form you must explicitly cast/convert data types to n|var|char for concatenation if not already a string type
For your list of concatenated names:
You need in addition to the last name a method to retain the family group together, plus distinguish between student and parent. As you want just one column of names this indicates you need a column of id's that point to the last and first names. So making some assumptions about the table TERM my guess is you list the students from that, then append the parents that relate to that group of students, and finally to output the required list in the required order.
select
case when type = 1 then 'Student' else 'Parent' end as who
, names.last || ', ' || names.first as Name
from (
select
STUDENT_ID as name_id
, STUDENT_ID as family_id
, 1 as TYPE
from term
union all
select
PARENTS.ID as name_id
, PARENTS.STUDENT_ID as family_id
, 2 as TYPE
from PARENTS
inner join term on PARENTS.STUDENT_ID = term.STUDENT_ID
) sq
inner join NAMES ON sq.name_id = NAMES.ID
order by
names.last
, sq.family_id
, sq.type
see: http://sqlfiddle.com/#!4/01804/6
This is too long for a comment.
Your question doesn't make sense. The easy answer to the question is:
select last, first
from names;
But it seems unlikely that is what you want.
Your sample query mentions a table term. That is not mentioned elsewhere in the question. Please clarify the question or delete this one and ask another.
I think I see what you're trying to do. I think you could set up a derived table and then query it. Set up something like: case when student id= id then 1 else 0 as match or whatever. Then query your derived table and group by match.
I would do it like that in SQL:
Select last +', '+ first as fullname from names;