Given two related tables, how to determine the most common relationships? - sql

given a 3 tables: users, books, book_users, how would I determine what are the commons books?
users: id, first_name, last_name
books: id, name
books_users: book_id, user_id
Designer Output, something like:
book | count
radBookName | 22
SemiRad | 22
Thanks

You seems want simple JOIN with GROUP BY clause :
SELECT b.name, count(*) as user_count
FROM books b INNER JOIN
books_users bu
ON bu.book_id = b.id
GROUP BY b.name;
This would produce duplicate count if one book has same user, if you want unique count then use count(distinct bu.user_id) instead.

Related

How to make sure result pairs are unique - without using distinct?

I have three tables I want to iterate over. The tables are pretty big so I will show a small snippet of the tables. First table is Students:
id
name
address
1
John Smith
New York
2
Rebeka Jens
Miami
3
Amira Sarty
Boston
Second one is TakingCourse. This is the course the students are taking, so student_id is the id of the one in Students.
id
student_id
course_id
20
1
26
19
2
27
18
3
28
Last table is Courses. The id is the same as the course_id in the previous table. These are the courses the students are following and looks like this:
id
type
26
History
27
Maths
28
Science
I want to return a table with the location (address) and the type of courses that are taken there. So the results table should look like this:
address
type
The pairs should be unique, and that is what's going wrong. I tried this:
select S.address, C.type
from Students S, Courses C, TakingCourse TC
where TC.course_id = C.id
and S.id = TC.student_id
And this does work, but the pairs are not all unique. I tried select distinct and it's still the same.
Multiple students can (and will) reside at the same address. So don't expect unique results from this query.
Only an overview is needed, so that's why I don''t want duplicates
So fold duplicates. Simple way with DISTINCT:
SELECT DISTINCT s.address, c.type
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id;
Or to avoid DISTINCT (why would you for this task?) and, optionally, get counts, too:
SELECT c.type, s.address, count(*) AS ct
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id
GROUP BY c.type, s.address
ORDER BY c.type, s.address;
A missing UNIQUE constraint on takingcourse(student_id, course_id) could be an additional source of duplicates. See:
How to implement a many-to-many relationship in PostgreSQL?

Nesting SELECT statements with duplicate entries and COUNT

I'm working with 3 tables: actors, films, and actor_film. Actors and films only have 2 fields: id (primary key) and name. Actor_film also has 2 fields, actor and film, which are both foreign keys representing actor and film ids, respectively. So if a film had 4 actors in it, there'd be 4 actor_film entries with the same film and 4 different actors.
My problem is that, given a certain actor's id, I'd like to return the actor id, actor name, film name, and the total number of actors in that film. However, the only actors that I want to show are ones that contain certain letters in their names.
Let me clear things up with an example. Say Tom Hanks is in only 2 movies, Forrest Gump and Saving Private Ryan, and I'm looking for actors in those 2 movies that have "Gary" or "Matt" in their names. Further suppose that there are 4 actors in Forrest Gump, and 5 in Saving Private Ryan. Then, the only thing I'd want to return would be (without the column names, of course)
actor id | actor name | film name | # actors
abcdefg | Gary Sinise | Forrest Gump | 4
hijklmn | Matt Damon | Saving Private Ryan | 5
opqrstu | Paul Giamatti | Saving Private Ryan | 5
Currently, I'm 75% of the way there by using:
SELECT actor.id, actors.name, films.name,
FROM (
SELECT actor_film.film
FROM actor_film, actors
WHERE actor_film.actor = actors.id
) AS a, actor_film, actors, films
WHERE actor_film.film = a.film
AND actors.id = actor_film.actor
AND films.id = a.film;
This is returning stuff like:
arnie | Arnold Schwarzenegger | Around the World in 80 Days
arnie | Arnold Schwarzenegger | Around the World in 80 Days
for a film that has 2 actors in it. In other words, I can't pull out all the distinct actors in the movie, but get the proper count for it implicitly and not explicitly with COUNT.
Anyway, I think I'm looking for some kind of INNER JOIN or nested SELECT, but I'm new to SQLite3 and don't know how to bring these together. Any solutions would be great, and any explanations on top of that would be amazing as well.
You shouldn't use the old style joins. They were old in '95 when the newer standard that let you do left joins clearer was made a standard.
I've noticed you also use plurals for your table names (eg "actors") The standard style is to use the singular for the table table name (eg "actor")
I use both these suggestions below, I also show you each step. I suggest you run the queries for each step and look at the output to understand how everything works since you are new to SQL.
Ok, lets take you problem step by step. First of all to see each actor and the films they are in (your first 3 columns) do this:
SELECT a.id as actor_id, a.name as actor_name, f.name as film_name
FROM actor as a
JOIN actor_film af on a.id = af.actor
JOIN film as f on af.film = f.id
Your last column can be found with the following query:
SELECT af.film as film_id, count(*) as c
FROM actor_film as af
GROUP BY af.film
Now we just join them together
SELECT a.id as actor_id, a.name as actor_name, f.name as film_name, fc.c as num_actors
FROM actor as a
JOIN actor_film af on a.id = af.actor
JOIN film as f on af.film = f.id
JOIN (
SELECT af.film as film_id, count(*) as c
FROM actor_file as af
GROUP BY af.film
) as fc on af.film = fc.film_id
If you want you can add a
WHERE a.name = 'Gary' OR a.name = 'Matt'
depending on your platform you might want
WHERE lower(a.name) = 'gary' OR lower(a.name) = 'matt'

Text search of a many-to-many data relation

I know this must have been answered before here, but I simply can't find a matching question.
Using a LIKE '%keyword%', I want to search a many-to-many data relationship in a MSSQL database and reduce it to a one-to-one result set. The two tables are joined through a linking table. Here's a very simplified version of what I'm talking about:
Books:
book_ id title
1 Treasure Island
2 Poe Collected Stories
3 Invest in Treasure Islands
Categories:
category_id name
1 Children
2 Adventure
3 Horror
4 Classic
5 Money
BookCategory:
book_id category_id
1 1
1 2
1 4
2 3
2 4
3 5
What I want to do is search for a phrase in the title (e.g. '%treasure island%') and get matching Books records that contain the search string and the single highest matching Categories record that goes with each book -- I want to discard the lesser category records. In other words, I'm looking for this:
book_id title category_id name
1 Treasure Island 4 Classic
3 Invest in Treasure Islands 5 Money
Any suggestions?
Try this. Filter your lookup table, then join:
With maxCategories AS
(select book_id, max(category_id) as category_id from BookCategory group by book_id)
select Books.book_id, Books.Title, Categories.category_id, Categories.name
from Books
inner join maxCategories on (Books.book_id = maxCategories.book_id)
inner join Categories on (Categories.category_id = maxCategories.category_id)
where Books.title like '%treasure island%'
Try:
select * from
(select b.*,
c.*,
row_number() over (partition by bc.book_id
order by bc.category_id desc) rn
from Books b
join BookCategory bc on b.book_id = bc.book_id
join Categories c on bc.category_id = c.category_id
where b.name like '%treasure island%') sq
where rn=1

SQL Query - Distinct results

I have the following two tables:
People [*ID*, Name]
Pet [*PetID*, OwnerID, Species, Name]
(OwnerID is a foreign key of ID)
I would like the database to list each person and how many different species they own. For example, if Bob (ID 1473) owned a dog, cat and another dog the output should be:
ID | No. of Species
----------------------
1473 | 2
I realize that this would require correlated sub-queries or outer joins, but I'm not exactly sure how to do that. Any help would be appreciated.
You could use count(distinct ...) for that:
select People.ID
, count(distinct Species)
from People
join Pet
on Pet.OwnerID = People.ID
group by
People.ID
select people.name, count(distinct pet.species)
from people, pet
where people.id = pet.ownerid
group by people.name
try this
Select ID,[No. of Species] from People
inner join
( select Count(Species) as [No. of Species],OwnerID from Pet
group by OwnerID) d
on Id = d.OwnerID

find missing values in MySQL

I am trying to write a query to find out what users are not enrolled on some courses.
I have 2 tables; courses, and users. Both contain the fields 'id' and then 'coursename' and username' respectively.
Using these two table the existing system adds users to a third table (called enrolled_users). Users take up one row for each. For example if user id '1' is on 4 courses the enrolled_users looks like this:
-----------------------
| user_id | course_id |
-----------------------
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
-----------------------
As you can see this table data is not built on a relationship, it is generated in PHP using the two tables 'courses' and 'users'.
I am not sure how to write a query on how to find out that courses user '1' is not enrolled in, if there are say 10 courses, but we can see user '1' is only on courses 1-4.
Am I suppose to use something like a LIKE or NOT LIKE?
SELECT u.id, c.id, ec.user_id, ec.course_id
FROM users u, courses c, enrolled_courses ec
WHERE u.id = ec.user_id
AND c.id = ec.course_id
AND u.id = '1'
I have tried using != and <> but this doesn't show what i need; a list of courses the user is not enrolled on.
Sorry if I am being vague, trying to get my head round it!
Using MySQL 5.0, on an existing system I cannot modify, only query from (for the moment).
SELECT
*
FROM
courses c
WHERE
c.id NOT IN (
SELECT
ec.course_id
FROM
enrolled_courses ec
WHERE
ec.user_id = 1 AND ec.course_id IS NOT NULL
)
If you are looking for a specific single user ID, be sure to include that ID condition as part of the WHERE clause, otherwise, leave it blank, and it will give you ALL people who are not registered for ALL possible classes via a Cartesian product (since no direct join between user and courses table) .. this could be VERY large / time consuming if trying to run for everyone and your course table is 100s of courses where someone would only be involved in 2-3-4 courses.
select
u.id,
c.id CourseID
from
users u,
courses c
where
u.id = 1
AND c.id NOT IN
( select ec.Course_ID
from enrolled_courses ec
where ec.user_id = u.id )
Just use a not in and a sub select.
select * from courses c where c.id not in
(select course_id from enrolled_courses where user_id='1')