Triple joins with SQL? - sql

My database represents a library. Each book is tagged with multiple things, so that one title might be tagged 'science fiction', 'short stories', and 'Russian'.
There are three tables: books, tags, and books_tag_link. They look like this:
Books
ID | TITLE
-----------------------------
1 | Rendezvous With Rama
2 | Howl and Other Poems
3 | A Short History of Nearly Everything
Tags
ID | TAGNAME
-----------------------------
1 | science fiction
2 | fiction
3 | poetry
Books_Tag_Link
BOOK | TAG
-----------------------------------
1 | 1
1 | 2
2 | 3
Hopefully you can see how that would work. The books_tag_link table has two foreign keys, and links books to tags; each book has many tags, each tag is associated with many books. I don't know if this is the best way to do it but it's what the OSS library program Calibre does, and that's what I'm kind of using as a reference as I study.
Now what I want to do is say "select all fiction books". But I can't quite work out the proper way to express that thought in SQL. Select books.title where books.id = tags.id = books_tag_link.tag... or something. I'm not sure.
Can someone help me out with a tip or explanation of what I should be doing?
I'm using SQLite at the moment but MySQL-specific advice would be fine too.

SELECT b.title
FROM Books AS b
JOIN Books_Tag_Link AS bt ON b.id = bt.book
JOIN Tags AS t ON t.id = bt.tag
WHERE t.tagname = 'fiction'

Something like that?
select b.title
from Books b join Books_Tag_Link btl on btl.BOOK=b.ID
join Tags t on t.ID=btl.TAG
where t.TAGNAME='fiction';
Caveat: if all tables are large, you have to make sure that the fields mentioned in JOIN are keys (indexes).

You need to use two joins:
select
books.title
from
books
inner join
books_tag_link on
books_tag_link.book = book.id
inner join
tags on
tags.id = books_tag_link.tag
where
tag.tagname = 'fiction'

Related

Stuck on beginner SQL practice. Multiple table where columns use same id

I'm very sorry to bother with minor problem, but I tried to search old answers for this one and since my skills in SQL are complete 0, I didn't even understand the answers :/! Neither is my English terminology great enough for properly searching.
I have these 2 tables: Cities and Flights.
Cities
+----+-------------+
|id | name |
+----+-------------+
|1 | Oslo |
|2 | New York |
|3 | Hong Kong |
+----+-------------+
Flights
+----+--------------------+-------------------+
|id | wherefrom_id | whereto_id |
+----+--------------------+-------------------+
|1 | 3 | 2 |
|2 | 3 | 1 |
|3 | 1 | 3 |
+----+--------------------+-------------------+
Now I have to write code where I need to make city ID's merge to wherefrom_id and whereto_id, in that manner that the answer shows table where you can see list of Flights (FROM/TO).
Example:
ANSWER:
+-----------+----------------+
|HONG KONG | NEW YORK |
+-----------+----------------+
|HONG KONG | OSLO |
+-----------+----------------+
|OSLO | HONG KONG |
+-----------+----------------+
This is what I wrote:
SELECT C.name, C.name
FROM Cities C, Flights F
WHERE C.id = F.wherefrom_id AND C.id = F.whereto_id;
For some reason this doesnt seem to work and I get nothing showing on my practice program. There is no error or anything it just doesnt show anything on the test answer. I really hope you get what I mean, English is not my first language and I truly tried my best to make it clear as possible :S
First things first - it's a lot easier to code in standard SQL join syntax. Converting your above to that is
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
The question you've been asked requires logic people don't usually use at first so it can be confusing the first time you encounter it.
I will run through the logic jump in a moment.
Imagine your Flights table has the City names in it (not IDs).
It would have columns, say, FlightID, From_City_Name, To_City_Name.
An example row would be 1, 'Oslo', 'Prague'.
Getting the data for this would be easy e.g., SELECT Flight_ID, From_City_Name, To_City_name FROM Flights.
However, this has many problems. As your question has done, you decide to pull out the cities into their own reference tables.
For this first example, however, you decided to have two extra tables as reference tables: From_City and To_City. These would both have an ID and city name. You then change your Flights to refer to these.
Your code would look like
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN From_City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN To_City AS TC ON Flights.To_City_ID = TC.ID
Notice how there are two joins there - one to From_City and one to To_City? That is because the From and To cities are referring to different things in the data.
So, then the final part of the issue: why have two city tables (from and to). Why not have one? Well, you can. If you create just one table, and modify the above, you get something like this:
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN City AS TC ON Flights.To_City_ID = TC.ID
Note that all that has changed is that the From_City and To_City references have been pointed to a different table City. However, the rest is the same.
And that, actually, would be your answer. The complex part that most people don't get to straight away, is having two joins to the same table.
As an aside, your original code is technically valid.
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
However, what it's effectively saying is to get the city names where the From_City is the same as the To_City - which is obviously not what you want (unless you're looking for turnbacks).
What you're doing is an old SQL way of expressing joins. The standard now has better ways to declare the relationships within the from clause and I take it that your material has postponed that slightly:
There are people who will yell at you for using this ancient syntax but the answer is easy enough:
SELECT C1.name, C2.name
FROM Cities C1, Cities C2, Flights F
WHERE C1.id = F.wherefrom_id AND C2.id = F.whereto_id
You can think of this as creating a "cross product" of all city-pair combinations and matching up the ones that match actual flights. The key is to references Cities twice by using different aliases (or correlation names.)
I think this is what you are looking for..
SELECT wf.name "wherefrom", wt.name "whereto"
FROM Flights f
JOIN Cities wf
ON f.wherefrom_id = wf.id
JOIN Cities wt
ON f.whereto_id = wt.id
order by f.id

SQL multiple table nested group by using Ruby on Rails Active Record

Models:
A survey has survey_answers which each have an answer and a question
A question belongs to a category.
Example:
Survey_answer has survey_id and answer_id and question_id
question has category_id
I am using Ruby on Rails with Active Record
Using SQL, and starting with a survey, how can I group as following:
{ category => questions => answers }
Example output:
{[
category_one_record,
[question_1,[answer_1, answer_2]],
[question_2,[answer_3, answer_4]],
category_two_record,
[question_3,[answer_5, answer_6]],
[question_4,[answer_7, answer_8]],
category_three_record,
...
]}
I understand how to do a group but I don't understand how to do a nested grouping with SQL
I believe that you're thinking about this in a non-SQL manner.
SQL is tabular, and when returning this kind of information it is normal to accept the repetition rather than look to generate a nested data structure (such as can be done with JSON, XML, etc).
You don't appear to want any actual aggregation (what GROUP BY is for). Instead you're referring to one way of keeping related records next to each other.
The would simply be joining the data and then ordering it.
You're original post isn't clear on the table structures, so this is a very rough example of what I mean...
SELECT
survey.name AS survey_name,
category.name AS category_name,
question.text AS question_text,
answer.text AS answer_text
FROM
survey
INNER JOIN
question
ON question.survey_id = survey.id
INNER JOIN
category
ON category.id = question.category_id
INNER JOIN
answer
ON answer.question_id = question.id
ORDER BY
survey.name,
category.name,
question.text,
answer.text
Such a query might then return a record set that looks like the following...
survey_name | category_name | question_text | answer_text
-------------+---------------+---------------+-------------
survey_1 | category_1 | question_1 | answer_1
survey_1 | category_1 | question_1 | answer_2
survey_1 | category_1 | question_2 | answer_3
survey_1 | category_1 | question_2 | answer_4
survey_1 | category_2 | question_3 | answer_5
Tabular. Not nested.
If you have natural joins available you can almost always pair table up with it even if you don't know how it works, like this:
select * from Survey_answer
natural join question
natural join category
where category_id = 1
If you don't have this option available you need to specify the joining points and if it's right, left or inner for example:
select * from Survey_answer
inner join question on Survey_answer.answer_id like question.answer_id
where Survey_answer.category_id = 1
You may learn more about this here: https://www.w3schools.com/sql/sql_join.asp

Aliases for 2 joins on one table in Microsoft Access

I have a table that shows relationships between items and another table with the items themselves:
articles_to_articles
-------------------------
|articleID_1|articleID_2|
-------------------------
|12345 |67890 |
|23442 |343243 |
-------------------------
articles
-----------------------------------------------------
|article_id | article_name|lots | of | other | stuff|
-----------------------------------------------------
I am attempting to generate a file with that consists of the relationships from articles_to_articles but with the names in addition to the ids.
What I have so far is:
SELECT
a2a.articleID_1,
key_articles.article_name,
a2a.articleID_2,
val_articles.article_name
FROM
articles_to_articles a2a
INNER JOIN
articles key_articles
ON key_articles.articleID = articles_to_articles.articleID_1
INNER JOIN
articles val_articles
ON val_articles.articleID = articles_to_articles.articleID_2;
Access gives me a "missing operator" error but I can't seem to find the missing operator. What basic thing am I missing?
When joining more than two tables in MS Access, you must enclose each join within separate groups of parentheses, for example:
SELECT
a2a.articleID_1,
key_articles.article_name,
a2a.articleID_2,
val_articles.article_name
FROM
(
articles_to_articles a2a
INNER JOIN
articles key_articles
ON
key_articles.articleID = a2a.articleID_1
)
INNER JOIN
articles val_articles
ON
val_articles.articleID = a2a.articleID_2

SQL command that spans across tables

I have these tables (authors, quotes, tags, quotes_tags) and I'd like to make a single query for a list of random quotes with their appropriate author and and tags (many) information.
Here's what I have right now, but the tags are being squashed and returning only one. How would I go about query that returns this set of quotes with the tags returned as json or whatever convention that's appropriate?
SELECT *
FROM quotes
JOIN authors ON quotes.author_id = authors.id
JOIN tags ON tags.id = quotes.id
ORDER BY RANDOM()
LIMIT 50
I am getting the following:
author | quote | tags
---------------------
john | lorem | hey
brian | lorem | test
but I'd like the following:
author | quote | tags
-------------------------------
john | lorem | hey, another (or whatever convention for a list -- json?)
brian | lorem | test, one, two
In version 8.4 and later, you can use array_agg:
SELECT a.author, q.quote, array_agg(t.tags)
FROM quotes q
INNER JOIN authors a
ON q.author_id = a.id
INNER JOIN tags t
ON t.id = q.id
GROUP BY a.author, q.quote
ORDER BY RANDOM()

mysql where IN on large dataset or Looping?

I have the following scenario:
Table 1:
articles
id article_text category author_id
1 "hello world" 4 1
2 "hi" 5 2
3 "wasup" 4 3
Table 2
authors
id name friends_with
1 "Joe" "Bob"
2 "Sue" "Joe"
3 "Fred" "Bob"
I want to know the total number of authors that are friends with "Bob" for a given category.
So for example, for category 4 how many authors are there that are friends with "Bob".
The authors table is quite large, in some cases I have a million authors that are friends with "Bob"
So I have tried:
Get list of authors that are friends with bob, and then loop through them and get the count for each of them of that given category and sum all those together in my code.
The issue with this approach is it can generate a million queries, even though they are very fast, it seems there should be a better way.
I was thinking of trying to get a list of authors that are friends with bob and then building an IN clause with that list, but I fear that would blow out the amt of memory allowed in the query set.
Seems like this is a common problem. Any ideas?
thanks
SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
WHERE friends_with = 'bob' AND art.category = 4
Count(Distinct a.id) is required as articles might hit multiple rows for each author.
But if you have any control over the database I would use a link table for friends_with as your cussrent solution either have to use a comma seperated list of names which will be disastrous for performance and require a completly different query or each author can only have one friend.
Friends
id friend_id
then the query would look like this
SELECT COUNT(DISTINCT auth.id)
FROM authors auth
INNER JOIN articles art ON auth.id = art.author_id
INNER JOIN friends f ON auth.id = f.id
INNER JOIN authors fauth ON fauth.id = f.friend_id
WHERE fauth.name = 'bob' AND art.category = 4
Its more complex but will allow for many friends, just remeber, this construct calls for 2 rows in friends for each pair, one from joe to bob and one from bob to joe.
You could build it differently but that would make the query even more complex.
Maybe something like
select fr.name,
fr.id,
au.name,
ar.article_text,
ar.category,
ar.author_id
from authors fr, authors au, articles ar
where fr.id = ar.author_id
and au.friends_with = fr.name
and ar.category = 4 ;
Just the count...
select count(distinct fr.name)
from authors fr, authors au, articles ar
where fr.id = ar.author_id
and au.friends_with = fr.name
and ar.category = 4 ;
A version without using joins (hopefully will work!)
SELECT count(distinct id) from authors where friends_with = 'Bob' and id in(select author_id from articles where category = 4)
I found it is easier to understand statements with 'IN' in when I started out with SQL.