How to design the database schema to link two tables via lots of other tables - sql

Although I'm using Rails, this question is more about database design. I have several entities in my database, with the schema a bit like this: http://fishwebby.posterous.com/40423840
If I want to get a list of people and order it by surname, that's no problem. However, if I want to get a list of people, ordered by surname, enrolled in a particular group, I have to use an SQL statement that includes several joins across four tables, something like this:
SELECT group_enrolment.*, person.*
FROM person INNER JOIN member ON person.id = member.person_id
INNER JOIN enrolment ON member.id = enrolment.member_id
INNER JOIN group_enrolment ON enrolment.id = group_enrolment.enrolment_id
WHERE group_enrolment.id = 123
ORDER BY person.surname;
Although this works, it strikes me as a bit inefficient, and potentially as my schema grows, these queries could get more and more complicated.
Another option could be to join the person table to all the other tables in the query by including person_id in the other tables, then it would just be one single join, for example
SELECT group_enrolment.*, person.*
FROM person INNER JOIN group_enrolment ON group_enrolment.person_id
WHERE group_enrolment.id = 123
ORDER BY person.surname;
But this would mean that in my schema, the person table is joined to a lot of other tables. Aside from a complicated schema diagram, does anyone see any disadvantages to this?
I'd be very grateful for any comments on this - whether what I'm doing now (the many table join) or the second solution or another one that hasn't occurred to me is the best way to go.
Many thanks in advance

Well, joins are what databases do. Having said that, you may consider propagating natural keys in your model, which would then allow you to skip over some tables in joins. Take a look at this example.
EDIT
I'm not saying that this will match your model (problem), but just for fun try similar queries on something like this:

Related

Coding Inner Join subquery as field in query

After looking at example after example of both inner joins and subqueries as fields, I'm apparently not getting some aspect, and I would appreciate help please. I am trying to write one query that must, alas, run in MS Access 2007 to talk to an Oracle database. I have to get values from several different places for various bits of data. One of those bits of data is GROUP_CODE (e.g., faculty, staff, student, alum, etc.). Getting that is non-trivial. I am trying to use two inner joins to get the specific value. The value of borrower category must be the value for my main row in the outer query. Here is what this looks like:
Patron table Patron_Barcode table Patron_Group table
Patron_id Barcode Patron_Group_iD
Barcode Patron_Group_id PATRON_Group_Code
I want to get the PATRON_GROUP.PATRON_GROUP_CODE. This is only one of 35 fields I need to get in my query. (Yes, that's terrible, but wearing my librarian hat, i can't write the Java program I'd like to write to do this in a snap.)
So as a test, I wrote this query:
select PATRON.PATRON_ID As thePatron,
(SELECT PATRON_GROUP.PATRON_GROUP_CODE As borrowwerCategory
FROM (PATRON_GROUP
INNER JOIN PATRON_BARCODE ON PATRON_GROUP.PATRON_GROUP_ID = PATRON_BARCODE.PATRON_GROUP_ID
) INNER JOIN PATRON ON PATRON_BARCODE.PATRON_ID = thePatron.PATRON_ID
));
I don't know what I'm doing wrong, but this doesn't work. I've written a fair amount of SQL in my time, but never anything quite like this. What am I doing wrong?
PATRON.BARCODE is the foreign key for the BARCODE table.
PATRON_BARCODE.PATRON_GROUP_ID is the foreign key for the PATRON_GROUP table. PATRON_GROUP_CODE in PATRON_GROUP is he column value that I need.
PATRON.BARCODE -> BARCODE.PATRON_GROUP_ID -> PATRON_GROUP.PATRON_GROUP_CODR>
The main table, PATRON, will have lots of other things, like inner and outer join to PATRON_ADDRESS, etc., and I can't just do an inner join directly to what I want in my main query. This has to happen in a subquery as a field. Thanks.
Ken

SQL Multiple Joins - How do they work exactly?

I'm pretty sure this works universally across various SQL implementations. Suppose I have many-to-many relationship between 2 tables:
Customer: id, name
has many:
Order: id, description, total_price
and this relationship is in a junction table:
Customer_Order: order_date, customer_id, order_id
Now I want to write SQL query to join all of these together, mentioning the customer's name, the order's description and total price and the order date:
SELECT name, description, total_price FROM Customer
JOIN Customer_Order ON Customer_Order.customer_id = Customer.id
JOIN Order = Order.id = Customer_Order.order_id
This is all well and good. This query will also work if we change the order so it's FROM Customer_Order JOIN Customer or put the Order table first. Why is this the case? Somewhere I've read that JOIN works like an arithmetic operator (+, * etc.) taking 2 operands and you can chain operator together so you can have: 2+3+5, for example. Following this logic, first we have to calculate 2+3 and then take that result and add 5 to it. Is it the same with JOINs?
Is it that behind the hood, the first JOIN must first be completed in order for the second JOIN to take place? So basically, the first JOIN will create a table out of the 2 operands left and right of it. Then, the second JOIN will take that resulting table as its left operand and perform the usual joining. Basically, I want to understand how multiple JOINs work behind the hood.
In many ways I think ORMs are the bane of modern programming. Unleashing a barrage of underprepared coders. Oh well diatribe out of the way, You're asking a question about set theory. THere are potentially other options that center on relational algebra but SQL is fundamentally set theory based. here are a couple of links to get you started
Using set theory to understand SQL
A visual explanation of SQL

When to use JOINs

It seems to me that there are two scenarios in which to use JOINs:
When data would otherwise be duplicated
When data from one query would otherwise be used in another query
Are these scenarios right? Are there any other scenarios in which to use JOIN?
EDIT: I think I've miscommunicated. I understand how a JOIN works, what I'm not so sure about is when to use one.
JOINS are used to JOIN tables together with related information.
Tipical situations are where you have lets say
A user table where the user has specific security settings. The join would be used such that you can determine which settings the user has.
Users
-UserID
-UserName
UserSecurityRoles
-UserID
-SecurityRoleID
SecurityRoles
-SecurityRoleID
-SecurityRole
SELECT *
FROM Users u INNER JOIN
UserSecurityRoles usr ON u.UserID = usr.UserID INNER JOIN
SecurityRoles sr ON usr.SecurityRoleID = sr.SecurityRoleID
WHERE sr.SecurityRole = 'Admin'
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
JOINS are used when you start normalizing your table structure. You can crete a table with 100s on columns, where a lot of the data could possibly be NULL, or you can normalize the tables, such that you avoid having too many columns with null values, where you group the appropriate data into table structures.
The answer to this Question has a VERY good link that has graphical display of using JOINs
JOINS are used to return data that is related in a relational database. Data can be related in 3 ways
One to Many relationship (A Person can have many Transactions)
Many to Many relationship (A Doctor can have many Patients, but a Patient can have more than one Doctor)
One to One relationship (One Person can have exactly one Passport number)
JOINS come in various flavours:
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table
A CROSS JOIN will return the product of each table
You use joins when you need information from more than one table :)

How do I represent my joins in Relational Algebra?

I am new to relational algebra and for my assignment I have to create two. I have written out the problem I have faced in SQL but I am unsure of how to represent such joins in relational algebra. Any help/pointers would be greatly appreciated.
SELECT ps.FirstName AS StudentFirstName, ps.LastName AS StudentLastName, pst.FirstName AS StaffFirstName , pst.LastName AS StaffLastName, pg.FirstName AS GuardianFirstName, pg.LastName AS GuadianLastName, i.DateTimeReported, i.NatureOfIllness
FROM Incident i
JOIN Student s USING (StudentID)
JOIN Person ps ON (s.StudentID = ps.PersonID)
JOIN Staff st USING (StaffID)
JOIN Person pst ON (st.StaffID = pst.PersonID)
JOIN Guardian g USING (GuardianID)
JOIN Person pg ON (g.GuardianID = pg.PersonID)
WHERE i.DecisionMade IS NULL;
Those left joins you're doing are referred to in relational algebra as theta-joins, sometimes more specifically specifically as equijoins. You'll want to use the symbol that looks like a bow tie and write "StudentID = PersonID" underneath it (for the second join in your example). I can't do the fancy symbols, but http://en.wikipedia.org/wiki/Relational_algebra#.CE.B8-join_and_equijoin has some examples.
Also, there's nothing wrong with 6-way joins and they'll indeed happen in the real world.
I think you are going about the problem the wrong way. In the real world you'd never want to create a situation where you had a 6 way join.
What it seems like you have here are incidents and people. The people have roles. There should be maybe three tables, incidents, roles, and people. The way you're joining against Person twice, is going to be a mess.
I think you should sit down and read about database normalization.
http://en.wikipedia.org/wiki/Database_normalization

Weird many to many and one to many relationship

I know I'm gonna get down votes, but I have to make sure if this is logical or not.
I have three tables A, B, C. B is a table used to make a many-many relationship between A and C. But the thing is that A and C are also related directly in a 1-many relationship
A customer added the following requirement:
Obtain the information from the Table B inner joining with A and C, and in the same query relate A and C in a one-many relationship
Something like:
alt text http://img247.imageshack.us/img247/7371/74492374sa4.png
I tried doing the query but always got 0 rows back. The customer insists that I can accomplish the requirement, but I doubt it. Any comments?
PS. I didn't have a more descriptive title, any ideas?
UPDATE:
Thanks to rcar, In some cases this can be logical, in order to have a history of all the classes a student has taken (supposing the student can only take one class at a time)
UPDATE:
There is a table for Contacts, a table with the Information of each Contact, and the Relationship table. To get the information of a Contact I have to make a 1:1 relationship with Information, and each contact can have like and an address book with; this is why the many-many relationship is implemented.
The full idea is to obtain the contact's name and his address book.
Now that I got the customer's idea... I'm having trouble with the query, basically I am trying to use the query that jdecuyper wrote, but as he warns, I get no data back
This is a doable scenario. You can join a table twice in a query, usually assigning it a different alias to keep things straight.
For example:
SELECT s.name AS "student name", c1.className AS "student class", c2.className as "class list"
FROM s
JOIN many_to_many mtm ON s.id_student = mtm.id_student
JOIN c c1 ON s.id_class = c1.id_class
JOIN c c2 ON mtm.id_class = c2.id_class
This will give you a list of all students' names and "hardcoded" classes with all their classes from the many_to_many table.
That said, this schema doesn't make logical sense. From what I can gather, you want students to be able to have multiple classes, so the many_to_many table should be where you'd want to find the classes associated with a student. If the id_class entries used in table s are distinct from those in many_to_many (e.g., if s.id_class refers to, say, homeroom class assignments that only appear in that table while many_to_many.id_class refers to classes for credit and excludes homeroom classes), you're going to be better off splitting c into two tables instead.
If that's not the case, I have a hard time understanding why you'd want one class hardwired to the s table.
EDIT: Just saw your comment that this was a made-up schema to give an example. In other cases, this could be a sensible way to do things. For example, if you wanted to keep track of company locations, you might have a Company table, a Locations table, and a Countries table. The Company table might have a 1-many link to Countries where you would keep track of a company's headquarters country, but a many-to-many link through Locations where you keep track of every place the company has a store.
If you can give real information as to what the schema really represents for your client, it might be easier for us to figure out whether it's logical in this case or not.
Perhaps it's a lack of caffeine, but I can't conceive of a legitimate reason for wanting to do this. In the example you gave, you've got students, classes and a table which relates the two. If you think about what you want the query to do, in plain English, surely it has to be driven by either the student table or the class table. i.e.
select all the classes which are attended by student 1245235
select all the students which attend class 101
Can you explain the requirement better? If not, tell your customer to suck it up. Having a relationship between Students and Classes directly (A and C), seems like pure madness, you've already got table B which does that...
Bear in mind that the one-to-many relationship can be represented through the many-to-many, most simply by adding a field there to indicate the type of relationship. Then you could have one "current" record and any number of "history" ones.
Was the customer "requirement" phrased as given, by the way? I think I'd be looking to redefine my relationship with them if so: they should be telling me "what" they want (ideally what, in business domain language, their problem is) and leaving the "how" to me. If they know exactly how the thing should be implemented, then I'd be inclined to open the source code in an editor and leave them to it!
I'm supposing that s.id_class indicates the student's current class, as opposed to classes she has taken in the past.
The solution shown by rcar works, but it repeats the c1.className on every row.
Here's an alternative that doesn't repeat information and it uses one fewer join. You can use an expression to compare s.id_class to the current c.id_class matched via the mtm table.
SELECT s.name, c.className, (s.id_class = c.id_class) AS is_current
FROM s JOIN many_to_many AS mtm ON (s.id_student = mtm.id_student)
JOIN c ON (c.id_class = mtm.id_class);
So is_current will be 1 (true) on one row, and 0 (false) on all the other rows. Or you can output something more informative using a CASE construct:
SELECT s.name, c.className,
CASE WHEN s.id_class = c.id_class THEN 'current' ELSE 'past' END AS is_current
FROM s JOIN many_to_many AS mtm ON (s.id_student = mtm.id_student)
JOIN c ON (c.id_class = mtm.id_class);
It doesn't seem to make sense. A query like:
SELECT * FROM relAC RAC
INNER JOIN tableA A ON A.id_class = RAC.id_class
INNER JOIN tableC C ON C.id_class = RAC.id_class
WHERE A.id_class = B.id_class
could generate a set of data but inconsistent. Or maybe we are missing some important part of the information about the content and the relationships of those 3 tables.
I personally never heard a requirement from a customer that would sound like:
Obtain the information from the Table
B inner joining with A and C, and in
the same query relate A and C in a
one-many relationship
It looks like that it is what you translated the requirement to.
Could you specify the requirement in plain English, as what results your customer wants to get?