I am taking a beginner's course in SQL, and have been playing around with some queries. One thing I don't really understand is how to "properly" query multiple tables, that is; compare values from two or more tables.
For instance,
I have a table called Student, holding the username, name, date of birth, and major (just the code. For instance, CS would stand for "Computer Science") of a particular person. I chose to make the username a primary key.
I also have another table called Major holding the major code (such as CS) as a primary key, and the entire major name. For instance, "CS" = "Computer Science", NS = "Neuroscience", etc.
Now, suppose I want to find the name of a major, given a student's username. Following is the imagined pseudocode for this query:
1) In the Student table: Provided the username, check what the major of that particular person is.
select majorcode from Student where username='aUserName';
Doing so correctly gives me the major code.
2) In the Major table: Find the title of the major provided the code.
select majorTitle from Major where majorcode='theMajorCode';
Combinded, I write:
select majorTitle from Major where majorcode=(select majorcode from Student where username='aUserName');
However, now suppose I want BOTH the title of the major (from the Major table) as well as the name of the student (from the Student table).
Any advice on how to do this?
You'll need a join. Something like this - note that any rows in Student that have a majorcode not in Major, or vice-versa, will not be included. If that's not what you want, look into outer joins.
SELECT majorTitle, username
FROM Student s
JOIN Major m ON s.majorcode = m.majorcode
You can of course add a WHERE clause to that query. Reference tables using the aliases ("s" for Student, "m" for Major) to avoid ambiguity.
Related
Scenario
I have a few tables, each table represents an entity of a unique type. For example lets go with:
School, Subject, Class, Teacher. Listed in order as Parent -> Child
Schema
Each table has:
ID: UUID
Name: CHAR VARYING
{parent}_id: UUID<-- example, class would have Subject_id, or Teacher would have Class_id.
The {parent}_id is the foreign id for each table.
Problem
I want to make a query that lists all the teachers of a given school. In order to do this in this Schema, I need to first query Subject by School_id, then Class by subject_id and then finally teacher by class_id.
A recursive functions makes sense to me but all tutorials I find are doing this within a single table and by ids which don't change with each recursion. In my example, each recursion I will need to search for a different ID.
Question
How do you go about doing this? I could make an array of the ids and make an index, increase index and use that to access the id in the array. This however seems like a common query so I believe there might be a more elegant solution.
Note: I am using PostgreSQL
Edit for Comment
I am using PostgreSQL DB and PGAdmin
Why would UUID not work? It has worked up to this point with no problems; even works with cascading delete using foreign keys.
I can show actual schema. However here is a fictitious layout. Quite straight forward I hope.
School
ID
Name
Subject
ID
Name
School_ID
Class
ID
Name
Subject_ID
Teacher
ID
Name
Class_ID
Expected output
Teacher_ID, Teacher_Name, Class_Name, Subject_Name, School_Name
Something like?:
select
Teacher_ID, Teacher_Name, Class_Name, Subject_Name, School_Name
from
school
join
subject
on
school.id = subject.school_id
join
class
on
class.subject_id = subject.id
join
teacher
on
teacher.class_id = class.id
When learing about joins, our instructor says to not skip tables.
For example, lets do a query that Selects the Last_Name, First_name, and Numeric_Grade.
I would write
Select Last_Name, First_Name, Numeric_Grade
From Student
Join Grade
Using(Student_id)
He says to write
Select Last_Name, First_Name, Numeric_Grade
From Student
Join Enrollment
Using(Student_id)
Join Grade
Using(Student_id)
Im confused because as long as long as i can link them through similar fields, i dont see the point of going enrollment.
He has not given me a reason for going through enrollment, other than its what the Diagram shows. Follow the diagram.
Do I have to go through Enrollment? Is it the safe way to do it, or does it not matter because Grade and Student have a Student_id primary key?
Quoting Alice Rischert in Oracle SQL By Example, lab 7.2:
The second choice is to join the STUDENT_ID from the GRADE table directly to the STUDENT_ID of the STUDENT table, thus skipping the ENROLLMENT table entirely. - - This shortcut is perfectly acceptable, even if it does not follow the primary key/foreign key relationship path. In this case, you can be sure not to build a Cartesian product because you can guarantee only one STUDENT_ID in the STUDENT table for every STUDENT_ID in the GRADE table. In addition, it also eliminates a join; thus, the query executes a little faster and requires fewer resources. The effect is probably fairly negligible with this small result set.
The only reason to go through the Enrollment table would be if you need information (fields) from that table. If both the Enrollment and Grade table have a Student_id field then you wouldn't need to go through Enrollment to get there.
In your example it looks like you are looking for First and Last Name, which should both come from the Student table and Numeric_Grade which should come from the Grade table. In this instance, there would be no need for the Enrollment table. If there were a WHERE clause that required something from the Enrollment table then yes you would need to include it, but your example I would say it is not needed.
If this is a question on a test or assignment and the teacher is requesting you go through the Enrollment table too I would do it just to appease him, but knowing that you don't actually need to do it to get the information that you require.
Depend on your tables. Sometimes you can but sometimes dont.
For example imagine in enrollment you have something like student_quit_course
Then you may only want grade of student actually finish the course and you need all three table
For this particular case you will have a GRADE for several section_id but to know what is that section you need [Section] and [Course] both join using [Enrollment]
Say I have a student table with the following fields - student id, student name, age, gender, marks, class.Assume that due to some error, there are multiple entries corresponding to each student. My requirement is to identify the duplicate rows in the table and the filter criterion is the student name and the class.But in the query result, in addition to identifying the duplicate records, I also need to find the original student detail which got duplicated. Is there any method to do this. I went through this answer: SQL: How to find duplicates based on two fields?. But here it only specifies how to find the duplicate rows and not a means to identify the actual row that was duplicated. Kindly throw some light on the possible solution. Thanks.
First of all: if the columns you've listed are all in the same table, it looks like your database structure could use some normalization.
In terms of your question: I'm assuming your StudentID field is a database generated, primary key and so has not been duplicated. (If this is not the case, I think you have bigger problems than just duplicates).
I'm also assuming the duplicate row has a higher value for StudentID than the original row.
I think the following should work (Note: I haven't created a table to verify this so it might not be perfect straight away. If it doesn't it should be fairly close)
select dup.StudentID as DuplicateStudentID
dup.StudentName, dup.Age, dup.Gender, dup.Marks, dup.Class,
orig.StudentID as OriginalStudentId
from StudentTable dup
inner join (
-- Find first student record for each unique combination
select Min(StudentId) as StudentID, StudentName, Age, Gender, Marks, Class
from StudentTable t
group by StudentName, Age, Gender, Marks, Class
) orig on dup.StudentName = orig.StudenName
and dup.Age = orig.Age
and dup.Gender = orig.Gender
and dup.Marks = orig.Marks
and dup.Class = orig.Class
and dup.StudentID > orig.StudentID -- Don't identify the original record as a duplicate
i have a table which store user name, hobby and city .hobby field contain different hobby joined using "," operator eg swimming, basket, cricket. I want to search user name who match at least one hobby according to my search criteria.
You should not have multiple attributes in one column. That's one of the number one rules of 3nf database design. Now you have to figure out ways to parse this data. This issue only gets worse and worse each and every day. Seperate the hobbies as multiple rows in your database.
I agree with #JonH that there shouldn't be more than one piece of information in a column. It stops the row being truly atomic.
But you are where you are, and you can use the LIKE clause to return rows that match a substring within a column.
Something like:
select hobbycolumn from hobbytable where hobbycolumn like '%swimming%'
for example
To do this properly you need to restructure your tables if possible. For what you are looking for a possible way would be to have 3 tables. I'm not sure who the city belongs to, so I put it with the user.
1 for user with the following cols:
id
name
city
A table for for hobbies:
id
name
And a user_hobbies join table that allows each user to have multiple hobbies, and each hobby to have multiple users:
id
user_id (foreign key)
hobby_id (foreign key)
Then searching for a user with a certain hobby is:
SELECT user.id, user.name FROM user
INNER JOIN 'user_hobbies' on user_hobbies.user_id=user.id
INNER JOIN 'hobbies' on hobbies.id = user_hobbies.hobby_id
WHERE hobbies.name LIKE "query";
I know I'm gonna get down votes, but I have to make sure if this is logical or not.
I have three tables A, B, C. B is a table used to make a many-many relationship between A and C. But the thing is that A and C are also related directly in a 1-many relationship
A customer added the following requirement:
Obtain the information from the Table B inner joining with A and C, and in the same query relate A and C in a one-many relationship
Something like:
alt text http://img247.imageshack.us/img247/7371/74492374sa4.png
I tried doing the query but always got 0 rows back. The customer insists that I can accomplish the requirement, but I doubt it. Any comments?
PS. I didn't have a more descriptive title, any ideas?
UPDATE:
Thanks to rcar, In some cases this can be logical, in order to have a history of all the classes a student has taken (supposing the student can only take one class at a time)
UPDATE:
There is a table for Contacts, a table with the Information of each Contact, and the Relationship table. To get the information of a Contact I have to make a 1:1 relationship with Information, and each contact can have like and an address book with; this is why the many-many relationship is implemented.
The full idea is to obtain the contact's name and his address book.
Now that I got the customer's idea... I'm having trouble with the query, basically I am trying to use the query that jdecuyper wrote, but as he warns, I get no data back
This is a doable scenario. You can join a table twice in a query, usually assigning it a different alias to keep things straight.
For example:
SELECT s.name AS "student name", c1.className AS "student class", c2.className as "class list"
FROM s
JOIN many_to_many mtm ON s.id_student = mtm.id_student
JOIN c c1 ON s.id_class = c1.id_class
JOIN c c2 ON mtm.id_class = c2.id_class
This will give you a list of all students' names and "hardcoded" classes with all their classes from the many_to_many table.
That said, this schema doesn't make logical sense. From what I can gather, you want students to be able to have multiple classes, so the many_to_many table should be where you'd want to find the classes associated with a student. If the id_class entries used in table s are distinct from those in many_to_many (e.g., if s.id_class refers to, say, homeroom class assignments that only appear in that table while many_to_many.id_class refers to classes for credit and excludes homeroom classes), you're going to be better off splitting c into two tables instead.
If that's not the case, I have a hard time understanding why you'd want one class hardwired to the s table.
EDIT: Just saw your comment that this was a made-up schema to give an example. In other cases, this could be a sensible way to do things. For example, if you wanted to keep track of company locations, you might have a Company table, a Locations table, and a Countries table. The Company table might have a 1-many link to Countries where you would keep track of a company's headquarters country, but a many-to-many link through Locations where you keep track of every place the company has a store.
If you can give real information as to what the schema really represents for your client, it might be easier for us to figure out whether it's logical in this case or not.
Perhaps it's a lack of caffeine, but I can't conceive of a legitimate reason for wanting to do this. In the example you gave, you've got students, classes and a table which relates the two. If you think about what you want the query to do, in plain English, surely it has to be driven by either the student table or the class table. i.e.
select all the classes which are attended by student 1245235
select all the students which attend class 101
Can you explain the requirement better? If not, tell your customer to suck it up. Having a relationship between Students and Classes directly (A and C), seems like pure madness, you've already got table B which does that...
Bear in mind that the one-to-many relationship can be represented through the many-to-many, most simply by adding a field there to indicate the type of relationship. Then you could have one "current" record and any number of "history" ones.
Was the customer "requirement" phrased as given, by the way? I think I'd be looking to redefine my relationship with them if so: they should be telling me "what" they want (ideally what, in business domain language, their problem is) and leaving the "how" to me. If they know exactly how the thing should be implemented, then I'd be inclined to open the source code in an editor and leave them to it!
I'm supposing that s.id_class indicates the student's current class, as opposed to classes she has taken in the past.
The solution shown by rcar works, but it repeats the c1.className on every row.
Here's an alternative that doesn't repeat information and it uses one fewer join. You can use an expression to compare s.id_class to the current c.id_class matched via the mtm table.
SELECT s.name, c.className, (s.id_class = c.id_class) AS is_current
FROM s JOIN many_to_many AS mtm ON (s.id_student = mtm.id_student)
JOIN c ON (c.id_class = mtm.id_class);
So is_current will be 1 (true) on one row, and 0 (false) on all the other rows. Or you can output something more informative using a CASE construct:
SELECT s.name, c.className,
CASE WHEN s.id_class = c.id_class THEN 'current' ELSE 'past' END AS is_current
FROM s JOIN many_to_many AS mtm ON (s.id_student = mtm.id_student)
JOIN c ON (c.id_class = mtm.id_class);
It doesn't seem to make sense. A query like:
SELECT * FROM relAC RAC
INNER JOIN tableA A ON A.id_class = RAC.id_class
INNER JOIN tableC C ON C.id_class = RAC.id_class
WHERE A.id_class = B.id_class
could generate a set of data but inconsistent. Or maybe we are missing some important part of the information about the content and the relationships of those 3 tables.
I personally never heard a requirement from a customer that would sound like:
Obtain the information from the Table
B inner joining with A and C, and in
the same query relate A and C in a
one-many relationship
It looks like that it is what you translated the requirement to.
Could you specify the requirement in plain English, as what results your customer wants to get?