SQL query help needed joining within a table - sql

I have a table called babynames that looks abit like this:
firstname |sex |year |count
Bob |M| 2010| 150
Bob |M| 2009| 100
Bob |M| 2008| 122
Bob |F| 2007| 2
Bob |F| 2001| 1
What I want to do is get a list of all the baby names that are both female and male, so my query needs to pull all the firstname records that have at least two records in the table and are at least one M and one F.
It's getting late and my mind isn't working well tonight. Can anyone suggest a string that might help me achieve this task?

There are several ways to handle this. One would be to to use a COUNT(DISTINCT sex) = 2 in the HAVING clause. Be sure to GROUP BY firstname.
SELECT
firstname
FROM babynames
GROUP BY firstname
HAVING COUNT(DISTINCT sex) = 2
Here's a demo: http://sqlfiddle.com/#!2/5d221/1
Another would an INNER JOIN against 2 aliases of the same table, where one looks for M while the other looks for F. If a name isn't matched by both conditions, the join can't be made and it will get excluded from the output.
SELECT
DISTINCT
m.firstname
FROM
babynames f
INNER JOIN babynames m ON f.firstname = m.firstname
WHERE
f.sex = 'F'
AND m.sex = 'M'
http://sqlfiddle.com/#!2/5d221/3

Related

Stuck on beginner SQL practice. Multiple table where columns use same id

I'm very sorry to bother with minor problem, but I tried to search old answers for this one and since my skills in SQL are complete 0, I didn't even understand the answers :/! Neither is my English terminology great enough for properly searching.
I have these 2 tables: Cities and Flights.
Cities
+----+-------------+
|id | name |
+----+-------------+
|1 | Oslo |
|2 | New York |
|3 | Hong Kong |
+----+-------------+
Flights
+----+--------------------+-------------------+
|id | wherefrom_id | whereto_id |
+----+--------------------+-------------------+
|1 | 3 | 2 |
|2 | 3 | 1 |
|3 | 1 | 3 |
+----+--------------------+-------------------+
Now I have to write code where I need to make city ID's merge to wherefrom_id and whereto_id, in that manner that the answer shows table where you can see list of Flights (FROM/TO).
Example:
ANSWER:
+-----------+----------------+
|HONG KONG | NEW YORK |
+-----------+----------------+
|HONG KONG | OSLO |
+-----------+----------------+
|OSLO | HONG KONG |
+-----------+----------------+
This is what I wrote:
SELECT C.name, C.name
FROM Cities C, Flights F
WHERE C.id = F.wherefrom_id AND C.id = F.whereto_id;
For some reason this doesnt seem to work and I get nothing showing on my practice program. There is no error or anything it just doesnt show anything on the test answer. I really hope you get what I mean, English is not my first language and I truly tried my best to make it clear as possible :S
First things first - it's a lot easier to code in standard SQL join syntax. Converting your above to that is
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
The question you've been asked requires logic people don't usually use at first so it can be confusing the first time you encounter it.
I will run through the logic jump in a moment.
Imagine your Flights table has the City names in it (not IDs).
It would have columns, say, FlightID, From_City_Name, To_City_Name.
An example row would be 1, 'Oslo', 'Prague'.
Getting the data for this would be easy e.g., SELECT Flight_ID, From_City_Name, To_City_name FROM Flights.
However, this has many problems. As your question has done, you decide to pull out the cities into their own reference tables.
For this first example, however, you decided to have two extra tables as reference tables: From_City and To_City. These would both have an ID and city name. You then change your Flights to refer to these.
Your code would look like
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN From_City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN To_City AS TC ON Flights.To_City_ID = TC.ID
Notice how there are two joins there - one to From_City and one to To_City? That is because the From and To cities are referring to different things in the data.
So, then the final part of the issue: why have two city tables (from and to). Why not have one? Well, you can. If you create just one table, and modify the above, you get something like this:
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN City AS TC ON Flights.To_City_ID = TC.ID
Note that all that has changed is that the From_City and To_City references have been pointed to a different table City. However, the rest is the same.
And that, actually, would be your answer. The complex part that most people don't get to straight away, is having two joins to the same table.
As an aside, your original code is technically valid.
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
However, what it's effectively saying is to get the city names where the From_City is the same as the To_City - which is obviously not what you want (unless you're looking for turnbacks).
What you're doing is an old SQL way of expressing joins. The standard now has better ways to declare the relationships within the from clause and I take it that your material has postponed that slightly:
There are people who will yell at you for using this ancient syntax but the answer is easy enough:
SELECT C1.name, C2.name
FROM Cities C1, Cities C2, Flights F
WHERE C1.id = F.wherefrom_id AND C2.id = F.whereto_id
You can think of this as creating a "cross product" of all city-pair combinations and matching up the ones that match actual flights. The key is to references Cities twice by using different aliases (or correlation names.)
I think this is what you are looking for..
SELECT wf.name "wherefrom", wt.name "whereto"
FROM Flights f
JOIN Cities wf
ON f.wherefrom_id = wf.id
JOIN Cities wt
ON f.whereto_id = wt.id
order by f.id

SQL selecting where A equals both B and C

name | course
Jay | LAWS0001
Mark | LAWS0002
Sam | LAWS0002
Alice | LAWS0001
Ryan | LAWS0001
Ryan | LAWS0002
Hey guys, I've got this database and I want to only select the names that take both 'LAWS0001' and 'LAWS0002'. So from this example, it should select 'Ryan' because he's the only person to take both courses.
I tried IN operator:
SELECT name
FROM student
WHERE course IN ('LAWS0001', 'LAWS0002')
but this takes everyone because everyone is taking either of the courses.
Is there an operator for my problem?
You can use your existing query, using a GROUP BY clause to COUNT the number of distinct courses each student is taking in the set ('LAWS0001', 'LAWS0002') and only selecting those students where the count is 2:
SELECT name
FROM student
WHERE course IN ('LAWS0001', 'LAWS0002')
GROUP BY name
HAVING COUNT(DISTINCT course) = 2
Demo on SQLFiddle

Selecting more after group-by while using join

At the moment I am busy with two tables, Students and Classes. These two both contain a column project_group, a way to categorize multiple students from one class into smaller groups.
In the Students table there is a column City that states in which town/city students live, from the rows that have been filled there are already several cities occurring multiple times. The code I used to check how many times a city is being showed is this:
SELECT City, count(*)
FROM Students
GROUP BY City
Now the next thing I want to do is show per class in which cities the students live and how many live there, so for example a result like:
A | - | 2
A | New York | 3
A | Los Angeles | 1
B | - | 1
B | Miami | 2
B | Seattle | 1
Students and Classes can join each other on the column project_group but what I'm mostly interested in his using both the GROUP BY mentioned earlier, using the JOIN and also showing the results per class.
Thanks in advance,
KRAD
I'm not sure what the column name is for A and B in your example. I'm assuming Classes.Class in the following:
SELECT
C.Class
, S.City
, COUNT(S.*) AS Count
FROM
Classes AS C INNER JOIN
Students AS S ON C.Project_Group = S.Project_Group
GROUP BY
C.Class
, S.City
I managed to get it working. While doing some tests to see which exact error message it was that I got, I used this and managed to get it working. I now get an overview per class that shows how many people live in which city. This is the code used.
SELECT class_id, city, count(*) AS amount
FROM students, classes
WHERE students.project_group = classes.project_group
GROUP BY class_id, city
ORDER BY class_id

SQL Count unique rows where a column contains two different values

I have taken a good look around and not been able to find any questions that match mine. Maybe I am not using the right language when searching or whatever, but here goes.
I have an SQL table called Classes that looks something like this
Student_Name | Class
--------------------
Edgar | Chemistry
Allan | Chemistry
Burt | Chemistry
Edgar | Math
Sue | Math
Hamilton | Math
Edgar | English
Sue | English
Edgar | German
Ben | German
I want to count how many students are taking both Math and German.
Assuming the following in this example:
- Student names are unique
- One student can have many classes
Logically I would use a select statement to get a result set of students who are taking Math. Then I would go through each Student_Name from the result set and check them against the table to see how many are taking German.
In this case I would expect a return of 1 as only Edgar is taking both Math and German.
Here are some of the queries I have tried so far to no avail :-(
This one was after doing some research on DISTINCT:
SELECT COUNT(DISTINCT Student_Name) FROM Classes WHERE Class = 'Math' AND Class = 'German';
And this one was after finding out more about GROUP BY:
SELECT COUNT(*) FROM (
SELECT DISTINCT Student_Name FROM Classes
WHERE Class IN ( 'Math', 'German' )
GROUP BY Student_Name
);
Neither of these came out quite right any help would be highly appreciated.
SELECT COUNT(*) totalStudent
FROM
(
SELECT student_name
FROM Classes
WHERE class IN ('Math','German')
GROUP BY student_name
HAVING COUNT(*) = 2
) subAlias
SQLFiddle Demo
OUTPUT
╔══════════════╗
║ TOTALSTUDENT ║
╠══════════════╣
║ 1 ║
╚══════════════╝
Could also do the following:
select count(distinct a.Student_name)
from Classes a inner join Classes b on
a.Class = 'German' and
b.Class = 'Math' and
a.Student_Name = b.Student_name;
This solves the problem where the table contains duplicate rows (as pointed out by a commenter to another answer)

Multiple JOIN (SQL)

My problem is Play! Framework / JPA specific. But I think it's applicable to general SQL syntax.
Here is a sample query with a simple JOIN:
return Post.find(
"select distinct p from Post p join p.tags as t where t.name = ?", tag
).fetch();
It's simple and works well.
My question is: What if I want to JOIN on more values in the same table?
Example (Doesn't work. It's a pseudo-syntax I created):
return Post.find(
"select distinct p from Post p join p.tags1 as t, p.tags2 as u, p.tags3 as v where t.name = ?, u.name = ?, v.name = ?", tag1, tag2, tag3,
).fetch();
Your programming logic seems okay, but the SQL statement needs some work. Seems you're new to SQL, and as you pointed out, you don't seem to understand what a JOIN is.
You're trying to select data from 4 tables named POST, TAG1, TAG2, and TAG3.
I don't know what's in these tables, and it's hard to give sample SQL statements without that information. So, I'm going to make something up, just for the purposes of discussion. Let's say that table POST has 6 columns, and there's 8 rows of data in it.
P Fname Lname Country Color Headgear
- ----- ----- ------- ----- --------
1 Alex Andrews 1 1 0
2 Bob Barker 2 3 0
3 Chuck Conners 1 5 0
4 Don Duck 3 6 1
5 Ed Edwards 2 4 2
6 Frank Farkle 4 2 1
7 Geoff Good 1 1 0
8 Hank Howard 1 3 0
We'll say that TAG1, TAG2, and TAG3 are lookup tables, with only 2 columns each. Table TAG1 has 4 country codes:
C Name
- -------
1 USA
2 France
3 Germany
4 Spain
Table TAG2 has 6 Color codes:
C Name
- ------
1 Red
2 Orange
3 Yellow
4 Green
5 Blue
6 Violet
Table TAG3 has 4 Headgear codes:
C Name
- -------
0 None
1 Glasses
2 Hat
3 Monacle
Now, when you select data from these 4 tables, for P=6, you're trying to get something like this:
Fname Lname Country Color Headgear
----- ------ ------- ------ -------
Frank Farkle Spain Orange None
First thing, let's look at your WHERE clause:
where t.name = ?, u.name = ?, v.name = ?
Sorry, but using commas like this is a syntax error. Normally you only want to find data where all 3 conditions are true; you do this by using AND:
where t.name=? AND u.name=? AND v.name=?
Second, why are you joining tables together? Because you need more information. Table POST says that Frank's COUNTRY value is 4; table TAG1 says that 4 means Spain. So we need to "join" these tables together.
The ancient (before 1980, I think) way to join tables is to list more than one table name in the FROM clause, separated by commas. This gives us:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P, TAG1 T, TAG2 U, TAG3 V
The trouble with this query is that you're not telling it WHICH rows you want, or how they relate to each other. So the database generates something called a "Cartesian Product". It's extremely rare that you want a Cartesian Product - normally this is a HUGE MISTAKE. Even though your database only has 22 rows in it, this SELECT statement is going to return 768 rows of data:
Alex Andrews USA Red None
Alex Andrews USA Red Glasses
Alex Andrews USA Red Hat
Alex Andrews USA Red Monacle
Alex Andrews USA Orange None
Alex Andrews USA Orange Glasses
...
Hank Howard Spain Violet Monacle
That's right, it returns every possible combination of data from the 4 tables. Imagine for a second that the POST table eventually grows to 20000 rows, and the three TAG tables have 100 rows each. The whole database would be less than a megabyte, but the Cartesian Product would have 20,000,000,000 rows of data -- probably about 120 GB of data. Any database engine would choke on that.
So if you want to use the Ancient way of specifying tables, it is VERY IMPORTANT to make sure that your WHERE clause shows the relationship between every table you're querying. This makes a lot more sense:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P, TAG1 T, TAG2 U, TAG3 V
WHERE P.Country=T.C AND P.Color=U.C AND P.Headgear=V.C
This only returns 8 rows of data.
Using the Ancient way, it's easy to accidentally create Cartesian Products, which are almost always bad. So they revised SQL to make it harder to do. That's the JOIN keyword. Now, when you specify additional tables you can specify how they relate at the same time. The New Way is:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P
INNER JOIN TAG1 T ON P.Country=T.C
INNER JOIN TAG2 U ON P.Color=U.C
INNER JOIN TAG3 V ON P.Headgear=V.C
You can still use a WHERE clause, too.
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P
INNER JOIN TAG1 T ON P.Country=T.C
INNER JOIN TAG2 U ON P.Color=U.C
INNER JOIN TAG3 V ON P.Headgear=V.C
WHERE P.P=?
If you call this and pass in the value 6, you get only one row back:
Fname Lname Country Color Headgear
----- ------ ------- ------ --------
Frank Farkle Spain Orange None
As was mentioned in the comments, you are looking for an ON clause.
SELECT * FROM TEST1
INNER JOIN TEST2 ON TEST1.A = TEST2.A AND TEST1.B = TEST2.B ...
See example usage of join here:
http://en.wikibooks.org/wiki/Java_Persistence/Relationships#Join_Fetching