SQL relationships, multiple methods, same result? Select, From, Where OR Select, From, Join, Where - sql

I've got 2 tables, a questions table and an answers table with the following example data:
+-----------------------------------+
| Questions |
+----+------------------------------+
| id | title |
+----+------------------------------+
| 1 | What is your favourite game? |
| 2 | What is your favourite food? |
+----+------------------------------+
+-------------------------------------------------+
| Answers |
+----+------------------------------+-------------+
| id | text | question_id |
+----+------------------------------+-------------+
| 1 | The Last Of Us | 1 |
| 2 | PlayerUnknowns Battlegrounds | 1 |
| 3 | Uncharted | 1 |
| 4 | KFC | 2 |
| 5 | Pizza | 2 |
+----+------------------------------+-------------+
Creating a one to many relationship as in one question can have many answers, I can do any of the following:
SELECT
id, text
FROM
answers
WHERE
question_id = 1
Or:
SELECT
answers.id, answers.text
FROM
answers
JOIN
questions
ON
answers.question_id = questions.id
WHERE
questions.id = 1
Or:
SELECT
answers.id, answers.text
FROM
questions
JOIN
answers
ON
questions.id = answers.question_id
WHERE
questions.id = 1
Which all return the following (expected) results:
+-----------------------------------+
| Results |
+----+------------------------------+
| id | text |
+----+------------------------------+
| 1 | The Last Of Us |
| 2 | PlayerUnknowns Battlegrounds |
| 3 | Uncharted |
+----+------------------------------+
Should any of them be avoided? Is there a preferred way of doing this? Just curious about the dos and don’ts of querying relationships in general really.

If you only want to get the answers, don't involve the questions table.
Just select from the answers.
Adding unused tables into your query makes no sense at all -
It makes the query harder to read, thus harder to maintain,
and It makes the database work harder (though modern databases might just optimize the unused parts of the query away) to get the same results.

If you want to imply relationship between "questions" and "answers" table then you can make id column from "questions" table as Primary key and
question_id column from "answers" as Foreign key
and you use JOIN when you need data(columns) from more than one table
in your case if you want title column to be included then you can JOIN tables

Related

*=* join? Is there such a thing? [duplicate]

This question already has answers here:
SQL JOIN and different types of JOINs
(6 answers)
Closed 3 years ago.
I need run a query that is like an left/right outer join. In other words I need all rows from both the left and right tables. But I don't need a cartesian product (cross join). I need to match on, in my case, email address. So given that, I have to output all rows from the left table, join the right table on email address, but all rows that do not match from either the left or right table need to be output as well with nulls for the fields from the opposite table. Sort of like a = join if there were such a thing, or left-righ outer join.
As for what I've tried: Google Searches. But didn't find anything. Cross apply might work, but I cannot wrap my brain around how that is any different from a join.
Example theoretical left-right join:
select users.*, contacts.*
from users
left-right join contacts on users.emailAddress = contacts.emailAddress
So if users contains:
----------------------------------
|emailAddress | firstName |
----------------------------------
|k#company.com | ken |
|b#enterprise.com | bill |
|j#establishment.com | joe |
----------------------------------
And contacts contains:
--------------------------------
|emailAddress | optedOut |
--------------------------------
|z#bigcompany.com | 0 |
|b#enterprise.com | 1 |
|h#smallcompany.com | 1 |
--------------------------------
The output should look like:
------------------------------------------------------------------
|emailAddress | firstName |emailAddress | optedOut |
------------------------------------------------------------------
|k#company.com | ken | NULL | NULL |
|b#enterprise.com | bill | b#enterprise.com | 1 |
|j#establishment.com | joe | NULL | NULL |
|NULL | NULL | z#bigcompany.com | 0 |
|NULL | NULL | h#smallcompany.com | 1 |
------------------------------------------------------------------
It's called a "full outer join". Your query should look like:
select users.*, contacts.*
from users
full outer join contacts on users.emailAddress = contacts.emailAddress

Sql data from row to column with reference to another column

Parent table
+====+===========+
| id | firstname |
+====+===========+
| 1 | abc |
+----+-----------+
| 2 | bcd |
+----+-----------+
| 3 | cde |
+----+-----------+
StudentRelationship table
+==========+==========+===========+
| relation | parentid | studentid |
+==========+==========+===========+
| father | 1 | s0001 |
+----------+----------+-----------+
| mother | 2 | s0001 |
+----------+----------+-----------+
| father | 3 | s0002 |
+----------+----------+-----------+
STUDENT table
+=======+===========+==========+=========+======+
| id | firstname | lastname | address | sex |
+=======+===========+==========+=========+======+
| s0001 | shdj | khb | jxx | male |
+-------+-----------+----------+---------+------+
It would be great if you could help me create a query which will return studentid ,name,father name,mother name,sex,address.
Based on what you've posted, then updated in your comments, I think this should work for you. I am sure someone with more advanced SQL skills can post a more elegant way to do this. But this is what I came up with:
SELECT DISTINCT cte.studentid
,studentFirstName
,studentLastName
,father.fatherFirstName
,mother.motherFirstName
,sex
,address
FROM cte
LEFT JOIN father ON cte.studentid = father.studentid
LEFT JOIN mother ON cte.studentid = mother.studentid
The following is an example where a student (Jeff Jones) has two fathers (let's say one of them is the step-father):
A few recommendations here:
Take a course on SQL syntax fundamentals (any type MySQL, T-SQL, etc..)
Read about FROM and JOIN
When posting your question here, the table examples should have better test data. "asdfkj", "shdsf", "Asdjkfdjkf" are horribly hard to
use to test code against because there is no context of what you are
looking at. I realize you are just posting an example, and the context
of the rows is partly insignificant, but it just makes for easier
question answering, and doesn't scare off people who would want to
answer your question.
Here is an DEMO you can play with, that has reasonable data in the fields you've mentioned.

Randomly Populating Foreign Key In Sample Data Set

I'm generating test data for a new database, and I'm having trouble populating one of the foreign key fields. I need to create a relatively large number (1000) of entries in a table (SurveyResponses) that has a foreign key to a table with only 6 entries (Surveys)
The database already has a Schools table that has a few thousand records. For arguments sake lets say it looks like this
Schools
+----+-------------+
| Id | School Name |
+----+-------------+
| 1 | PS 1 |
| 2 | PS 2 |
| 3 | PS 3 |
| 4 | PS 4 |
| 5 | PS 5 |
+----+-------------+
I'm creating a new Survey table. It will only have about 3 rows.
Survey
+----+-------------+
| Id | Col2 |
+----+-------------+
| 1 | 2014 Survey |
| 2 | 2015 Survey |
| 3 | 2016 Survey |
+----+-------------+
SurveyResponses simply ties a school to a survey.
Survey Responses
+----+----------+----------+
| Id | SchoolId | SurveyId |
+----+----------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 1 |
| 4 | 4 | 3 |
| 5 | 5 | 2 |
+----+----------+----------+
Populating the SurveyId field is what's giving me the most trouble. I can randomly select 1000 Schools, but I haven't figured out a way to generate 1000 random SurveyIds. I've been trying to avoid a while loop, but maybe that's the only option?
I've been using Red Gate SQL Data Generator to generate some of my test data, but in this case I'd really like to understand how this can be done with raw SQL.
Here is one way, using a correlated subquery to get a random survey associated with each school:
select s.schoolid,
(select top 1 surveyid
from surveys
order by newid()
) as surveyid
from schools s;
Note: This doesn't seem to work. Here is a SQL Fiddle showing the non-workingness. I am quite surprised it doesn't work, because newid() should be a
EDIT:
If you know the survey ids have no gaps and start with 1, you can do:
select 1 + abs(checksum(newid()) % 3) as surveyid
I did check that this does work.
EDIT II:
This appears to be overly aggressive optimization (in my opinion). Correlating the query appears to fix the problem. So, something like this should work:
select s.schoolid,
(select top 1 surveyid
from surveys s2
where s2.surveyid = s.schoolid or s2.surveyid <> s.schoolid -- nonsensical condition to prevent over optimization
order by newid()
) as surveyid
from schools s;
Here is a SQL Fiddle demonstrating this.

Is it possible to construct dynamic aggregate columns in an ARel query that uses a join?

Here's a bit of sample context for my question below to help clarify what I'm asking...
The Schema
Users
- id
- name
Answers
- id
- user_id
- topic_id
- was_correct
Topics
- id
- name
The Data
Users
id | name
1 | Gabe
2 | John
Topics
id | name
1 | Math
2 | English
Answers
id | user_id | topic_id | was_correct
1 | 1 | 1 | 0
2 | 1 | 1 | 1
3 | 1 | 2 | 1
4 | 2 | 1 | 0
5 | 2 | 2 | 0
What I'd like to have, in a result set, is a table with one row per user, and two columns per topic, one that shows the sum of correct answers for the topic, and one that shows the sum of the incorrect answers for that topic. For the sample data above, this result set would look like:
My desired result
users.id | users.name | topic_1_correct_sum | topic_1_incorrect_sum | topic_2_correct_sum | topic_2_incorrect_sum
1 | Gabe | 1 | 1 | 1 | 0
2 | John | 0 | 1 | 0 | 1
Obviously, if there were more topics in the Topics table, I'd like this query to include new correct_sum and incorrect_sums for each topic that exists, so I'm looking for a way to write this without hard-coding topic_ids into the sum functions of my select clause.
Is there a smart way to magic this sort of thing with ARel?
Gabe,
What you're looking for here is a crosstab query. There are many approaches to writing this, unfortunately none that will be generic enough in SQL. AFAIK each database handles crosstabs differently. Another way of looking at this is as a "cube", something typically found in OLAP-type databases (as opposed to OLTP).
Its easily writeable in SQL, however will likely include some functions native to the database you're using. What DB are you using?
Your answers table looks like it needs to have 1,2,3,4,5 and not 1,1,1,1,1 as ids...

Retrieve comma delimited data from a field

I've created a form in PHP that collects basic information. I have a list box that allows multiple items selected (i.e. Housing, rent, food, water). If multiple items are selected they are stored in a field called Needs separated by a comma.
I have created a report ordered by the persons needs. The people who only have one need are sorted correctly, but the people who have multiple are sorted exactly as the string passed to the database (i.e. housing, rent, food, water) --> which is not what I want.
Is there a way to separate the multiple values in this field using SQL to count each need instance/occurrence as 1 so that there are no comma delimitations shown in the results?
Your database is not in the first normal form. A non-normalized database will be very problematic to use and to query, as you are actually experiencing.
In general, you should be using at least the following structure. It can still be normalized further, but I hope this gets you going in the right direction:
CREATE TABLE users (
user_id int,
name varchar(100)
);
CREATE TABLE users_needs (
need varchar(100),
user_id int
);
Then you should store the data as follows:
-- TABLE: users
+---------+-------+
| user_id | name |
+---------+-------+
| 1 | joe |
| 2 | peter |
| 3 | steve |
| 4 | clint |
+---------+-------+
-- TABLE: users_needs
+---------+----------+
| need | user_id |
+---------+----------+
| housing | 1 |
| water | 1 |
| food | 1 |
| housing | 2 |
| rent | 2 |
| water | 2 |
| housing | 3 |
+---------+----------+
Note how the users_needs table is defining the relationship between one user and one or many needs (or none at all, as for user number 4.)
To normalise your database further, you should also use another table called needs, and as follows:
-- TABLE: needs
+---------+---------+
| need_id | name |
+---------+---------+
| 1 | housing |
| 2 | water |
| 3 | food |
| 4 | rent |
+---------+---------+
Then the users_needs table should just refer to a candidate key of the needs table instead of repeating the text.
-- TABLE: users_needs (instead of the previous one)
+---------+----------+
| need_id | user_id |
+---------+----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 4 | 2 |
| 2 | 2 |
| 1 | 3 |
+---------+----------+
You may also be interested in checking out the following Wikipedia article for further reading about repeating values inside columns:
Wikipedia: First normal form - Repeating groups within columns
UPDATE:
To fully answer your question, if you follow the above guidelines, sorting, counting and aggregating the data should then become straight-forward.
To sort the result-set by needs, you would be able to do the following:
SELECT users.name, needs.name
FROM users
INNER JOIN needs ON (needs.user_id = users.user_id)
ORDER BY needs.name;
You would also be able to count how many needs each user has selected, for example:
SELECT users.name, COUNT(needs.need) as number_of_needs
FROM users
LEFT JOIN needs ON (needs.user_id = users.user_id)
GROUP BY users.user_id, users.name
ORDER BY number_of_needs;
I'm a little confused by the goal. Is this a UI problem or are you just having trouble determining who has multiple needs?
The number of needs is the difference:
Len([Needs]) - Len(Replace([Needs],',','')) + 1
Can you provide more information about the Sort you're trying to accomplish?
UPDATE:
I think these Oracle-based posts may have what you're looking for: post and post. The only difference is that you would probably be better off using the method I list above to find the number of comma-delimited pieces rather than doing the translate(...) that the author suggests. Hope this helps - it's Oracle-based, but I don't see .