I have 4 Tables, I don't need the 'Questions' Table, but I just put it there so you'd know where i got the Question ID in the other tables.
Questions
--------
ID
Question
Question_Options
--------
ID
Question_ID
Option_Label
Session
--------------
ID
GENDER
Session_Answers
-----------------
ID
Session_ID
Option_ID
Question_ID
I calculated the following: the number of votes for each option from a certain question, like so
SELECT Q.Option_Label as Choice, COALESCE((SELECT COUNT(*) FROM Session_Answers S WHERE S.Option_ID = Q.ID),0) as Votes
FROM Question_Options Q
INNER JOIN Session_Answers S
ON Q.Question_ID = S.Question_ID
WHERE Q.Question_ID = 10114<---the Question ID
GROUP BY Q.ID,Q.option_label
What I want to do, is add a new column to the query that calculates the number of males who have chosen each option based on the Session Table.
You can do that :
SELECT QO.Question_ID, QO.Option_Label as Choice, COUNT(*) as VotesMale
FROM Question_Options QO
LEFT JOIN Session_Answers SA ON QO.ID = SA.Option_ID
JOIN [Session] S ON S.ID = SA.Session_ID AND S.Gender = 'M'
WHERE QO.Question_ID = 10114<---the Question ID
GROUP BY QO.Question_ID, QO.Option_label
You can simply add extra count from the Session table filtered with Gender.
SELECT Q.Option_Label as Choice, COALESCE((SELECT COUNT(*) FROM Session_Answers SA WHERE
SA.Option_ID = Q.ID),0) as Votes,
COALESCE((SELECT COUNT(*) FROM Session SM WHERE
S.Session_ID = SM.Session_ID AND Gender='M'),0) as MalesSessions
FROM Question_Options Q
INNER JOIN Session_Answers S
ON Q.Question_ID = S.Question_ID
WHERE Q.Question_ID = 10114 ---the Question ID
GROUP BY Q.ID,Q.option_label
Related
I have 3 tables
User Table
id
Name
1
Mike
2
Sam
Score Table
id
UserId
CourseId
Score
1
1
1
5
2
1
1
10
3
1
2
5
Course Table
id
Name
1
Course 1
2
Course 2
What I'm trying to return is rows for each user to display user id and user name along with the sum of the maximum score per course for that user
In the example tables the output I'd like to see is
Result
User_Id
User_Name
Total_Score
1
Mike
15
2
Sam
0
The SQL I've tried so far is:
select TOP(3) u.Id as User_Id, u.UserName as User_Name, SUM(maxScores) as Total_Score
from Users as u,
(select MAX(s.Score) as maxScores
from Scores as s
inner join Courses as c
on s.CourseId = c.Id
group by s.UserId, c.Id
) x
group by u.Id, u.UserName
I want to use a having clause to link the Users to Scores after the group by in the sub query but I get a exception saying:
The multi-part identifier "u.Id" could not be bound
It works if I hard code a user id in the having clause I want to add but it needs to be dynamic and I'm stuck on how to do this
What would be the correct way to structure the query?
You were close, you just needed to return s.UserId from the sub-query and correctly join the sub-query to your Users table (I've joined in reverse order to you because to me its more logical to start with the base data and then join on more details as required). Taking note of the scope of aliases i.e. aliases inside your sub-query are not available in your outer query.
select u.Id as [User_Id], u.UserName as [User_Name]
, sum(maxScore) as Total_Score
from (
select s.UserId, max(s.Score) as maxScore
from Scores as s
inner join Courses as c on s.CourseId = c.Id
group by s.UserId, c.Id
) as x
inner join Users as u on u.Id = x.UserId
group by u.Id, u.UserName;
I have a User table that has a many to many relationship with Areas. This relationship is stored in the Rel_User_area table. I want to show the user name and the first area that appears in the list of areas.
Ex.
User
id | Name
1 | Peter
2 | Joe
Area
id | Name
1 | Area A
2 | Area B
3 | Area C
Rel_User_area
iduser | idarea
1 | 1
1 | 3
2 | 3
The result I want:
User Name | Area
Peter |Area A
Joe |Area C
Using the minimum area id to determine "First" you could use a correlated subquery (A subquery that refers to field(s) in the main query to filter results):
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
WHERE area.id = (SELECT min(idarea) FROM Rel_User_Area WHERE iduser = RUA.iduser)
There's other ways of doing this that may be RDBMS specific. Like in Teradata I would use a QUALIFY clause that doesn't exist in MySQL, SQL Server, Oracle, Postgres, etc.. Regardless of the RDBMS the above should work.
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
QUALIFY ROW_NUMBER() OVER (PARTITION BY user.id ORDER BY area.id ASC) = 1;
using the ID from Rel_user_Area you mentioned in comments...
This should be pretty platform independent.
SELECT U.name as Username, A.Name as Area
FROM (SELECT min(ID) minID, IDUser, IDarea
FROM Rel_user_Area
GROUP BY IDUser, IDarea) UA
INNER JOIN User U
on U.ID = UA.IDuser
INNER JOIN Area A
on A.ID = UA.IDArea
If Cross apply and top work (could substitute limit 1 vs top if Postgresql or mySQL)
This will run the cross apply SQL once for each record in user; thus you get the most recent rel_user_Area ID per user.
SELECT U.name as Username, A.Name as Area
FROM User U
on U.ID = UA.IDuser
CROSS APPLY (SELECT TOP 1 IDUser, IDArea
FROM Rel_user_Area z
WHERE Z.IDUSER = U.ID
ORDER BY ID ASC) UA
INNER JOIN Area A
on A.ID = UA.IDArea
There is probably a much better way to create these views. I have limited SQL experience so this is the way I designed it, I am hoping some of you SQL gurus can point me in a more efficient direction.
I essentially have 3 tables (sometimes 4) in my view, here is the essential structure:
Table USER
USER_ID | EMAIL | PASSWORD | CREATED_DATE
(Indexes: USER_ID)
Table USER_META
ID | USER_ID | NAME | VALUE
(Indexes: ID,USER_ID,NAME)
Table USER_SCORES
ID | USER_ID | GAME_ID | SCORE | CREATED_DATE
(Indexes: ID,USER_ID)
All the tables use the first ID column as an auto-increment primary key.
The second table "USER_META" is where I keep all the contact info and other misc. Primarily it is first_name,last_name, street,city, etc. - Depending on the user this could be 4 items or 140, which is why I use this table instead of having 150 columns in my USER table.
For reports, searching and editing I need about 20 values from USER_META, so I have views that look like this:
View V_USR_META
select USER_ID,EMAIL,
(select VALUE from USER_META
where NAME = 'FIRST_NAME' and USER_ID = u.USER_ID) as first_name,
(select VALUE from USER_META
where NAME = 'LAST_NAME' and USER_ID = u.USER_ID) as last_name,
(select VALUE from USER_META
where NAME = 'CITY' and USER_ID = u.USER_ID) as city,
(select VALUE from USER_META
where NAME = 'STATE' and USER_ID = u.USER_ID) as state,
(select VALUE from USER_META
where NAME = 'ZIP' and USER_ID = u.USER_ID) as zip,
/* 10 more selects for different meta values here */
(select max(SCORE) from USER_SCORES
where USER_ID = u.USER_ID) as high_score,
(select top (1) CREATED_DATE from USER_SCORES
where USER_ID = u.USER_ID
order by id desc) as last_game
from USER u
This get's pretty slow, and there are actually many more sub queries, this is just to illustrate the query. I also have to query a few other tables to get misc. info about the user.
I use the view when searching for a user, searches use name or userid or email or score, etc. I also use it to populate the user information screen when I present all the data in one place.
So - Is there a better way to write the view?
An alternative to all of those correlated subqueries would be to use max with case:
select u.USER_ID,
u.EMAIL,
max(case when um.name = 'FIRST_NAME' then um.value end) first_name,
max(case when um.name = 'LAST_NAME' then um.value end) last_name
...
from USER u
left join USER_META um
on u.user_id = um.user_id
group by u.user_id, u.email
Then you could add the user_scores results:
select u.USER_ID,
u.EMAIL,
max(case when um.name = 'FIRST_NAME' then um.value end) first_name,
max(case when um.name = 'LAST_NAME' then um.value end) last_name
...,
max(us.score) maxscore,
max(us.created_date) maxcreateddate
from USER u
left join USER_META um
on u.user_id = um.user_id
left join USER_SCORES us
on u.user_id = us.user_id
group by u.user_id, u.email
WITH Meta AS (
SELECT USER_ID
,FIRST_NAME
,LAST_NAME
,CITY
,STATE
,ZIP
FROM USER_META
PIVOT (
MAX(VALUE) FOR NAME IN (FIRST_NAME, LAST_NAME, CITY, STATE, ZIP)
) AS p
)
,MaxScores AS (
SELECT USER_ID
,MAX(SCORE) AS Score
FROM USER_SCORES
GROUP BY USER_ID
)
,LastGames AS (
SELECT USER_ID
,MAX(CREATED_DATE) AS GameDate
FROM USER_SCORES
GROUP BY USER_ID
)
SELECT USER.USER_ID
,USER.EMAIL
,Meta.FIRST_NAME
,Meta.LAST_NAME
,Meta.CITY
,Meta.STATE
,Meta.ZIP
,MaxScores.Score
,LastGames.GameDate
FROM USER
INNER JOIN Meta
ON USER.USER_ID = Meta.USER_ID
LEFT JOIN MaxScores
ON USER.USER_ID = MaxScores.USER_ID
LEFT JOIN LastGames
ON USER.USER_ID = LastGames.USER_ID
It seems pretty simple i have a table 'question' which stores a list of all questions and a many to many table which sits between 'question' and 'user' called 'question_answer'.
Is it possible to do one query to get back all questions within questions table and the ones a user has answered with the un answered questions being NULL values
question:
| id | question |
question_answer:
| id | question_id | answer | user_id |
I am doing this query, but the condition is enforcing that only the questions answered are returned. Will i need to resort to nested select?
SELECT * FROM `question` LEFT JOIN `question_answer`
ON question_answer.question_id = question.id
WHERE user_id = 14583461 GROUP BY question_id
if user_id is in the outer joined to table then your predicate user_id = 14583461 will result in not returning any rows where user_id is null i.e. the rows with unanswered questions. You need to say "user_id = 14583461 or user_id is null"
Shouldn't you use RIGHT JOIN?
SELECT * FROM question_answer RIGHT JOIN question ON question_answer.question_id = question.id
WHERE user_id = 14583461 GROUP BY question_id
something like this might help (http://pastie.org/1114844)
drop table if exists users;
create table users
(
user_id int unsigned not null auto_increment primary key,
username varchar(32) not null
)engine=innodb;
drop table if exists question;
create table question
(
question_id int unsigned not null auto_increment primary key,
ques varchar(255) not null
)engine=innodb;
drop table if exists question_ans;
create table question_ans
(
user_id int unsigned not null,
question_id int unsigned not null,
ans varchar(255) not null,
primary key (user_id, question_id)
)engine=innodb;
insert into users (username) values
('user1'),('user2'),('user3'),('user4');
insert into question (ques) values
('question1 ?'),('question2 ?'),('question3 ?');
insert into question_ans (user_id,question_id,ans) values
(1,1,'foo'), (1,2,'mysql'), (1,3,'php'),
(2,1,'bar'), (2,2,'oracle'),
(3,1,'foobar');
select
u.*,
q.*,
a.ans
from users u
cross join question q
left outer join question_ans a on a.user_id = u.user_id and a.question_id = q.question_id
order by
u.user_id,
q.question_id;
select
u.*,
q.*,
a.ans
from users u
cross join question q
left outer join question_ans a on a.user_id = u.user_id and a.question_id = q.question_id
where
u.user_id = 2
order by
q.question_id;
edit: added some stats/explain plan & runtime:
runtime: 0.031 (10,000 users, 1000 questions, 3.5 million answers)
select count(*) from users
count(*)
========
10000
select count(*) from question
count(*)
========
1000
select count(*) from question_ans
count(*)
========
3682482
explain
select
u.*,
q.*,
a.ans
from users u
cross join question q
left outer join question_ans a on a.user_id = u.user_id and a.question_id = q.question_id
where
u.user_id = 256
order by
u.user_id,
q.question_id;
id select_type table type possible_keys key key_len ref rows Extra
== =========== ===== ==== ============= === ======= === ==== =====
1 SIMPLE u const PRIMARY PRIMARY 4 const 1 Using filesort
1 SIMPLE q ALL 687
1 SIMPLE a eq_ref PRIMARY PRIMARY 8 const,foo_db.q.question_id 1
Move the user_id predicate into the join condition. This will then ensure that all rows from question are returned, but only rows from question_answer with the specified user ID and question ID.
SELECT * FROM question
LEFT JOIN question_answer ON question_answer.question_id = question.id
AND user_id = 14583461
ORDER BY user_id, question_id
I have a candidate table say candidates having only id field and i left joined profiles table to it. Table profiles has 2 fields namely, candidate_id & name.
e.g. Table candidates:
id
----
1
2
and Table profiles:
candidate_id name
----------------------------
1 Foobar
1 Foobar2
2 Foobar3
i want the latest name of a candidate in a single query which is given below:
SELECT C.id, P.name
FROM candidates C
LEFT JOIN profiles P ON P.candidate_id = C.id
GROUP BY C.id
ORDER BY P.name;
But this query returns:
1 Foobar
2 Foobar3
...Instead of:
1 Foobar2
2 Foobar3
The problem is that your PROFILES table doesn't provide a reliable means of figuring out what the latest name value is. There are two options for the PROFILES table:
Add a datetime column IE: created_date
Define an auto_increment column
The first option is the best - it's explicit, meaning the use of the column is absolutely obvious, and handles backdated entries better.
ALTER TABLE PROFILES ADD COLUMN created_date DATETIME
If you want the value to default to the current date & time when inserting a record if no value is provided, tack the following on to the end:
DEFAULT CURRENT_TIMESTAMP
With that in place, you'd use the following to get your desired result:
SELECT c.id,
p.name
FROM CANDIDATES c
LEFT JOIN PROFILES p ON p.candidate_id = c.id
JOIN (SELECT x.candidate_id,
MAX(x.created_date) AS max_date
FROM PROFILES x
GROUP BY x.candidate_id) y ON y.candidate_id = p.candidate_id
AND y.max_date = p.created_date
GROUP BY c.id
ORDER BY p.name
Use a subquery:
SELECT C.id, (SELECT P.name FROM profiles P WHERE P.candidate_id = C.id ORDER BY P.name LIMIT 1);