Cant merge two queries with different columns - sql

I'm studying SQL and somehow I'm stuck with a question. I have 2 tables ('users' and 'follows').
Follows Table
user_id
follows
date
1
2
1993-09-01
2
1
1989-01-01
3
1
1993-07-01
2
3
1994-10-10
3
2
1995-03-01
4
2
1988-08-08
4
1
1988-08-08
1
4
1994-04-02
1
5
2000-01-01
5
1
2000-01-02
5
6
1986-01-10
7
1
1990-02-02
1
7
1996-10-01
1
8
1993-09-03
8
1
1995-09-01
8
9
1995-09-01
9
8
1996-01-10
7
8
1993-09-01
3
9
1996-05-30
4
9
1996-05-30
Users Table
user_id
first_name
last_name
school
1
Harry
Potter
Gryffindor
2
Ron
Wesley
Gryffindor
3
Hermonie
Granger
Gryffindor
4
Ginny
Weasley
Gryffindor
5
Draco
Malfoy
Slytherin
6
Tom
Riddle
Slytherin
7
Luna
Lovegood
Ravenclaw
8
Cho
Chang
Ravenclaw
9
Cedric
Diggory
Hufflepuff
I need to list all rows from follows where someone from one house follows someone from a different house. I tried to make 2 queries, one to get all houses related to follows.user_id and another one with all houses related to follows.follows and "merge" then:
select a.nome_id, a.user_id_house, b.follows_id, b.follows_house
from ( select follows.user_id as nome_id
, users.house as user_id_house
from follows inner join users
ON users.user_id = follows.user_id
) as a,
( select follows.follows as follows_id
, users.house as follows_house
from follows inner join users
ON follows.user_id = users.user_id
) as b
where a.user_id_house <> b.follows_house;
The problem is that the result is like 400 rows, its not right. I have no idea how I could solve this.

Try this
SELECT follows.user_id, users.school, followers.user_id, followers.school FROM follows
JOIN users ON follows.user_id=users.user_id
JOIN users as followers ON follows.follows=followers.user_id
WHERE users.school <> followers.school
Note: Pay attention to naming in my answer
Thanks for correcting to Thorsten Kettner

Related

t-sql merging two tables and replace null values

I have these two tables, and that what I want is to compare them to know if there is any null value in table 2, if there is, replace the existing value in table 1 by the null value in table 2 (by the code column that is the primary key).
Table 1
Code Name Points
1 Juan Perez 10
2 Marco Salgado 5
3 Carlos Soto 9
4 Alberto Ruiz 12
5 Alejandro Castro 5
10 Jonatan Polanco 0
11 JD NULL
Table 2
Code Name Points
1 Juan Perez 10
2 Marco Salgado 5
3 Carlos Soto 9
4 Alberto Ruiz 12
5 Alejandro Castro 5
10 Null 0
11 JD 9
The resulting table should look like this
Table 2
Code Name Points
1 Juan Perez 10
2 Marco Salgado 5
3 Carlos Soto 9
4 Alberto Ruiz 12
5 Alejandro Castro 5
10 Jonatan Polanco 0
11 JD 9
If you are trying to update the rows that have null values in Points column, You just need to join the two tables and add a where clause to limit the rows to the ones you want to update. Something like this
UPDATE t2
SET Points = t.Points
FROM table_1 t
JOIN table_2 t2
ON t.code = t2.code
WHERE t2.Points IS NULL

In a game show database scenario, how do I fetch the average total episode score per season in a single query?

Pardon the title gore. I'm having trouble finding a good way to express my question, which is endemic to the problem.
The Tables
season
id name
---- ------
1 Season 1
2 Season 2
3 Season 3
episode
id season_id number title
---- ----------- -------- ---------------------------------------
1 1 1 Pilot
2 1 2 1x02 - We Got Picked Up
3 1 3 1x03 - This is the Third Episode
4 2 1 2x01 - We didn't get cancelled.
5 2 2 2x02 - We're running out of ideas!
6 3 1 3x01 - We're still here.
7 3 2 3x02 - Okay, this game show is dying.
8 3 3 3x03 - Untitled
score
id episode_id score contestant_id (table not given)
---- ------------ ------- ---------------------------------
1 1 35 1
2 1 -12 2
3 1 8 3
4 1 5 4
5 2 13 1
6 2 -2 5
7 2 3 3
8 2 -14 6
9 3 -14.5 1
10 3 -3 2
11 3 1.5 7
12 3 9.5 5
13 4 22.8 1
14 4 -3 8
15 5 2 1
16 5 13.5 9
17 5 7 3
18 6 13 1
19 6 -84 10
20 6 12 11
21 7 3 1
22 7 10 2
23 8 29 1
24 8 1 5
As you can see, you have multiple episodes per season, and multiple scores per episode (one score per contestant). Contestants can reappear in later episodes (irrelevant), scores are floating point values, and there can be an arbitrary number of scores per episode.
So what am I looking for?
I'd like to get the average total episode score per season, where the total episode score is the sum of all the scores in an episode. Mathematically, this comes out to be the sum of all scores in a season divided by the number of episodes. Easy enough to comprehend, but I have had trouble doing it in a single query and getting the correct result. I'd like an output like the following:
name average_total_episode_score
---------- -----------------------------
Season 1 9.83
Season 2 21.15
Season 3 -5.33
The top-level query needs to be on the season table as it will be combined with other, similar queries on the same table. It's easy enough to do this with an aggregate in a subquery, but an aggregation executes the subquery, failing my single-query requirement. Can this be done in a single query?
Hope this should work
Select s.id, avg(score)
FROM Season S,
Episode e,
Score sc
WHERE s.id = e.season_id
AND e.id = sc.episode_id
Group by s.id
Okay, just figured it out. As usual, I had to write and post a whole book before the simple solution descended upon me.
The problem in my query (which I didn't give in the question) was the lack of a DISTINCT count. Here is a working query:
SELECT
"season"."id",
"season"."name",
(SUM("score"."score") / COUNT(DISTINCT "episode"."id")) AS "average_total_episode_score"
FROM "season"
LEFT OUTER JOIN "episode"
ON ("season"."id" = "episode"."season_id")
LEFT OUTER JOIN "score"
ON ("episode"."id" = "score"."episode_id")
GROUP BY "season"."id"
select Se.id AS Season_Id, sum(score) As season_score, avg(score) from score S join episode E ON S.episode_id = E.id
join Season se ON se.id = e.season_id group by se.id

Adding missing information to a table? (considering random start and end months)

I have the following table spanishcourse, representing the grades of students in a Spanish course. This is a school with courses starting every month, and students can start and leave the course randomly during the year (month in and month out columns respectively). The fact is some of these students were absent in some months, when they were absent the grade is 0. The problem is when the student was absent the column month does not show the grade as 0 (column grades).
month in month out month student grades
3 9 3 John 10
3 9 5 John 8
3 9 6 John 4
3 9 7 John 3
3 9 9 John 7
2 7 2 Mary 9
2 7 3 Mary 2
2 7 6 Mary 6
2 7 7 Mary 9
1 3 1 Jane 8
1 3 2 Jane 7
1 3 3 Jane 5
6 10 6 Rick 9
6 10 8 Rick 1
6 10 10 Rick 3
The output that I need is, now a small part of Rick:
month in month out month student grades
6 10 6 Rick 9
6 10 7 Rick 0
6 10 8 Rick 1
6 10 9 Rick 0
6 10 10 Rick 3
Conclusion: I only need to add the missing periods from the start until the end of a student. Considering Rick's example, we only added months 7 and 9 as having grade 0. Can some of you help me please?
PS: I already saw some other answered questions. They were the opposite because they considered as all data starting from 1 to n. They were not considering random months like this example.
You can do this using cross join and left outer join. The cross join generates all combinations between students and months. The left outer join brings in the data for the matching records. Records that don't match get a grade of 0.
The following assumes that some student somewhere has a grade in each month:
select s.month_in, s.month_out, m.month, s.student,
coalesce(sc.grades, 0) as grades
from (select distinct student, month_in, month_out from spanishcourse sc) s cross join
(select distinct month from spanishcourse sc) m left outer join
spanishcourse sc
on sc.student = s.student and sc.month = m.month;
SQL Fiddle
select s.month_in, s.month_out, month, student, coalesce(grades, 0)
from
spanishcourse sc
right join
(
select distinct
student, month_in, month_out,
generate_series(month_in, month_out, 1) as month
from spanishcourse
) s using (student, month)
order by student, month

SQL: Create view from 2 tables printing null values when no records

I have in my DB these 2 tables:
LESSONS RATINGS
ID | NAME ID | LESSON | RATING
1 lesson1 1 1 4
2 lesson2 2 2 2
3 lesson3 3 1 5
4 lesson4 4 4 2
5 lesson5 5 3 1
6 lesson6 6 2 5
7 lesson7 7 6 3
And I want a View that show me something like this:
LESSONS_RATINGS
IDL| NAME | RATING
1 lesson1 4.5
2 lesson2 3.5
3 lesson3 1
4 lesson4 2
5 lesson5 NULL
6 lesson6 3
7 lesson7 NULL
But what I've been able to get so far is this:
LESSONS_RATINGS
IDL| NAME | RATING
1 lesson1 4.5
2 lesson2 3.5
3 lesson3 1
4 lesson4 2
6 lesson6 3
Notice that NULL records are missing. That's why in table RATINGS there are no records of lessons 5 and 7. I'm doing this:
CREATE OR REPLACE VIEW `LESSONS_RATINGS` AS
select
`l`.`ID` AS `IDL`,
`l`.`NAME` AS `NAME`,
CASE WHEN AVG(`lr`.`RATING`) IS NULL THEN NULL ELSE AVG(`lr`.`RATING`) END AS `RATING`
from
`LESSONS` AS `l`,
`RATINGS` AS `lr`
where
(`l`.`ID` = `lr`.`ID`)
group by `l`.`ID`;
Use an OUTER JOIN:
select
`l`.`ID` AS `IDL`,
`l`.`NAME` AS `NAME`,
AVG(`lr`.`RATING`) AS `RATING`
from
`LESSONS` AS `l` LEFT JOIN `RATINGS` AS `lr`
ON `l`.`ID` = `lr`.`ID`
group by `l`.`ID`;
Also, I don't think there is a need for the Case statement -- you can just use AVG(lr.rating).
A Visual Explanation of SQL Joins

How to select faulty records?

I'm investigating an error in one of our tables of a geographical database. Given the table below, the DistrictName and DisId should always have the same combination (i.e. Bronx = 11, Manhatten = 14), but some records have a different DisId (while still sharing the the same DistrictName).
Id DistrictName DisId Section
------------------------------------------------
1 Bronx 11 1
2 Bronx 11 2
3 Brooklyn 12 1
4 Brooklyn 13 2 //wrong
5 Manhatten 14 1
6 Manhatten 14 2
7 Queens 15 1
8 Queens 16 2 //wrong
9 Queens 17 3 //wrong
How can I select all faulty records in a query?
There is always a Section 1, so records with a section > 1 containing the same DistrictName but having a deviating DisId are the ones I'm looking for.
I've tried using a group by (districtname) but I'm having difficulties comparing with the section1 record. I'm kind of lost when it comes to putting the logic in the having or where clause. Any help appreciated!
select * from your_table
where section > 1
and districtname in
(
select districtname
from your_table
group by districtname
having count(distinct disid) > 1
)