Unify columns from different tables while selecting distinct rows - sql

Tables
User
id
name
email
is_active
1
john
john#albert.com
FALSE
2
mike
mike#ss.com
TRUE
3
monica
monica#dunno.com
TRUE
4
joey
joey#as.com
FALSE
5
ross
ross#boss.com
FALSE
Subscriptions
id
house_id
plan name
status
1
1
A banana a month
inactive
2
2
An apple a month
active
3
3
A pear a month
active
House
id
name
1
John's House
2
Mike's House
3
Monica's House
4
Joey's House
5
Ross's House
House_Contact (legacy table)
id
house_id
is_primary
1
1
TRUE
2
2
FALSE
2
3
TRUE
House_User (new table)
id
house_id
is_owner
user_id
1
2
FALSE
2
2
4
FALSE
4
3
5
FALSE
5
Expected Results
The resulting table should include the following:
Does the user have a subscription regardless of status? If so, include, if not, disregard.
Get email & is_active from User table (if they have subscription)
Get is_primary OR is_owner (if they have a subscription)
Results should be distinct (no duplicate users)
house_id
email
is_owner
is_active
1
john#albert.com
TRUE
FALSE
2
mike#ss.com
FALSE
TRUE
3
monica#dunno.com
TRUE
TRUE
What I tried
SELECT
u.email AS "email",
u.is_active AS "is_active",
h.id AS "house_id",
is_owner
FROM
house c
INNER JOIN (
SELECT
house_id,
user_id
FROM
house_user) hu ON h.id = hu.house_id
INNER JOIN (
SELECT
id,
email,
is_active
FROM
USER) u ON hu.user_id = u.id
INNER JOIN (
SELECT
id,
email,
is_primary
FROM
house_contact) hc ON u.email = ch.email
INNER JOIN (
SELECT
house_id,
is_primary is_owner
FROM
house_contact
UNION
SELECT
house_id,
is_owner is_owner
FROM
house_user) t ON u.id = t.house_id)
ORDER BY
u.email
Results are half than if I remove the INNER JOIN with UNION statement. No idea how to proceed.
I'm particularly confused with unifying the column and the possible duplication.

My educated guess:
SELECT DISTINCT ON (u.id)
u.id, u.email, u.is_active, h.house_id, h.is_primary
FROM "user" u
LEFT JOIN (
SELECT hu.user_id, hu.house_id
, GREATEST(hc.is_primary, hu.is_owner) AS is_primary
FROM house_user hu
LEFT JOIN house_contact hc USING (house_id)
WHERE EXISTS (SELECT FROM subscription WHERE house_id = hu.house_id)
) h ON h.user_id = u.id
ORDER BY u.id, h.is_primary DESC NULLS LAST, h.house_id;
We don't need table house in the query at all.
I see three possible sources of conflict:
house_contact.is_primary vs. house_user.is_owner. Both seem to mean the same. The DB design is broken in this respect. Taking GREATEST() of both, which means true if either is true.
We don't care about subscription.status, so just make sure the house has at least one subscription of any kind with EXISTS, thereby avoiding possible duplicates a priori.
A user can live in multiple houses. We want only one row per user. So show the first house with is_primary (the one with the smallest house_id) if any. If there is no house, there is also no subscription. But the outer LEFT JOIN keeps the user in the result. Change to JOIN to skip users without subscription.
About DISTINCT ON:
Select first row in each GROUP BY group?
About sorting boolean values:
Sorting null values after all others, except special
Sort NULL values to the end of a table

You can use the joins as follows:
Select distinct hu.house_id, u.email, hu.is_owner, hc.is_primary
From user u join house_user hu on u.id = hu.user_id
Join subscriptions s on s.house_id = hu.house_id
Join house_contract hc on hc.house_id = s.house_id;
I have used distinct to remove duplicates if you have multiple data in the table for matching condition. You can remove it if not required in case it is not required.

From what I can tell, you want to start with a query like this:
select s.house_id, u.email, hu.is_owner, u.is_active
from subscriptions s left join
house_user hu
on s.house_id = hu.house_id left join
users u
on hu.user_id = u.id;
This does not return what you want, but it is rather unclear how your results are derived.

Related

How to sum up max values from another table with some filtering

I have 3 tables
User Table
id
Name
1
Mike
2
Sam
Score Table
id
UserId
CourseId
Score
1
1
1
5
2
1
1
10
3
1
2
5
Course Table
id
Name
1
Course 1
2
Course 2
What I'm trying to return is rows for each user to display user id and user name along with the sum of the maximum score per course for that user
In the example tables the output I'd like to see is
Result
User_Id
User_Name
Total_Score
1
Mike
15
2
Sam
0
The SQL I've tried so far is:
select TOP(3) u.Id as User_Id, u.UserName as User_Name, SUM(maxScores) as Total_Score
from Users as u,
(select MAX(s.Score) as maxScores
from Scores as s
inner join Courses as c
on s.CourseId = c.Id
group by s.UserId, c.Id
) x
group by u.Id, u.UserName
I want to use a having clause to link the Users to Scores after the group by in the sub query but I get a exception saying:
The multi-part identifier "u.Id" could not be bound
It works if I hard code a user id in the having clause I want to add but it needs to be dynamic and I'm stuck on how to do this
What would be the correct way to structure the query?
You were close, you just needed to return s.UserId from the sub-query and correctly join the sub-query to your Users table (I've joined in reverse order to you because to me its more logical to start with the base data and then join on more details as required). Taking note of the scope of aliases i.e. aliases inside your sub-query are not available in your outer query.
select u.Id as [User_Id], u.UserName as [User_Name]
, sum(maxScore) as Total_Score
from (
select s.UserId, max(s.Score) as maxScore
from Scores as s
inner join Courses as c on s.CourseId = c.Id
group by s.UserId, c.Id
) as x
inner join Users as u on u.Id = x.UserId
group by u.Id, u.UserName;

Query sql to get the first occurrence in a many to many relationship

I have a User table that has a many to many relationship with Areas. This relationship is stored in the Rel_User_area table. I want to show the user name and the first area that appears in the list of areas.
Ex.
User
id | Name
1 | Peter
2 | Joe
Area
id | Name
1 | Area A
2 | Area B
3 | Area C
Rel_User_area
iduser | idarea
1 | 1
1 | 3
2 | 3
The result I want:
User Name | Area
Peter |Area A
Joe |Area C
Using the minimum area id to determine "First" you could use a correlated subquery (A subquery that refers to field(s) in the main query to filter results):
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
WHERE area.id = (SELECT min(idarea) FROM Rel_User_Area WHERE iduser = RUA.iduser)
There's other ways of doing this that may be RDBMS specific. Like in Teradata I would use a QUALIFY clause that doesn't exist in MySQL, SQL Server, Oracle, Postgres, etc.. Regardless of the RDBMS the above should work.
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
QUALIFY ROW_NUMBER() OVER (PARTITION BY user.id ORDER BY area.id ASC) = 1;
using the ID from Rel_user_Area you mentioned in comments...
This should be pretty platform independent.
SELECT U.name as Username, A.Name as Area
FROM (SELECT min(ID) minID, IDUser, IDarea
FROM Rel_user_Area
GROUP BY IDUser, IDarea) UA
INNER JOIN User U
on U.ID = UA.IDuser
INNER JOIN Area A
on A.ID = UA.IDArea
If Cross apply and top work (could substitute limit 1 vs top if Postgresql or mySQL)
This will run the cross apply SQL once for each record in user; thus you get the most recent rel_user_Area ID per user.
SELECT U.name as Username, A.Name as Area
FROM User U
on U.ID = UA.IDuser
CROSS APPLY (SELECT TOP 1 IDUser, IDArea
FROM Rel_user_Area z
WHERE Z.IDUSER = U.ID
ORDER BY ID ASC) UA
INNER JOIN Area A
on A.ID = UA.IDArea

SQL Inner Join Two Foreign Keys

I have two tables (Users and Pairs). The Pairs table contains 3 columns, an ID and then a user1ID and user2ID.
Users
ID firstName surname
------------------------------
1043 john doe
2056 jane doe
Pairs
ID user1ID user2ID
------------------------------
1 1043 2056
I'm then looking at using a select statement to get the user details base on the ID of the Pairs table:
SELECT users1.*, users2.*
FROM Pairs
JOIN Users users1 ON Pairs.user1ID = users1.IDNumber
JOIN Users users2 ON Pairs.user2ID = users2.IDNumber
WHERE Pairs.ID = 1
Which returns the right details for the two users, however they're all on one row, how can I get it to return each user on a separate row as they are in the Users table?
SELECT users1.*, users2.*
FROM Pairs
JOIN Users
ON Pairs.user1ID = users.IDNumber
OR Pairs.user2ID = users.IDNumber
WHERE Pairs.ID = 1
Just use an OR statement in your ON condition instead of 2 joins.
IN will work also.
SELECT *
FROM Pairs p
JOIN Users u ON u.ID IN (p.user1ID, p.User2ID)
WHERE p.ID = 1

SQL query (Join without duplicates)

I have tables users and topics. Every user can have from 0 to several topics (one-to-many relationship).
How I can get only those users which have at least one topic?
I need all columns from users (without columns from topics) and without duplicates in table users. In last column I need number of topics.
UPDATED:
Should be like this:
SELECT user.*, count(topic.id)
FROM ad
LEFT JOIN topic ON user.id = topic.ad
GROUP BY user.id
HAVING count(topic.id) > 0;
but it takes 0 result. But it should not be 0.
Firstly you need to have your two tables, because you have left limited information about your table structure I will use an example to explain how this works, you should then be able to easily apply this to your own tables.
Firstly you need to have two tables (which you do)
Table "user"
id | name
1 | Joe Bloggs
2 | Eddy Ready
Table "topic"
topicid | userid | topic
1 | 1 | Breakfast
2 | 1 | Lunch
3 | 1 | Dinner
Now asking for a count against each user is done using the follwing;
SELECT user.name, count(topic.topicid)
FROM user
INNER JOIN topic ON user.id = topic.userid
GROUP BY user.name
If you use a left join, this will include records from the "user" table which does not have any rows in the "topic" table, however if you use an INNER JOIN this will ONLY include users who have a matching value in both tables.
I.e. because the user id "2" (which we use to join) is not listed in the topic table you will not get any results for this user.
Hope that helps!
use inner join and distinct
select distinct user_table.id
from user_table
inner join topics_table on topic_table.user_id = user_table.id
select u.id
, u.name
, count(b.topicName)
from user u
left join topic t on t.userid = u.id
group by u.id, u.name
You can select topic number per user and then join it with user data. Something like this:
with t as
(
select userid, count(*) as n
from topic
group by userid
)
SELECT user.*, t.n
FROM user
JOIN t ON user.id = t.userid

joining tables while keeping the Null values

I have two tables:
Users: ID, first_name, last_name
Networks: user_id, friend_id, status
I want to select all values from the users table but I want to display the status of specific user (say with id=2) while keeping the other ones as NULL. For instance:
If I have users:
? first_name last_name
------------------------
1 John Smith
2 Tom Summers
3 Amy Wilson
And in networks:
user_id friend_id status
------------------------------
2 1 friends
I want to do search for John Smith for all other users so I want to get:
id first_name last_name status
------------------------------------
2 Tom Summers friends
3 Amy Wilson NULL
I tried doing LEFT JOIN and then WHERE statement but it didn't work because it excluded the rows that have relations with other users but not this user.
I can do this using UNION statement but I was wondering if it's at all possible to do it without UNION.
You need to put your condition into the ON clause of the LEFT JOIN.
Select
u.first_name,
u.last_name,
n.status
From users u
Left Join networks n On ( ( n.user_id = 1 And n.friend_id = u.id )
Or ( n.friend_id = 1 And n.user_id = u.id )
Where u.id <> 1
This should return you all users (except for John Smith) and status friend if John Smith is either friend of this user, or this user is friend of John Smith.
You probably don't need a WHERE clause, and instead of that, put the condition into the "ON" clause that follows your "LEFT JOIN". That should fix your issues. Also, make sure that the main table is on the left side of the left join, otherwise, you should use a right join.
In addition to the (correct) replies above that such conditions should go in the ON clause, if you really want to put them in the WHERE clause for some reason, just add a condition that the value can be null.
WHERE (networks.friendid = 2 OR networks.friendid IS NULL)
From what you've described, it should be a case of joining a subset of networks to users.
select id, first_name, last_name, status
from users u
left join networks n on u.id = n.user_id
and n.friend_id = 1
where id <> 1;
The left join will keep rows from users that do not have a matching row in networks and adding the and n.friend_id = 1 limits when the 'friends' status is returned. Lastly, you may choose to exclude the row from users that you are running the query for.