Why does this inner join of two tables create duplicate rows - sql

I have the following tables:
Users
Conversations
Group_Members
I need to select all the conversations where a user with a specific ID takes part in. Users and Group_Members are in a many-to-many relationship.
Why does the following query create duplicate rows on the last select, as seen in this image?
select * from Conversations
select * from Group_Members
select Conversations.*
from Conversations
inner join Group_Members on Group_Members.userid=1054
User.Id and Conversation.Id are primary keys.
Sure, select distinct would work, but I don't understand why the select above creates duplicates.

Your join criteria is wrong. It would join them when it sees a Group_members.userId = 1054 regardless what conversations has. You used your "filter criteria" as your "relation criteria".
Your joining key is in fact ConversationId, and what you used is the filtering.
You should write that as:
select Conversations.*
from Conversations
inner join Group_Members on Group_Members.ConversationId = Conversations.Id
where Group_Members.userid=1054
-- and ConversationId = 4; -- if you would filter for a particular conversation

Related

Trying to count the number of occurences that 3 columns from 2 tables have on my organizations table? I need the occurrences joined in one table

-- 2. In one table, show how many private topics, admins, and standard users each organization has.
SELECT organizations.name, COUNT(topics.privacy) AS private_topic, COUNT(users.type) AS user_admin, COUNT(users.type) AS user_standard
FROM organizations
LEFT JOIN topics
ON organizations.id=topics.org_id
AND topics.privacy='private'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='admin'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='standard'
GROUP BY organizations.name
;
org_id is the foreign key that reals both the users table and topics table. It keeps giving me the wrong result by only either counting the number of admins or standard users and putting that for all rows in the each column. Any help is really appreciated as I have been stuck on this for a while now!
So, I am getting an error when I do as you said which is that the users table cannot be specified more than once. I updated the code to how you said to write it but still nothing. They really don't give me any sample data either but I just made some queries and saw the number of times there are private topics for example, which is in the privacy column of the topics table. When I dont get this error as I said, the joins seem to overwrite themselves where each row for all the columns is the same as the last join.
It appears to me that topics and users have no relationship. You're just trying to get the result together in a single query. There are other and possibly better ways to accomplish that but I think this will fix what you've got already (assuming you have id columns for each table.)
SELECT
organizations.name,
COUNT(DISTINCT topics.id) AS private_topic,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'admin') AS user_admin,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'standard') AS user_standard`
FROM organizations
LEFT JOIN topics
ON organizations.id = topics.org_id AND topics.privacy = 'private'
LEFT JOIN users
ON users.org_id = organizations.id
GROUP BY organizations.name;
I propose this as a more straightforward way:
SELECT
min(o.name) as "name",
(
select count(*) from topics t
where t.org_id = o.id AND t.privacy = 'private'
) as private_topics,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'admin'
) AS user_admin,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'standard'
) AS user_standard
FROM organizations o
GROUP BY o.id;

Select creating a Column with results of another query as a JSON

I'm trying to create a query that will fetch results from table parties. This table contains two foreign keys and I'm having trouble "mapping" these foreign keys.
For the first foreign key I need to map my host_id column to the actual name of the person users.name.
I was able to solve this with:
SELECT parties.*, users.name as host_name
FROM parties
INNER JOIN users ON parties.host_id = users.id
My second foreign key is to a table called guests which has a FK named party_refer which refers to parties.id.
The following query includes my Guests as part of the results (by appending all of my guests table columns in the results)
SELECT parties.*, users.name as host_name, guests.*
FROM parties
INNER JOIN users ON parties.host_id = users.id
INNER JOIN guests ON parties.id = guests.party_refer
I would like to modify this second INNER JOIN so that the results of (select * from guests) are returned as a single Column called Guests with the results expressed as a JSON.
I believe I need to use array_to_json(array_agg(row_to_json())) but I've been trying for hours to get it working with no luck.
I think you are looking for
SELECT parties.*, users.name as host_name, json_agg(row_to_json(guests)) as guests
FROM parties
INNER JOIN users ON parties.host_id = users.id
INNER JOIN guests ON parties.id = guests.party_refer
GROUP BY parties.id, users.name
Although a subquery may be simpler than extensive grouping:
SELECT
parties.*,
users.name as host_name,
(SELECT json_agg(row_to_json(guests))
FROM guests
WHERE guests.party_refer = parties.id) as guests
FROM parties
INNER JOIN users ON parties.host_id = users.id
(online demo)
You might prefer an explicit json_build_object instead of the row_to_json, e.g.
json_agg(json_build_object('guestName', guests.name))

Count number of posts of each user - SQL

I need to get the number of posts each user has created.
This is the structure of both tables (users, microposts).
Microposts
id
user_id
content
created_at
Users
id
name
email
admin
SELECT users.*, count( microposts.user_id )
FROM microposts LEFT JOIN users ON users.id=microposts.user_id
GROUP BY microposts.user_id
This gets me only the users that have posts. I need to get all users, even if they have 0 posts
You have the join in the wrong order.
In a LEFT JOIN you ensure you keep all the records in the table written first (to the left).
So, join in the other order (users first/left), and then group by the user table's id, and not the microposts table's user_id...
SELECT users.*, count( microposts.user_id )
FROM users LEFT JOIN microposts ON users.id=microposts.user_id
GROUP BY users.id

Ambiguous column name error for user_id

My users table has the columns
user_id, email
My invites table has the columns
invite_id request_id user_id sent_time
When I run the following query, I get the two tables joined into 1, which is expected.
'SELECT * FROM users INNER JOIN invites ON users.user_id = invites.user_id'
However, when I run the following query,
'SELECT user_id FROM users INNER JOIN invites ON users.user_id = invites.user_id'
I get the following error
OperationalError: (sqlite3.OperationalError) ambiguous column name: user_id [SQL: 'SELECT user_id FROM users INNER JOIN invites ON users.user_id = invites.user_id']
Any help appreciated.
I think the message is pretty clear. SQLite doesn't know what table user_id is coming from.
One simple solution is to qualify the column name usinga table alias:
SELECT u.user_id
FROM users u INNER JOIN
invites i
ON u.user_id = i.user_id;
Another method is to use USING rather than ON:
SELECT user_id
FROM users u INNER JOIN
invites i
USING (user_id);
You need to qualify the column name with table name like below cause both table involved in query have the same column name
SELECT `users`.user_id
FROM users
INNER JOIN invites ON `users`.user_id = invites.user_id
This means you need to be specific about which user_id column you want to display. Even though they're joined that doesn't guarantee they're identical. Some types of joins allow NULL values on one side of the match (e.g. LEFT JOIN), so you need to ask for a particular value:
SELECT users.user_id FROM users INNER JOIN invites ON users.user_id = invites.user_id

Is it true that JOINS can be used everywhere to replace Subqueries in SQL

I heard people saying that table joins can be used everywhere to replace sub-queries. I tested it in my query, but found that appropriate data set was only retrieved when I used sub-queries. I was not able to get same data set using joins. I am not sure if what I found is right because I am a newcomer in RDBMS, thus not so much experienced. I will try to draw the schema (in words) of the database in which I was experimenting:
The database has two tables:
Users (ID, Name, City) and Friendship (ID, Friend_ID)
Goal: Users table is designed to store simple user data and Friendship table represents Friendship between users. Friendship table has both the columns as foreign keys, referencing to Users.ID. Tables have many-to-many relationship between them.
Question: I have to retrieve Users.ID and Users.Name of all the Users, which are not friends with a particular user x, but are from same city (much like fb's friend suggestion system).
By using subquery, I am able to achieve this. Query looks like:
SELECT ID, NAME
FROM USERS AS U
WHERE U.ID NOT IN (SELECT FRIENDS_ID
FROM FRIENDSHIP,
USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
Example entries:
Users
Id = 1 Name = Jon City = Mumbai
Id=2 Name=Doe City=Mumbai
Id=3 Name=Arun City=Mumbai
Id=4 Name=Prakash City=Delhi
Friendship
Id= 1 Friends_Id = 2
Id = 2 Friends_Id=1
Id = 2 Friends_Id = 3
Id = 3 Friends_Id = 2
Can I get the same data set in a single query by performing joins. How? Please let me know if my question is not clear. Thanks.
Note: I used inner join in the sub-query by specifying both tables: Friendship, Users. Omitting the Users table and using the U from outside, gives an error (But if not using alias for the table Users, query becomes syntactically okay but result from this query includes ID's and names of users, who have more than one friends, including the user having ID x. Interesting, but is not the topic of the question).
For not in you can use left join and check for is null:
select u.id, u.name
from Users u
left join Friends f on u.id = f.id and f.friend_id = #person
where u.city like '%city%' and f.friend_id is null and u.id <> #person;
There are some cases where you can't work out your way with just inner/left/right joins, but your case is not one of them.
Please check sql fiddle: http://sqlfiddle.com/#!9/1c5b1/14
Also about your note: What you tried to do can be achieved with lateral join or cross apply depending on the engine you are using.
You can rewrite your query using only joins. The trick is to join to the User tables once with an inner join to identify users within the same city and reference the Friendship table with a left join and a null check to identify non-friends.
SELECT
U1.ID,
U1.Name
FROM
USERS U1
INNER JOIN
USERS U2
ON
U1.CITY = U2.CITY
LEFT JOIN
FRIENDSHIP F
ON
U2.ID = F.ID AND
U1.ID = F.FRIEND_ID
WHERE
U2.id = X AND
U1.ID <> U2.id AND
F.id IS NULL
The above query doesn't handle the situation where USER x's primary key is in the FRIEND_ID column of the FRIENDSHIP table. I assume because your subquery version doesn't handle that situation, perhaps you create 2 rows for each friendship, or friendships are not bi-directional.
Joins and subqueries can be used to achieve similar results in some cases, but certainly not all. As an example, this query with a subquery could not be achieve vis-a-vis a join:
SELECT ID, COLUMN1, COUNT(*) FROM MYTABLE
WHERE ID IN (
SELECT DISTINCT ID FROM MYTABLE
WHERE COLUMN2 NOT IN (VALUES1, VALUES2)
)
GROUP BY ID;
This is only one example, but there are many.
Conversely, you cannot get information from another table by using a subquery without joining it.
As to your example
SELECT ID, NAME FROM USERS AS U
WHERE U.ID NOT IN (
SELECT FRIENDS_ID FROM FRIENDSHIP, USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
This could be constructed as:
select ID, NAME from users u
join FRIENDSHIP f on f.ID = u.ID
where u.ID = x
and u.ID != y
and CITY like '%A_CITY';
I changed your second x to a y assumptively, so it wouldn't cause confusion.
Of course, you may also want to LEFT JOIN aka LEFT OUTER JOIN if there is a chance that there may be multiple results in the FRIENDSHIP table.