SQL Querying Incomplete Result - sql

I am having trouble understanding what is wrong with this query or how to properly join it together.
I know that I am missing results from other queries that show both sides of the connections separately using two aliases for the same table.
Pretty much a friendship has two "id"s (source and target) which both map to the same column in the users table.
I believe it is the OR statement that is causing the invalid joining, and if so how is the joining suppose to be done. If not what is the problem?
SELECT DISTINCT u.id, u.first_name, u.last_name
FROM friendships AS f, users AS u
WHERE f.mutual = 'true' -- both are friends
AND (u.id = f.source_id OR u.id = f.target_id) -- this line?
AND f.created_at BETWEEN '2016-01-20' AND '2016-01-27' -- time period they became friends
GROUP BY u.id, u.first_name
ORDER BY u.first_name;
Is it the use of the word "DISTINCT?"
I appreciate the help, I have tried INNER and LEFT JOINS, but my mind is just blanking on this and I can't figure out how to get it to work.
Here is the query that IS WORKING and shows me more users than the above query:
SELECT f.source_id, u1.first_name AS s_firstname, u1.last_name AS s_lastname, f.target_id, u2.first_name AS t_firstname, u2.last_name AS t_lastname
FROM friendships AS f, users AS u1, users AS u2
WHERE f.mutual = 'true'
AND u1.id = f.source_id AND u2.id = f.target_id
AND f.created_at BETWEEN '2016-01-20' AND '2016-01-27';
Friendship Table Definition:
User Table Definition:

You must either
1. UNION (distinct) the projections of your second query on source and target or
2. select distinct users where they are a source or target
SELECT DISTINCT u.id, u.first_name, u.last_name
FROM friendships AS f, users AS u
WHERE f.mutual = 'true' -- both are friends
AND (u.id = f.source_id OR u.id = f.target_id)
AND f.created_at BETWEEN '2016-01-20' AND '2016-01-27'
ORDER BY u.first_name;

This will avoid any "duplicates" between u1 * u2. Either one (or more) friendship records exist (in the given period) or they don't.
SELECT u1.first_name AS s_firstname, u1.last_name AS s_lastname
, u2.first_name AS t_firstname, u2.last_name AS t_lastname
FROM users AS u1
JOIN users AS u2
ON EXISTS (
SELECT 13 FROM friendships f
WHERE u1.id = f.source_id AND u2.id = f.target_id
AND f.mutual = 'true'
AND f.created_at BETWEEN '2016-01-20' AND '2016-01-27';
);

Related

Is it true that JOINS can be used everywhere to replace Subqueries in SQL

I heard people saying that table joins can be used everywhere to replace sub-queries. I tested it in my query, but found that appropriate data set was only retrieved when I used sub-queries. I was not able to get same data set using joins. I am not sure if what I found is right because I am a newcomer in RDBMS, thus not so much experienced. I will try to draw the schema (in words) of the database in which I was experimenting:
The database has two tables:
Users (ID, Name, City) and Friendship (ID, Friend_ID)
Goal: Users table is designed to store simple user data and Friendship table represents Friendship between users. Friendship table has both the columns as foreign keys, referencing to Users.ID. Tables have many-to-many relationship between them.
Question: I have to retrieve Users.ID and Users.Name of all the Users, which are not friends with a particular user x, but are from same city (much like fb's friend suggestion system).
By using subquery, I am able to achieve this. Query looks like:
SELECT ID, NAME
FROM USERS AS U
WHERE U.ID NOT IN (SELECT FRIENDS_ID
FROM FRIENDSHIP,
USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
Example entries:
Users
Id = 1 Name = Jon City = Mumbai
Id=2 Name=Doe City=Mumbai
Id=3 Name=Arun City=Mumbai
Id=4 Name=Prakash City=Delhi
Friendship
Id= 1 Friends_Id = 2
Id = 2 Friends_Id=1
Id = 2 Friends_Id = 3
Id = 3 Friends_Id = 2
Can I get the same data set in a single query by performing joins. How? Please let me know if my question is not clear. Thanks.
Note: I used inner join in the sub-query by specifying both tables: Friendship, Users. Omitting the Users table and using the U from outside, gives an error (But if not using alias for the table Users, query becomes syntactically okay but result from this query includes ID's and names of users, who have more than one friends, including the user having ID x. Interesting, but is not the topic of the question).
For not in you can use left join and check for is null:
select u.id, u.name
from Users u
left join Friends f on u.id = f.id and f.friend_id = #person
where u.city like '%city%' and f.friend_id is null and u.id <> #person;
There are some cases where you can't work out your way with just inner/left/right joins, but your case is not one of them.
Please check sql fiddle: http://sqlfiddle.com/#!9/1c5b1/14
Also about your note: What you tried to do can be achieved with lateral join or cross apply depending on the engine you are using.
You can rewrite your query using only joins. The trick is to join to the User tables once with an inner join to identify users within the same city and reference the Friendship table with a left join and a null check to identify non-friends.
SELECT
U1.ID,
U1.Name
FROM
USERS U1
INNER JOIN
USERS U2
ON
U1.CITY = U2.CITY
LEFT JOIN
FRIENDSHIP F
ON
U2.ID = F.ID AND
U1.ID = F.FRIEND_ID
WHERE
U2.id = X AND
U1.ID <> U2.id AND
F.id IS NULL
The above query doesn't handle the situation where USER x's primary key is in the FRIEND_ID column of the FRIENDSHIP table. I assume because your subquery version doesn't handle that situation, perhaps you create 2 rows for each friendship, or friendships are not bi-directional.
Joins and subqueries can be used to achieve similar results in some cases, but certainly not all. As an example, this query with a subquery could not be achieve vis-a-vis a join:
SELECT ID, COLUMN1, COUNT(*) FROM MYTABLE
WHERE ID IN (
SELECT DISTINCT ID FROM MYTABLE
WHERE COLUMN2 NOT IN (VALUES1, VALUES2)
)
GROUP BY ID;
This is only one example, but there are many.
Conversely, you cannot get information from another table by using a subquery without joining it.
As to your example
SELECT ID, NAME FROM USERS AS U
WHERE U.ID NOT IN (
SELECT FRIENDS_ID FROM FRIENDSHIP, USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
This could be constructed as:
select ID, NAME from users u
join FRIENDSHIP f on f.ID = u.ID
where u.ID = x
and u.ID != y
and CITY like '%A_CITY';
I changed your second x to a y assumptively, so it wouldn't cause confusion.
Of course, you may also want to LEFT JOIN aka LEFT OUTER JOIN if there is a chance that there may be multiple results in the FRIENDSHIP table.

How can I get records from one table which do not exist in a related table?

I have this users table:
and this relationships table:
So each user is paired with another one in the relationships table.
Now I want to get a list of users which are not in the relationships table, in either of the two columns (user_id or pair_id).
How could I write that query?
First try:
SELECT users.id
FROM users
LEFT OUTER JOIN relationships
ON users.id = relationships.user_id
WHERE relationships.user_id IS NULL;
Output:
This is should display only 2 results: 5 and 6. The result 8 is not correct, as it already exists in relationships. Of course I'm aware that the query is not correct, how can I fix it?
I'm using PostgreSQL.
You need to compare to both values in the on statement:
SELECT u.id
FROM users u LEFT OUTER JOIN
relationships r
ON u.id = r.user_id or u.id = r.pair_id
WHERE r.user_id IS NULL;
In general, or in an on clause can be inefficient. I would recommend replacing this with two not exists statements:
SELECT u.id
FROM users u
WHERE NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.user_id) AND
NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.pair_id);
I like the set operators
select id from users
except
select user_id from relationships
except
select pair_id from relationships
or
select id from users
except
(select user_id from relationships
union
select pair_id from relationships
)
This is a special case of:
Select rows which are not present in other table
I suppose this will be simplest and fastest:
SELECT u.id
FROM users u
WHERE NOT EXISTS (
SELECT 1
FROM relationships r
WHERE u.id IN (r.user_id, r.pair_id)
);
In Postgres, u.id IN (r.user_id, r.pair_id) is just short for:(u.id = r.user_id OR u.id = r.pair_id).
The expression is transformed that way internally, which can be observed from EXPLAIN ANALYZE.
To clear up speculations in the comments: Modern versions of Postgres are going to use matching indexes on user_id, and / or pair_id with this sort of query.
Something like:
select u.id
from users u
where u.id not in (select r.user_id from relationships r)
and u.id not in (select r.pair_id from relationships r)

Select without Group By Sql

I got some problems with this query i need all my selectors but i only want to group aspnet_Users.Userid how should i do this?
Here is my query:
SELECT
aspnet_Users.UserId, aspnet_Users.UserName,
Friends.Verified, Friends.FriendUserId
FROM
[aspnet_Users]
INNER JOIN
Friends ON Friends.UserId = aspnet_Users.UserId OR Friends.FriendUserId = aspnet_Users.UserId
WHERE
aspnet_Users.UserId IN
(SELECT UserId as Id FROM Friends
WHERE FriendUserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57'
UNION
SELECT FriendUserId as Id FROM Friends
WHERE UserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57')
GROUP BY
aspnet_Users.UserId, aspnet_Users.UserName,
Friends.Verified, Friends.FriendUserId
Stored Procedure structure:
Tables [aspnet_Users] and [Friends]
Columns [aspnet_Users.UserId], [aspnet_Users.UserName], [Friends.Verified], [Friends.FriendUserId]
That is the data that i need to get using the procedure, but the problem is that each user can have multiply friends which means if so there will be multiply Friends values. This is the reason why i can't group using those values cause it gives me dublicates with the wrong value.
Expected Output:
aspnet_Users.UserId (This is the user id of the friend, note that it doesn't mean that its the FriendUserId)
aspnet_Users.UserName (This is the UserName of the friend based on the UserId explained above)
Friends.Verified (True or False. The Verified value in the Friends table on the friendship)
Friends.FriendUserId (The FriendUserId value from the table Friends)
If your relationship is not fully reciprocal in the table Friends, you can use this query.
Otherwise either part of the UNION would work on its own
SELECT U.UserId, U.UserName, F.Verified, F.FriendUserId
FROM Friends F
JOIN aspnet_Users U ON U.UserId=F.UserId
WHERE F.FriendUserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57'
UNION
SELECT U.UserId, U.UserName, F.Verified, F.FriendUserId
FROM Friends F
JOIN aspnet_Users U ON U.UserId=F.FriendUserId
WHERE F.UserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57'
Do you need to group? You aren't using any aggregate functions (eg sum, count)
You haven't made it clear what results you're looking for - I'm guessing all details of users connected with a given user, including the user themselves?
Surely you can just do something like:
Select a.userid, a.username, f.verified, f.userid
From aspnet_users a
Full outer join friends f on f.frienduserid = a.userid
Where a.userid = #userid or f.frienduserid = #userid
You can use DISTINCT
SELECT DISTINCT
aspnet_Users.UserId, aspnet_Users.UserName,
Friends.Verified, Friends.FriendUserId
FROM
[aspnet_Users]
INNER JOIN
Friends ON Friends.UserId = aspnet_Users.UserId OR Friends.FriendUserId = aspnet_Users.UserId
WHERE
aspnet_Users.UserId IN
(SELECT UserId as Id FROM Friends
WHERE FriendUserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57'
UNION
SELECT FriendUserId as Id FROM Friends
WHERE UserId='3d1224ac-f2ad-45d4-aa84-a98e748e3e57')

How to do joins with conditions?

I always struggle with joins within Access. Can someone guide me?
4 tables.
Contest (id, user_id, pageviews)
Users (id, role_name, location)
Roles (id, role_name, type1, type2, type3)
Locations (id, location_name, city, state)
Regarding the Roles table -- type1, type2, type3 will have a Y if role_name is this type. So if "Regular" for role_name would have a Y within type1, "Moderator" for role-name would have a Y within type2, "Admin" for role_name would have a Y within type3. I didn't design this database.
So what I'm trying to do. I want to output the following: user_id, pageviews, role_name, city, state.
I'm selecting the user_id and pageviews from Contest. I then need to get the role_name of this user, so I need to join the Users table to the Contest table, right?
From there, I need to also select the location information from the Locations table -- I assume I just join on Locations.location_name = Users.location?
Here is the tricky part. I only want to output if type1, within the Roles table, is Y.
I'm lost!
As far as I can see, this is a query that can be built in the query design window, because you do not seem to need left joins or any other modifications, so:
SELECT Contest.user_id,
Contest.pageviews,
Roles.role_name,
Locations.city,
Locations.state
FROM ((Contest
INNER JOIN Users
ON Contest.user_id = Users.id)
INNER JOIN Roles
ON Users.role_name = Roles.role_name)
INNER JOIN Locations
ON Users.location = Locations.location_name
WHERE Roles.type1="Y"
Lots of parentheses :)
select *
from users u
inner join contest c on u.id = c.user_id and
inner join locations l on l.id = u.location and
inner join roles r on r.role_name = u.role_name
where r.type1 = 'Y'
This is assuming that location in users refers to the location id, if it is location name then it has to be joined to that column in locations table.
EDIT: The answer accepted is better, I did not consider that access needs parentheses.
Can you show what query you are currently using? Can't you just join on role_name and just ignore the type1, type2, type3? I am assuming there are just those 3 role_names available.
I know you didn't design it, but can you change the structure? Sometimes it's better to move to a sturdy foundation rather than living in the house that is about to fall on your head.
SELECT u.user_id, c.pageviews,
IIF(r.role_Name = "Moderator", r.type1 = Y,
IIF(r.role_name="Admin", r.type2="Y", r.type3="Y")),
l.location_name FROM users as u
INNER JOIN roles as r On (u.role_name = r.role_name)
INNER JOIN contest as c On (c.user_id = u.Id)
INNER JOIN locations as l On (u.location = l.location_name or l.id)
depending on whether the location in your user table is an id or the actual name reference.
I think I need to see some sample data....I do not understand the relationship between Users and Roles because there is a field role_name within the Users table, and how does that relate the the Roles Table?
EDIT NOTE Now using SQL Explicit Join Best Practice
SELECT
C.user_id
, C.pageviews
, U.role_name
, L.city
, L.state
FROM
Contest C
INNER JOIN Users U ON C.user_id = U.id
INNER JOIN Locations L ON U.location = L.id
INNER JOIN Roles R ON U.role_name = R.role_name
WHERE
R.type1='Y'

joining two tables

I have users table. There are three other tables: developers, managers, testers. All of these tables have a foreign key user_id.
I need to find all the users who are either developer or manager. What the sql will look like?
Update: Someone can be both a developer and a manager.
One way to do it would be
SELECT u.*, 'Developer'
FROM users u
INNER JOIN developer d ON d.user_id = u.user_id
UNION ALL
SELECT u.*, 'Manager'
FROM users u
INNER JOIN manager m ON m.user_id = u.user_id
SELECT *
FROM users u
WHERE EXISTS
(
SELECT NULL
FROM developers
WHERE user_id = u.id
UNION ALL
SELECT NULL
FROM managers
WHERE user_id = u.id
)
SELECT u.*,
CASE d.user_id IS NULL THEN 'N' ELSE 'Y' END is_developer,
CASE m.user_id IS NULL THEN 'N' ELSE 'Y' END is_manager
FROM users u -- all users
LEFT JOIN developers d -- perhaps a developer
ON u.user_id = d.user_id
LEFT JOIN manager m -- perhaps a manager
ON u.user_id = m.user_id
WHERE d.user_id IS NOT NULL -- either a developer
OR m.user_id IS NOT NULL -- or a manager (or both)
SELECT
user_id
/* ...other desired columns from the user table... */
FROM
user
WHERE
user_id IN (SELECT user_id FROM developer UNION SELECT user_id FROM manager)
Here I am using IN rather than EXISTS so that the developer and manager tables only need to be queried one time. It's possible that the optimizer may do this anyway, but this makes it explicit.
Also, this solution does not return duplicates for users who are both managers and developers.