Performance of JOIN operation - sql

Lets suppose I have two huge tables
users(user_id, name, country)
location(id, user_id, country, city)
If i want to fetch for particular country
1.
select * from users u
inner join location l on u.user_id = l.user_id
where l.country = 'heaven'
2.
select * from users u
inner join location l on u.user_id = l.user_id
where u.country = 'heaven'
3.
select * from users u
inner join location l on u.user_id = l.user_id
where u.country = 'heaven' and l.country = 'heaven'
Which of the three would be better approach?
and suppose the result of filtering data by country would be
users table has 1000 record with country='heaven'
location table has 2 record with country='heaven'
What will be performance now?

Your queries would only return the same results, if one of the following is true:
location records did not exist for all users.
users has duplicate user_ids.
It is unclear which. And, having country in both tables suggests that something is wrong with your data model.
That said, if you need both filtering by the join and filtering on the country, I would recommend:
select *
from users u join
locations l
on u.user_id = l.user_id
where l.country = 'heaven';
And you want an index on locations(country, user_id). This should find the two records in locations and look up the corresponding values in users. That seems like the fastest way to do what you want.

Related

Trying to count the number of occurences that 3 columns from 2 tables have on my organizations table? I need the occurrences joined in one table

-- 2. In one table, show how many private topics, admins, and standard users each organization has.
SELECT organizations.name, COUNT(topics.privacy) AS private_topic, COUNT(users.type) AS user_admin, COUNT(users.type) AS user_standard
FROM organizations
LEFT JOIN topics
ON organizations.id=topics.org_id
AND topics.privacy='private'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='admin'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='standard'
GROUP BY organizations.name
;
org_id is the foreign key that reals both the users table and topics table. It keeps giving me the wrong result by only either counting the number of admins or standard users and putting that for all rows in the each column. Any help is really appreciated as I have been stuck on this for a while now!
So, I am getting an error when I do as you said which is that the users table cannot be specified more than once. I updated the code to how you said to write it but still nothing. They really don't give me any sample data either but I just made some queries and saw the number of times there are private topics for example, which is in the privacy column of the topics table. When I dont get this error as I said, the joins seem to overwrite themselves where each row for all the columns is the same as the last join.
It appears to me that topics and users have no relationship. You're just trying to get the result together in a single query. There are other and possibly better ways to accomplish that but I think this will fix what you've got already (assuming you have id columns for each table.)
SELECT
organizations.name,
COUNT(DISTINCT topics.id) AS private_topic,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'admin') AS user_admin,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'standard') AS user_standard`
FROM organizations
LEFT JOIN topics
ON organizations.id = topics.org_id AND topics.privacy = 'private'
LEFT JOIN users
ON users.org_id = organizations.id
GROUP BY organizations.name;
I propose this as a more straightforward way:
SELECT
min(o.name) as "name",
(
select count(*) from topics t
where t.org_id = o.id AND t.privacy = 'private'
) as private_topics,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'admin'
) AS user_admin,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'standard'
) AS user_standard
FROM organizations o
GROUP BY o.id;

SQL Where on different table

SELECT * FROM student_mentor sm INNER JOIN users u
ON sm.student_id = u.user_id
WHERE sm.teacher_id = $teacher_id
Teacher_id being the session id,
I want to see all the students that have the same mentor.
Right now if I run this I just see all of the students twice, maybe one of you knows why?
My db scheme
You are not specifying on which columns you want to do the join, so you're getting a cross reference where all records are joined to all records.
You should do something like (not sure about your column names):
SELECT * FROM student_mentor sm INNER JOIN users u
ON sm.student_id = u.user_id
WHERE sm.teacher_id = $teacher_id

Is it true that JOINS can be used everywhere to replace Subqueries in SQL

I heard people saying that table joins can be used everywhere to replace sub-queries. I tested it in my query, but found that appropriate data set was only retrieved when I used sub-queries. I was not able to get same data set using joins. I am not sure if what I found is right because I am a newcomer in RDBMS, thus not so much experienced. I will try to draw the schema (in words) of the database in which I was experimenting:
The database has two tables:
Users (ID, Name, City) and Friendship (ID, Friend_ID)
Goal: Users table is designed to store simple user data and Friendship table represents Friendship between users. Friendship table has both the columns as foreign keys, referencing to Users.ID. Tables have many-to-many relationship between them.
Question: I have to retrieve Users.ID and Users.Name of all the Users, which are not friends with a particular user x, but are from same city (much like fb's friend suggestion system).
By using subquery, I am able to achieve this. Query looks like:
SELECT ID, NAME
FROM USERS AS U
WHERE U.ID NOT IN (SELECT FRIENDS_ID
FROM FRIENDSHIP,
USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
Example entries:
Users
Id = 1 Name = Jon City = Mumbai
Id=2 Name=Doe City=Mumbai
Id=3 Name=Arun City=Mumbai
Id=4 Name=Prakash City=Delhi
Friendship
Id= 1 Friends_Id = 2
Id = 2 Friends_Id=1
Id = 2 Friends_Id = 3
Id = 3 Friends_Id = 2
Can I get the same data set in a single query by performing joins. How? Please let me know if my question is not clear. Thanks.
Note: I used inner join in the sub-query by specifying both tables: Friendship, Users. Omitting the Users table and using the U from outside, gives an error (But if not using alias for the table Users, query becomes syntactically okay but result from this query includes ID's and names of users, who have more than one friends, including the user having ID x. Interesting, but is not the topic of the question).
For not in you can use left join and check for is null:
select u.id, u.name
from Users u
left join Friends f on u.id = f.id and f.friend_id = #person
where u.city like '%city%' and f.friend_id is null and u.id <> #person;
There are some cases where you can't work out your way with just inner/left/right joins, but your case is not one of them.
Please check sql fiddle: http://sqlfiddle.com/#!9/1c5b1/14
Also about your note: What you tried to do can be achieved with lateral join or cross apply depending on the engine you are using.
You can rewrite your query using only joins. The trick is to join to the User tables once with an inner join to identify users within the same city and reference the Friendship table with a left join and a null check to identify non-friends.
SELECT
U1.ID,
U1.Name
FROM
USERS U1
INNER JOIN
USERS U2
ON
U1.CITY = U2.CITY
LEFT JOIN
FRIENDSHIP F
ON
U2.ID = F.ID AND
U1.ID = F.FRIEND_ID
WHERE
U2.id = X AND
U1.ID <> U2.id AND
F.id IS NULL
The above query doesn't handle the situation where USER x's primary key is in the FRIEND_ID column of the FRIENDSHIP table. I assume because your subquery version doesn't handle that situation, perhaps you create 2 rows for each friendship, or friendships are not bi-directional.
Joins and subqueries can be used to achieve similar results in some cases, but certainly not all. As an example, this query with a subquery could not be achieve vis-a-vis a join:
SELECT ID, COLUMN1, COUNT(*) FROM MYTABLE
WHERE ID IN (
SELECT DISTINCT ID FROM MYTABLE
WHERE COLUMN2 NOT IN (VALUES1, VALUES2)
)
GROUP BY ID;
This is only one example, but there are many.
Conversely, you cannot get information from another table by using a subquery without joining it.
As to your example
SELECT ID, NAME FROM USERS AS U
WHERE U.ID NOT IN (
SELECT FRIENDS_ID FROM FRIENDSHIP, USERS
WHERE USERS.ID = FRIENDSHIP.ID AND USERS.ID = x)
AND U.ID != x AND CITY LIKE '% A_CITY%';
This could be constructed as:
select ID, NAME from users u
join FRIENDSHIP f on f.ID = u.ID
where u.ID = x
and u.ID != y
and CITY like '%A_CITY';
I changed your second x to a y assumptively, so it wouldn't cause confusion.
Of course, you may also want to LEFT JOIN aka LEFT OUTER JOIN if there is a chance that there may be multiple results in the FRIENDSHIP table.

Select all entities which do not have children -- getting a lot of rows back, is this correct?

I've got 84,000 rows in my Users table. Users are created automatically. So, I thought it would be nice to see how many users actually did anything after being created. I wrote this query:
SELECT COUNT(*) FROM Users u
JOIN Folders f ON UserId = u.Id
JOIN Playlists p ON FolderId = f.Id
WHERE 0 = (SELECT COUNT(*) FROM PlaylistItems WHERE PlaylistId = p.Id)
My intent is to only count users which have no playlist items in any of their playlists. This query returned 74,000 results which seems high.
I'm wondering if this query is selecting all users which have at least one playlist with no items in it. That is, if a user has two playlists -- one empty and one populated -- are they still counted in my query? And, if so, how can I modify it to select only users which have only empty playlists.
If that's vastly more difficult then I might try my hand at counting only users with 1 playlist which is empty.
The database structure is:
Many users. 1:1 user:folder, 1:many folder:playlists, 1:many playlists:playlistItems
A better pattern than counting every single playlist and comparing is simply finding all the users who don't have anything in any playlist. I like NOT EXISTS for this:
SELECT COUNT(u.Id)
FROM dbo.Users AS u
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.PlayLists AS pl
INNER JOIN dbo.PlayListItems AS pli
ON pl.id = pli.PlayListID
INNER JOIN dbo.Folders AS f
ON p.FolderID = f.ID
WHERE f.UserID = u.Id
);
As an aside, calling a column Id in its primary table and something else everywhere else might seem like a good idea, but I find it quite confusing. Why isn't a FolderID called a FolderID everywhere in the data model?
Break down your query:
SELECT u.id, COUNT(*) FROM Users u
JOIN Folders f ON UserId = u.Id
JOIN Playlists p ON FolderId = f.Id
join PlaylistItems on PlaylistId = p.Id
group by u.id
This should provide you with a list of all users and the count of the number of rows in playlists by userID. a couple ways to go...
Take a count of all users not in that list:
select count(*) from users where id not in (SELECT u.id FROM Users u
JOIN Folders f ON UserId = u.Id
JOIN Playlists p ON FolderId = f.Id
join PlaylistItems on PlaylistId = p.Id
group by u.id)
MySQL performs poorly on that...same thing using left join:
select count(*)
from users u left join (SELECT u.id, COUNT(*) FROM Users u
JOIN Folders f ON UserId = u.Id
JOIN Playlists p ON FolderId = f.Id
join PlaylistItems on PlaylistId = p.Id
group by u.id)a
on a.id = u.id
where a.id is null

How to do joins with conditions?

I always struggle with joins within Access. Can someone guide me?
4 tables.
Contest (id, user_id, pageviews)
Users (id, role_name, location)
Roles (id, role_name, type1, type2, type3)
Locations (id, location_name, city, state)
Regarding the Roles table -- type1, type2, type3 will have a Y if role_name is this type. So if "Regular" for role_name would have a Y within type1, "Moderator" for role-name would have a Y within type2, "Admin" for role_name would have a Y within type3. I didn't design this database.
So what I'm trying to do. I want to output the following: user_id, pageviews, role_name, city, state.
I'm selecting the user_id and pageviews from Contest. I then need to get the role_name of this user, so I need to join the Users table to the Contest table, right?
From there, I need to also select the location information from the Locations table -- I assume I just join on Locations.location_name = Users.location?
Here is the tricky part. I only want to output if type1, within the Roles table, is Y.
I'm lost!
As far as I can see, this is a query that can be built in the query design window, because you do not seem to need left joins or any other modifications, so:
SELECT Contest.user_id,
Contest.pageviews,
Roles.role_name,
Locations.city,
Locations.state
FROM ((Contest
INNER JOIN Users
ON Contest.user_id = Users.id)
INNER JOIN Roles
ON Users.role_name = Roles.role_name)
INNER JOIN Locations
ON Users.location = Locations.location_name
WHERE Roles.type1="Y"
Lots of parentheses :)
select *
from users u
inner join contest c on u.id = c.user_id and
inner join locations l on l.id = u.location and
inner join roles r on r.role_name = u.role_name
where r.type1 = 'Y'
This is assuming that location in users refers to the location id, if it is location name then it has to be joined to that column in locations table.
EDIT: The answer accepted is better, I did not consider that access needs parentheses.
Can you show what query you are currently using? Can't you just join on role_name and just ignore the type1, type2, type3? I am assuming there are just those 3 role_names available.
I know you didn't design it, but can you change the structure? Sometimes it's better to move to a sturdy foundation rather than living in the house that is about to fall on your head.
SELECT u.user_id, c.pageviews,
IIF(r.role_Name = "Moderator", r.type1 = Y,
IIF(r.role_name="Admin", r.type2="Y", r.type3="Y")),
l.location_name FROM users as u
INNER JOIN roles as r On (u.role_name = r.role_name)
INNER JOIN contest as c On (c.user_id = u.Id)
INNER JOIN locations as l On (u.location = l.location_name or l.id)
depending on whether the location in your user table is an id or the actual name reference.
I think I need to see some sample data....I do not understand the relationship between Users and Roles because there is a field role_name within the Users table, and how does that relate the the Roles Table?
EDIT NOTE Now using SQL Explicit Join Best Practice
SELECT
C.user_id
, C.pageviews
, U.role_name
, L.city
, L.state
FROM
Contest C
INNER JOIN Users U ON C.user_id = U.id
INNER JOIN Locations L ON U.location = L.id
INNER JOIN Roles R ON U.role_name = R.role_name
WHERE
R.type1='Y'