SQL Server left join same table twice I'm getting duplicate rows

SQL Server left join same table twice I'm getting duplicate rows - sql

I have this query:
SELECT
u.UserId,
up.PhoneNumber AS OfficePhoneNumber,
up2.PhoneNumber
FROM
[OnlineTools].[App].[User] AS u
LEFT JOIN
[UserPhone] AS up ON up.UserId = u.UserId
AND up.PhoneType = 'Work'
LEFT JOIN
[UserPhone] AS up2 ON up2.UserId = u.UserId
AND up2.PhoneType = 'Mobile'
Expected result is three records returned and its correct when I left join one UserPhone table only.
When I'm joining the same table again to get the mobile phones, instead of three records I'm getting 18 records.
What I can improve here in order to get the correct records?

You are getting a cross-join (or Cartesian result). For each entry in the "Work" condition, it is getting all entries in the "Mobile" since they are left-joins. Then, the inverse is true while getting all Mobile, joining to the work.
I would suggest joining ONCE to the given phone table and filter on just the two types, but add a column to SHOW what type it was... Something like
SELECT
u.UserId,
up.PhoneNumber,
up.PhoneType
FROM
OnlineTools.App.User u
JOIN UserPhone up
ON u.UserId = up.UserId
AND up.PhoneType in ( 'Work', 'Mobile' )
This way, it is a single entry retrieved, AND that specific type is the column to show if work or mobile. Now, if you want the work phone to be listed in the first position, just add
order by
up.PhoneType DESC
Which will result in all WORK phones first, then any mobile... but if there are no work phones, then only mobile will show (or vice-versa).

Assuming that for each user there is 1 work phone and 1 mobile phone, you could use conditional aggregation instead of the 2 joins:
SELECT u.UserId,
MAX(CASE WHEN up.PhoneType = 'Work' THEN up.PhoneNumber END) AS OfficePhoneNumber,
MAX(CASE WHEN up.PhoneType = 'Mobile' THEN up.PhoneNumber END) AS MobilePhoneNumber
FROM [OnlineTools].[App].[User] AS u LEFT JOIN [UserPhone] AS up
ON up.UserId = u.UserId
GROUP BY u.UserId;

Related

Only way to write this SQL JOIN question?

I wrote this sql query and it seems to work great but im not sure if it is the correct way to write it or if there is another better way to write it:
SELECT
art.artid, users.userid
FROM
art LEFT JOIN users
ON
art.userid = users.userid
WHERE
(SELECT COUNT(1) FROM art WHERE art.userid = users.userid) > 5 AND
users.active = '1' AND
art.active = '1' AND
art.status = '0' AND
art.pricesek > 0 GROUP BY users.userid ORDER BY RAND()
It gets the users from users table that are active and has 5 or more artworks in the art table. It also checks to see that artwork is active, status of artwork is set to 0 "for sale" and price is more then 0. Then it groups results by userid in a random order.
Is this the correct way to write this or is there another way.
"All input is hardcoded so no userinput will be sent into database, so not worried about injections (should i be worried even if its hardcoded?)."

I made a small change in your code. Instead of using (SELECT COUNT (1) FROM art WHERE art.userid = users.userid)> 5 I put it in Having clause.
SELECT art.artid, users.userid
FROM art LEFT JOIN users ON art.userid = users.userid
WHERE users.active = '1' AND art.active = '1' AND
art.status = '0' AND art.pricesek > 0
GROUP BY users.userid, art.artid
HAVING COUNT(users.userid) > 5
ORDER BY RAND()

Your query has problems at many levels. The most obvious is that the GROUP BY clause is inconsistent with the SELECT. That should be generating an error.
It gets the users from users table that are active and has 5 or more artworks in the art table.
I would instead suggest aggregating the art table before joining:
SELECT u.userid
FROM users u JOIN
(SELECT a.userid, COUNT(*) as cnt
FROM art a
WHERE a.active = 1 AND
a.status = 0 AND
a.pricesek > 0
GROUP BY a.userid
) a
ON a.userid = u.userid
WHERE a.cnt > 5 AND u.active = 1
ORDER BY RAND();
Notes:
LEFT JOIN is not appropriate. In order to count the number of artworks, the JOIN must find at least 1 (really 6) matching rows.
It makes no sense to return a.artid. If you need an example, you could use min(a.artid) in the subquery. If you want all of them, then you would need to specify how to return them, but a JSON, array, or string aggregation function would be used n the subquery.
The values "1" and "0" look like numbers, so I removed the single quotes, so I assume the columns are numbers. Compare numbers to numbers and strings to strings. Try to avoid mixing the two.

Include 0 in count(*) SQL query

I have two entities, User and MaBase. MaBase contains user_id and status. I want to get the count of status by user, I also want to show a 0 for any status values where the user doesn't have a record.
I created the below query using count, but it only returns non-null values. How I can solve this:
SELECT status, COUNT(*)
FROM ma_base
WHERE ma_base.user_id = 5
GROUP BY status
I have 5 types of status values. If a user only has ma_base records for 4 of them, I still want to see a 0 value for the 5th status.

It's not every day I get to write a CROSS JOIN:
SELECT u.ID, s.status,
coalesce((SELECT COUNT(*) FROM ma_base m WHERE m.User_Id = u.ID and m.status = s.Status),0) As Status_Count
FROM User u
CROSS JOIN (SELECT DISTINCT status FROM MA_Base) s
WHERE u.ID = 5
OR:
SELECT u.ID, s.status, COALESCE(COUNT(m.status), 0) AS Status_Count
FROM User u
CROSS JOIN (SELECT DISTINCT status FROM MA_Base) s
LEFT JOIN MA_Base m ON m.User_Id = u.ID AND m.status = s.status
WHERE u.ID = 5
GROUP BY u.ID, s.status
In a nutshell, we first need to create a projection for the user with every possible status value, to anchor the result records for your "missing" statuses. Then we can JOIN or do a correlated subquery to get your desired results.
For the JOIN option, note the expression in the COUNT() function. It's important; COUNT(*) won't do what you want. For both options, note the use of COALESCE() to put the expected result in for NULL.
If you have a separate table defining your status values, use that instead of deriving them from ma_base.

Too much Data using DISTINCT MAX

I want to see the last activity each individual handset and the user that used that handset. I have a table UserSessions that stores the last activity of a particular user as well as what handset they used in that activity. There are roughly 40 handsets, yet I always get back way too many records, like 10,000 rows when I only want the last activity of each handset. What am I doing wrong?
SELECT DISTINCT MAX(UserSessions.LastActivity), Handsets.Name,Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE
Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY UserSessions.LastActivity, Handsets.Name,Users.Username
I expect to get one record per handset of the users last activity with that handset. What I get is multiple records on all handsets and dates over 10000 rows

You typically GROUP BY the same columns as you SELECT, except those who are arguments to set functions.
This GROUP BY returns no duplicates, so SELECT DISTINCT isn't needed.
SELECT MAX(UserSessions.LastActivity), Handsets.Name, Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY Handsets.Name, Users.Username

There is no such thing as DISTINCT MAX. You have SELECT DISTINCT which ensures that all columns referenced in the SELECT are not duplicated (as a group) across multiple rows. And there is MAX() an aggregation function.
As a note: SELECT DISTINCT is almost never appropriate with GROUP BY.
You seem to want:
SELECT *
FROM (SELECT h.Name, u.Username, MAX(us.LastActivity) as last_activity,
RANK() OVER (PARTITION BY h.Name ORDER BY MAX(us.LastActivity) desc) as seqnum
FROM UserSessions us JOIN
Handsets h
ON h.HandsetId = us.HandsetId INNER JOIN
Users u
ON u.UserId = us.UserId
WHERE h.Name in (1000,1001.1002,1003,1004....) AND
h.Deleted = 0
GROUP BY h.Name, u.Username
) h
WHERE seqnum = 1

Using binary logic in PostgreSQL JOIN queries

I've got 3 tables that look vaguely like this:
Users
----------
UserID
Name
Phone
User Groups
-----------
GroupID
Activity
Group Membership
---------------
UserID
GroupID
Independent Actives
-------------------
UserID
Activity
The idea is that a user can perform an activity either as part of a group or on their own. What I want to do is return all the people that partake in a certain activity. What I have been able to write so far lets me return all the users which are in groups that undertake that activity. What I want to add to this is the ability to see the people that do the activity independently. This is what I have so far:
SELECT
users.name, users.phone, user_groups.activity
FROM users
INNER JOIN group_membership ON group_membership.userID = users.userID
INNER JOIN user_groups ON user_groups.groupID = group_membership.groupID
WHERE user_groups.activity = 'Knitting';
The above bit works fine and it shows all of the users that are part of groups that do knitting, but I also want it to show all the users that are knitting independently. This is what I have attempted to add:
SELECT
users.name, users.phone, user_groups.activity
FROM users
INNER JOIN group_membership ON group_membership.userID = users.userID
INNER JOIN user_groups ON user_groups.groupID = group_membership.groupID
INNER JOIN independent_activity ON independent_activity.userID = users.userID
WHERE user_groups.activity = 'Knitting' OR independent_activity.activity = 'Knitting';
The problem here is the syntax, I understand the algorithm that I'm trying to do but I don't know how to transfer it into sql and so any help is appreciated.

You could use a UNION in this case
SELECT users.NAME
,users.phone
,user_groups.activity
FROM users
INNER JOIN group_membership ON group_membership.userID = users.userID
INNER JOIN user_groups ON user_groups.groupID = group_membership.groupID
WHERE user_groups.activity = 'Knitting'
UNION
SELECT users.NAME
,users.phone
,independent_activity.activity
FROM users
INNER JOIN independent_activity ON independent_activity.userID = users.userID
WHERE independent_activity.activity = 'Knitting';
You also might want to lookup the differences between a UNION and a UNION ALL and decide the one that suites your requirement.

You've got a working answer from SoulTrain. However, for completeness sake I'd like to mention that you don't have to join all those tables. (You could use outer joins here and remove duplicate matches with DISTINCT, but that's not necessary. You don't have to query the users table twice either. And you don't need UNION for doing the distinct job.)
Simply select from the one table you want to display data from, i.e. the users table, and then use EXISTS or IN to get only those users that are either in one set or another.
select name, phone
from users
where userid in
(
select userid
from independent_actives
where activity = 'Knitting'
)
or userid
(
select userid
from group_membership
where groupid in (select groupid from user_groups where activity = 'Knitting')
)

how can i eliminate duplicates in gridview?

i am retrieving data from three tables for my requirement so i wrote the following query
i was getting correct result but the problem is records are repeated whats the problem in
that query. i am binding result of query to grid view control. please help me
SELECT DISTINCT (tc.coursename), ur.username, uc. DATE, 'Paid' AS Status
FROM tblcourse tc, tblusereg ur, dbo.UserCourse uc
WHERE tc.courseid IN (SELECT ur1.courseid
FROM dbo.UserCourse ur1
WHERE ur1.userid = #userid)
AND ur.userid = #userid
AND uc. DATE IS NOT NULL
AND ur.course - id = uc.course - id

There is no JOIN between tblcourse tc,tblusereg ur. So you get a cross join despite the IN (which is actually a JOIN)
DISTINCT works on the whole row too: not one column.
Note: you mention dbo.UserCourse twice but use different column names courseid and [course-id]
Rewritten with JOINs.
select distinct
tc.coursename, ur.username, uc.[date], 'Paid' as [Status]
from
dbo.tblcourse tc
JOIN
dbo.tblusereg ur ON tc.courseid = ur.[course-id]
JOIN
dbo.UserCourse uc ON ur.[course-id] = uc.[course-id]
where
ur.userid=#userid
and
uc.[date] is not null
This may fix your problem...

Change that first part of your query
select distinct (tc.coursename),
TO
select distinct tc.coursename,
to make all the columns distinct not just tc.coursename

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server left join same table twice I'm getting duplicate rows - sql

Related

Only way to write this SQL JOIN question?

Include 0 in count(*) SQL query

Too much Data using DISTINCT MAX

Using binary logic in PostgreSQL JOIN queries

how can i eliminate duplicates in gridview?

Categories

Resources