SQL report duplicated rows into count into count - sql

First of all let me tell you I have no experience in SQL whatsoever, however I changed my positions lately and given the situation it'd be easier for me to run some script then check each record in the application individually. Here's the scenario:
I have two tables:
Users with userID, username, email etc.. and
Documents with DocumentID and UserID, document name and again some other columns.
I want to create a report that will help me check if users have documents attached to their profile.
What I am doing now is
SELECT UsersTable.UserID,
DocumentsTable.DocumentID,
DocumentsTable.UserID
FROM UserTable
LEFT JOIN DocumentTable ON UserTable.UserID = DocumentTable.UserID
The problem I am having is that, some users already have 2 or more documents attached to their profile, this is causing a duplication.
For example, in the report I see such rows
User1 DocumentA
User2 DocumentA
User2 DocumentB
User2 DocumentC
User3 DocumentA
etc.
Is there a way to somehow convert those document to values count based on the UserID? so instead I'd like to see
User1 1
User2 3
User3 1

You are looking for GROUP BY. I would recommend writing the query as:
SELECT ut.UserID, COUNT(dt.UserID)
FROM UserTable ut LEFT JOIN
DocumentTable dt
ON ut.UserID = dt.UserID
GROUP BY ut.UserID
ORDER BY ut.UserID;
Notes:
The use of table aliases makes the query easier to write and to read.
The ORDER BY guarantees that the results are ordered by the user id.
The COUNT() is based on the second table, because there might not be a match.

Related

group_concat multiple rows where given condition

I have 3 tables (User, User_usergroups , usergroup). I would like to get a list of user who has usergroup equal to "member" and other groups (group_concat) belonging to this user. How do I do that?
I got the first part of the query, but I could not get other groups, aggregated groups of this user (I like to use group_concat to concatenate other groups in one field.
SELECT user.userId, (group_concat(group_.name)???)
FROM User_ user join Users_UserGroups userGroups
on user.userId= userGroups.userId
join UserGroup group_ on userGroups.userGroupId = group_.userGroupId
WHERE group_.name="member";
Assume that is the 3 tables
The outcome will be
Many thanks for your great help.
It looks like my query is so complex and it has down vote from the administrator or someone. I wish whoever did this could add their comment to my query so that, in the future, I will know what I need to do to clarify my question to the community or make my question better. Here is the answer to my question.
First retrieve the list of userId.
Then join this "member" table with other tables to find the result from the list of retrieved ID (step 1).
As we aggregate the name of the groups, we need to use groupby.
Keywords to resolve this problem is to use an alias for subquery or nested SQL
select member.userid, member.screenName, group_concat(member.name)
from (SELECT User_.userid userid, User_.screenName screenName , UserGroup.name name
FROM UserGroup
JOIN Users_UserGroups
ON UserGroup.userGroupId = Users_UserGroups.usergroupid
JOIN User_
on Users_UserGroups.userId = User_.userId
WHERE UserGroup.name = "member")member
JOIN Users_UserGroups
ON member.userid = Users_UserGroups.userId
JOIN UserGroup
on UserGroup.userGroupId=Users_UserGroups.userGroupId
GROUP BY member.userid;

How to select records from database table which has to user id (created_by_user, given_to_user) and replace users id by usernames?

This is task table:
This is user table:
I want to select user tasks.
I would give from backend ("given_to_user) id.
But The thing is I want that SELECTED data would have usernames instead of Id which is (created_by_user and given_to_user).
SELECTED table would look like this.
Example:
How to achieve what I want?
Or maybe I designed poorly my tables that It is difficult to select data I need? :)
task table has to id values that are foreign keys to user table.
I tried many thinks but couldn't get desired result.
You did not design poorly the tables.
In fact this is common practice to store the ids that reference columns in other tables. You just need to learn to implement joins:
SELECT
task.id, task.title, task.information, user.usename AS created_by, user2.usename AS given_to
FROM
(task INNER JOIN user ON task.created_by_user = user.id)
INNER JOIN user AS user2 ON task.created_by_user = user2.id;
Do you just want two joins?
select t.*, uc.username as created_by_username,
ug.username as given_to_username
from task t left join
users uc
on t.created_by_user = uc.id left join
users ug
on t.given_to_user = ug.id;
This uses left join in case one of the user ids is missing.

Designing A Database For User Friendship

I would like to store friendships in a database. My idea is that when user1 becomes friends with user2 I store that friendship so that I can get all of either user's friends if I ever need it. At first I thought I would just store their id's in a table with one insert, but then I thought about some complications while querying the db.
If I have 2 users that have a user id of 10 and 20 should I make two inserts into the db when they become friends
ID USER1 USER2
1 10 20
2 20 10
or is there a way to query the db to only get a particular users friends if I only did one insert like so
ID USER1 USER2
1 10 20
I know the first way can definitely give me what I am looking for but I would like to know if this is good practice and if there is a better alternative. And if the second way can be queried to get me the result I would be looking for like all of user 10's friends.
Brad Christie's suggestion of querying the table in both directions is good.
However, given that MySQL isn't very good at optimizing OR queries, using UNION ALL might be more efficient:
( SELECT u.id, u.name
FROM friendship f, user u
WHERE f.user1 = 1 AND f.user2 = u.id )
UNION ALL
( SELECT u.id, u.name
FROM friendship f, user u
WHERE f.user2 = 1 AND f.user1 = u.id )
Here's a SQLFiddle of it, based on Brad's example. I modified the friendship table to add two-way indexes for efficient access, and to remove the meaningless id column. Of course, with such a tiny example you can't really test real-world performance, but comparing the execution plans between the two versions may be instructive.
A friendship is a two-way bond (for all intents and purposes). Unlike another link (like a message that's one-way) a friendship should only have one entry. However, what you're seeing is correct; you would need to query against both columns to get a user's friends, but that's simple enough:
-- The uses of `1` below is where you'd insert the ID of
-- the person you're looking up friends on
SELECT u.id, u.name
FROM friendship f
LEFT JOIN user u
ON (u.id = f.user1 OR u.id = f.user2)
AND u.id <> 1
WHERE (f.user1 = 1 OR f.user2 = 1)
example here

Issues with subqueries for stored procedure

The query I am trying to perform is
With getusers As
(Select userID from userprofspecinst_v where institutionID IN
(select institutionID, professionID from userprofspecinst_v where userID=#UserID)
and professionID IN
(select institutionID, professionID from userprofspecinst_v where userID=#UserID))
select username from user where userID IN (select userID from getusers)
Here's what I'm trying to do. Given a userID and a view which contains the userID and the ID of their institution and profession, I want to get the list of other userID's who also have the same institutionID and and professionID. Then with that list of userIDs I want to get the usernames that correspond to each userID from another table (user). The error I am getting when I try to create the procedure is, "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.". Am I taking the correct approach to how I should build this query?
The following query should do what you want to do:
SELECT u.username
FROM user AS u
INNER JOIN userprofspecinst_v AS up ON u.userID = up.userID
INNER JOIN (SELECT institutionID, professionID FROM userprofspecinst_v
WHERE userID = #userID) AS ProInsts
ON (up.institutionID = ProInsts.institutionID
AND up.professionID = ProInsts.professionID)
Effectively the crucial part is the last INNER JOIN statement - this creates a table constituting the insitutionsids and professsionids the user id belongs to. We then get all matching items in the view with the same institution id and profession id (the ON condition) and then link these back to the user table on the corresponding userids (the first JOIN).
You can either run this for each user id you are interested in, or JOIN onto the result of a query (your getusers) (it depends on what database engine you are running).
If you aren't familiar with JOIN's, Jeff Atwood's introductory post is a good starting place.
The JOIN statement effectively allows you to explot the logical links between your tables - the userId, institutionID and professionID are all examples of candidates for foreign keys - so, rather than having to constantly subquery each table and piece the results together, you can link all the tables together and filter down to the rows you want. It's usually a cleaner, more maintainable approach (although that is opinion).

FIRST ORDER BY ... THEN GROUP BY

I have two tables, one stores the users, the other stores the users' email addresses.
table users: (userId, username, etc)
table userEmail: (emailId, userId, email)
I would like to do a query that allows me to fetch the latest email address along with the user record.
I'm basically looking for a query that says
FIRST ORDER BY userEmail.emailId DESC
THEN GROUP BY userEmail.userId
This can be done with:
SELECT
users.userId
, users.username
, (
SELECT
userEmail.email
FROM userEmail
WHERE userEmail.userId = users.userId
ORDER BY userEmail.emailId DESC
LIMIT 1
) AS email
FROM users
ORDER BY users.username;
But this does a subquery for every row and is very inefficient. (It is faster to do 2 separate queries and 'join' them together in my program logic).
The intuitive query to write for what I want would be:
SELECT
users.userId
, users.username
, userEmail.email
FROM users
LEFT JOIN userEmail USING(userId)
GROUP BY users.userId
ORDER BY
userEmail.emailId
, users.username;
But, this does not function as I would like. (The GROUP BY is performed before the sorting, so the ORDER BY userEmail.emailId has nothing to do).
So my question is:
Is it possible to write the first query without making use of the subqueries?
I've searched and read the other questions on stackoverflow, but none seems to answer the question about this query pattern.
But this does a subquery for every row and is very inefficient
Firstly, do you have a query plan / timings that demonstrate this? The way you've done it (with the subselect) is pretty much the 'intuitive' way to do it. Many DBMS (though I'm not sure about MySQL) have optimisations for this case, and will have a way to execute the query only once.
Alternatively, you should be able to create a subtable with ONLY (user id, latest email id) tuples and JOIN onto that:
SELECT
users.userId
, users.username
, userEmail.email
FROM users
INNER JOIN
(SELECT userId, MAX(emailId) AS latestEmailId
FROM userEmail GROUP BY userId)
AS latestEmails
ON (users.userId = latestEmails.userId)
INNER JOIN userEmail ON
(latestEmails.latestEmailId = userEmail.emailId)
ORDER BY users.username;
If this is a query you do often, I recommend optimizing your tables to handle this.
I suggest adding an emailId column to the users table. When a user changes their email address, or sets an older email address as the primary email address, update the user's row in the users table to indicate the current emailId
Once you modify your code to do this update, you can go back and update your older data to set emailId for all users.
Alternatively, you can add an email column to the users table, so you don't have to do a join to get a user's current email address.