sql select where column is a count - sql

I have a table with users, and I have another table with activity, the user who had the activity is logged in a column. how could I make a query so that I can select each user with the count of activities they have.
I really can't think of how to do it nor search for something like this on the web.
so for example
User table
id | name
1 | john
2 | karen
Activity table
id | user_id
1 | 1
2 | 1
3 | 2
Results
name | Count
john | 2
karen| 1

Make use of LEFT JOIN and COUNT aggregate
SELECT name, COUNT(a.user_id) count
FROM [User] u LEFT JOIN Activity a
ON u.id = a.user_id
GROUP BY u.id, u.name
Output:
| name | count |
|-------|-------|
| john | 2 |
| karen | 1 |
Here is a SQLFiddle demo
Recommended reading:
A Visual Explanation of SQL Joins

select name, count(a.Id) as ActivityCount
from [user] u
inner join activity a on u.Us = a.UserId
group by name

very simple to do. You can combine the two tables by using a join. To have the count (ie the total count) added, there is a function you can use which is conveniently called "Count". So all together, it would look something like this-
select u.id, u.name, count(*) as ct
from tblUser u
left join tblActivity a on u.id = a.id
group by u.id, u.name
order by ct desc

select
u.id as user_id, -- name is not necessary unique
max(u.name) as name,
count(a.Id) as [count]
from
[User] u
left join Activity a -- left join becuase some users can have no activities
on u.Id = a.user_id
group by u.id

Related

Efficiently getting multiple counts of foreign key rows in PostgreSQL

I have a database that consists of users who can perform various actions, which I keep track of in multiple tables. I'm creating a point system, so I need to count how many of each type of action the user did. For example, if I had:
users posts comments shares
id | username id | user_id id | user_id id | user_id
------------- -------------- -------------- --------------
1 | abc 1 | 1 1 | 1 1 | 2
2 | xyz 2 | 1 2 | 2 2 | 2
I would want to return:
user_details
id | username | post_count | comment_count | share_count
---------------------------------------------------------
1 | abc | 2 | 1 | 0
2 | xyz | 0 | 1 | 2
This is slightly different from this question about foreign key counts since I want to return the individual counts per table.
What I've tried so far (example code):
SELECT
users.id,
users.username,
COUNT( DISTINCT posts.id ) as post_count,
COUNT( DISTINCT comments.id ) as comment_count,
COUNT( DISTINCT shares.id ) as share_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN comments ON comments.user_id = users.id
LEFT JOIN shares ON shares.user_id = users.id
GROUP BY users.id
While this works, I had to use DISTINCT in all of my counts because the LEFT JOINS were causing high numbers of duplicate rows. I feel like there must be a better way to do this since (please correct me if I'm wrong) on each LEFT JOIN, the DISTINCT is having to filter out an exponentially growing number of duplicated rows.
Thank you so much for any help you could give me with this!
You can join derived tables that already do the aggregation.
SELECT u.id,
u.username,
coalesce(pc.c, 0) AS post_count,
coalesce(cc.c, 0) AS comment_count,
coalesce(sc.c, 0) AS share_count
FROM users AS u
LEFT JOIN (SELECT p.user_id,
count(*) AS cc
FROM posts AS p
GROUP BY p.user_id) AS pc
ON pc.user_id = u.id
LEFT JOIN (SELECT c.user_id,
count(*) AS
FROM comments AS c
GROUP BY c.user_id) AS cc
ON cc.user_id = u.id
LEFT JOIN (SELECT s.user_id,
count(*) AS c
FROM shares AS s
GROUP BY s.user_id) AS sc
ON sc.user_id = u.id;

PostgreSQL - How to remove duplicates when doing LEFT OUTER JOIN with WHERE clause?

I have 2 tables:
users table
+--------+---------+
| id | integer |
+--------+---------+
| phone | string |
+--------+---------+
| active | boolean |
+--------+---------+
statuses table
+---------+---------+
| id | integer |
+---------+---------+
| user_id | integer |
+---------+---------+
| step_1 | boolean |
+---------+---------+
| step_2 | boolean |
+---------+---------+
I'm doing LEFT OUTER JOIN statuses table on users table with WHERE clause like this:
SELECT users.id, statuses.step_1, statuses.step_2
FROM users
LEFT OUTER JOIN statuses ON users.id = statuses.user_id
WHERE (users.active='f')
ORDER BY users.id DESC
My problem
There are some users that have same phone number inside the users table and I want remove the duplicate users based on the phone number.
I don't want to delete them from database. But just want to exclude them for this query only.
For example, say John (ID: 1) and Sara (ID: 2) shared same phone number (+6012-3456789), removing one of them, either John or Sara is fine for me.
What I've tried but did not work?
First:
SELECT DISTINCT users.phone
FROM users
LEFT OUTER JOIN statuses ON users.id = statuses.user_id
WHERE (users.active='f')
ORDER BY users.id DESC
Second:
SELECT users.phone, COUNT(*)
FROM users
LEFT OUTER JOIN statuses ON users.id = statuses.user_id
WHERE (users.active='f')
GROUP BY phone
HAVING COUNT(users.phone) > 1
I would do this before doing the join. In Postgres, select distinct on is a very useful construct:
SELECT u.id, s.step_1, s.step_2
FROM (SELECT distinct on (phone) u.*
FROM users u
WHERE u.active = 'f'
ORDER BY phone
) u LEFT OUTER JOIN
statuses s
ON u.id = s.user_id
WHERE u.active = 'f'
ORDER BY u.id DESC;
distinct on returns one row for whatever is in parentheses. In this case, that would be by phone (based on "I want remove the duplicate users based on the phone number"). Then, the join should not be showing these as duplicates.
Here is one way
Self Join the users table and join using phone numbers and filter any one of the duplicate name by comparison operator.
SELECT *
FROM (SELECT u.*
FROM users u
JOIN users u1
ON u. u.phone = u1.phone -- to
AND u.name >= u1.name) u
LEFT OUTER JOIN statuses
ON users.id = statuses.user_id
WHERE ( users.active = 'f' )
or use ROW_NUMBER
Generate row number for each phone numbers and filter the first phone number with row number as 1
SELECT *
FROM (SELECT u.*,
Row_number()OVER(partition BY phone ORDER BY name) rn
FROM users u) u
LEFT OUTER JOIN statuses
ON users.id = statuses.user_id
WHERE ( users.active = 'f' )
AND rn = 1

Find interests of a user and let it appear only once in the result set

I have the following tables in my database
A table with users:
---------------------
| userId | username |
---------------------
| 1 | john doe |
| 2 | jane doe |
---------------------
A list of "interests"
---------------------
| intId | interest |
---------------------
| 1 | books |
| 2 | cars |
---------------------
And a table in which I save what interests a user has
--------------------
| userId | intId |
--------------------
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
--------------------
Now I also have a web page where I can select one or more interests. Based on that I want
to retrieve the users that have those interests.
So suppose I select "books" and "cars". Then I should get two records back, "jonh doe" and "jane doe".
But the thing is that I'm not really sure how to create a query for this. I could do a simple LEFT JOIN. But then
I would get 3 records back. Since "john doe" is interested in "books" AND "cars". But I want each user to appear only
once in the result.
So how should I create this query?
Something like this should do it with a DISTINCT if you're working with interest names:
SELECT DISTINCT u.UserName
FROM Users u
INNER JOIN UserInterests ui ON ui.userId = u.userId
INNER JOIN Interest i on i.intId = ui.intId
WHERE i.Interest in ('books', 'cars')
If you want to do it based on interest IDs:
SELECT DISTINCT u.UserName
FROM Users u
INNER JOIN UserInterests ui ON ui.userId = u.userId
WHERE ui.intId in (1,2)
Select username
from users u
join
(
Select distinct userid
from table
where intid in (1,2)
)t
on t.userid = u.serid
SELECT *,
STUFF((SELECT DISTINCT ','+u.username
FROM dbo.UserInterests ui
INNER JOIN dbo.[user] u ON u.userid = ui.UserId
WHERE ui.IntId=i.intId
FOR XML PATH('')),1,1,'')
FROM dbo.Interests i
The following would be my preferred method, it should run a little faster than using SELECT DISTINCT in the query.
SELECT U.UserName
FROM Users AS U
WHERE
EXISTS(
SELECT 1
FROM UserInterests AS UI
INNER JOIN Interest I on I.intId = UI.intId
WHERE UI.userId = U.userId
AND I.Interest in ('books', 'cars')
)
Try this:
SELECT DISTINCT U.UserName
FROM Users AS U INNER JOIN
UserInterest AS UI ON U.userId = UI.userId INNER JOIN
Interests AS I ON UI.intId = I.intId

Writing a complex SQL query, with table relations

I will present my table structures first (only relevant fields will be mentioned)
/* The table Users */
user_id | user_name | user_registration_date
1 | USER1 | 19/09/2010
2 | USER2 | 20/09/2010
/* The table Levels_Completed */
user_id | level_id
1 | 1
1 | 2
2 | 1
I would like to display a scoreboard. The first user on the list, will be the one with the highest count of levels he completed.
For the example above, USER1 will be displayed above USER2.
I want to receive the next data:
user_id, user_name, user_registration_date, COUNT(level_id rows) AS score
Ordered by the count of score, for each SQL row I receive.
Example:
1 | USER1 | 19/09/2010 | 2
2 | USER2 | 20/09/2010 | 1
I know how to use INNER JOIN, but I think the counting and ordering are above my current level. Help please?
SELECT Users.user_id, user_name, user_registration_date, COUNT(level_id) AS score
FROM Users INNER JOIN Levels_Completed ON Users.user_id = Levels_Completed.user_id
GROUP BY Users.user_id, user_name, user_registration_date
Try this:
SELECT
U.user_id,
U.user_name,
U.user_registration_date,
COUNT(L.level_id) as score
FROM Users U
LEFT JOIN Levels_Completed L
ON U.User_Id = L.User_Id
GROUP BY U.user_id, U.user_name, U.user_registration_date
ORDER BY score DESC
SELECT Users.user_id, user_name, user_registration_date, score
FROM Users
INNER JOIN (
SELECT user_id, COUNT(level_id) AS score
FROM Levels_Completed
GROUP BY user_id)
USING (user_id)
ORDER BY score DESC

SQL Order and group by

I'm getting a bit lost in making a query that performs a certain look up.
I have the first part of the query going, which returns me all accounts that have missing some entries on their account. Now I need to filter this subset further based on their last login attempt.
The table structures are as follows:
The users table contains all user information. We only care about users under project_id 33.
The users_account_list contains all accounts a user has. We only care about users NOT having an entry for service 50.
The users_login_logs contains all login attempts for a user.
The original query I have this this:
SELECT u.id,
u.login,
u.email,
u.nickname,
b.station_login AS "additionals.station_login",
a.id AS "user_account_list.id",
a.game_id AS "user_account_list.game_id",
a.game_uid AS "user_account_list.game_uid",
c.created_at AS "last login"
FROM users u
LEFT JOIN user_account_list a ON u.id = a.user_id AND a.game_id = 50
LEFT JOIN user_additionals b ON u.id = b.id
LEFT JOIN user_login_logs c ON u.id = c.user_id
WHERE u.project_id = 33
AND u.verified_at IS NOT NULL
AND (a.id IS NULL OR a.game_id IS NULL OR a.game_uid IS NULL)
AND (b.station_login IS NULL OR b.station_login = '')
ORDER BY c.created_at DESC
This returns me all users that have been registered under project_id 33, and do not have an entry for game_id 50, and have no information stored in their additional info table. Optional, but not relevant, Just limits the data returned. It does give me multiple rows per user back , sorted according their latest login date.
What I need is to get only 1 row per user returned with their LATEST login date. I tried replacing the ORDER BY with GROUP by u.id but this gives me the oldest result back, not the latest.
How can I:
Limit the rows returned to only 1 row per user
Make sure the row is based on the latest login attempt of the user.
EDIT:
This is what the query currently returns:
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
| id | login | email | nickname | additionals.station_login | user_account_list.id | user_account_list.game_id | user_account_list.game_uid | last login |
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-19 00:00:00 |
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-18 00:00:00 |
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-17 00:00:00 |
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
3 rows in set (0.08 sec)
One way to do this, is to JOIN the table user_login_logs, with the following table:
SELECT user_id, MAX(created_at) LatestDate
FROM user_login_logs
GROUP BY user_id
and join it on created_at = LatestDate. This will limit the users login logs to the latest created date for each user. Here is your query:
SELECT u.id,
u.login,
u.email,
u.nickname,
b.station_login AS "additionals.station_login",
a.id AS "user_account_list.id",
a.game_id AS "user_account_list.game_id",
a.game_uid AS "user_account_list.game_uid",
c.created_at AS "last login"
FROM users u
LEFT JOIN user_account_list a ON u.id = a.user_id AND a.game_id = 50
LEFT JOIN user_additionals b ON u.id = b.id
LEFT JOIN user_login_logs c ON u.id = c.user_id
LEFT JOIN
(
SELECT user_id, MAX(created_at) LatestDate
FROM user_login_logs
GROUP BY user_id
) maxc ON c.userid = maxc.userid AND c.created_at = maxc.LatestDate
WHERE u.project_id = 33
AND u.verified_at IS NOT NULL
AND (a.id IS NULL OR a.game_id IS NULL OR a.game_uid IS NULL)
AND (b.station_login IS NULL OR b.station_login = '')
ORDER BY c.created_at DESC;
Note that: You are LEFT JOIN the table, so that the unmatched rows from the left joined table will be included in the result set. If you didn't need to include them in the result set, use INNER JOIN instead.
You need to replace your ORDER BY by a GROUP by u.id as you tried but you also need in your SELECT to indicate that you want the last date in the group so you need to replace
c.created_at AS "last login"
by
MAX(c.created_at) AS "last login"
This will return only one line per user thanks to the GROUP BY and you will only select the latest date for each user thanks to MAX()
Edit: I think you should avoid using alias column names with spaces inside to avoid mistakes