SQL: Finding user with most number of comments - sql

I need to find out the user who has posted the most number of comments. There are two tables 1)users(Id, DisplayName) 2)comments(Id, UserId, test) . I have used the following query
Select DisplayName from users INNER JOIN (Select UserId, max(comment_count) as `max_comments from (Select UserId, count(Id) as comment_count from comments group by UserId) as` T1) as T2 ON users.Id=T2.UserId
However, this returns to me the Display Name of the user with Id = 1 rather than what I want. How do I work around this ?

SELECT TOP 1
U.DisplayName,
COUNT(C.ID) AS CommentCount
FROM
Users AS U
INNER JOIN Comments AS C ON U.ID = C.UserID
GROUP BY
U.DisplayName
ORDER BY
COUNT(C.ID) DESC

Related

SQL join 3 tables while getting count from 2 of them

Back with another SQL question about joins. I have 3 tables:
user: id, username, name, city, state, private
rides: id, creator, title, datetime, city, state
posts: id, title, user, date, state, city
I need to get the users from the user table, and based on the id of user, get the number of posts and rides for each person. Such as, user with id 25 has 2 rides and 4 posts, while the user with id 27 has 2 rides and 2 posts. The problem I am having, is that both users are coming back with 4 posts and rides each.
user.id = rides.creator = posts.user //just so you know what fields equals the user id
Here is my code:
select u.id, u.username, u.state, u.city, count(p.id) as TotalPosts, count(r.id) as TotalRides
from user u
left join posts p on p.user=u.id
left join rides r on r.creator=u.id
where private='public'
group by u.id
order by u.username, u.state asc;
If I separate them out, and just join the posts or the rides, I get the correct totals back. I tried switching the order of the joins, but I got the same results. Not sure what is going on.
Any ideas or thoughts are appreciated.
Your problem is a Cartesian product along two different dimensions. The best solution is to pre-aggregate the data:
select u.id, u.username, u.state, u.city, p.TotalPosts, r.TotalRides
from user u left join
(select user, count(*) as totalposts
from posts p
group by user
) p
on p.user = u.id left join
(select creator, count(*) as totalrides
from rides r
group by creator
) r
on r.creator = u.id
where u.private = 'public'
group by u.id
order by u.username, u.state asc;
you can always use a sub select.
select u.*,
(select count(*) from posts where user = u.id) as 'posts',
(select count(*) from rides where creator = u.id) as 'rides'
from users u
where .....

SQL query to find the top 3 in a category

Calling all sql enthusiasts!
Quick info: using PostgreSQL.
I have a query that return the maximum number of likes for a user per category. What I want now, is to show the top 3 users with the most likes per category.
A helpful resource was using this example to solve the problem:
select type, variety, price
from fruits
where (
select count(*) from fruits as f
where f.type = fruits.type and f.price <= fruits.price
) <= 2;
I understand this, but my query is using joins and I am also a beginner, so I was not able to use this information effectively.
Down to business, this is my query for returning the MAX likes for a user per category.
SELECT category, username, MAX(post_likes) FROM (
SELECT c.name category, u.username username, SUM(p.like_count) post_likes, COUNT(*) post_num
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id
GROUP BY c.name, u.username) AS leaders
WHERE post_likes > 0
GROUP BY category, username
HAVING MAX(post_likes) >= (SELECT SUM(p.like_count)
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id WHERE c.name = leaders.category
GROUP BY u.username order by sum desc limit 1)
ORDER BY MAX(post_likes) DESC;
Any and all help would be greatly appreciated. I am having a difficult time wrapping my head around this problem. Thank!
If you want the most likes per category, use window functions:
SELECT cu.*
FROM (SELECT c.name as category, u.username as username,
SUM(p.like_count) as post_likes, COUNT(*) as post_num,
ROW_NUMBER() OVER (PARTITION BY c.name ORDER BY COUNT(*) DESC) as seqnum
FROM categories c JOIN
topics t
ON c.id = t.category_id JOIN
posts p
ON t.id = p.topic_id JOIN
users u
ON u.id = p.user_id
GROUP BY c.name, u.username
) cu
WHERE seqnum <= 3;
This always returns three rows per category, even if there are ties. If you want to do something else, then consider DENSE_RANK() or RANK() instead of ROW_NUMBER().
Also, use as for column aliases in the FROM clause. Although optional, one day you will leave out a comma and be grateful that you are in the habit of using as.

How to Join only first row, disregard further matches

I have 2 tables
Table Users:
UserID | Name
Table Cars:
CarID | Car Name | FK_UserID
A user can have more than 1 car.
I want to join each user with 1 car only, not more.
Having looked at other threads here,
I've tried the following:
Select users.UserID, users.name, carid
from Users
join cars
on users.UserID =
(
select top 1 UserID
from users
where UserID = CarID
)
But it still returns more than 1 match for each user.
What am I doing wrong?
You can try like below using ROW_NUMBER() function
select userid, username, carname
from
(
Select users.UserID as userid,
users.name as username,
cars.carname as carname,
ROW_NUMBER() OVER(PARTITION BY users.UserID ORDER BY users.UserID) AS r
from Users
join cars
on users.UserID = cars.FK_UserID
) XXX
where r = 1;
with x as
(select row_number() over(partition by userid order by carid) as rn,
* from cars)
select u.userid, x.carid, x.carname
from users u join x on x.userid = u.userid
where x.rn = 1;
This is one way to do it using row_number function.
Another way to do it
select u.UserID,
u.name,
(select TOP 1 carid
from cars c
where u.UserID = c.FK_UserID
order by carid) carid -- Could be ordered by anything
from Users u
-- where only required if you only want users with cars
where exists (select * from car c where u.UserID = c.FK_UserID)
Best would be to do a subquery and use a group-by in it to return only 1 user and a car for each user. Then join that to the outer user table.
Here is an example:
select *
from user_table u
join (
select userid
, max(carname)
from cars
group by userid
) x on x.userId = u.userId
or you could use the row_number() examples above if you want a specific order (either this example or theirs will do the trick)

Select all threads and order by the latest one

Now that I got the Select all forums and get latest post too.. how? question answered, I am trying to write a query to select all threads in one particular forum and order them by the date of the latest post (column "updated_at").
This is my structure again:
forums forum_threads forum_posts
---------- ------------- -----------
id id id
parent_forum (NULLABLE) forum_id content
name user_id thread_id
description title user_id
icon views updated_at
created_at created_at
updated_at
last_post_id (NULLABLE)
I tried writing this query, and it works.. but not as expected: It doesn't order the threads by their last post date:
SELECT DISTINCT ON(t.id) t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY t.id, p.updated_at DESC;
How can I solve this one?
Assuming you want a single row per thread and not all rows for all posts.
DISTINCT ON is still the most convenient tool. But the leading ORDER BY items have to match the expressions of the DISTINCT ON clause. If you want to order the result some other way, you need to wrap it into a subquery and add another ORDER BY to the outer query:
SELECT *
FROM (
SELECT DISTINCT ON (t.id)
t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY t.id, p.updated_at DESC
) sub
ORDER BY updated_at DESC;
If you are looking for a query without subquery for some unknown reason, this should work, too:
SELECT DISTINCT
t.id
, first_value(u.username) OVER w AS username
, first_value(p.updated_at) OVER w AS updated_at
, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
WINDOW w AS (PARTITION BY t.id ORDER BY p.updated_at DESC)
ORDER BY updated_at DESC;
There is quite a bit going on here:
The tables are joined and rows are selected according to JOIN and WHERE clauses.
The two instances of the window function first_value() are run (on the same window definition) to retrieve username and updated_at from the latest post per thread. This results in as many identical rows as there are posts in the thread.
The DISTINCT step is executed after the window functions and reduces each set to a single instance.
ORDER BY is applied last and updated_at references the OUT column (SELECT list), not one of the two IN columns (FROM list) of the same name.
Yet another variant, a subquery with the window function row_number():
SELECT id, username, updated_at, title
FROM (
SELECT t.id
, u.username
, p.updated_at
, t.title
, row_number() OVER (PARTITION BY t.id
ORDER BY p.updated_at DESC) AS rn
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
) sub
WHERE rn = 1
ORDER BY updated_at DESC;
Similar case:
Return records distinct on one column but order by another column
You'll have to test which is faster. Depends on a couple of circumstances.
Forget the distinct on:
SELECT t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY p.updated_at DESC;

Challenge in PostgreSQL query (group by and having issue)

I'm trying to create a query but i'm having some trouble with it. I have two tables:
users (id, name, email)
comments (id, uid, comment, date, time)
I'm trying to list all users and their comments, which can be done quite easily with an inner join. However, i get various comments per user, since i joined the result. I just want their latest comment. Any ideas? :)
this should do it:
select distinct on(u.name, u.id) *
from comments c, users u
where u.id=c.uid
order by u.name, u.id, c.date desc
For PostgreSQL 8.4+:
SELECT x.*
FROM (SELECT u.*, c.*,
ROW_NUMBER() OVER (PARTITION BY u.id
ORDER BY c.date DESC, c.time DESC) AS rnk
FROM USERS u
JOIN COMMENTS c ON c.uid = u.id) x
WHERE x.rnk = 1
This might work:
EDIT:
I updated the query to this:
SELECT u.id, u.name, u.email, t.id, t.uid, t.comment, t.date, t.time
FROM users u
LEFT OUTER JOIN
(
select c.id, m.uid, c.comment, m.cdate, c.time
from comments c
right outer join
(
select uid, max(date) as cdate
from comments
group by uid
) as m
ON c.cdate = m.cdate
) t
ON u.id = t.uid
Assuming comment id is autoincrement, find the maximum commentid per user (the latest comment)
SELECT u.id, u.name, u.email, c.id, c.uid, c.comment, c.date, c.time
FROM users u
JOIN comments c ON u.id = c.uid
JOIN
(
select uid, max(id) id
from comments
group by uid
) as c2 ON c.id = c2.id AND c.uid = c2.uid
OMG Ponies certainly has the best answer, but here is another way to do it, without any extended database feature:
select
u.name,
c.comment,
c.comment_date_time
from users as u
left join comments as c
on c.uid = u.id
and
c.comment_date_time -
(
select max(c2.comment_date_time)
from comments as c2
where c2.uid = u.id
) = 0
I have merge your date and time columns into comment_date_time in this example.