Challenge in PostgreSQL query (group by and having issue) - sql

I'm trying to create a query but i'm having some trouble with it. I have two tables:
users (id, name, email)
comments (id, uid, comment, date, time)
I'm trying to list all users and their comments, which can be done quite easily with an inner join. However, i get various comments per user, since i joined the result. I just want their latest comment. Any ideas? :)

this should do it:
select distinct on(u.name, u.id) *
from comments c, users u
where u.id=c.uid
order by u.name, u.id, c.date desc

For PostgreSQL 8.4+:
SELECT x.*
FROM (SELECT u.*, c.*,
ROW_NUMBER() OVER (PARTITION BY u.id
ORDER BY c.date DESC, c.time DESC) AS rnk
FROM USERS u
JOIN COMMENTS c ON c.uid = u.id) x
WHERE x.rnk = 1

This might work:
EDIT:
I updated the query to this:
SELECT u.id, u.name, u.email, t.id, t.uid, t.comment, t.date, t.time
FROM users u
LEFT OUTER JOIN
(
select c.id, m.uid, c.comment, m.cdate, c.time
from comments c
right outer join
(
select uid, max(date) as cdate
from comments
group by uid
) as m
ON c.cdate = m.cdate
) t
ON u.id = t.uid

Assuming comment id is autoincrement, find the maximum commentid per user (the latest comment)
SELECT u.id, u.name, u.email, c.id, c.uid, c.comment, c.date, c.time
FROM users u
JOIN comments c ON u.id = c.uid
JOIN
(
select uid, max(id) id
from comments
group by uid
) as c2 ON c.id = c2.id AND c.uid = c2.uid

OMG Ponies certainly has the best answer, but here is another way to do it, without any extended database feature:
select
u.name,
c.comment,
c.comment_date_time
from users as u
left join comments as c
on c.uid = u.id
and
c.comment_date_time -
(
select max(c2.comment_date_time)
from comments as c2
where c2.uid = u.id
) = 0
I have merge your date and time columns into comment_date_time in this example.

Related

How can I select the highest post per topic and associate them given this schema design?

I am working on a small forum component for a site and I am creating a page where I want to display each topic along with its highest rated answer. Here are what the tables look like:
POST USER TOPIC
id id id
date name title
text bio date
views
likes
topic_id
author_id
My query looks like so:
select
u.id, u.name, u.bio,
p.id, p.date, p.text, p.views, p.likes,
t.id, t.title, t.date
from
( select p.id, max(p.likes) as likes, p.topic_id
from post as p group by p.topic_id ) as q
inner join post as p on q.id = p.id
inner join topic as t on t.id = q.topic_id
inner join user as u on u.id = p.author_id
order by date desc;
One of the problems I'm having running this is withing "q". Postgresql wont let me run the "q" query because it wants "p.id" to be in the "group by" clause or in an aggregate function. I tried to use "distinct on (p.id)" but I got the same error message: p.id must appear in the GROUP BY clause or be used in an aggregate function.
Without the p.id attribute, I cannot meaningfully link it to the other tables; is there another way of accomplishing this?
;WITH cte AS (
SELECT
u.id AS UserId
,u.name
,u.bio
,p.id AS PostId
,p.[date] AS PostDate
,p.text
,p.views
,p.Likes
,t.id AS TopidId
,t.title
,t.[date] AS TopicDate
,p.Likes
,ROW_NUMBER() OVER (PARTITION BY t.id ORDER BY p.Likes DESC, p.[date] DESC) AS RowNum
,DENSE_RANK() OVER (PARTITION BY t.id ORDER BY p.Likes DESC) AS RankNum
FROM
topic t
INNER JOIN post p
ON t.id = p.topic_id
INNER JOIN [user] u
ON p.author_id = u.id
)
SELECT *
FROM
cte
WHERE
RowNum = 1
;
switch RowNum to RankNum if you want to see ties for most liked
This is a common need: when grouping, show each group's first/last a ranked by some other criteria b. I don't have a name for it, but this seems to be the canonical question. You can see there are a lot of choices! My favorite solution is probably a lateral join:
SELECT u.id, u.name, u.bio,
p.id, p.date, p.text, p.views, p.likes,
t.id, t.title, t.date
FROM topic t
LEFT OUTER JOIN LATERAL (
SELECT *
FROM post
WHERE post.topic_id = t.id
ORDER BY post.likes DESC
LIMIT 1
) p
ON true
LEFT OUTER JOIN "user" u
ON p.author_id = u.id
;
SELECT
u.id AS uid, u.name, u.bio
, p.id AS pid, p."date" AS pdate, p.text, p.views, p.likes
, t.id AS tid, t.title, t."date" AS tdate
FROM post p
JOIN topic t ON t.id = p.topic_id
JOIN user u ON u.id = p.author_id
WHERE NOT EXISTS ( SELECT *
FROM post nx
WHERE nx.topic_id = p.topic_id
AND nx.likes > p.likes)
ORDER BY p."date" DESC
;

SQL query to find the top 3 in a category

Calling all sql enthusiasts!
Quick info: using PostgreSQL.
I have a query that return the maximum number of likes for a user per category. What I want now, is to show the top 3 users with the most likes per category.
A helpful resource was using this example to solve the problem:
select type, variety, price
from fruits
where (
select count(*) from fruits as f
where f.type = fruits.type and f.price <= fruits.price
) <= 2;
I understand this, but my query is using joins and I am also a beginner, so I was not able to use this information effectively.
Down to business, this is my query for returning the MAX likes for a user per category.
SELECT category, username, MAX(post_likes) FROM (
SELECT c.name category, u.username username, SUM(p.like_count) post_likes, COUNT(*) post_num
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id
GROUP BY c.name, u.username) AS leaders
WHERE post_likes > 0
GROUP BY category, username
HAVING MAX(post_likes) >= (SELECT SUM(p.like_count)
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id WHERE c.name = leaders.category
GROUP BY u.username order by sum desc limit 1)
ORDER BY MAX(post_likes) DESC;
Any and all help would be greatly appreciated. I am having a difficult time wrapping my head around this problem. Thank!
If you want the most likes per category, use window functions:
SELECT cu.*
FROM (SELECT c.name as category, u.username as username,
SUM(p.like_count) as post_likes, COUNT(*) as post_num,
ROW_NUMBER() OVER (PARTITION BY c.name ORDER BY COUNT(*) DESC) as seqnum
FROM categories c JOIN
topics t
ON c.id = t.category_id JOIN
posts p
ON t.id = p.topic_id JOIN
users u
ON u.id = p.user_id
GROUP BY c.name, u.username
) cu
WHERE seqnum <= 3;
This always returns three rows per category, even if there are ties. If you want to do something else, then consider DENSE_RANK() or RANK() instead of ROW_NUMBER().
Also, use as for column aliases in the FROM clause. Although optional, one day you will leave out a comma and be grateful that you are in the habit of using as.

SQL combining 3 tables and get just the row with the latest date

I have 3 tables user, session and log. The user table stores all user relevant information while the session just connects the user with the log. And i want to get a list of all users with the latest log entry. The table design looks like this:
user (id, name, ...)
session (id, user_id)
log (id, session_id, time, type, ...)
My current query looks like this
SELECT *
FROM USER AS u
INNER JOIN session AS s
ON u.id = s.user_id
INNER JOIN log AS l
ON l.session_id = s.id
ORDER BY l.time DESC
But it's not hard to imagine that this just returns the data of all 3 tables sorted by date. How do i achieve a result that i just get every user just once with the data from the latest log entry ordered by the time of log (desc)?
Thanks in advance for your help.
You can use DISTINCT ON in conjunction with ORDER BY to get the latest row per user by log date. This will allow you to select the additional fields you need:
SELECT DISTINCT ON (u.id)
u.id,
u.Name,
l.type,
l.time
FROM user AS u
INNER JOIN session AS s ON u.id = s.user_id
INNER JOIN log AS l ON l.session_id = s.id
ORDER BY u.id, l.time DESC;
N.B. I don't know exactly what columns you need, but I have added a couple in to demonstrate as I don't like to advocate the use of SELECT *
For completeness there are a couple of other ways to achieve this, the first is to select the max in a subquery and join back to the outer query on both user_id and time:
SELECT u.id,
u.Name,
l.type,
l.time
FROM user AS u
INNER JOIN session AS s
ON u.id = s.user_id
INNER JOIN log AS l
ON l.session_id = s.id
INNER JOIN
( SELECT s.user_id, MAX(l.time) AS time
FROM session AS s
INNER JOIN log AS l
ON l.session_id = s.id
GROUP BY s.user_id
) AS MaxLog
ON MaxLog.user_id = u.id
AND MaxLog.time = l.time
ORDER BY l.time DESC;
Or you can use ROW_NUMBER():
SELECT id, Name, type, time
FROM ( SELECT u.id,
u.Name,
l.type,
l.time,
ROW_NUMBER() OVER(PARTITION BY u.id ORDER BY l.time DESC) AS RowNumber
FROM user AS u
INNER JOIN session AS s
ON u.id = s.user_id
INNER JOIN log AS l
ON l.session_id = s.id
) u
WHERE RowNumber = 1;
I've assumed some schema (user.user_name?), but you can do this by grouping and an aggregate like Max:
SELECT u.user_id,
u.user_name,
Max(l.time) AS LastLogTime
FROM USER AS u
LEFT JOIN session AS s
ON u.id = s.user_id
INNER JOIN log AS l
ON l.session_id = s.id
GROUP BY u.user_id,
u.user_name;
You won't be able to select * as we need to use GROUP BY
Similarly, ORDER BY l.time isn't applicable any more - you could still order by e.g. user_name
I've also LEFT JOINED - this way, if the user has no sessions, it will still return a record for the user, possibly with a LastLogTime of NULL.

SQL: Comparing MAX Dates from Two different Tables

I have 3 Tables
User
Attendence
Payment
Now I like to get
GroupID, UserName, MAX(PaymetDate), MAX(AttendenceDate)
Where MAX(PaymetDate) IS LESS THAN MAX(AttendenceDate)
This what I have Tried
SELECT MAX(PaymetDate) AS Paied_Upto
FROM Payment
Group by GroupID
SELECT MAX(AttendenceDate) AS Last_ AttendenceDate
FROM Attendence FULL OUTER JOIN Users ON Attendence.Username = Users.Username
Group by Users.GroupID
But how do get them to work together?
Thank
Try this:
SELECT u.GroupID, u.UserName, py.LastPaymentDate, at.LastAttendenceDate
FROM User AS u,
(SELECT Username, Max(AttendenceDate) AS LastAttendenceDate FROM Attendence GROUP BY Username) AS at,
(SELECT GroupID, Max(PaymetDate) AS LastPaymentDate FROM Payment GROUP BY GroupID) AS py
WHERE u.UserName=at.Username
AND u.GroupID=py.GroupID
AND py.LastPaymentDate < at.LastAttendenceDate;
try this
select p.GroupID, u.UserName, MAX(p.PaymetDate), MAX(a.AttendenceDate)
from dbo.Users u
inner join dbo.Attandence a
ON u.UserName = a.UserName
Inner join dbo.Payment p
ON u.groupID = p.GroupID
GROUP BY p.GroupID, u.UserName
Having MAX(p.PaymentDate) < MAX(a.attendenceDate)
I think this does what you need (SqlFiddle link):
select UserName, GroupID, MaxAttendanceDate, MaxPaymentDate
from (
select
u.UserName,
u.GroupID,
(select max(AttendanceDate)
from Attendance a
where a.UserName = u.UserName) as MaxAttendanceDate,
(select max(PaymentDate)
from Payment p
where p.GroupID = u.GroupId) as MaxPaymentDate
from [User] u
) x
where MaxAttendanceDate > MaxPaymentDate

SQL: Finding user with most number of comments

I need to find out the user who has posted the most number of comments. There are two tables 1)users(Id, DisplayName) 2)comments(Id, UserId, test) . I have used the following query
Select DisplayName from users INNER JOIN (Select UserId, max(comment_count) as `max_comments from (Select UserId, count(Id) as comment_count from comments group by UserId) as` T1) as T2 ON users.Id=T2.UserId
However, this returns to me the Display Name of the user with Id = 1 rather than what I want. How do I work around this ?
SELECT TOP 1
U.DisplayName,
COUNT(C.ID) AS CommentCount
FROM
Users AS U
INNER JOIN Comments AS C ON U.ID = C.UserID
GROUP BY
U.DisplayName
ORDER BY
COUNT(C.ID) DESC