Group by on join and calculate max of groups - sql

I have 3 tables as described below:
All three tables and output image
posts post table
post_comments posts comments
comments comments
Now I want to fetch the posts that have highest liked comments and the status of that comment should be active in Postgres.
OUTPUT:
posts resultant posts
NOTE: Since for post 1, the highest liked comment is inactive.
I've tried something like this:
select "posts".*
from "posts"
inner join (select id, max(likes) l from comments innner join post_comments on comments.id = post_comments.alert_id and post_comments.post_id = posts.id) a on posts.id = a.cid ...
This is not complete but I'm unable to do this.

In Postgres, you can get the active comment with the most likes for each post using distinct on:
select distinct on (pc.post_id) pc.*
from post_comments pc join
comments c
on pc.comment_id = c.id
where c.status = 'active'
order by pc.post_id, c.likes desc;
I think this is quite related to what you want.

Try something like this:
SELECT posts.*, MAX(likes) l
FROM posts
JOIN post_comments ON post_id = posts.id
LEFT JOIN comments ON comment_id = comments.id
GROUP BY posts.id

Related

Problems with multiple joins and a sum

If have the following three PostgreSQL tables:
Post table:
postid | title | author | created
Vote table:
postid | username | vote
where vote is equal to 1 if the user voted the post up, 0 if the user did not vote and -1 if the user voted the post down.
Comment table:
commentID | parentID | postID | content | author | created
where the parentID is null if the comment is not a reply.
I want to receive now for every post its title, author, created date, sum of all votes and
the vote of the current logged in user and the number of comments.
I already had problems with the vote of the user and asked here and someone helped me to get the following query:
SELECT post.postID as postID, post.title as title, post.author as author,
post.created as created,
COALESCE(sum(votes.vote), 0) as voteCount,
COALESCE(sum(votes.vote) FILTER (WHERE votes.username = :username), 0) as userVote
FROM post
LEFT JOIN votes ON post.postID = votes.postID
GROUP BY post.postID
ORDER BY voteCount DESC
Now I tried another LEFT JOIN to fetch the number of comments like this:
COUNT(DISTINCT comments) FILTER (WHERE comments.parentID IS NULL) as numComments
LEFT JOIN comments on post.postID = comments.postID
However, while the number of comments work, the number of votes on each post is wrong since
due to the other join the rows seem to appear multiple times yielding a wrong sum and I have some trouble figuring out a way to solve this.
I already tried to fetch the number of comments as a subquery so that it is independent from the
number of votes without success.
Any further help would be very appreciated! :-)
You would typically pre-aggregate in subqueries before joining, like so:
SELECT p.*
COALESCE(v.voteCount, 0) as voteCount,
COALESCE(v.userVote, 0) as userVote,
COALESCE(c.numComments, 0) as numComments
FROM post p
LEFT JOIN (
SELECT postID,
SUM(vote) as voteCount,
SUM(vote) FILTER (WHERE username = :username) userVote
FROM votes
GROUP BY postID
) v ON v.postID = p.postID
LEFT JOIN (
SELECT postID, count(*) numComments
FROM comments
WHERE parentID IS NULL
GROUP BY postID
) c ON c.postID = p.postID
ORDER BY voteCount DESC
Count the values separately. The joins are causing a Cartesian product. This is a place where correlated subqueries or lateral joins help:
SELECT p.*, v.*, c.*
FROM post p CROSS JOIN LATERAL
(SELECT SUM(v.vote) as voteCount,
SUM(v.vote) FILTER (WHERE v.username = :username), 0) as userVote
FROM votes v
WHERE p.postID = v.postID
) v CROSS JOIN LATERAL
(SELECT SUM(c.vote) as commentCount,
SUM(c.vote) FILTER (WHERE c.username = :username), 0) as userVote
FROM comments c
WHERE p.postID = c.postID
) c
ORDER BY voteCount DESC;

JOIN 3 tables (or cross join?) and output 2 times users.name value

I'm trying to join 3 tables - users, posts and comments to get value like post title, post author, comment title, comment author.
users (id, name)
posts (id, content, user_id)
comments (content, user_id, post_id)
Right now I have query:
SELECT posts.id AS POST_ID, posts.content AS Post, posts.user_id AS PostAuthorID,
users.name AS PostAuthorName,
comments.content AS comment, comments.user_id AS CommentsAuthorID
FROM posts
JOIN users on users.id = posts.user_id
JOIN comments ON users.id = comments.post_id;
My question is how to get author's name of each comment and how get the author of the post? Right now by doing JOIN users on users.id = posts.user_id I have post author but how to deal with comment author name? When I'm trying to add another condition to last join - JOIN comments ON users.id = comments.post_id AND users.id = comments.user_id I receive table with posts and comments created by the same user. Any help is welcome.
I am pretty sure you meant:
ON posts.id = comments.post_id;
where your query currently says:
ON users.id = comments.post_id;
Then just join to users twice:
SELECT p.id AS post_id, p.content AS post, p.user_id AS post_author_id
, pu.name AS post_author_name
, c.content AS comment, c.user_id AS comment_author_id
, cu.name AS comment_author_name
FROM posts p
JOIN users pu ON pu.id = p.user_id
JOIN comments c ON c.post_id = p.id
JOIN users cu ON cu.id = c.user_id;
Returns multiple rows for posts with multiple comments. One per comment. Returns nothing for posts without comments. To include posts without comments (or if any ID column can be NULL) use LEFT JOIN instead ...
user (user_id, name)
post (post_id, user_id, descrition)
comment (cm_id, post_id, content)
you join user with post by user_id and post with comment by post_id
Select comments.user_id AS CommentsAuthorID
Where posts.id = x
From (posts left join comments on posts.id = comments.post_id
If you just need the comments.user_id make a join with the posts and the comments.
When you want comments.users.name the query would be:
Select comments.user_id AS CommentsAuthorID comments.users_name AS UserName
Where posts.id = x
From (posts left join comments on posts.id = comments.post_id

Find total user posting per post SQL QUERY

I have a Database with the following two tables, member, POSTS I am looking for a way to get the count of how many posts a user has.
(Source: http://i.stack.imgur.com/FDv31.png)
I have tried many variations of the following SQL command with out any success. instead of showing the count of posts for a single user it shows a single row with all the posts as the count.
In the end I want something like this
(Source: http://i.stack.imgur.com/EbaEj.png)
Might be that I'm missing something here, but this query would seem to give you the results you want:
SELECT member.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total
FROM member;
I have left comment out of the query as it is not obvious what comment you want to be returned in that column for the group of comments that is counted.
See a SQL Fiddle demo here.
Edit
Sorry, misinterpreted your question :-) This query will properly return all the comments, along with the person who posted them and the total number of comments that the person made:
SELECT Posts.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total,
Posts.comment
FROM Posts
INNER JOIN member ON Posts.user_id = member.ID
GROUP BY Posts.ID, member.Name, member.ID, Posts.comment;
See an updated SQL Fiddle demo here.
You could use a subquery to calculate the total posts per member:
select m.ID
, m.Name
, coalesce(grp.total, 0)
, p.comment
from member m
left join
posts p
on p.user_id = m.id
left join
(
select user_id
, count(*) as total
from posts
group by
user_id
) grp
on grp.user_id = m.id
select
a.id
, a.name
, count(1) over (partition by b.user_id) as TotalCountPerUser
, b.comment
from member a join post b
on a.id = b.user_id

SQL Query Question: X has many Y. Get all X and get only the newest Y per X

Suppose we have two tables. Post and Comment. Post has many Comments. Pretend they are somewhat filled so that the number of comments per post is varied. I want a query which will grab all posts but only the newest comment per post.
I have been directed to joins and sub queries but I can't figure it out.
Example Output:
Post1:
Comment4 (newest for post1)
Post2:
Comment2 (newest for post2)
Post3:
Comment 10 (newest for post3)
etc...
Any help would be greatly appreciated. Thanks.
This answer assumes that you have a unique identifier for each comment, and that it's an increasing number. That is, later posts have higher numbers than earlier posts. Doesn't have to be sequential, just have to be corresponding to order of input.
First, do a query that extracts the maximum comment id, grouped by post id.
Something like this:
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
This will give you a list of post id's, and the highest (latest) comment id for each one.
Then you join with this, to extract the rest of the data from the comments, for those id's.
SELECT C1.*, C2.PostID
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
Then, you join with the posts, to get the information about those posts.
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
INNER JOIN Posts AS P ON C2.PostID = P.ID
An alternate approach doesn't use the PostID of the inner query at all. First, pick out the maximum comment id for all unique posts, but don't care about which post, we know they're unique.
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
Then do an IN clause, to get the rest of the data for those comments:
SELECT C1.*
FROM Comments
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
Then simply join in the posts:
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN Posts AS P ON C1.PostID = P.ID
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
Select the newest comment from a subquery
e.g
Select *
from Posts po
Inner Join
(
Select CommentThread, CommentDate, CommentBody, Post from comments a
inner join
(select commentthread, max(commentdate)
from comments b
group by commentthread)
on a.commentthread = b.commentthread
and a.commentdate = b.commentdate
) co
on po.Post = co.post
select *
from post
, comments
where post.post_id = comments.post_id
and comments.comments_id = (select max(z.comments_id) from comments z where z.post_id = post.post_id)
And if you should still be stuck with an old mysql version, that doesn't know subqueries you can use something likeSELECT
p.id, c1.id
FROM
posts as p
LEFT JOIN
comments as c1
ON
p.id = c1.postId
LEFT JOIN
comments as c2
ON
c1.postId = c2.postId
AND c1.id < c2.id
WHERE
isnull(c2.id)
ORDER BY
p.idEither way, check your query with EXPLAIN for performance issues.

SQL: Get all posts with any comments

I need to construct some rather simple SQL, I suppose, but as it's a rare event that I work with DBs these days I can't figure out the details.
I have a table 'posts' with the following columns:
id, caption, text
and a table 'comments' with the following columns:
id, name, text, post_id
What would the (single) SQL statement look like which retrieves the captions of all posts which have one or more comments associated with it through the 'post_id' key? The DBMS is MySQL if it has any relevance for the SQL query.
select p.caption, count(c.id)
from posts p join comments c on p.id = c.post_id
group by p.caption
having count (c.id) > 0
SELECT DISTINCT p.caption, p.id
FROM posts p,
comments c
WHERE c.post_ID = p.ID
I think using a join would be a lot faster than using the IN clause or a subquery.
SELECT DISTINCT caption
FROM posts
INNER JOIN comments ON posts.id = comments.post_id
Forget about counts and subqueries.
The inner join will pick up all the comments that have valid posts and exclude all the posts that have 0 comments. The DISTINCT will coalesce the duplicate caption entries for posts that have more then 1 comment.
I find this syntax to be the most readable in this situation:
SELECT * FROM posts P
WHERE EXISTS (SELECT * FROM Comments WHERE post_id = P.id)
It expresses your intent better than most of the others in this thread - "give me all the posts ..." (select * from posts) "... that have any comments" (where exist (select * from comments ... )). It's essentially the same as the joins above, but because you're not actually doing a join, you don't have to worry about getting duplicates of the records in Posts, so you'll just get one record per post.
SELECT caption FROM posts
INNER JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id;
No need for a having clause or count().
edit: Should be a inner join of course (to avoid nulls if a comment is orphaned), thanks to jishi.
Just going off the top of my head here but maybe something like:
SELECT caption FROM posts WHERE id IN (SELECT post_id FROM comments HAVING count(*) > 0)
You're basically looking at performing a subquery --
SELECT p.caption FROM posts p WHERE (SELECT COUNT(*) FROM comments c WHERE c.post_id=p.id) > 1;
This has the effect of running the SELECT COUNT(*) subquery for each row in the posts table. Depending on the size of your tables, you might consider adding an additional column, comment_count, into your posts table to store the number of corresponding comments, such that you can simply do
SELECT p.caption FROM posts p WHERE comment_count > 1