TSQL left join and only last row from right - sql

I'm writing sql query to get post and only last comment of this post(if exists).
But I can't find a way to limit only 1 row for right column in left join.
Here is sample of this query.
SELECT post.id, post.title,comment.id,comment.message
from post
left outer join comment
on post.id=comment.post_id
If post has 3 comments I get 3 rows with this post, but I want only 1 row with last comment(ordered by date).
Can somebody help me with this query?

SELECT post.id, post.title, comment.id, comment.message
FROM post
OUTER APPLY
(
SELECT TOP 1 *
FROM comment с
WHERE c.post_id = post.id
ORDER BY
date DESC
) comment
or
SELECT *
FROM (
SELECT post.id, post.title, comment.id, comment.message,
ROW_NUMBER() OVER (PARTITION BY post.id ORDER BY comment.date DESC) AS rn
FROM post
LEFT JOIN
comment
ON comment.post_id = post.id
) q
WHERE rn = 1
The former is more efficient for few posts with many comments in each; the latter is more efficient for many posts with few comments in each.

Subquery:
SELECT p.id, p.title, c.id, c.message
FROM post p
LEFT join comment c
ON c.post_id = p.id AND c.id =
(SELECT MAX(c2.id) FROM comment c2 WHERE c2.post_id = p.id)

You'll want to join to a sub-query that returns the last comment for the post. For example:
select post.id, post.title. lastpostid, lastcommentmessage
from post
inner join
(
select post.id as lastpostid, max(comment.id) as lastcommentmessage
from post
inner join comment on commment.post_id = post.id
group by post.id
) lastcomment
on lastpostid = post.id

Couple of options....
One way is to do the JOIN on:
SELECT TOP 1 comment.message FROM comment ORDER BY comment.id DESC
(note I'm assuming that comment.id is an Identity field)

what version of SQL Server? If you have the Row_Number() function available you can sort your comments by whatever "first" means to you and then just add a "where RN=1" clause. Don't have a handy example or the right syntax off the top of my head but do have tons of queries that do exactly this. Other posts are all in the 1,000's of ways you could do this.
I'd say profile it and see which one performs best for you.

You didn't say the specific name of your date field, so I filled in with [DateCreated]. This is essentially the same as AGoodDisplayName's post above, but using the date field instead of relying on the ID column ordering.
SELECT post.id, post.title, comment.id, comment.message
FROM post p
LEFT OUTER JOIN comment
ON comment.id = (
SELECT TOP 1 id
FROM comment
WHERE p.id = post_id
ORDER BY [DateCreated] ASC
)

Related

Postgres - Left join using a where clause + distinct

I want to join two tables using a join
SELECT * FROM posts
LEFT JOIN voted ON posts.post_id = voted.id
Which produces this:
How would I create query using:
ORDER BY date_posted DESC FETCH FIRST 5 ROW ONLY
on the Posts Table to return this result
Edit 1: duplicate post_id
How would I make it so that the uuid on the user_id column is only 82411850-
Edit 2: Final query thanks to Mr.Linoff
SELECT p.post_id, p.date_posted, p.posted_by,
v.user_id, v.votes
FROM posts p LEFT JOIN
voted v
ON p.post_id = v.id
AND v.user_id = '82411580...'
ORDER BY p.date_posted DESC
FETCH FIRST 5 ROW ONLY ;
You have a collision of ids. Be explicit about the columns you are selecting.
Then I think you have basically the right logic:
SELECT p.post_id, p.date_posted, p.posted_by,
v.user_id, v.votes
FROM posts p LEFT JOIN
voted v
ON p.post_id = v.id
ORDER BY p.date_posted DESC
FETCH FIRST 5 ROW ONLY ;

Group by on join and calculate max of groups

I have 3 tables as described below:
All three tables and output image
posts post table
post_comments posts comments
comments comments
Now I want to fetch the posts that have highest liked comments and the status of that comment should be active in Postgres.
OUTPUT:
posts resultant posts
NOTE: Since for post 1, the highest liked comment is inactive.
I've tried something like this:
select "posts".*
from "posts"
inner join (select id, max(likes) l from comments innner join post_comments on comments.id = post_comments.alert_id and post_comments.post_id = posts.id) a on posts.id = a.cid ...
This is not complete but I'm unable to do this.
In Postgres, you can get the active comment with the most likes for each post using distinct on:
select distinct on (pc.post_id) pc.*
from post_comments pc join
comments c
on pc.comment_id = c.id
where c.status = 'active'
order by pc.post_id, c.likes desc;
I think this is quite related to what you want.
Try something like this:
SELECT posts.*, MAX(likes) l
FROM posts
JOIN post_comments ON post_id = posts.id
LEFT JOIN comments ON comment_id = comments.id
GROUP BY posts.id

SQL Server: Join two tables and only get data of one table without identical rows

I have two tables, not the actual ones, I am trying to replicate the situation here:
tblMstPost with a large number of columns(say 20). I am adding just two here.
PostId Title
1 First Post
2 Second Post
3 Third post
tblTrnComment
CommentId PostId Comment
1 1 Hello
2 1 Hi
3 1 Hey
4 2 Test
5 3 Hello
Now I want data from post table only. I don't need any data from Comment table.
The condition to get data from post table is that I need posts which have the comment "Hello" and "Hi".
Now I can write something like this:
SELECT p.*
FROM tblMstPost AS p
INNER JOIN tblTrnComment AS c
ON p.PostId = c.PostId
WHERE c.CommentId IN (1, 2, 5)
Above query will give results with two identical rows with PostId 1.
PostId Title
1 First Post
1 First Post
3 Third post
Now I want to remove one of the identical rows. I have tried DISTINCT but one of my columns on Post table has text data type and for this reason, DISTINCT is not working. When I GROUP BY p.PostId SQL server asks the same for all of the columns: [column] is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Please note that post table has a large number of columns and I don't want to add all of them to GROUP BY statement.
Is there any solution other than using WHERE IN subquery?
Update: I have found this video on Youtube which clearly explains exists and my question: https://youtu.be/zfgJ3ZmAgNw
Try
select * from tblMstPost p
where exists(select 1 from tblTrnComment
where PostId = p.PostId and Comment in ('Hi', 'Hello'));
JOIN might produce duplicate rows as because of tblTrnComment table has many rows associate with PostId from tblMstPost table. So, what you need is IN or EXISTS
So, you don't need to use JOIN
select p.*
from tblMstPost p
where p.PostId in (select PostId
from tblTrnComment
where Comment in ('Hi', 'Hello')
);
Now, i would suggest EXISTS instead
select p.*
from tblMstPost p
where exists(select 1
from tblTrnComment c
where c.PostId = p.PostId and
c.Comment in ('Hi', 'Hello')
);
For grouping you need to group all of the rows, as you put group by id the compiler didnt know what to do with the titles, does it group the same, should it aggregate them all. so you list all non aggregated columns in your group by like below.
SELECT p.*
FROM tblMstPost AS p
INNER JOIN tblTrnComment AS c
ON p.PostId = c.PostId
GROUP BY PostId, Title
WHERE c.CommentId IN (1, 2, 5)
To avoid a WHERE IN clause, you could join to a distinct set of Post Ids which have the appropriate comment:
SELECT p.*
FROM tblMstPost p
JOIN ( SELECT DISTINCT PostId
FROM tblTrnComment
WHERE Comment = 'Hello'
OR Comment = 'Hi' -- Or have an IN here, or a lookup etc
) t
ON p.PostId = t.PostId

SQL Query Question: X has many Y. Get all X and get only the newest Y per X

Suppose we have two tables. Post and Comment. Post has many Comments. Pretend they are somewhat filled so that the number of comments per post is varied. I want a query which will grab all posts but only the newest comment per post.
I have been directed to joins and sub queries but I can't figure it out.
Example Output:
Post1:
Comment4 (newest for post1)
Post2:
Comment2 (newest for post2)
Post3:
Comment 10 (newest for post3)
etc...
Any help would be greatly appreciated. Thanks.
This answer assumes that you have a unique identifier for each comment, and that it's an increasing number. That is, later posts have higher numbers than earlier posts. Doesn't have to be sequential, just have to be corresponding to order of input.
First, do a query that extracts the maximum comment id, grouped by post id.
Something like this:
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
This will give you a list of post id's, and the highest (latest) comment id for each one.
Then you join with this, to extract the rest of the data from the comments, for those id's.
SELECT C1.*, C2.PostID
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
Then, you join with the posts, to get the information about those posts.
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN (
SELECT MAX(ID) MaxCommentID, PostID
FROM Comments
GROUP BY PostID
) AS C2 ON C1.CommentID = C2.MaxCommentID
INNER JOIN Posts AS P ON C2.PostID = P.ID
An alternate approach doesn't use the PostID of the inner query at all. First, pick out the maximum comment id for all unique posts, but don't care about which post, we know they're unique.
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
Then do an IN clause, to get the rest of the data for those comments:
SELECT C1.*
FROM Comments
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
Then simply join in the posts:
SELECT C1.*, P.*
FROM Comments AS C1
INNER JOIN Posts AS P ON C1.PostID = P.ID
WHERE C1.ID IN (
SELECT MAX(ID) AS MaxCommentID
FROM Comments
GROUP BY PostID
)
Select the newest comment from a subquery
e.g
Select *
from Posts po
Inner Join
(
Select CommentThread, CommentDate, CommentBody, Post from comments a
inner join
(select commentthread, max(commentdate)
from comments b
group by commentthread)
on a.commentthread = b.commentthread
and a.commentdate = b.commentdate
) co
on po.Post = co.post
select *
from post
, comments
where post.post_id = comments.post_id
and comments.comments_id = (select max(z.comments_id) from comments z where z.post_id = post.post_id)
And if you should still be stuck with an old mysql version, that doesn't know subqueries you can use something likeSELECT
p.id, c1.id
FROM
posts as p
LEFT JOIN
comments as c1
ON
p.id = c1.postId
LEFT JOIN
comments as c2
ON
c1.postId = c2.postId
AND c1.id < c2.id
WHERE
isnull(c2.id)
ORDER BY
p.idEither way, check your query with EXPLAIN for performance issues.

SQL: Get all posts with any comments

I need to construct some rather simple SQL, I suppose, but as it's a rare event that I work with DBs these days I can't figure out the details.
I have a table 'posts' with the following columns:
id, caption, text
and a table 'comments' with the following columns:
id, name, text, post_id
What would the (single) SQL statement look like which retrieves the captions of all posts which have one or more comments associated with it through the 'post_id' key? The DBMS is MySQL if it has any relevance for the SQL query.
select p.caption, count(c.id)
from posts p join comments c on p.id = c.post_id
group by p.caption
having count (c.id) > 0
SELECT DISTINCT p.caption, p.id
FROM posts p,
comments c
WHERE c.post_ID = p.ID
I think using a join would be a lot faster than using the IN clause or a subquery.
SELECT DISTINCT caption
FROM posts
INNER JOIN comments ON posts.id = comments.post_id
Forget about counts and subqueries.
The inner join will pick up all the comments that have valid posts and exclude all the posts that have 0 comments. The DISTINCT will coalesce the duplicate caption entries for posts that have more then 1 comment.
I find this syntax to be the most readable in this situation:
SELECT * FROM posts P
WHERE EXISTS (SELECT * FROM Comments WHERE post_id = P.id)
It expresses your intent better than most of the others in this thread - "give me all the posts ..." (select * from posts) "... that have any comments" (where exist (select * from comments ... )). It's essentially the same as the joins above, but because you're not actually doing a join, you don't have to worry about getting duplicates of the records in Posts, so you'll just get one record per post.
SELECT caption FROM posts
INNER JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id;
No need for a having clause or count().
edit: Should be a inner join of course (to avoid nulls if a comment is orphaned), thanks to jishi.
Just going off the top of my head here but maybe something like:
SELECT caption FROM posts WHERE id IN (SELECT post_id FROM comments HAVING count(*) > 0)
You're basically looking at performing a subquery --
SELECT p.caption FROM posts p WHERE (SELECT COUNT(*) FROM comments c WHERE c.post_id=p.id) > 1;
This has the effect of running the SELECT COUNT(*) subquery for each row in the posts table. Depending on the size of your tables, you might consider adding an additional column, comment_count, into your posts table to store the number of corresponding comments, such that you can simply do
SELECT p.caption FROM posts p WHERE comment_count > 1