Fetching latest item from a relative collection - sql

I got two tables, User and UserActivity.
How do I write a SQL query which fetches each user and it's latest activity? UserActivity.UserId references User.Id.
Might sound simple but I can't figure out how to get the latest entry from UserActivity for each user.

try this
Select u.*
ua.*
from user u
join useractivity ua on ua.userid = u.userid
join (select userid, max(useractivityid) from useractivity groupy by userid) um
on um.useractivityid = ua.useractivityid

Let's supose that your tables are:
UserT( Id, name )
UserActivity( UserId, sessionNumber, activityTimeStamp)
And when you say latest activity you are talking about the last moment that this user has activity.
In this case, the query is:
select
UserT.name,
max( activityTimeStamp ) as latestActivity
from
UserT left outer join
UserActivity UA on UA.UserId = UserT.Id
group by
UserT.Id, UserT.name
Yes, is a simple query. Only complexity is grouping by users and get aggregated max time.
Regards and sorry about answer delay. I have a little lag today ;)
If you are talking about all columns of activity, then use CTE:
;with cte as (
select
UA.*,
ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY activityTimeStamp DESC) as RN,
from
UserActivity UA )
select
UserT.*,
cte.*
from
UserT left outer join
cte on cte.RN = 1 and cte.UserId = UserT.Id

Related

SQL CTE scope in query

I have tables ChatMessages, ChatGroups and ChatGroupMemberships. A user can be in 0..N groups and in group can be 1..N chat messages. That first message is created once group is initied and it is sort of "alive" ping.
I'm optimizing the way I'm reconstructing the list of a user's conversations. That list is pretty standard and you may know it from any social site:
| Chat with User X -> [last message in that chat group]
| Group chat named ABC -> [same]
What I did so far was that I simply queried a list of ChatGroupMemberships where userId = X (x being logged in user) and then for each entry in this collection I've selected latest message in that group and then ordered entire list on server (insted of doing that on DB).
I'm creating queries with NHibernate where possible in this project so the function for the above follows:
public ChatMessage GetLastMessageInGroup(int groupId)
{
return session.CreateCriteria<ChatMessage>()
.AddOrder(Order.Desc("Date"))
.Add(Restrictions.Eq("RecipientId", groupId))
.SetMaxResults(1)
.List<ChatMessage>().FirstOrDefault();
}
Now I'm pretty sure that this is a pretty ugly solution I did there and as users created more and more chat groups reconstruction time of that conversations list started to took more and more time (I'm doing that via AJAX but still). Now a user is commonly a member of about 50 groups.
Thus a need to optimize this arose and what am I doing now is that I'd like to split loading of that list into smaller batches. As users scroll the list loads entries on fly.
The function I'm working on looks like:
public List<ChatGroupMembershipTimestamp> GetMembershipsWhereUserId(int userId, int take, int skip)
{
}
I've spent a good few hours learning about CTE's and come up with the following:
WITH ChatGroupMemberships AS
(
SELECT p.Date, p.RecipientId, p.RecipientType, p.Id, p.userId
ROW_NUMBER() OVER(PARTITION BY p.RecipientId, p.userId ORDER BY p.Id DESC)
AS rk FROM ChatMessages p
)
SELECT s.date, s.recipientId as groupId, s.recipientType as groupType, s.userId
FROM ChatGroupMemberships s
WHERE s.rk = 1
Order by s.date desc;
This returns last (newest) message by every user in every group. The catch is that my user in not a member of every of these groups so I need to somehow join? this on the ChatGroupMemberships table and check whether there is a row with that user's ID as userId
An illustration of the query above follows:
I might be as well overcomplicating this, my original plan was to execute:
select m.id as messageId, groupId, date
from ChatGroupMemberships g
join ChatMessages m on g.groupId = m.recipientId
where g.userId = XXX <-- user's id
order by m.date desc
Yields:
But here I'd need only the top-most row for each groupId which I tried to do with the query above.
Sorry this might be confusing (due to my lack of proper terms / knowledge) and it is a fairly long question.
Should any clarification be needed I'd be more than happy to collaborate.
Lastly, I'm attaching design schemes of the tables
chat messages:
chat group memberships:
chat groups:
WITH messages_ranked AS
(
SELECT p.Date, p.RecipientId, p.RecipientType, p.Id, p.userId
ROW_NUMBER() OVER(PARTITION BY p.RecipientId, p.userId ORDER BY p.Id DESC) AS rk
FROM ChatMessages p
JOIN ChatGroupMemberships g
on p.recipientId = g.groupId
where g.user_id = XXX
)
SELECT s.date, s.recipientId as groupId, s.recipientType as groupType, s.userId
FROM messages_ranked s
WHERE s.rk = 1
Order by s.date desc;
I can't see your pictures, but if your plan was simply to make a most-recent-row version of your 'original plan', then it would be something like:
with chatMessages as (
select m.*,
rk = row_number() over(
partition by m.recipientid, m.userid
order by m.id desc
)
from chatMessages m
)
select m.id as messageId, g.groupId, m.date
from chatGroupMemberships g
join chatMessages m on g.groupId = m.recipientId and m.rk = 1
where g.userId = XXX <-- user's id
order by m.date desc
The main point being, that you don't try to make chatGroupMemberships out of chatMessages. You simply do your needed filtering on chatMessages and then use it as you did before.
That being said. If your concern is that 'my user is not a member of every of these groups', I don't know how your original query worked for you in any sense.
Based on your c# code, .Add(Restrictions.Eq("RecipientId", groupId)), I'm wondering if you are perhaps needing to change
g.userId = XXX
to
m.recipientId = XXX
And in any case, why does g.groupId match to m.recipientId. That seems like an error, or if not, very odd.

Too much Data using DISTINCT MAX

I want to see the last activity each individual handset and the user that used that handset. I have a table UserSessions that stores the last activity of a particular user as well as what handset they used in that activity. There are roughly 40 handsets, yet I always get back way too many records, like 10,000 rows when I only want the last activity of each handset. What am I doing wrong?
SELECT DISTINCT MAX(UserSessions.LastActivity), Handsets.Name,Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE
Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY UserSessions.LastActivity, Handsets.Name,Users.Username
I expect to get one record per handset of the users last activity with that handset. What I get is multiple records on all handsets and dates over 10000 rows
You typically GROUP BY the same columns as you SELECT, except those who are arguments to set functions.
This GROUP BY returns no duplicates, so SELECT DISTINCT isn't needed.
SELECT MAX(UserSessions.LastActivity), Handsets.Name, Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY Handsets.Name, Users.Username
There is no such thing as DISTINCT MAX. You have SELECT DISTINCT which ensures that all columns referenced in the SELECT are not duplicated (as a group) across multiple rows. And there is MAX() an aggregation function.
As a note: SELECT DISTINCT is almost never appropriate with GROUP BY.
You seem to want:
SELECT *
FROM (SELECT h.Name, u.Username, MAX(us.LastActivity) as last_activity,
RANK() OVER (PARTITION BY h.Name ORDER BY MAX(us.LastActivity) desc) as seqnum
FROM UserSessions us JOIN
Handsets h
ON h.HandsetId = us.HandsetId INNER JOIN
Users u
ON u.UserId = us.UserId
WHERE h.Name in (1000,1001.1002,1003,1004....) AND
h.Deleted = 0
GROUP BY h.Name, u.Username
) h
WHERE seqnum = 1

SQL: How to Order By And Limit Via a Join

As an example, let's say I have the following query:
select
u.id,
u.name,
(select s.status from user_status s where s.user_id = u.id order by s.created_at desc limit 1) as status
from
user u
where
u.active = true;
The above query works great. It returns the most recent user status for the selected user. However, I want to know how to get the same result using a join on the user_status table, instead of using a sub-query. Is something like this possible?
I'm using PostgreSQL.
Thank you for any help you can give!
select u.id , b.status
from user u join user_status b on u.id= b.user_id
where u.active = true;
order by b.s.created_at desc limit 1
i think this work.
but in your code there is "b" which i did not know what it is.
JOIN syntax does not directly offer order or limit as options; so strictly speaking you cannot achieve what you want directly as a join. I believe the easiest way to resolve your question is to use a joined subquery, like this:
select
u.id
, u.name
, s.status
from user u
left join (
select
user_id
, status
, row_number() over(partition by user_id
order by created_at desc) as rn
from user_status s
) s on u.id = s.userid and s.rn = 1
where u.active = true;
Here the analytic function row_number() combined with the over() clause enables the subsequent join condition and s.rn=1 to take advantage of both ordering and limiting the joined rows via the calculation of the rn value.
nb a correlated subquery within the select clause (as used in the question's query) acts like a left join because it can return NULL. If that effect isn't needed or desired you can change to an inner join.
It is possible to move that subquery into a CTE, but unless there are compelling reasons to do so I prefer using the more traditional form seen above.
An alternative approach (for Postgres 9.3 or later) is to use a lateral join which is quite similar to the original subquery, but as it becomes part of the from clause is likely to be more efficient that using that subquery in the select clause.
select
u.id
, u.name
, s.status
from user u
left join lateral (
select user_status.status
from user_status
where user_status.user_id = u.id
order by user_status.created_at desc
limit 1
) s ON true
where u.active = true;
I ended up doing the following, which is working great for me:
select
u.id,
u.name,
us.status
from
user u
left join (
select
distinct on (user_id)
*
from
user_status
order by
user_id,
created_at desc
) as us on u.id = vs.user_id
where
u.active = true;
This is also more efficient than the query that I had in my question.
You can achieve it by converting the subquery into a with clause and use it in the join

Query for newest record in a table, store query as view

I'm trying to turn this query into a view:
SELECT t.*
FROM user t
JOIN (SELECT t.UserId,
MAX( t.creationDate ) 'max_date'
FROM user t
GROUP BY t.UserId) x ON x.UserId = t.UserId
AND x.max_date = t.creationDate
But views do not accept subqueries.
What this does is look for the latest, newest record of a user.
I got the idea from this other stackoverflow question
Is there a way to turn this into a query with joins, perhaps?
Create two views
Create View MaxCreationDate
As
SELECT t.userId, Max(t2.CreationDate) MaxCreated
FROM user t
Group By t.UserId
Create View UserWithMaxDate
As
Select t.*, m.MaxCreated From user t
Join MaxCreationDate m
On m.UserId= t.UserId
and then just call the second one...
EDIT: hey, based on comment from Quassnoi, and your inclusion of
where t.CreationDate = MaxDate in yr orig sql, I wonder if you want to see all rows for each distinct user, with the max creation date for that user in every row, or, do you want only one row per user, the one row that was created most recently?
If the latter is the case, as #Quassnoi suggested in comment, change the second view query as follows
Create View UserWithMaxDate
As
Select t.*, m.MaxCreated From user t
Join MaxCreationDate m
On m.UserId= t.UserId
And m.MaxCreated = t.Creationdate
CREATE INDEX ix_user_userid_creationdate_id ON user (userid, creationdate, id);
CREATE VIEW v_duser AS
SELECT DISTINCT userId
FROM user;
CREATE VIEW v_lastuser AS
SELECT u.*
FROM v_duser ud
JOIN user u
ON u.id =
(
SELECT ui.id
FROM user ui
WHERE ui.userid = ud.userid
ORDER BY
ui.userid DESC, ui.creationdate DESC, ui.id DESC
LIMIT 1
);
This is fast and deals with possible duplicates on (userid, creationdate).

Best way to construct this query? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Retrieving the last record in each group
I have two tables set up similar to this (simplified for the quest):
actions-
id - user_id - action - time
users -
id - name
I want to output the latest action for each user. I have no idea how to go about it.
I'm not great with SQL, but from what I've looked up, it should look something like the following. not sure though.
SELECT `users`.`name`, *
FROM users, actions
JOIN < not sure what to put here >
ORDER BY `actions`.`time` DESC
< only one per user_id >
Any help would be appreciated.
SELECT * FROM users JOIN actions ON actions.id=(SELECT id FROM actions WHERE user_id=users.id ORDER BY time DESC LIMIT 1);
you need to do a groupwise max - please refer to examples here http://jan.kneschke.de/projects/mysql/groupwise-max/
here's an example i did for somone else which is similar to your requirements:
http://pastie.org/925108
select
u.user_id,
u.username,
latest.comment_id
from
users u
left outer join
(
select
max(comment_id) as comment_id,
user_id
from
user_comment
group by
user_id
) latest on u.user_id = latest.user_id;
select u.name, a.action, a.time
from user u, action a
where u.id = a.user_id
and a.time in (select max(time) from action where user_id = u.user_id group by user_id )
note untested - but this should be the pattern
DECLARE #Table (ID Int, User_ID, Time DateTime)
-- This gets the latest entry for each user
INSERT INTO #Table (ID, User_ID, Time)
SELECT ID, User_ID, MAX(TIME)
FROM actions z
INNER JOIN users x on x.ID = z.ID
GROUP BY z. userID
-- Join to get resulting action
SELECT z.user_ID, z.Action
FROM actions z
INNER JOIN #Table x on x.ID = z.ID
This is the greatest-n-per-group problem that comes up frequently on Stack Overflow. Follow the tag for dozens of other posts on this problem.
Here's how to do it in MySQL given your schema with no subqueries and no GROUP BY:
SELECT u.*, a1.*
FROM users u JOIN actions a1 ON (u.id = a1.user_id)
LEFT OUTER JOIN actions a2 ON (u.id = a2.user_id AND a1.time < a2.time)
WHERE a2.id IS NULL;
In other words, show the user with her action such that if we search for another action with the same user and a later time, we find none.
It seems to me that the following will be works
WITH GetMaxTimePerUser (user_id, time) (
SELECT user_id, MAX(time)
FROM actions
GROUP BY user_id
)
SELECT u.name, a.action, amax.time
FROM actions AS a
INNER JOIN users AS u ON u.id=a.user_id
INNER JOIN GetMaxTimePerUser AS u_maxtime ON u_maxtime.user_id=u.id
WHERE a.time=u_maxtime.time
Usage of temporary named result set (common table expression or CTE) without subqueries and OUTER JOIN is the way best opened for query optimization. (CTE is something like a VIEW but existing only virtual or inline)