Determining relationship between rows using subqueries in PostgreSQL - sql

I'm trying to figure out how to complete the following task in a single query.
Basically, given a user's ID, I want to return the user profiles of all users he is friends with.
If anything is unclear, I'll be happy to go into more detail. Thanks!
table 'users':
user_id | col1 | col2 | etc
-----------------------------------------
a | *** | *** | ***
-----------------------------------------
b | *** | *** | ***
table 'users_friends'
user_id | friend_user_id | status
-----------------------------------------
a | b | 1
-----------------------------------------
b | a | 1
given a value of a, find rows in table users_friends where
user_id = a
status = 1
using the resulting rows of that query, find rows in table users_friends where
user_id = b (column `user_friend_id` from resulting rows)
user_friend_id = a (column `user_id` from resulting rows)
status = 1
if any rows are returned, select rows from table 'users' where
user_id = b (column `user_id` from resulting row)
This is a really rough one I came up with. I think it does what I'm looking for, but I'm sure there are better ways to go about it.
SELECT * FROM users WHERE user_id IN
(SELECT user_id FROM users_friends WHERE friend_user_id IN
(SELECT user_id FROM users_friends WHERE user_id = 'someuserid' AND status = 1 ) AND status = 1 );

select u.*
from
users u
inner join
users_friends f on u.user_id = f.friend_user_id
where
f.status = 1
and f.friend_user_id = 'a'

Assuming there are no duplicates in friends table:
SELECT u.user_id, u.col1, u.col2
JOIN users_friends AS f1 ON u.user_id=f1.user_id
JOIN users_friends AS f2 ON f1.user_id=f2.friend_id AND f1.friend_id=f2.user_id
WHERE f1.status=1 AND f2.status=1 AND f2.user_id='a'

SQL Fiddle
SELECT u.user_id, u.col1
FROM users_friends AS f
JOIN users AS u
ON f.friend_user_id = u.user_id
WHERE f.user_id = 'a'
AND f.status = 1

Related

Efficiently getting multiple counts of foreign key rows in PostgreSQL

I have a database that consists of users who can perform various actions, which I keep track of in multiple tables. I'm creating a point system, so I need to count how many of each type of action the user did. For example, if I had:
users posts comments shares
id | username id | user_id id | user_id id | user_id
------------- -------------- -------------- --------------
1 | abc 1 | 1 1 | 1 1 | 2
2 | xyz 2 | 1 2 | 2 2 | 2
I would want to return:
user_details
id | username | post_count | comment_count | share_count
---------------------------------------------------------
1 | abc | 2 | 1 | 0
2 | xyz | 0 | 1 | 2
This is slightly different from this question about foreign key counts since I want to return the individual counts per table.
What I've tried so far (example code):
SELECT
users.id,
users.username,
COUNT( DISTINCT posts.id ) as post_count,
COUNT( DISTINCT comments.id ) as comment_count,
COUNT( DISTINCT shares.id ) as share_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN comments ON comments.user_id = users.id
LEFT JOIN shares ON shares.user_id = users.id
GROUP BY users.id
While this works, I had to use DISTINCT in all of my counts because the LEFT JOINS were causing high numbers of duplicate rows. I feel like there must be a better way to do this since (please correct me if I'm wrong) on each LEFT JOIN, the DISTINCT is having to filter out an exponentially growing number of duplicated rows.
Thank you so much for any help you could give me with this!
You can join derived tables that already do the aggregation.
SELECT u.id,
u.username,
coalesce(pc.c, 0) AS post_count,
coalesce(cc.c, 0) AS comment_count,
coalesce(sc.c, 0) AS share_count
FROM users AS u
LEFT JOIN (SELECT p.user_id,
count(*) AS cc
FROM posts AS p
GROUP BY p.user_id) AS pc
ON pc.user_id = u.id
LEFT JOIN (SELECT c.user_id,
count(*) AS
FROM comments AS c
GROUP BY c.user_id) AS cc
ON cc.user_id = u.id
LEFT JOIN (SELECT s.user_id,
count(*) AS c
FROM shares AS s
GROUP BY s.user_id) AS sc
ON sc.user_id = u.id;

Join one table with two other ones by id

I am trying to join one table with two others that are unrelated to each other but are linked to the first one by an id
I have the following tables
create table groups(
id int,
name text
);
create table members(
id int,
groupid int,
name text
);
create table invites(
id int,
groupid int,
status int \\ 2 for accepted, 1 if it's pending
);
Then I inserted the following data
insert into groups (id, name) values(1,'group');
insert into members(id, groupid, name) values(1,1,'admin'),(1,1,'other');
insert into invites(id, groupid, status) values(1,1,2),(2,1,1),(3,1,1);
Obs:
The admin does not has an invite
The group has an approved invitation with status 2 (because the member 'other' joined)
The group has two pending invites with status 1
I am trying to do a query that gets the following result
groupid | name | inviteId
1 | admin | null
1 | other | null
1 | null | 2
1 | null | 3
I have tried the following querys with no luck
select g.id, m.name, i.id from groups g
left join members m ON m.groupid = g.id
left join invites i ON i.groupid = g.id and i.status = 1;
select g.id, m.name, i.id from groups g
join (select groupid, name from members) m ON m.groupid = g.id
join (select groupid, id from invites where status = 1) i ON i.groupid = g.id;
Any ideas of what I am doing wrong?
Because members and invites are not related, you need to use two separate queries and use UNION (automatically removes duplicates) or UNION ALL (keeps duplicates) to get the output you desire:
select g.id as groupid, m.name, null as inviteid from groups g
join members m ON m.groupid = g.id
union all
select g.id, null, i.id from groups g
join invites i ON (i.groupid = g.id and i.status = 1);
Output:
groupid | name | inviteid
---------+-------+----------
1 | admin |
1 | other |
1 | | 3
1 | | 2
(4 rows)
Without a UNION, your query implies that the tables have some sort of relationship, so the columns are joined side-by-side. Since you want to preserve the null values, implying that the tables are not related, you need to concatenate/join them vertically with UNION
Disclosure: I work for EnterpriseDB (EDB)

Writing a complex SQL query, with table relations

I will present my table structures first (only relevant fields will be mentioned)
/* The table Users */
user_id | user_name | user_registration_date
1 | USER1 | 19/09/2010
2 | USER2 | 20/09/2010
/* The table Levels_Completed */
user_id | level_id
1 | 1
1 | 2
2 | 1
I would like to display a scoreboard. The first user on the list, will be the one with the highest count of levels he completed.
For the example above, USER1 will be displayed above USER2.
I want to receive the next data:
user_id, user_name, user_registration_date, COUNT(level_id rows) AS score
Ordered by the count of score, for each SQL row I receive.
Example:
1 | USER1 | 19/09/2010 | 2
2 | USER2 | 20/09/2010 | 1
I know how to use INNER JOIN, but I think the counting and ordering are above my current level. Help please?
SELECT Users.user_id, user_name, user_registration_date, COUNT(level_id) AS score
FROM Users INNER JOIN Levels_Completed ON Users.user_id = Levels_Completed.user_id
GROUP BY Users.user_id, user_name, user_registration_date
Try this:
SELECT
U.user_id,
U.user_name,
U.user_registration_date,
COUNT(L.level_id) as score
FROM Users U
LEFT JOIN Levels_Completed L
ON U.User_Id = L.User_Id
GROUP BY U.user_id, U.user_name, U.user_registration_date
ORDER BY score DESC
SELECT Users.user_id, user_name, user_registration_date, score
FROM Users
INNER JOIN (
SELECT user_id, COUNT(level_id) AS score
FROM Levels_Completed
GROUP BY user_id)
USING (user_id)
ORDER BY score DESC

SQL Order and group by

I'm getting a bit lost in making a query that performs a certain look up.
I have the first part of the query going, which returns me all accounts that have missing some entries on their account. Now I need to filter this subset further based on their last login attempt.
The table structures are as follows:
The users table contains all user information. We only care about users under project_id 33.
The users_account_list contains all accounts a user has. We only care about users NOT having an entry for service 50.
The users_login_logs contains all login attempts for a user.
The original query I have this this:
SELECT u.id,
u.login,
u.email,
u.nickname,
b.station_login AS "additionals.station_login",
a.id AS "user_account_list.id",
a.game_id AS "user_account_list.game_id",
a.game_uid AS "user_account_list.game_uid",
c.created_at AS "last login"
FROM users u
LEFT JOIN user_account_list a ON u.id = a.user_id AND a.game_id = 50
LEFT JOIN user_additionals b ON u.id = b.id
LEFT JOIN user_login_logs c ON u.id = c.user_id
WHERE u.project_id = 33
AND u.verified_at IS NOT NULL
AND (a.id IS NULL OR a.game_id IS NULL OR a.game_uid IS NULL)
AND (b.station_login IS NULL OR b.station_login = '')
ORDER BY c.created_at DESC
This returns me all users that have been registered under project_id 33, and do not have an entry for game_id 50, and have no information stored in their additional info table. Optional, but not relevant, Just limits the data returned. It does give me multiple rows per user back , sorted according their latest login date.
What I need is to get only 1 row per user returned with their LATEST login date. I tried replacing the ORDER BY with GROUP by u.id but this gives me the oldest result back, not the latest.
How can I:
Limit the rows returned to only 1 row per user
Make sure the row is based on the latest login attempt of the user.
EDIT:
This is what the query currently returns:
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
| id | login | email | nickname | additionals.station_login | user_account_list.id | user_account_list.game_id | user_account_list.game_uid | last login |
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-19 00:00:00 |
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-18 00:00:00 |
| 1 | usrnm | someon#mail.com | Nickname | | NULL | NULL | NULL | 2012-10-17 00:00:00 |
+----+-------+-----------------+----------+---------------------------+----------------------+---------------------------+----------------------------+---------------------+
3 rows in set (0.08 sec)
One way to do this, is to JOIN the table user_login_logs, with the following table:
SELECT user_id, MAX(created_at) LatestDate
FROM user_login_logs
GROUP BY user_id
and join it on created_at = LatestDate. This will limit the users login logs to the latest created date for each user. Here is your query:
SELECT u.id,
u.login,
u.email,
u.nickname,
b.station_login AS "additionals.station_login",
a.id AS "user_account_list.id",
a.game_id AS "user_account_list.game_id",
a.game_uid AS "user_account_list.game_uid",
c.created_at AS "last login"
FROM users u
LEFT JOIN user_account_list a ON u.id = a.user_id AND a.game_id = 50
LEFT JOIN user_additionals b ON u.id = b.id
LEFT JOIN user_login_logs c ON u.id = c.user_id
LEFT JOIN
(
SELECT user_id, MAX(created_at) LatestDate
FROM user_login_logs
GROUP BY user_id
) maxc ON c.userid = maxc.userid AND c.created_at = maxc.LatestDate
WHERE u.project_id = 33
AND u.verified_at IS NOT NULL
AND (a.id IS NULL OR a.game_id IS NULL OR a.game_uid IS NULL)
AND (b.station_login IS NULL OR b.station_login = '')
ORDER BY c.created_at DESC;
Note that: You are LEFT JOIN the table, so that the unmatched rows from the left joined table will be included in the result set. If you didn't need to include them in the result set, use INNER JOIN instead.
You need to replace your ORDER BY by a GROUP by u.id as you tried but you also need in your SELECT to indicate that you want the last date in the group so you need to replace
c.created_at AS "last login"
by
MAX(c.created_at) AS "last login"
This will return only one line per user thanks to the GROUP BY and you will only select the latest date for each user thanks to MAX()
Edit: I think you should avoid using alias column names with spaces inside to avoid mistakes

Subquery to return the latest entry for each parent ID

I have a parent table with entries for documents and I have a history table which logs an audit entry every time a user accesses one of the documents.
I'm writing a search query to return a list of documents (filtered by various criteria) with the latest user id to access each document returned in the result set.
Thus for
DOCUMENTS
ID | NAME
1 | Document 1
2 | Document 2
3 | Document 3
4 | Document 4
5 | Document 5
HISTORY
DOC_ID | USER_ID | TIMESTAMP
1 | 12345 | TODAY
1 | 11111 | IN THE PAST
1 | 11111 | IN THE PAST
1 | 12345 | IN THE PAST
2 | 11111 | TODAY
2 | 12345 | IN THE PAST
3 | 12345 | IN THE PAST
I'd be looking to get a return from my search like
ID | NAME | LAST_USER_ID
1 | Document 1 | 12345
2 | Document 2 | 11111
3 | Document 3 | 12345
4 | Document 4 |
5 | Document 5 |
Can I easily do this with one SQL query and a join between the two tables?
Revising what Andy White produced, and replacing square brackets (MS SQL Server notation) with DB2 (and ISO standard SQL) "delimited identifiers":
SELECT d.id, d.name, h.last_user_id
FROM Documents d LEFT JOIN
(SELECT r.doc_id AS id, user_id AS last_user_id
FROM History r JOIN
(SELECT doc_id, MAX("timestamp") AS "timestamp"
FROM History
GROUP BY doc_id
) AS l
ON r."timestamp" = l."timestamp"
AND r.doc_id = l.doc_id
) AS h
ON d.id = h.id
I'm not absolutely sure whether "timestamp" or "TIMESTAMP" is correct - probably the latter.
The advantage of this is that it replaces the inner correlated sub-query in Andy's version with a simpler non-correlated sub-query, which has the potential to be (radically?) more efficient.
I couldn't get the "HAVING MAX(TIMESTAMP)" to run in SQL Server - I guess having requires a boolean expression like "having max(TIMESTAMP) > 2009-03-05" or something, which doesn't apply in this case. (I might be doing something wrong...)
Here is something that seems to work - note the join has 2 conditions (not sure if this is good or not):
select
d.ID,
d.NAME,
h."USER_ID" as "LAST_USER_ID"
from Documents d
left join History h
on d.ID = h.DOC_ID
and h."TIMESTAMP" =
(
select max("TIMESTAMP")
from "HISTORY"
where "DOC_ID" = d.ID
)
This doesn't use a join, but for some queries like this I like to inline the select for the field. If you want to catch the situation when no user has accessed you can wrap it with an NVL().
select a.ID, a.NAME,
(select x.user_id
from HISTORY x
where x.doc_id = a.id
and x.timestamp = (select max(x1.timestamp)
from HISTORY x1
where x1.doc_id = x.doc_id)) as LAST_USER_ID
from DOCUMENTS a
where <your criteria here>
I think it should be something like this:
SELECT ID, Name, b.USER_ID as LAST_USER_ID
FROM DOCUMENTS a LEFT JOIN
( SELECT DOC_ID, USER_ID
FROM HISTORY
GROUP BY DOC_ID, USER_ID
HAVING MAX( TIMESTAMP )) as b
ON a.ID = b.DOC_ID
this might work also:
SELECT ID, Name, b.USER_ID as LAST_USER_ID
FROM DOCUMENTS a
LEFT JOIN HISTORY b ON a.ID = b.DOC_ID
GROUP BY DOC_ID, USER_ID
HAVING MAX( TIMESTAMP )
Select ID, Name, User_ID
From Documents Left Outer Join
History a on ID = DOC_ID
Where ( TimeStamp = ( Select Max(TimeStamp)
From History b
Where a.DOC_ID = b.DOC_ID ) OR
TimeStamp Is NULL ) /* this accomodates the Left */