List of questions comparison - sql

I have a profile that looks like this:
profile_id | answer_id
----------------------
1 1
1 4
1 10
I have a table which contains a list of responses by poll respondents with structure like this:
user_id | answer_id
-------------------
1 1
1 9
2 1
2 4
2 10
3 14
3 29
How do I select a list of users that gave all of the answers in the profile? In this case only user 2.

You can use the following:
select user_id
from response r
where answer_id in (select distinct answer_id -- get the list of distinct answer_id
from profile
where profile_id = 1) -- add filter if needed
group by user_id -- group by each user
having count(distinct answer_id) = (select count(distinct answer_id) -- verify the user has the distinct count
from profile
where profile_id = 1) -- add filter if needed
See SQL Fiddle with Demo
Or another way to write this is:
select user_id
from response r
where answer_id in (1, 4, 10)
group by user_id
having count(distinct answer_id) = 3
See SQL Fiddle with Demo

This is an example of a join query with an aggregation:
select a.user_id
from profile p full outer join
answers a
on p.answer_id = p.answer_id and
p.profile_id = 1
group by a.user_id
having count(p.profileid) = count(*) and
count(a.user_id) = count(*)
The full outer join matches all the profiles to all the answers. If the two sets completely match, then there are no "null"s in the ids of the other set. The having clause checks for jsut this condition.

SELECT user_id
FROM user_answer
WHERE user_id in (SELECT user_id FROM profile WHERE answer_id = 1) AND
user_id in (SELECT user_id FROM profile WHERE answer_id = 4) AND
user_id in (SELECT user_id FROM profile WHERE answer_id = 10)

SELECT *
FROM table1
INNER JOIN table2
ON table1.answer_id = table2.answer_id
WHERE table2.user_id = 2
i think this might be what you're looking for.

Related

select all row values as a list

I have a table tasks that looks like this:
userId caption status id
1 Paul done 1
2 Ali notDone 18
3 Kevin notDone 12
3 Elisa notDone 13
I join it with another table users to find the number of taskswhere status = notDone. I do it like this:
SELECT u.id,
t.number_of_tasks,
FROM users u
INNER JOIN (
SELECT userId, COUNT(*) number_of_tasks
FROM tasks
WHERE status = "notDone"
GROUP BY userId
) t ON u.id = t.userId
"""
Now, I want create another column captions that somehow includes a list of all captions that were included in the countand fulfil the join + where conditions.
For example, I would expect this as one of the rows. How can I achieve this?
userId number_of_tasks captions
3 2 ["Kevin", "Elisa"]
You can use json_group_array() aggregate function inside the subquery to create the list of captions for each user:
SELECT u.id, t.number_of_tasks, t.captions
FROM users u
INNER JOIN (
SELECT userId,
COUNT(*) number_of_tasks,
json_group_array(caption) captions
FROM tasks
WHERE status = 'notDone'
GROUP BY userId
) t ON u.id = t.userId;

Dynamically select the table to join in Postgres with case statements

My notifications table has a column called action_id and trigger_type. I want to INNER JOIN action_id with another table (Like users or posts) depending on the trigger_type. I wrote the following query but it throws an error.
Table structure
users
display_name
username
id
John
Doe
1
Larry
Doe
2
posts
post_title
post_body
id
user_id
Hello
Hello world
1
2
comments
comment_text
post_id
id
user_id
Hello
1
1
1
notifications
read
trigger_id
id
target_id
action_id
trigger_type
false
1
1
2
1
0
false
1
2
2
1
1
trigger_type = 0 means its a like 1 means its a comment
SELECT notifications.*, users.display_name, users.username, users.profile_pic, posts.title
FROM notifications
INNER JOIN users ON users.id = notifications.trigger_id
(
CASE notifications.trigger_type
WHEN 0 THEN INNER JOIN users ON users.id = notifications.action_id
WHEN 1 THEN INNER JOIN posts ON posts.id = notifications.trigger_id
)
You cannot conditionally join like that. Instead, use left join like this:
SELECT n.*,
-- whatever columns you want from the trigger user go here
un.display_name, un.username, un.profile_pic, p.title
FROM notifications n JOIN
users u
ON u.id = n.trigger_id LEFT JOIN
users un
ON un.id = n.action_id AND n.trigger_type = 0 LEFT JOIN
posts p
ON p.id = n.action_id AND n.trigger_type = 1;

Query to return one record when pivot table has exactly one record for each id in the IN clause

I have three tables:
users:
id username
-----------------
1 user1
2 user2
3 user3
chats:
id created_on
---------------------------
1 2020-11-07 00:00:00
2 2020-11-08 00:00:00
chat_users (pivot):
id chat_id user_id
----------------------------
1 1 1
2 1 2
3 2 1
4 2 2
5 2 3
Given an array of user ids, i want to return one chat record or an empty result.
Examples:
Given user ids (1, 2), return chat row with id 1.
Given user ids (1, 3), return an empty result set.
Given user ids (1, 2, 3), return chat row with id 2.
I've tried something like:
select distinct c.*
from chats as c
join chat_users as cu on cu.user_id in (...)
;
but I know that's not right and it doesn't work for matching the pivot records exactly.
Thank you in advance - any help is much appreciated, I'm drawing a blank.
UPDATE
This seems to work, but I have a feeling it is a less than ideal solution:
with filtered_chat_users as (
select cu.chat_id, string_agg(cu.user_id::text, ',' order by cu.user_id asc) as user_ids
from chat_users as cu
where cu.user_id in (1, 2, 3)
group by cu.chat_id
)
select c.*
from filtered_chat_users as fcu
join chats as c on c.id = fcu.chat_id
where fcu.user_ids = '1,2,3';
UDPATE 2:
The above does not work as I had thought/hoped. Passing in a subset of user ids incorrectly matches on chat records: passing in (1, 2) returns both chats 1 and 2.
UPDATE 3:
This does work, but again wondering if there is a more efficient solution:
with filtered_chats as (
select distinct c.id
from chats as c
join chat_users as cu on cu.chat_id = c.id
and cu.user_id in (1, 2)
), filtered_chat_users as (
select cu.chat_id, string_agg(cu.user_id::text, ',' order by cu.user_id asc) as user_ids
from chat_users as cu
join filtered_chats as fc on fc.id = cu.chat_id
group by cu.chat_id
)
select c.*
from filtered_chat_users as fcu
join chats as c on c.id = fcu.chat_id
where fcu.user_ids = '1,2'
You can use conditional aggregation:
select chat_id
from chat_users cu
group by chat_id
having count(*) filter (user_id in (1, 2, 3)) = count(*) and
count(*) = 3;
The list and "3" need to change depending on what you are looking for.
This checks that all specified users are in the chat. The = 3 validates that only those users are there.

Query conditionally return only one row per distinct id

I am making a Reddit clone and I'm having trouble querying my list of posts, given a logged in user, that shows whether or not logged in user upvoted the post for every post. I made a small example to make things simpler.
I am trying to return only one row per distinct post_id, but prioritize the upvoted column to be t > f > null.
For this example data:
> select * from post;
id
----
1
2
3
> select * from users;
id
----
1
2
> select * from upvoted;
user_id | post_id
---------+---------
1 | 1
2 | 1
If I am given user_id = 1 I want my query to return:
postid | user_upvoted
--------+--------------
1 | t
2 | f
3 | f
Since user1 upvoted post1, upvoted is t. Since user1 did not upvote post2, upvoted is f. Same for post3.
Schema
CREATE TABLE IF NOT EXISTS post (
id bigserial,
PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS users (
id serial,
PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS upvoted (
user_id integer
REFERENCES users(id)
ON DELETE CASCADE ON UPDATE CASCADE,
post_id bigint
REFERENCES post(id)
ON DELETE CASCADE ON UPDATE CASCADE,
PRIMARY KEY (user_id, post_id)
);
What I tried so far
SELECT post.id as postid,
CASE WHEN user_id=1 THEN true ELSE false END as user_upvoted
FROM post LEFT OUTER JOIN upvoted
ON post_id = post.id;
Which gives me:
postid | user_upvoted
--------+--------------
1 | t
1 | f
2 | f
3 | f
Due to the join, there are two "duplicate" rows that result from the query. I want to priority the row with t > f > null. So I want to keep the 1 | t row.
Full script with schema+data.
You should be able to do this with distinct on:
SELECT distinct on (p.id) p.id as postid,
(CASE WHEN user_id = 1 THEN true ELSE false END) as upvoted
FROM post p LEFT OUTER JOIN
upvoted u
ON u.post_id = p.id
ORDER BY p.id, upvoted desc;
Since the combination (user_id, post_id) is defined unique in upvoted (PRIMARY KEY), this can be much simpler:
SELECT p.id AS post_id, u.post_id IS NOT NULL AS user_upvoted
FROM post p
LEFT JOIN upvoted u ON u.post_id = p.id
AND u.user_id = 1;
Simply add user_id = 1 to the join condition. Makes perfect use of the index and should be simplest and fastest.
You also mention NULL, but there are only two distinct states in the result: true / false.
Alternative approach
On second thought, you might be complicating a very basic task. If you are only interested in posts the current user upvoted, use this simple query instead:
SELECT post_id FROM upvoted WHERE user_id = 1;
All other posts are not upvoted by the given user. It would seem we don't have to list those explicitly.
SQL Fiddle.
The exists() operator yields a boolean value:
SELECT p.id
, EXISTS (SELECT * FROM upvoted x
WHERE x.post_id = p.id
AND x.user_id = 1) AS it_was_upvoted_by_user1
FROM post p
;

Find rows that have same value in one column and other values in another column?

I have a PostgreSQL database that stores users in a users table and conversations they take part in a conversation table. Since each user can take part in multiple conversations and each conversation can involve multiple users, I have a conversation_user linking table to track which users are participating in each conversation:
# conversation_user
id | conversation_id | user_id
----+------------------+--------
1 | 1 | 32
2 | 1 | 3
3 | 2 | 32
4 | 2 | 3
5 | 2 | 4
In the above table, user 32 is having one conversation with just user 3 and another with both 3 and user 4. How would I write a query that would show that there is a conversation between just user 32 and user 3?
I've tried the following:
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3);
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY (cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3));
SELECT conversation_id AS cid,
user_id
FROM conversation_user
WHERE (user_id = 32)
OR (user_id = 3)
GROUP BY cid HAVING count(*) = 2;
These queries throw an error that says that user_id must appear in the GROUP BY clause or be used in an aggregate function. Putting them in an aggregate function (e.g. MIN or MAX) doesn't sound appropriate. I thought that my first two attempts were putting them in the GROUP BY clause.
What am I doing wrong?
This is a case of relational division. We have assembled an arsenal of techniques under this related question:
How to filter SQL results in a has-many-through relation
The special difficulty is to exclude additional users. There are basically 4 techniques.
Select rows which are not present in other table
I suggest LEFT JOIN / IS NULL:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
LEFT JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND cu3.conversation_id IS NULL;
Or NOT EXISTS:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
Both queries do not depend on a UNIQUE constraint for (conversation_id, user_id), which may or may not be in place. Meaning, the query even works if user_id 32 (or 3) is listed more than once for the same conversation. You would get duplicate rows in the result, though, and need to apply DISTINCT or GROUP BY.
The only condition is the one you formulated:
... a query that would show that there is a conversation between just user 32 and user 3?
Audited query
The query you linked in the comment wouldn't work. You forgot to exclude other participants. Should be something like:
SELECT * -- or whatever you want to return
FROM conversation_user cu1
WHERE cu1.user_id = 32
AND EXISTS (
SELECT 1
FROM conversation_user cu2
WHERE cu2.conversation_id = cu1.conversation_id
AND cu2.user_id = 3
)
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
Which is similar to the other two queries, except that it will not return multiple rows if user_id = 3 is linked multiple times.
You can use conditional aggregation to select all cids that only have 2 specific particpants
select cid from conversation_user
group by cid
having count(*) = 2
and count(case when user_id not in (32,3) then 1 end) = 0
If (cid,user_id) is not unique then replace having count(*) = 2 with having count(distinct user_id) = 2
if you just want confirmation.
select conversation_id
from conversation_users
group by conversation_id
having bool_and ( user_id in (3,32))
and count(*) = 2;
if you want full details,
you can use a window function and a CTE like this:
with a as (
select *
,not bool_and( user_id in (3,32) )
over ( partition by conversation_id)
and 2 = count(user_id)
over ( partition by conversation_id)
as conv_candidates
from conversation_users
)
select * from a where conv_candidates;
Because you want conversations with just 2 users, you can use a self outer join on other users and filter out hits:
To find all 2-user conversations and they're between:
SELECT
a.conversation_id cid,
a.user_id user_id_1,
b.user_id user_id_2
FROM conversation_user a
JOIN conversation_user b ON b.cid = a.cid
AND b.user_id > a.user_id
LEFT JOIN conversation_user c ON c.cid = a.cid
AND c.user_id NOT IN (a.user_id, b.user_id)
WHERE c.cid IS NULL -- only return misses on join to others
To find all 2-user conversations for a particular user, just add:
AND a.user_id = 32