Optimizing a nested SQL query through (preferably) joins - sql

I am currently trying to fetch a list of Posts from a database, along with the likes, dislikes and checking whether the user has liked the post or not.
What I have tried:
Here's what the first version of the query looked like:
SELECT
announcements.*,
users.FIRSTNAME,
users.LASTNAME,
((SELECT COUNT(USER_ID) FROM likes_posts WHERE POST_ID = announcements.ID) - (SELECT COUNT(USER_ID) FROM dislikes_posts WHERE POST_ID = announcements.ID)) as TLIKES,
(SELECT COUNT(USER_ID) FROM likes_posts WHERE USER_ID = ? AND POST_ID = announcements.ID) AS USER_LIKED,
(SELECT COUNT(USER_ID) FROM dislikes_posts WHERE USER_ID = ? AND POST_ID = announcements.ID) AS USER_DISLIKED FROM announcements LEFT JOIN users ON announcements.OWNER_ID = users.ID
WHERE announcements.CHANNEL = ? AND announcements.ID < ? ORDER BY announcements.ID DESC
I have tried optimizing it through serval JOINS, but the results are quite messed up:
SELECT
announcements.*,
users.FIRSTNAME,
users.LASTNAME,
COUNT(likes_posts.USER_ID) AS TLikes,
COUNT(dislikes_posts.USER_ID) AS TDislikes,
UserLiked.ID AS userLiked,
UserDisliked.ID AS userDisliked
FROM announcements
LEFT JOIN likes_posts ON likes_posts.POST_ID = announcements.ID
LEFT JOIN dislikes_posts ON dislikes_posts.POST_ID = announcements.ID
LEFT JOIN likes_posts AS UserLiked ON UserLiked.USER_ID = ?
LEFT JOIN likes_posts AS UserDisliked ON UserDisliked.USER_ID = ?
LEFT JOIN users ON announcements.OWNER_ID = users.ID
WHERE announcements.CHANNEL = ? AND announcements.ID < ?
GROUP BY announcements.ID
ORDER BY announcements.ID DESC
Queries' results
The first query manages to constantly fetch the correct number of likes and dislikes (example: 5 and 3).
For the second one, however, it constantly fetches a number that is the double of the current likes or dislikes, whichever is bigger (eg. if there are 5 likes and 6 dislikes, the result would be 16 likes and 16 dislikes)
Problem
I'm guessing the second query is somehow fetching the likes_posts table 2 times, which causes the discrepancy between the likes and dislikes.

Here's one way you could do it, by aggregating the like and dislike counts first, then joining them to the base table. This way you're only doing the counts once each instead of twice
SELECT
a.*,
u.FIRSTNAME,
u.LASTNAME,
coalesce(likes.cnt, 0) - coalesce(dislikes.cnt, 0) as TLIKES,
coalesce(likes.cnt, 0) AS USER_LIKED,
coalesce(dislikes.cnt, 0) AS USER_DISLIKED
FROM
announcements a
LEFT JOIN
users u ON a.OWNER_ID = u.ID
left join
(
select post_id, count(user_id) cnt
from likes_posts
group by post_id
) likes on likes.post_id = a.id
left join
(
select post_id, count(user_id) cnt
from dislikes_posts
group by post_id
) dislikes on dislikes.post_id = a.id
WHERE
announcements.CHANNEL = ? AND announcements.ID < ?
ORDER BY
announcements.ID DESC

Related

Get users with item count <= 1 in sql

We have these tables in PostgreSQL 12:
User -> id, name, email
items -> id, user_id, description
We want to run a query to find users that have 1 item or less.
I tried using a join statement and in the WHERE clause tried to put the count of users < 1 with this query
select * from "user" inner join item on "user".id = item.user_id where count(item.user_id) < 1;
but it failed and gave me this error.
ERROR: aggregate functions are not allowed in WHERE
LINE 1: ...inner join item on "user".id = item.user_id where count(item...
so im thinking the query needs to be more techincal.
Can anyone please help me with this? thanks
You can do:
select u.*
from user u
left join (
select user_id, count(*) as cnt from items group by user_id
) x on x.user_id = u.id
where x.cnt = 1 or x.cnt is null
You don't technically need a JOIN for this. You can get all the necessary data from the item table with GROUP BY. The trick is you need to use HAVING instead of WHERE for aggregated data like COUNT()
SELECT user_id
FROM item
GROUP BY user_id
HAVING COUNT(id) > 1
But we can add a JOIN if you want to see more fields from the user table:
SELECT u.id, u.name, u.email
FROM item i
INNER JOIN "user" u on u.id = i.user_id
GROUP BY u.id, u.name, u.email
HAVING COUNT(i.id) > 1

How to print two attribute values from your Sub query table

Suppose I have two tables,
User
Post
Posts are made by Users (i.e. the Post Table will have foreign key of user)
Now my question is,
Print the details of all the users who have more than 10 posts
To solve this, I can type the following query and it would give me the desired result,
SELECT * from USER where user_id in (SELECT user_id from POST group by user_id having count(user_id) > 10)
The problem occurs when I also want to print the Count of the Posts along with the user details. Now obtaining the count of user is not possible from USER table. That can only be done from POST table. But, I can't get two values from my subquery, i.e. I can't do the following,
SELECT * from USER where user_id in (SELECT user_id, **count(user_id)** from POST group by user_id having count(user_id) > 10)
So, how do I resolve this issue? One solution I know is this, but this I think it would be a very naive way to resolve this and will make the query much more complex and also much more slow,
SELECT u.*, (SELECT po.count(user_id) from POST as po group by user_id having po.count(user_id) > 10) from USER u where u.user_id in (SELECT p.user_id from POST p group by user_id having p.count(user_id) > 10)
Is there any other way to solve this using subqueries?
Move the aggregation to the from clause:
SELECT u.*, p.num_posts
FROM user u JOIN
(SELECT p.user_id, COUNT(*) as num_posts
FROM post p
GROUP BY p.user_id
HAVING COUNT(*) > 10
) p
ON u.user_id = p.user_id;
You can do this with subqueries:
select u.*
from (select u.*,
(select count(*) from post p where p.user_id = u.user_id) as num_posts
from users u
) u
where num_posts > 10;
With an index on post(user_id), this might actually have better performance than the version using JOIN/GROUP BY.
You can try by joining the tables, Prefer to do a JOIN than using SUBQUERY
SELECT user.*, count( post.user_id ) as postcount
FROM user LEFT JOIN post ON users.user_id = post.user_id
GROUP BY post.user_id
HAVING postcount > 10 ;

How to count and group by column across one to many relationship while handling 0 case?

I am trying to formulate a single SQL query that will count a table across a one to many relationship. Here is the short version of my schema:
User(id)
Group(id)
UserGroup(user_id, group_id)
Post(id, user_id, group_id)
The goal is to return the count of posts for each user in a group. The specific issue I am running into is my current query cannot return 0 for a user that has no posts. Here is my naive query:
SELECT
COUNT(*) as total,
user_id
FROM
posts
WHERE
group_id = ?
GROUP BY user_id
ORDER BY
total DESC
This works fine when every user has a post, but when some have no posts, they do not show up in the list. How can I write a single query that handles this scenario and returns count 0 for said users? I know I need to somehow incorporate UserGroup to get the list of users, but am stuck from there.
Use a left join:
SELECT u.id, COUNT(*) as total
FROM users u LEFT JOIN
posts p
ON p.user_id = u.id AND
p.group_id = ?
GROUP BY u.id
ORDER BY total DESC
I think I got it, but not sure how performant.
select count(p), u.id from users u left join (select * from workouts where group_id = ?) p on p.user_id = u.id where u.id in (select user_id from user_group where group_id = ?) group by u.id;

Postgres - Left join using a where clause + distinct

I want to join two tables using a join
SELECT * FROM posts
LEFT JOIN voted ON posts.post_id = voted.id
Which produces this:
How would I create query using:
ORDER BY date_posted DESC FETCH FIRST 5 ROW ONLY
on the Posts Table to return this result
Edit 1: duplicate post_id
How would I make it so that the uuid on the user_id column is only 82411850-
Edit 2: Final query thanks to Mr.Linoff
SELECT p.post_id, p.date_posted, p.posted_by,
v.user_id, v.votes
FROM posts p LEFT JOIN
voted v
ON p.post_id = v.id
AND v.user_id = '82411580...'
ORDER BY p.date_posted DESC
FETCH FIRST 5 ROW ONLY ;
You have a collision of ids. Be explicit about the columns you are selecting.
Then I think you have basically the right logic:
SELECT p.post_id, p.date_posted, p.posted_by,
v.user_id, v.votes
FROM posts p LEFT JOIN
voted v
ON p.post_id = v.id
ORDER BY p.date_posted DESC
FETCH FIRST 5 ROW ONLY ;

Too much Data using DISTINCT MAX

I want to see the last activity each individual handset and the user that used that handset. I have a table UserSessions that stores the last activity of a particular user as well as what handset they used in that activity. There are roughly 40 handsets, yet I always get back way too many records, like 10,000 rows when I only want the last activity of each handset. What am I doing wrong?
SELECT DISTINCT MAX(UserSessions.LastActivity), Handsets.Name,Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE
Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY UserSessions.LastActivity, Handsets.Name,Users.Username
I expect to get one record per handset of the users last activity with that handset. What I get is multiple records on all handsets and dates over 10000 rows
You typically GROUP BY the same columns as you SELECT, except those who are arguments to set functions.
This GROUP BY returns no duplicates, so SELECT DISTINCT isn't needed.
SELECT MAX(UserSessions.LastActivity), Handsets.Name, Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY Handsets.Name, Users.Username
There is no such thing as DISTINCT MAX. You have SELECT DISTINCT which ensures that all columns referenced in the SELECT are not duplicated (as a group) across multiple rows. And there is MAX() an aggregation function.
As a note: SELECT DISTINCT is almost never appropriate with GROUP BY.
You seem to want:
SELECT *
FROM (SELECT h.Name, u.Username, MAX(us.LastActivity) as last_activity,
RANK() OVER (PARTITION BY h.Name ORDER BY MAX(us.LastActivity) desc) as seqnum
FROM UserSessions us JOIN
Handsets h
ON h.HandsetId = us.HandsetId INNER JOIN
Users u
ON u.UserId = us.UserId
WHERE h.Name in (1000,1001.1002,1003,1004....) AND
h.Deleted = 0
GROUP BY h.Name, u.Username
) h
WHERE seqnum = 1