Find top 5 famous people - sql

I have a case in hand where I need to find the top 5 people with most likes on their posts overall.
Here's the schema:
CREATE TABLE users (
ID SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
username VARCHAR(30) NOT NULL,
);
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
url VARCHAR(300) NOT NULL,
user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
);
CREATE TABLE likes (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
contents VARCHAR(240) NOT NULL,
user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
post_id INTEGER REFERENCES posts(id) ON DELETE CASCADE,
comment_id INTEGER REFERENCES comments(id) ON DELETE CASCADE,
-- 👉 either associated with post or comment 👈 --
CHECK(
COALESCE((post_id)::boolean::integer, 0) +
COALESCE((comment_id)::boolean::integer, 0) = 1
),
-- user can like post/comment once --
UNIQUE (user_id, post_id, comment_id)
);
My Attempts
Both are giving different outputs, not sure which one is correct. Also, I would appreciate an ideal (scalable) solution for this:
1.
WITH FAMOUS AS (
SELECT likes.id, users.username AS username, users.id AS user_id
FROM likes
JOIN posts ON posts.user_id = likes.post_id
JOIN users ON users.id = likes.user_id
WHERE likes.comment_id IS null
)
SELECT COUNT(*) AS num, username FROM FAMOUS
GROUP BY username
ORDER BY num DESC LIMIT 5;
2.
WITH LIKES_DATA AS (
SELECT post_id, COUNT(*) AS num_likes_per_post FROM likes
WHERE likes.comment_id IS NULL
GROUP BY post_id
)
SELECT users.username, SUM(num_likes_per_post) as num_likes
FROM LIKES_DATA
JOIN posts ON posts.id = LIKES_DATA.post_id
JOIN users ON users.id = posts.user_id
GROUP BY users.username
ORDER BY num_likes DESC LIMIT 5;

I simply do not understand the thought process for the second query.
Based on your description, I think just using JOINs and GROUP BY is sufficient:
SELECT u.username AS username, u.id AS user_id, COUNT(*)
FROM likes l JOIN
posts p
ON p.user_id = l.post_id JOIN
users u
ON u.id = l.user_id
WHERE likes.comment_id IS NULL -- don't know what this is for
GROUP BY u.username, u.id
ORDER BY COUNT(*) DESC
LIMIT 5;

Related

Postgresql join with Condition

I have three table as follows.
CREATE TABLE users (
id uuid NOT NULL PRIMARY KEY,
password character varying(128) NOT NULL,
username character varying(15) NOT NULL,
email character varying(100) NULL,
gender character varying(1) NOT NULL
);
CREATE TABLE followers (
id bigserial NOT NULL PRIMARY KEY,
followed_at timestamp with time zone NOT NULL,
follower_id uuid REFERENCES users(id),
following_id uuid REFERENCES users(id)
);
CREATE TABLE profile_picture (
id uuid NOT NULL PRIMARY KEY ,
profile_pic character varying(100) NOT NULL,
owner_id uuid NOT NULL REFERENCES users(id),
is_active boolean NOT NULL
);
I want the query for selecting all the follower with the fields : id, username and active_profile_pic.
is_active will be true only for one profile pic of the user but he can upload as many profile photo as he want.
I have tried the without profile_pic, which is not the wanted result.
select users.id , username from users inner join followers on follower_id = users.id where followers.following_id = user_id;
I want the query for selecting all the followers with id, username and active_profile_pic who are following the user with given user_id.
User may have no profile pic and can have more than one so only the active_profile pic should be returned.
Query will have to contain the follower with no profile picture too.
Tried this, but it is not returning follower which have no profile picture. I want to return that too.
SELECT u.id, username, p.profile_pic FROM followers INNER JOIN users AS u ON u.id = follower_id
LEFT OUTER JOIN profile_picture AS p ON u.id = p.owner_id
where followers.following_id = '' and p.is_active = true
In the above query user_id signifies variable , you can specify id of the user there.
Please visit DBFIDDLE
Please suggest me the right query.
You have profile_picture.is_active test in WHERE condition, therefore your result will display only users with profile picture set. If you want all users, regardless of the profile picture, you should include profile_picture.is_active column in query results and delete the condition p.is_active = true from WHERE statement.
With the help of #random_user,
This works for me.
select id , username, profile_pic from (SELECT u.id, username, is_active, profile_pic FROM users as u Full Outer JOIN followers ON u.id = follower_id
Full Outer JOIN profile_picture AS p ON u.id = p.owner_id
where followers.following_id = '2582f93d-68c3-48e2-98ba-a401402c7b62') as profile_table where (is_active = true) or (is_active is null)
May be someone can suggest, better answer, but this works for me.
Check it at DBFIDDLE

Get user from table based on id

I have these Postgres tables:
create table deals_new
(
id bigserial primary key,
slip_id text,
deal_type integer,
timestamp timestamp,
employee_id bigint
constraint employee_id_fk
references common.employees
);
create table twap
(
id bigserial primary key,
deal_id varchar not null,
employee_id bigint
constraint fk_twap__employee_id
references common.employees,
status integer
);
create table employees
(
id bigint primary key,
account_id integer,
first_name varchar(150),
last_name varchar(150)
);
New table to query:
create table accounts
(
id bigint primary key,
account_name varchar(150) not null
);
I use this SQL query:
select d.*, t.id as twap_id
from common.deals_new d
left outer join common.twap t on
t.deal_id = d.slip_id and
d.timestamp between '11-11-2021' AND '11-11-2021' and
d.deal_type in (1, 2) and
d.quote_id is null
where d.employee_id is not null
order by d.timestamp desc, d.id
offset 10
limit 10;
How I can extend this SQL query to search also in table employees by account_id and map the result in table accounts by id? I would like to print also accounts. account_name based on employees .account_id.
You need two joins to to make this work for you. One join to get to the employee table, and one more join to get to the accounts table.
select d.*, t.id as twap_id, a.account_name
from common.deals_new d
left outer join common.twap t on
t.deal_id = d.slip_id and
d.timestamp between '11-11-2021' AND '11-11-2021' and
d.deal_type in (1, 2) and
d.quote_id is null
join employees as e on d.employee_id = e.id
join accounts as a on a.id = e.account_id
where d.employee_id is not null
order by d.timestamp desc, d.id
offset 10
limit 10;
Note: I did not fiddle this one, so could have a typo, but I think you get the idea here.

Complicated Join Query, Join 3 tables with multiple group bys

I have 3 tables:
Tweets:
CREATE TABLE tweets (
text_content VARCHAR(280) not null,
username VARCHAR(50) not null,
timestamp TIMESTAMP not null DEFAULT current_timestamp,
id UUID not null DEFAULT uuid_generate_v4(),
CONSTRAINT tweets_pk PRIMARY KEY (id)
);
Likes:
CREATE TABLE likes (
username VARCHAR(50) not null,
timestamp TIMESTAMP not null default current_timestamp,
post_id UUID not null,
CONSTRAINT likes_pk PRIMARY KEY (username, post_id),
CONSTRAINT likes_post_id_fk FOREIGN KEY (post_id) REFERENCES tweets(id)
);
And Retweets
CREATE TABLE retweets (
username VARCHAR(50) not null,
timestamp TIMESTAMP not null default current_timestamp,
post_id UUID not null,
CONSTRAINT retweets_pk PRIMARY KEY (username, post_id),
CONSTRAINT retweets_post_id_fk FOREIGN KEY (post_id) REFERENCES tweets(id)
);
I need a query, that would select all tweets, along with the amount of likes and retweets they have.
I did manage to write a working query, but I think I over-complicated it, and would love to hear simpler solutions!
You want to aggregate before joining. Assuming the join key is post_id:
select t.*, l.likes, r.retweets
from tweets t left join
(select post_id, count(*) as likes
from likes
group by post_id
) l
on l.post_id = t.id left join
(select post_id, count(*) as retweets
from retweets
group by post_id
) r
on r.post_id = t.id;

Multiple selects on joined tables with group by?

I have three tables with the structures outlined below:
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
username VARCHAR(255) UNIQUE
);
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT REFERENCES users(id) NOT NULL,
category BIGINT REFERENCES categories(id) NOT NULL,
text TEXT NOT NULL
);
CREATE TABLE posts_votes (
user_id BIGINT REFERENCES users(id) NOT NULL,
post_id BIGINT REFERENCES posts(id) NOT NULL
value SMALLINT NOT NULL,
PRIMARY KEY(user_id, post_id)
);
I was able to compose a query that gets each post with its user and its total value using the below query:
SELECT p.id, p.text, u.username, COALESCE(SUM(v.value), 0) AS vote_value
FROM posts p
LEFT JOIN posts_votes v ON p.id=t.post_id
JOIN users u ON p.user_id=u.id
WHERE posts.category=1337
GROUP BY p.id, p.text, u.username
But now I want to also return a column that returns the result of SELECT COALESCE((SELECT value FROM posts_votes WHERE user_id=1234 AND post_id=n), 0) for each post_id n in the above query. What would be the best way to do this?
I think an additional LEFT JOIN is a reasonable approach:
SELECT p.id, p.text, u.username, COALESCE(SUM(v.value), 0) AS vote_value,
COALESCE(pv.value, 0)
FROM posts p JOIN
users u
ON p.user_id=u.id LEFT JOIN
topics_votes v
ON p.id = t.post_id LEFT JOIN
post_votes pv
ON pv.user_id = 1234 AND pv.post_id = p.id
WHERE p.category = 1337
GROUP BY p.id, p.text, u.username, pv.value;

PostgreSQL Query trimming results unnecessarily

I'm working on my first assignment using SQL on our class' PostgreSQL server. A sample database has the (partial here) schema:
CREATE TABLE users (
id int PRIMARY KEY,
userStatus varchar(100),
userType varchar(100),
userName varchar(100),
email varchar(100),
age int,
street varchar(100),
city varchar(100),
state varchar(100),
zip varchar(100),
CONSTRAINT users_status_fk FOREIGN KEY (userStatus) REFERENCES userStatus(name),
CONSTRAINT users_types_fk FOREIGN KEY (userType) REFERENCES userTypes(name)
);
CREATE TABLE events (
id int primary key,
title varchar(100),
edate date,
etime time,
location varchar(100),
user_id int, -- creator of the event
CONSTRAINT events_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);
CREATE TABLE polls (
id int PRIMARY KEY,
question varchar(100),
creationDate date,
user_id int, --creator of the poll
CONSTRAINT polls_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);
and a bunch of sample data (in particular, 127 sample users).
I have to write a query to find the number of polls created by a user within the past year, as well as the number of events created by a user that occurred in the past year. The trick is, I should have rows with 0s for both columns if the user had no such polls/events.
I have a query which seems to return the correct data, but only for 116 of the 127 users, and I cannot understand why the query is trimming these 11 users, when the WHERE clause only checks attributes of the poll/event. Following is my query:
SELECT u.id, u.userStatus, u.userType, u.email, -- Return user details
COUNT(DISTINCT e.id) AS NumEvents, -- Count number of events
COUNT(DISTINCT p.id) AS NumPolls -- Count number of polls
FROM (users AS u LEFT JOIN events AS e ON u.id = e.user_id) LEFT JOIN polls AS p ON u.id = p.user_id
WHERE (p.creationDate IS NULL OR ((now() - p.creationDate) < INTERVAL '1' YEAR) OR -- Only get polls created within last year
e.edate IS NULL OR ((now() - e.edate) < INTERVAL '1' YEAR)) -- Only get events that happened during last year
GROUP BY u.id, u.userStatus, u.userType, u.email;
Any help would be much appreciated.
Using a different query seemed to work. Here's what I ended up with:
SELECT u.id, u.userStatus, u.userType, u.email, COUNT(DISTINCT e.id) AS numevents, COUNT(DISTINCT p.id) AS numpolls
FROM users AS u LEFT OUTER JOIN (SELECT * FROM events WHERE ((now() - edate) < INTERVAL '1' YEAR)) AS e ON u.id = e.user_id
LEFT OUTER JOIN (SELECT * FROM polls WHERE ((now() - creationDate) < INTERVAL '1' YEAR)) AS p ON u.id = p.user_id
GROUP BY u.id, u.userStatus, u.userType, u.email
;
Try to avoid using DISTINCT with sub-queries for example.