Select users' info with their total number of comments - sql

I have two tables:
User : (id, username, created_at, updated_at)
Comment : (comment_id, user_id, username, created_at, updated_at)
Note: yes, I do understand Comment table has a duplicated field, 'username'. However, the table is already designed in that way and I have no permission to redesign the schema.
And this is an output format how I want to extract data from tables.
id | username | num_of_counts
And this is two different sql codes I've tried with (I've simplified the codes to show you what I'm trying to do... minor typos may exist but general ideas are here.)
-- Ver 1
SELECT u.id, u.username, COUNT(c.id)
FROM User u
LEFT JOIN Comment c ON u.id = c.id
GROUP BY u.id
-- Ver 2
SELECT u.id, u.username, c.cnt
FROM User u
LEFT JOIN (SELECT id, COUNT(*) AS cnt
FROM Comment
GROUP BY user_id) c
ON u.id = c.id
GROUP BY u.id
Both codes gives me the same issue:
"Column 'username' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause"
After reading some articles regarding it, I've learned that there's a conflict between selecting 'username' and grouping rows by 'id'.
I'm still googling and reading similar cases but still getting the same issue (I'm not that good at sql stuff...)
What would be the best way to code sql query to get outputs in this format?
id | username | num_of_comments
1 | Tyler | 3
2 | Jane | 5
3 | Jack | 1

SELECT u.id, u.username, COUNT(c.id) as theCount
FROM User u
JOIN Comment c ON u.id = c.id
GROUP BY u.id,u.username

Drew has the right answer. But I want to point out that your second query can also work. It just doesn't need a group by at the outermost level:
SELECT u.id, u.username, c.cnt
FROM User u LEFT JOIN
(SELECT id, COUNT(*) AS cnt
FROM Comment
GROUP BY user_id
) c
ON u.id = c.id;
Under some circumstances, this can even have better performance -- for instance, if username were a really, really long string.

Neither has solved the issue.... :'(
SELECT
*,
(SELECT COUNT(id) FROM Comment WHERE id = id) AS Comments
FROM User
ORDER BY id DESC
This work-around has solved the issue... it's a simplified version of what i've actually coded tho. I still appreciate your answers.

Related

How to print two attribute values from your Sub query table

Suppose I have two tables,
User
Post
Posts are made by Users (i.e. the Post Table will have foreign key of user)
Now my question is,
Print the details of all the users who have more than 10 posts
To solve this, I can type the following query and it would give me the desired result,
SELECT * from USER where user_id in (SELECT user_id from POST group by user_id having count(user_id) > 10)
The problem occurs when I also want to print the Count of the Posts along with the user details. Now obtaining the count of user is not possible from USER table. That can only be done from POST table. But, I can't get two values from my subquery, i.e. I can't do the following,
SELECT * from USER where user_id in (SELECT user_id, **count(user_id)** from POST group by user_id having count(user_id) > 10)
So, how do I resolve this issue? One solution I know is this, but this I think it would be a very naive way to resolve this and will make the query much more complex and also much more slow,
SELECT u.*, (SELECT po.count(user_id) from POST as po group by user_id having po.count(user_id) > 10) from USER u where u.user_id in (SELECT p.user_id from POST p group by user_id having p.count(user_id) > 10)
Is there any other way to solve this using subqueries?
Move the aggregation to the from clause:
SELECT u.*, p.num_posts
FROM user u JOIN
(SELECT p.user_id, COUNT(*) as num_posts
FROM post p
GROUP BY p.user_id
HAVING COUNT(*) > 10
) p
ON u.user_id = p.user_id;
You can do this with subqueries:
select u.*
from (select u.*,
(select count(*) from post p where p.user_id = u.user_id) as num_posts
from users u
) u
where num_posts > 10;
With an index on post(user_id), this might actually have better performance than the version using JOIN/GROUP BY.
You can try by joining the tables, Prefer to do a JOIN than using SUBQUERY
SELECT user.*, count( post.user_id ) as postcount
FROM user LEFT JOIN post ON users.user_id = post.user_id
GROUP BY post.user_id
HAVING postcount > 10 ;

How to count and group by column across one to many relationship while handling 0 case?

I am trying to formulate a single SQL query that will count a table across a one to many relationship. Here is the short version of my schema:
User(id)
Group(id)
UserGroup(user_id, group_id)
Post(id, user_id, group_id)
The goal is to return the count of posts for each user in a group. The specific issue I am running into is my current query cannot return 0 for a user that has no posts. Here is my naive query:
SELECT
COUNT(*) as total,
user_id
FROM
posts
WHERE
group_id = ?
GROUP BY user_id
ORDER BY
total DESC
This works fine when every user has a post, but when some have no posts, they do not show up in the list. How can I write a single query that handles this scenario and returns count 0 for said users? I know I need to somehow incorporate UserGroup to get the list of users, but am stuck from there.
Use a left join:
SELECT u.id, COUNT(*) as total
FROM users u LEFT JOIN
posts p
ON p.user_id = u.id AND
p.group_id = ?
GROUP BY u.id
ORDER BY total DESC
I think I got it, but not sure how performant.
select count(p), u.id from users u left join (select * from workouts where group_id = ?) p on p.user_id = u.id where u.id in (select user_id from user_group where group_id = ?) group by u.id;

No luck getting most recent entry in sub-select: ORA-00904: invalid identifier, and ORA-06553: wrong number or types of arguments in call

I've seen many posts about this subject but none of the solutions solved my problem.
In a nutshell, I have a users table and a user_history table. Each user can have 0 or more user_history entries. The user_history table has a status column. All I want to do is get a list of users and the value of the status column for their most recent user_history entry. And I can't get it to work. I've tried:
select u.id, u.name,
(select status from (select status, rownum as rn from user_history uh where uh.user_id = u.id
order by created_date desc) where rn = 1) status
from users u;
This gives me a "ORA-00904: invalid identifier, u.id" error. From what I've read, Oracle does not allow you to access the outer-select 'u.id' from within a sub-sub-select (the one with rownum). From the first sub-select it works fine but as I said, I can have n entries in user_history, I only need the most recent.
I've also tried using an inner join:
select u.id, u.name, h.status
from users u
inner join (select user_id, status, rownum as rn from user_history where user_id = u.id order by created_date desc) h on u.id = h.user_id where h.rn = 1;
This gives me the dreaded "ORA-06553: wrong number or types of arguments in call to u" ... which I tried fixing by using distinct but to no avail.
I've also tried using row_number(), over and partition ... other types of inner joins with select ... nothing gets me the data I need.
Can someone give me a hand with this (seemingly) simple query?
In old days query would look something like this
select
u.id,
u.name,
uh.status
from
users u
inner join
(select
user_id,
status
from
user_history h
where
created_date = (select
max(created_date)
from
user_history d
where
h.user_id = d.user_id)
) uh
on u.id = uh.user_id;
What you have here is a correlated subquery that will get you latest date in history for the user. It is going to execute for each row so it is a bit slow performer. And you Join it with your user table to get your status.
I haven't tested it but it looks right.
Would something like this work? It would also eliminate the scalar within your query and be a little easier to debug, since you can run the inner query (uh) independently and evaluate its results.
with uh as (
select
u.id, u.name, uh.status, uh.created_date,
max (uh.created_date) over (partition by uh.user_id) as max_date
from
users u,
user_history uh
where
u.id = uh.user_id
)
select
id, name, status
from uh
where created_date = max_date
-- Edit --
For what it's worth, I loaded some sample data:
Users
1 Bilbo
2 Fatty
3 Pippin
4 Balin
User History
1 one 1/1/2014
1 two 1/2/2014
1 three 1/3/2014
2 four 1/4/2014
2 five 1/5/2014
2 six 1/6/2014
3 seven 1/7/2014
3 eight 1/8/2014
3 nine 1/9/2014
This was the output:
1 Bilbo three
2 Fatty six
3 Pippin nine
Here is the row_number alternative if you have multiple history records with the exact same "date" field.
with uh as (
select
u.id, u.name, uh.status,
row_number() over
(partition by u.id order by uh.created_date desc) as rn
from
users u,
user_history uh
where
u.id = uh.user_id
)
select
id, name, status
from uh
where rn = 1

Getting Counts Per User Across Tables in POSTGRESQL

I'm new to postgresql. I have a database that has three tables in it: Users, Order, Comments. Those three tables look like this
Orders Comments
------ --------
ID ID
UserID UserID
Description Details
CreatedOn CreatedOn
I'm trying to get a list of all of my users and how many orders each user has made and how many comments each user has made. In other words, the result of the query should look like this:
UserID Orders Comments
------ ------ --------
1 5 7
2 2 9
3 0 0
...
Currently, I'm trying the following:
SELECT
UserID,
(SELECT COUNT(ID) FROM Orders WHERE UserID=ID) AS Orders,
(SELECT COUNT(ID) FROM Comments WHERE UserID=ID) AS Comments
FROM
Orders o,
Comments c
WHERE
o.UserID = c.UserID
Is this the right way to do this type of query? Or can someone provide a better approach from a performance standpoint?
SQL Fiddle
select
id, name,
coalesce(orders, 0) as orders,
coalesce(comments, 0) as comments
from
users u
left join
(
select userid as id, count(*) as orders
from orders
group by userid
) o using (id)
left join
(
select userid as id, count(*) as comments
from comments
group by userid
) c using (id)
order by name
The usual way to do this is by using outer joins to the two other tables and then group by the id (and name)
select u.id,
u.name,
count(distinct o.id) as num_orders,
count(distinct c.id) as num_comments
from users u
left join orders o on o.userId = u.id
left join comments c on c.userId = u.id
group by u.id, u.name
order by u.name;
That might very well be faster than your approach. But Postgres' query optimizer is quite smart and I have seen situations where both solutions are essentially equal in performance.
You will need to test that on your data and also have a look at the execution plans in order to find out which one is more efficient.

Find total user posting per post SQL QUERY

I have a Database with the following two tables, member, POSTS I am looking for a way to get the count of how many posts a user has.
(Source: http://i.stack.imgur.com/FDv31.png)
I have tried many variations of the following SQL command with out any success. instead of showing the count of posts for a single user it shows a single row with all the posts as the count.
In the end I want something like this
(Source: http://i.stack.imgur.com/EbaEj.png)
Might be that I'm missing something here, but this query would seem to give you the results you want:
SELECT member.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total
FROM member;
I have left comment out of the query as it is not obvious what comment you want to be returned in that column for the group of comments that is counted.
See a SQL Fiddle demo here.
Edit
Sorry, misinterpreted your question :-) This query will properly return all the comments, along with the person who posted them and the total number of comments that the person made:
SELECT Posts.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total,
Posts.comment
FROM Posts
INNER JOIN member ON Posts.user_id = member.ID
GROUP BY Posts.ID, member.Name, member.ID, Posts.comment;
See an updated SQL Fiddle demo here.
You could use a subquery to calculate the total posts per member:
select m.ID
, m.Name
, coalesce(grp.total, 0)
, p.comment
from member m
left join
posts p
on p.user_id = m.id
left join
(
select user_id
, count(*) as total
from posts
group by
user_id
) grp
on grp.user_id = m.id
select
a.id
, a.name
, count(1) over (partition by b.user_id) as TotalCountPerUser
, b.comment
from member a join post b
on a.id = b.user_id