Suppose I have two tables,
User
Post
Posts are made by Users (i.e. the Post Table will have foreign key of user)
Now my question is,
Print the details of all the users who have more than 10 posts
To solve this, I can type the following query and it would give me the desired result,
SELECT * from USER where user_id in (SELECT user_id from POST group by user_id having count(user_id) > 10)
The problem occurs when I also want to print the Count of the Posts along with the user details. Now obtaining the count of user is not possible from USER table. That can only be done from POST table. But, I can't get two values from my subquery, i.e. I can't do the following,
SELECT * from USER where user_id in (SELECT user_id, **count(user_id)** from POST group by user_id having count(user_id) > 10)
So, how do I resolve this issue? One solution I know is this, but this I think it would be a very naive way to resolve this and will make the query much more complex and also much more slow,
SELECT u.*, (SELECT po.count(user_id) from POST as po group by user_id having po.count(user_id) > 10) from USER u where u.user_id in (SELECT p.user_id from POST p group by user_id having p.count(user_id) > 10)
Is there any other way to solve this using subqueries?
Move the aggregation to the from clause:
SELECT u.*, p.num_posts
FROM user u JOIN
(SELECT p.user_id, COUNT(*) as num_posts
FROM post p
GROUP BY p.user_id
HAVING COUNT(*) > 10
) p
ON u.user_id = p.user_id;
You can do this with subqueries:
select u.*
from (select u.*,
(select count(*) from post p where p.user_id = u.user_id) as num_posts
from users u
) u
where num_posts > 10;
With an index on post(user_id), this might actually have better performance than the version using JOIN/GROUP BY.
You can try by joining the tables, Prefer to do a JOIN than using SUBQUERY
SELECT user.*, count( post.user_id ) as postcount
FROM user LEFT JOIN post ON users.user_id = post.user_id
GROUP BY post.user_id
HAVING postcount > 10 ;
Related
We have these tables in PostgreSQL 12:
User -> id, name, email
items -> id, user_id, description
We want to run a query to find users that have 1 item or less.
I tried using a join statement and in the WHERE clause tried to put the count of users < 1 with this query
select * from "user" inner join item on "user".id = item.user_id where count(item.user_id) < 1;
but it failed and gave me this error.
ERROR: aggregate functions are not allowed in WHERE
LINE 1: ...inner join item on "user".id = item.user_id where count(item...
so im thinking the query needs to be more techincal.
Can anyone please help me with this? thanks
You can do:
select u.*
from user u
left join (
select user_id, count(*) as cnt from items group by user_id
) x on x.user_id = u.id
where x.cnt = 1 or x.cnt is null
You don't technically need a JOIN for this. You can get all the necessary data from the item table with GROUP BY. The trick is you need to use HAVING instead of WHERE for aggregated data like COUNT()
SELECT user_id
FROM item
GROUP BY user_id
HAVING COUNT(id) > 1
But we can add a JOIN if you want to see more fields from the user table:
SELECT u.id, u.name, u.email
FROM item i
INNER JOIN "user" u on u.id = i.user_id
GROUP BY u.id, u.name, u.email
HAVING COUNT(i.id) > 1
I am using python on a SQlite3 DB i created. I have the DB created and currently just using command line to try and get the sql statement correct.
I have 2 tables.
Table 1 - users
user_id, name, message_count
Table 2 - messages
id, date, message, user_id
When I setup table two, I added this statement in the creation of my messages table, but I have no clue what, if anything, it does:
FOREIGN KEY (user_id) REFERENCES users (user_id)
What I am trying to do is return a list containing the name and message count during 2020. I have used this statement to get the TOTAL number of posts in 2020, and it works:
SELECT COUNT(*) FROM messages WHERE substr(date,1,4)='2020';
But I am struggling with figuring out if I should Join the tables, or if there is a way to pull just the info I need. The statement I want would look something like this:
SELECT name, COUNT(*) FROM users JOIN messages ON messages.user_id = users.user_id WHERE substr(date,1,4)='2020';
One option uses a correlated subquery:
select u.*,
(
select count(*)
from messages m
where m.user_id = u.user_id and m.date >= '2020-01-01' and m.date < '2021-01-01'
) as cnt_messages
from users u
This query would take advantage of an index on messages(user_id, date).
You could also join and aggregate. If you want to allow users that have no messages, a left join is a appropriate:
select u.name, count(m.user_id) as cnt_messages
from users u
left join messages m
on m.user_id = u.user_id and m.date >= '2020-01-01' and m.date < '2021-01-01'
group by u.user_id, u.name
Note that it is more efficient to filter the date column against literal dates than applying a function on it (which precludes the use of an index).
You are missing a GROUP BY clause to group by user:
SELECT u.user_id, u.name, COUNT(*) AS counter
FROM users u JOIN messages m
ON m.user_id = u.user_id
WHERE substr(m.date,1,4)='2020'
GROUP BY u.user_id, u.name
I am trying to formulate a single SQL query that will count a table across a one to many relationship. Here is the short version of my schema:
User(id)
Group(id)
UserGroup(user_id, group_id)
Post(id, user_id, group_id)
The goal is to return the count of posts for each user in a group. The specific issue I am running into is my current query cannot return 0 for a user that has no posts. Here is my naive query:
SELECT
COUNT(*) as total,
user_id
FROM
posts
WHERE
group_id = ?
GROUP BY user_id
ORDER BY
total DESC
This works fine when every user has a post, but when some have no posts, they do not show up in the list. How can I write a single query that handles this scenario and returns count 0 for said users? I know I need to somehow incorporate UserGroup to get the list of users, but am stuck from there.
Use a left join:
SELECT u.id, COUNT(*) as total
FROM users u LEFT JOIN
posts p
ON p.user_id = u.id AND
p.group_id = ?
GROUP BY u.id
ORDER BY total DESC
I think I got it, but not sure how performant.
select count(p), u.id from users u left join (select * from workouts where group_id = ?) p on p.user_id = u.id where u.id in (select user_id from user_group where group_id = ?) group by u.id;
I have a bunch of Users, each of whom has many Posts.
Schema:
Users: id
Posts: user_id, rating
How do I find all Users who have at least one post with a rating above, say, 10?
I'm not sure if I should use a subQuery for this, or if there's an easier way.
Thanks!
To find all users with at least one post with a rating above 10, use:
SELECT u.*
FROM USERS u
WHERE EXISTS(SELECT NULL
FROM POSTS p
WHERE p.user_id = u.id
AND p.rating > 10)
EXISTS doesn't care about the SELECT statement within it - you could replace NULL with 1/0, which should result in a math error for dividing by zero... But it won't, because EXISTS is only concerned with the filteration in the WHERE clause.
The correlation (the WHERE p.user_id = u.id) is why this is called a correlated subquery, and will only return rows from the USERS table where the id values match, in addition to the rating comparison.
EXISTS is also faster, depending on the situation, because it returns true as soon as the criteria is met - duplicates don't matter.
You can join the tables to find the relevant users, and use DISTINCT so each user is in the result set at most once even if they have multiple posts with rating > 10:
select distinct u.id,u.username
from users u inner join posts p on u.id = p.user_id
where p.rating > 10
Use an inner join:
SELECT * from users INNER JOIN posts p on users.id = p.user_id where p.rating > 10;
select distinct id
from users, posts
where id = user_id and rating > 10
SELECT max(p.rating), u.id
from users u
INNER JOIN posts p on users.id = p.user_id
where p.rating > 10
group by u.id;
Additionally, this will tell you what their highest rating is.
The correct answer for your question as stated is OMG Ponies's answer, WHERE EXISTS is more descriptive and almost always faster. But "SELECT NULL" looks really ugly and counterintuitive to me. I've seen SELECT * or SELECT 1 as a best practice for this.
Another way, in case we're collecting answers:
SELECT u.id
FROM users u
JOIN posts p on u.id = p.user_id
WHERE p.rating > 10
GROUP BY u.id
HAVING COUNT(*) > 1
This could be useful if it's not always 1 you're testing on.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Retrieving the last record in each group
I have two tables set up similar to this (simplified for the quest):
actions-
id - user_id - action - time
users -
id - name
I want to output the latest action for each user. I have no idea how to go about it.
I'm not great with SQL, but from what I've looked up, it should look something like the following. not sure though.
SELECT `users`.`name`, *
FROM users, actions
JOIN < not sure what to put here >
ORDER BY `actions`.`time` DESC
< only one per user_id >
Any help would be appreciated.
SELECT * FROM users JOIN actions ON actions.id=(SELECT id FROM actions WHERE user_id=users.id ORDER BY time DESC LIMIT 1);
you need to do a groupwise max - please refer to examples here http://jan.kneschke.de/projects/mysql/groupwise-max/
here's an example i did for somone else which is similar to your requirements:
http://pastie.org/925108
select
u.user_id,
u.username,
latest.comment_id
from
users u
left outer join
(
select
max(comment_id) as comment_id,
user_id
from
user_comment
group by
user_id
) latest on u.user_id = latest.user_id;
select u.name, a.action, a.time
from user u, action a
where u.id = a.user_id
and a.time in (select max(time) from action where user_id = u.user_id group by user_id )
note untested - but this should be the pattern
DECLARE #Table (ID Int, User_ID, Time DateTime)
-- This gets the latest entry for each user
INSERT INTO #Table (ID, User_ID, Time)
SELECT ID, User_ID, MAX(TIME)
FROM actions z
INNER JOIN users x on x.ID = z.ID
GROUP BY z. userID
-- Join to get resulting action
SELECT z.user_ID, z.Action
FROM actions z
INNER JOIN #Table x on x.ID = z.ID
This is the greatest-n-per-group problem that comes up frequently on Stack Overflow. Follow the tag for dozens of other posts on this problem.
Here's how to do it in MySQL given your schema with no subqueries and no GROUP BY:
SELECT u.*, a1.*
FROM users u JOIN actions a1 ON (u.id = a1.user_id)
LEFT OUTER JOIN actions a2 ON (u.id = a2.user_id AND a1.time < a2.time)
WHERE a2.id IS NULL;
In other words, show the user with her action such that if we search for another action with the same user and a later time, we find none.
It seems to me that the following will be works
WITH GetMaxTimePerUser (user_id, time) (
SELECT user_id, MAX(time)
FROM actions
GROUP BY user_id
)
SELECT u.name, a.action, amax.time
FROM actions AS a
INNER JOIN users AS u ON u.id=a.user_id
INNER JOIN GetMaxTimePerUser AS u_maxtime ON u_maxtime.user_id=u.id
WHERE a.time=u_maxtime.time
Usage of temporary named result set (common table expression or CTE) without subqueries and OUTER JOIN is the way best opened for query optimization. (CTE is something like a VIEW but existing only virtual or inline)