How to create alias from all columns in sql? - sql

The goal of the query here was simplified, but it represents a complex one that I want to select all users fields from the subquery plus computing a SUM. So, this is an example only.
I'm doing a subquery because of a problem with SUM duplicate rows. Like recommended to do with this answer: https://stackoverflow.com/a/7351991/255932
But the problem is that subquery also selects a column "rating" from the table ratings and I can't select all users fields unless describing all users columns on parent select.
SELECT id, name, x, y, z ..., SUM(rating)
FROM
(SELECT users.*, ratings.rating
FROM users
INNER JOIN ratings ON
users.id = ratings.user_id
)
GROUP BY users.id
I would like to know if there is a way to replace (id, name, x, y, z, ...) with a simple (users.*).

Actually, there are two very simple ways.
If users.id is the primary key:
SELECT u.*, sum(r.rating) AS total
FROM users u
JOIN ratings r ON r.user_id = u.id
GROUP BY u.id;
You need Postgres 9.1 or later for this to work. Details in this closely reated answer:
PostgreSQL - GROUP BY clause
If users.id is at least unique:
SELECT u.*, r.total
FROM users u
JOIN (
SELECT user_id, sum(rating) AS total
FROM ratings
GROUP BY 1
) r ON r.user_id = u.id;
Works with any version I know of. When retrieving the whole table or large parts of it, it's also generally faster to group first and join later.

Kind of, but not really. There is a workaround, but you have to approach your subquery differently.
SELECT (c.users).*, SUM(c.rating)
FROM
(SELECT users, ratings.rating
FROM users
INNER JOIN ratings ON
users.id = ratings.user_id
) c
GROUP BY c.users;

Related

PostgreSQL: Get the count of rows in a join query

I am trying to get some data joining few tables. I have an audit table where I store the audits for actions performed by users. I am trying to get the list of users in the order of the number audits they have and the number of audits. I have the following query:
SELECT s.user_created,
u.first_name,
u.last_name,
u.email,
a.message as audits
FROM cbrain_school s
inner join ugrp_user u on s.user_created = u.user_id
inner join audit a on u.user_id = a.user_id
order by u.created_time desc;
This query will give me 1 row per entry in the audit table. I just want 1 row per user and the count of entries in the audit table ordered by the number of audits.
Is there any way to do that. I was getting an error when I tried to include count() in the above query
First of all you are joining with the table cbrain_school. Why? You are selecting no data from this table (except for s.user_created which is simply u.user_id). I suppose you want to limit the users show to the cbrain_school.user_created? Then use EXISTS or IN to look this up.
select u.user_id, u.first_name, u.last_name, u.email, a.message as audits
from ugrp_user u
inner join audit a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
This shows much better that cbrain_school.user_created is mere criteria. (But the query result is the same, of course.) It's a good habit to avoid joins, when you are not really interested in the joined rows.
Now you don't want to show each message anymore, but merely count them per user. So rather then joining messages, you should join the message count:
select u.user_id, u.first_name, u.last_name, u.email, a.cnt
from ugrp_user u
inner join
(
select user_id, count(*) as cnt
from audit
group by user_id
) a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
(You could also join all messages and only then aggregate, but I don't recommend this. It would work for this and many other queries, but is prone to errors when working with multiple tables, where you might suddenly count or add up values multifold. It's a good habit to join before aggregating.)

Return records of groupby with a count of one

I have two tables user_expenses and users.
The foreign key for user_expenses is user_expenses.user_id which corresponds to users.id.
I would like to get some information from both tables, in which the users have only ONE expense.
I gave this a shot:
SELECT
users.id, users.email, users.stripe_plan, users.previous_plan,
users.created_at, user_expenses.created_at, user_expenses.description
FROM
users
INNER JOIN
user_expenses ON user_expenses.user_id = users.id
WHERE
user_expenses.description NOT LIKE "%free%"
GROUP BY
user_expenses.user_id
HAVING
COUNT(*) = 1
But of course, this yields the following problem:
SELECT list is not in GROUP BY clause and contains nonaggregated column 'app.user_expenses.created_at' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Adding this column into the group_by is problematic because it will actually return users who have multiple expenses with different descriptions.
Can anyone offer some advice on how to approach this problem? I only want users with a single entry in the user_expenses table, regardless of the type of description.
You could do either a subquery or you could pseudo-aggregate values that are not in the group by list:
(1):
SELECT users.id, users.email, users.stripe_plan, users.previous_plan, users.created_at, user_expenses.created_at, user_expenses.description
FROM users
INNER JOIN user_expenses
ON user_expenses.user_id = users.id
WHERE user_expenses.description NOT LIKE "%free%"
and users.id not in
(select ue2.user_id from user_expenses ue2 group by user_id having count(*) > 1)
(2)
SELECT users.id, max(users.email), max(users.stripe_plan), max(users.previous_plan), max(users.created_at), max(user_expenses.created_at), max(user_expenses.description)
FROM users
INNER JOIN user_expenses
ON user_expenses.user_id = users.id
WHERE user_expenses.description NOT LIKE "%free%"
GROUP BY user_expenses.user_id
HAVING COUNT(*) = 1
You can do a dummy aggregation. Change:
user_expenses.created_at, user_expenses.description
in the select list by:
min(user_expenses.created_at) created_at, min(user_expenses.description) description
... which will be the same as the original value, since you know you only have one per group.
It would also be more natural to group by the users.id field, which has as advantage that it allows for outer joining the user_expenses table (if ever you would need that):
group by users.id
NB: in MySql 5.7+ it is not necessary to aggregate fields that are functionally dependent on the grouped-by fields. Since all fields of the users record are determined by the users.id value they can go without aggregation.

How to get the number of users grouped by the number of comments they've made?

I'd like to get the count of users grouped by the number of comments they've made.
[User]: ID
[Comment]: ID, UserID
So if user A has made 1 comment, user B has made 1 comment and user C has made 2 comments, then the output would be:
0 comments => 0 users
1 comment => 2 users (A+B)
2 comments => 1 user (C)
How would you query this?
It will depend on your specific database structure, but let's say you have a users table and a comments table:
users table:
id: serial
name: text
comments table:
id: serial
user_id: integer (foreign key to the users table)
comment: text
You can count the number of comments each user has made with this query:
SELECT users.id, users.name, count(comments.id) as comment_cnt
FROM users LEFT JOIN
comments ON users.id = comments.user_id
GROUP BY users.id, users.name
You can then use the results of this query in a nested query to count the number of users for each number of comments:
SELECT comment_cnt, count(*) FROM
(SELECT users.id, users.name, count(comments.id) as comment_cnt
FROM users LEFT JOIN
comments ON users.id = comments.user_id
GROUP BY users.id, users.name) AS comment_cnts
GROUP BY comment_cnt;
I don't know of any elegant way to fill the gaps where there are zero users for a given number of comments, but a temporary table and another level of nesting works:
CREATE TABLE wholenumbers (num integer);
INSERT INTO wholenumbers VALUES (0), (1), (2), (3), (4), (5), (6);
SELECT num as comment_cnt, COALESCE(user_cnt,0) as user_cnt
FROM wholenumbers
LEFT JOIN (SELECT comment_cnt, count(*) AS user_cnt
FROM ( SELECT users.id, users.name, count(comments.id) AS comment_cnt
FROM users LEFT JOIN comments ON users.id = comments.user_id
GROUP BY users.id, users.name) AS comment_cnts
GROUP BY comment_cnt) AS user_cnts ON wholenumbers.num = user_cnts.comment_cnt
ORDER BY num;
Building on the table layout #ClaytonC provided:
WITH cte AS (
SELECT msg_ct, count(*) AS users
FROM (
SELECT count(*) AS msg_ct
FROM comments
GROUP BY user_id
) sub
GROUP BY 1
)
SELECT msg_ct, COALESCE(users, 0) AS users
FROM generate_series(0, (SELECT max(msg_ct) FROM cte)) msg_ct
LEFT JOIN cte USING (msg_ct)
ORDER BY 1;
Major points
First, count comments per user (msg_ct). As long as referential integrity is enforced by a foreign key, you do not need to join to the users table at all to aggregate comments per user. Just count rows in comments.
Next, count users per message count (users).
I am doing this in a CTE, because I use the derived table twice in the final query.
First for generate_series() to generate all counts from min to max dynamically, including gaps.
Then for the table to LEFT JOIN to and get the final result.
The count starts with 0 (after my update). If you want to have it start with the smallest actual msg_ct, consider the first draft of my answer in the edit history.
Closely related answer explaining the basics:
Select all integers that are not already in table in postgres
Count users without comments
As #ClaytonC commented, the above answer does not include users without comments.
To fix this (if you actually need it), either LEFT JOIN to users right at the start after all:
WITH cte AS (
SELECT msg_ct, count(*) AS users
FROM (
SELECT count(c.user_id) AS msg_ct
FROM users u
LEFT JOIN comments c ON c.user_id = u.id
GROUP BY u.id
) sub
GROUP BY 1
)
SELECT ...
Or, since the join is just for finding users without comments, we might get away cheaper: Count all users and subtract users with comments (which we processed anyway):
WITH cte AS (
SELECT msg_ct, count(*)::int AS users
FROM (
SELECT count(*)::int AS msg_ct
FROM comments
GROUP BY user_id
) sub
GROUP BY 1
)
, agg AS (
SELECT max(msg_ct) AS max_ct -- maximum for generate_series
,((SELECT count(*) FROM users) - sum(users))::int AS users
-- quiet rest with 0 comments
FROM cte
)
SELECT 0 AS msg_ct, users FROM agg -- users with 0 comments
UNION ALL
SELECT msg_ct, COALESCE(users, 0)
FROM (SELECT generate_series(1, max_ct) AS msg_ct FROM agg) g
LEFT JOIN cte USING (msg_ct)
ORDER BY 1;
The query gets a bit more complex, but it might be faster for big tables. Not sure. Test with EXPLAIN ANALYZE (I would be grateful for a comment with the results.)

How can I get records from one table which do not exist in a related table?

I have this users table:
and this relationships table:
So each user is paired with another one in the relationships table.
Now I want to get a list of users which are not in the relationships table, in either of the two columns (user_id or pair_id).
How could I write that query?
First try:
SELECT users.id
FROM users
LEFT OUTER JOIN relationships
ON users.id = relationships.user_id
WHERE relationships.user_id IS NULL;
Output:
This is should display only 2 results: 5 and 6. The result 8 is not correct, as it already exists in relationships. Of course I'm aware that the query is not correct, how can I fix it?
I'm using PostgreSQL.
You need to compare to both values in the on statement:
SELECT u.id
FROM users u LEFT OUTER JOIN
relationships r
ON u.id = r.user_id or u.id = r.pair_id
WHERE r.user_id IS NULL;
In general, or in an on clause can be inefficient. I would recommend replacing this with two not exists statements:
SELECT u.id
FROM users u
WHERE NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.user_id) AND
NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.pair_id);
I like the set operators
select id from users
except
select user_id from relationships
except
select pair_id from relationships
or
select id from users
except
(select user_id from relationships
union
select pair_id from relationships
)
This is a special case of:
Select rows which are not present in other table
I suppose this will be simplest and fastest:
SELECT u.id
FROM users u
WHERE NOT EXISTS (
SELECT 1
FROM relationships r
WHERE u.id IN (r.user_id, r.pair_id)
);
In Postgres, u.id IN (r.user_id, r.pair_id) is just short for:(u.id = r.user_id OR u.id = r.pair_id).
The expression is transformed that way internally, which can be observed from EXPLAIN ANALYZE.
To clear up speculations in the comments: Modern versions of Postgres are going to use matching indexes on user_id, and / or pair_id with this sort of query.
Something like:
select u.id
from users u
where u.id not in (select r.user_id from relationships r)
and u.id not in (select r.pair_id from relationships r)

SQL to gather data from one table while counting records in another

I have a users table and a songs table, I want to select all the users in the users table while counting how many songs they have in the songs table. I have this SQL but it doesn't work, can someone spot what i'm doing wrong?
SELECT jos_mfs_users.*, COUNT(jos_mfs_songs.id) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist=jos_mfs_users.id
Help is much appreciated. Thanks!
The inner join won't work, because it joins every matching row in the songs table with the users table.
SELECT jos_mfs_users.*,
(SELECT COUNT(jos_mfs_songs.id)
FROM jos_mfs_songs
WHERE jos_mfs_songs.artist=jos_mfs_users.id) as song_count
FROM jos_mfs_users
WHERE (SELECT COUNT(jos_mfs_songs.id)
FROM jos_mfs_songs
WHERE jos_mfs_songs.artist=jos_mfs_users.id) > 10
There's a GROUP BY clause missing, e.g.
SELECT jos_mfs_users.id, COUNT(jos_mfs_songs.id) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist=jos_mfs_users.id
GROUP BY jos_mfs_users.id
If you want to add more columns from jos_mfs_users in the select list you should add them in the GROUP BYclause as well.
Changes:
Don't do SELECT *...specify your fields. I included ID and NAME, you can add more as needed but put them in the GROUP BY as well
Changed to a LEFT JOIN - INNER JOIN won't list any users that have no songs
Added the GROUP BY so it gives a valid count and is valid syntax
SELECT u.id, u.name COUNT(s.id) as song_count
FROM jos_mfs_users AS u
LEFT JOIN jos_mfs_songs AS S
ON s.artist = u.id
GROUP BY U.id, u.name
Try
SELECT
*,
(SELECT COUNT(*) FROM jos_mfs_songs as songs WHERE songs.artist=users.id) as song_count
FROM
jos_mfs_users as users
This seems like a many to many relationship. By that I mean it looks like there can be several records in the users table for each user, one of each song they have.
I would have three tables.
Users, which has one record for each user
Songs, which has one record for each song
USER_SONGS, which has one record for each user/song combination
Now, you can do a count of the songs each user has by doing a query on the intermediate table. You can also find out how many users have a particular song.
This will tell you how many songs each user has
select id, count(*) from USER_SONGS
GROUP BY id;
This will tell you how many users each song has
select artist, count(*) from USER_SONGS
GROUP BY artist;
I'm sure you will need to tweak this for your needs, but it may give you the type of results you are looking for.
You can also join either of these queries to the other two tables to find the user name, and/or artist name.
HTH
Harv Sather
ps I am not sure if you are looking for song counts or artist counts.
You need a GROUP BY clause to use aggregate functions (like COUNT(), for example)
So, assuming that jos_mfs_users.id is a primary key, something like this will work:
SELECT jos_mfs_users.*, COUNT( jos_mfs_users.id ) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist = jos_mfs_users.id
GROUP BY jos_mfs_users.id
Notice that
since you are grouping by user id, you will get one result per distinct user id in the results
the thing you need to COUNT() is the number of rows that are being grouped (in this case the number of results per user)