PostgreSQL: Get the count of rows in a join query - sql

I am trying to get some data joining few tables. I have an audit table where I store the audits for actions performed by users. I am trying to get the list of users in the order of the number audits they have and the number of audits. I have the following query:
SELECT s.user_created,
u.first_name,
u.last_name,
u.email,
a.message as audits
FROM cbrain_school s
inner join ugrp_user u on s.user_created = u.user_id
inner join audit a on u.user_id = a.user_id
order by u.created_time desc;
This query will give me 1 row per entry in the audit table. I just want 1 row per user and the count of entries in the audit table ordered by the number of audits.
Is there any way to do that. I was getting an error when I tried to include count() in the above query

First of all you are joining with the table cbrain_school. Why? You are selecting no data from this table (except for s.user_created which is simply u.user_id). I suppose you want to limit the users show to the cbrain_school.user_created? Then use EXISTS or IN to look this up.
select u.user_id, u.first_name, u.last_name, u.email, a.message as audits
from ugrp_user u
inner join audit a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
This shows much better that cbrain_school.user_created is mere criteria. (But the query result is the same, of course.) It's a good habit to avoid joins, when you are not really interested in the joined rows.
Now you don't want to show each message anymore, but merely count them per user. So rather then joining messages, you should join the message count:
select u.user_id, u.first_name, u.last_name, u.email, a.cnt
from ugrp_user u
inner join
(
select user_id, count(*) as cnt
from audit
group by user_id
) a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
(You could also join all messages and only then aggregate, but I don't recommend this. It would work for this and many other queries, but is prone to errors when working with multiple tables, where you might suddenly count or add up values multifold. It's a good habit to join before aggregating.)

Related

How to pull the count of occurences from 2 SQL tables

I am using python on a SQlite3 DB i created. I have the DB created and currently just using command line to try and get the sql statement correct.
I have 2 tables.
Table 1 - users
user_id, name, message_count
Table 2 - messages
id, date, message, user_id
When I setup table two, I added this statement in the creation of my messages table, but I have no clue what, if anything, it does:
FOREIGN KEY (user_id) REFERENCES users (user_id)
What I am trying to do is return a list containing the name and message count during 2020. I have used this statement to get the TOTAL number of posts in 2020, and it works:
SELECT COUNT(*) FROM messages WHERE substr(date,1,4)='2020';
But I am struggling with figuring out if I should Join the tables, or if there is a way to pull just the info I need. The statement I want would look something like this:
SELECT name, COUNT(*) FROM users JOIN messages ON messages.user_id = users.user_id WHERE substr(date,1,4)='2020';
One option uses a correlated subquery:
select u.*,
(
select count(*)
from messages m
where m.user_id = u.user_id and m.date >= '2020-01-01' and m.date < '2021-01-01'
) as cnt_messages
from users u
This query would take advantage of an index on messages(user_id, date).
You could also join and aggregate. If you want to allow users that have no messages, a left join is a appropriate:
select u.name, count(m.user_id) as cnt_messages
from users u
left join messages m
on m.user_id = u.user_id and m.date >= '2020-01-01' and m.date < '2021-01-01'
group by u.user_id, u.name
Note that it is more efficient to filter the date column against literal dates than applying a function on it (which precludes the use of an index).
You are missing a GROUP BY clause to group by user:
SELECT u.user_id, u.name, COUNT(*) AS counter
FROM users u JOIN messages m
ON m.user_id = u.user_id
WHERE substr(m.date,1,4)='2020'
GROUP BY u.user_id, u.name

Find average SQL statement multiple tables

Trying to get the average with a amount of times a user has purchases brocolli and then with the price at that time. A 0 if the user has not purchased any. This is working and saying it can't see the Name column. What am I missing?
SELECT U.ID, U.NAME, COALESCE(AVG(P.PRICE),0) AS SELLPRICE FROM USERS AS U
LEFT JOIN PURCHASES AS P ON U.ID=P.USERID AND P.FoodId=1 GROUP BY U.ID;
EDIT
SELECT P.USERID, U.NAME, AVG(P.PRICE) AS "Sell Price"
FROM PURCHASES AS P INNER JOIN USERS AS U ON CASE WHEN P.ID NOT NULL THEN
"Sell Price" ELSE 0 WHERE P.FOODID=1
I also tried simplifying to just use the purchasers table and get wrong results but maybe I can tweek this as it runs.
SELECT AVG(A.Price),ID FROM PURCHASES AS A WHERE FOODID=1 GROUP BY ID;
To be honest this was in part this issue with the compiler I was using as it was browser site compiler so even when I had it working on my machine it was giving different results on the site. I ended using an inner join on the two tables.
Update
This ran correctly for the answer. Thank you.
SELECT U.ID, U.NAME, AVG(P.PRICE) FROM USERS AS U LEFT JOIN PURCHASES AS P ON U.ID = P.USERID AS P AND P.FoodId=1 GROUP BY U.ID, P.NAME;
If I understand correctly, you are trying to get the average purchase price of broccoli for each user. You are closer to achieving this with your second query, you just need to group by the UserId column (to get the average per user), not the Id column (this would give you the average per purchase - kind of meaningless) of the Purchases table. So, i think all you need is:
SELECT AVG(PRICE), USERID FROM PURCHASES WHERE FOODID=1 GROUP BY USERId
If you want to add some user information to your output, like their name, you will need to join with the Users table:
SELECT AVG(P.PRICE), P.USERID, U.NAME
FROM PURCHASES AS P INNER JOIN USERS AS U ON P.USERID = U.ID
WHERE P.FOODID=1
GROUP BY P.USERId, U.NAME
Your first query I'm afraid has more than one syntax issues. For example, the name aver_output doesn't correspond to a table or subquery column. If you were trying to name your average price column, you would need the AS keyword (or just lose the comma). Also, you have a subquery and next to it you have a table (Users) without any correlation between them. You must specify how you want to join the two, i.e. whether you want an inner, outer, left or right join. In most systems you can also use a comma to indicate a join, but you don't even have that.
In any case, even if you do fix the syntax, the subquery is unnecessary, as you can achieve the same thing without one.
Edit (after edits to the original post):
The best way to include the users who have not ever purchased any broccoli, is to perform a left join to the PURCHASES table, as in your last attempt. However, you need to group by the name of the user, because it appears in your select list. Grouping by the name of the product is not necessary in this case. So, I suggest:
SELECT U.ID, U.NAME, AVG(P.PRICE) FROM USERS AS U LEFT JOIN PURCHASES AS P ON U.ID = P.USERID AND P.FoodId=1 GROUP BY U.ID, U.NAME;
You need a LEFT JOIN of USERS to PURCHASES and GROUP BY user:
SELECT
U.ID, U.NAME,
COALESCE(AVG(P.PRICE), 0) AS "Sell Price",
COUNT(P.USERID) AS "Number of purchases"
FROM USERS AS U LEFT JOIN PURCHASES AS P
ON P.USERID = U.ID AND P.FOODID = 1
GROUP BY U.ID, U.NAME
SELECT U.Id, AVG(U.NAME)
FROM (SELECT PRICE, USERID, ID
FROM PURCHASES WHERE FOODID=1) P
Join USERS U on U.Id = P.USERId
GROUP BY U.ID
But actually, all you should need is
SELECT U.Id, AVG(U.NAME)
FROM PURCHASES P Join USERS U
on U.Id = P.USERId
WHERE P.FOODID = 1
GROUP BY U.ID
except I still don't see why you are asking for average of Name column...
What exactly are you trying to do?

Best approach for limiting rows coming back in SQL when joining for a sum

I need to get back a list of users and the total amount that they have ordered. In reality my query is more complex but I think this sums it up. My issue is, if a user made 5 orders for example, I'll get back their name and the total they've ordered 5 times due to the join (having 5 rows in the order table for that user).
What's the recommended approach for when you need to total the records in one table that has multiple rows without requiring many rows to come back? distinct could work but is this the best? (especially when my select chooses more information than what's below)
SELECT user.name, sum(order.amount) FROM USER user
INNER JOIN USER_ORDERS order
ON (user.user_id = order.user_id)
Are you just looking for GROUP BY?
SELECT u.name, SUM(o.amount)
FROM USER u JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id
GROUP BY u.name, u.user_id;
Note that this has included user_id in the GROUP BY, just in case two users have the same name.
If you want all users, even those without orders, then you want a LEFT JOIN:
SELECT u.name, SUM(o.amount)
FROM USER u LEFT JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id
GROUP BY u.name, u.user_id;
Or a correlated subquery:
SELECT u.name,
(SELECT SUM(o.amount)
FROM USER_ORDERS uo
WHERE u.user_id = uo.user_id
)
FROM USER u;
You could use the analytic version of SUM.
SELECT u.name, SUM(o.amount) OVER(PARTITION BY u.name)
FROM USER u JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id;

Joining 3 tables in SQL Server

I have three tables namely: user, special_order, corp_order.
I want a result where it can display all orders placed by a user in both order tables.
My SQL statement is:
SELECT
u.user_id,
c.user_id,
s.user_id
FROM
corp_user u
JOIN
special_order s ON s.user_id = u.user_id
JOIN
corp_orders c ON c.user_id = u.user_id;
which is returning unnecessary data.
What you want is all orders, regardless of whether they are special orders or coprporate orders. Decide what columns you want from each corp_user table, and matching columns for orders from each order table and then do two separate queries that are linked by UNION ALL. (You want UNION ALL, not UNION because otherwise if there are exact matches in all fields, the duplicates would be eliminated.
Example for illustration:
SELECT
u.user_id,
s.order_id
FROM
corp_user u
INNER JOIN special_order s on u.user_id = s.user_id
UNION ALL
SELECT
u.user_id,
c.order_id
FROM
corp_user u
INNER JOIN corp_order c on u.user_id = c.user_id
Note: fields must match exactly - not necessarily in names, but in position. Names from first query will be used.

sql using count & group by without using distinct keyword?

I want to optimize this query
**SELECT * FROM Users WHERE Active = 1 AND UserId IN (SELECT UserId FROM Users_Roles WHERE RoleId IN (SELECT RoleId FROM Roles WHERE PermissionLevel >= 100)) ORDER BY LastName**
execution time became less wen i replace above query with joins as below,
**SELECT u.* FROM Users u INNER JOIN Users_Roles ur ON (u.UserId = ur.UserId) INNER JOIN Roles r ON (r.RoleId = ur.RoleId) WHERE u.Active = 1 AND r.PermissionLevel > 100 GROUP BY u.UserId ORDER BY u.LastName**
But the above query gives duplicate records since my roles table has more than one entry for every user.
I cant use DISTINCT since there is a function where i find count by replacing SELECT(*) FROM to SELECT COUNT(*) FROM to find count for pagination and then execute count query and result query
As we already known that count & GROUP BY is used together will result in bad output.
Now i want to optimize the query and have to find number of rows ie count for the query. Please give be better way find out the result.
It is difficult to optimise other peoples queries without fully knowing the schema, what is indexed what isn't, how much data there is, what your DBMS is etc. Even with this we can't see execution plans, IO statistics etc. With this in mind, the below may not be better than what you already have, but it is how I would write the query in your situation.
SELECT u.*
FROM Users u
INNER JOIN
( SELECT ur.UserID
FROM Users_Roles ur
INNER JOIN Roles r
ON r.RoleID = ur.RoleID
WHERE r.PermissionLevel > 100
GROUP BY ur.UserID
) ur
ON u.UserId = ur.UserId
WHERE u.Active = 1
ORDER BY u.LastName