Getting Counts Per User Across Tables in POSTGRESQL - sql

I'm new to postgresql. I have a database that has three tables in it: Users, Order, Comments. Those three tables look like this
Orders Comments
------ --------
ID ID
UserID UserID
Description Details
CreatedOn CreatedOn
I'm trying to get a list of all of my users and how many orders each user has made and how many comments each user has made. In other words, the result of the query should look like this:
UserID Orders Comments
------ ------ --------
1 5 7
2 2 9
3 0 0
...
Currently, I'm trying the following:
SELECT
UserID,
(SELECT COUNT(ID) FROM Orders WHERE UserID=ID) AS Orders,
(SELECT COUNT(ID) FROM Comments WHERE UserID=ID) AS Comments
FROM
Orders o,
Comments c
WHERE
o.UserID = c.UserID
Is this the right way to do this type of query? Or can someone provide a better approach from a performance standpoint?

SQL Fiddle
select
id, name,
coalesce(orders, 0) as orders,
coalesce(comments, 0) as comments
from
users u
left join
(
select userid as id, count(*) as orders
from orders
group by userid
) o using (id)
left join
(
select userid as id, count(*) as comments
from comments
group by userid
) c using (id)
order by name

The usual way to do this is by using outer joins to the two other tables and then group by the id (and name)
select u.id,
u.name,
count(distinct o.id) as num_orders,
count(distinct c.id) as num_comments
from users u
left join orders o on o.userId = u.id
left join comments c on c.userId = u.id
group by u.id, u.name
order by u.name;
That might very well be faster than your approach. But Postgres' query optimizer is quite smart and I have seen situations where both solutions are essentially equal in performance.
You will need to test that on your data and also have a look at the execution plans in order to find out which one is more efficient.

Related

How to query top record group conditional on the counts and strings in a second table

I call on the SQL Gods of the internet!! O so desperately need your help with this query, my livelyhood depends on it. I've solved it in Alteryx in like 2 minutes but i need to write this query in SQL and I am relatively new to the language in terms of complex blending and syntax.
Your help would be so appreciated!! :) xoxox I cant begin to describe
Using SSMS I need to use 2 tables 'searches' and 'events' to query...
the TOP 2 [user]s with the highest count of unique search ids in Table 'searches'
Condition that the [user]s in the list have at least 1 eventid in 'events' where [event type] starts with "great"
Here is an example of what needs to happen
search event and end result example
So the only pieces i have so far are below but boy oh boy please don't Laugh :(
What i was trying to do is..
select a table of unique users with the searchcounts from the search table
inner join selected table from 1 on userid with a table described in 3
create table of unique user ids with counts of events with [type] starting with "great"
Filter the inner joined table for the top 2 search counts from step 1
SELECT userid, COUNT() as searchcount
FROM searches
GROUP BY userid
INNER JOIN (SELECT userid, COUNT() as eventcount
FROM events WHERE LEFT(type, 5) = "great" AND eventcount>0 Group by userid)
ON searches.userid=events.userId
Obviously, this doesn't work at all!!! I think my structure is off and my method of filtering for "great" is errored. Also i dont know how to add the "top 2" clause to the search table query without affecting the inner join. This code needs to be fairly efficient so if you have a better more computationally efficient idea...I love you long time
SELECT top(2) userid, COUNT() as searchcount FROM searches
where userid in (select userid from events where left(type, 5)='great')
GROUP BY userid
order by count() desc
hope above query will serve your purpose.
I think you need exists and windows function dense_rank as follows:
Select * from
(Select u.userid, dense_rank() over (partition by u.userid order by count(*) desc) as rn
From users u join searches s on u.userid = s.userid
Where exists
(select 1 from events e
Where e.userid = u.userid And LEFT(e.type, 5) = 'great')
Group by u.userid ) t Where rn <= 2

Count by partition

I have a bunch of rows(fruits) in a table, along with the basic details like fruit name, description color.
I want to know how many people viewed that product, how people liked it, etc...
This is what I tried but it is returning completely wrong numbers:
SELECT
vw.[Id],
vw.[Name],
vw.[Color],
-- This is returning 268 for id 1 but it supposed to be 134
(COUNT(v.PublicImageViewId) OVER (PARTITION BY v.[publicImageId] )) AS [ViewCount],
-- This is returning 268 for id 1 but it supposed to be 2, both these counts are same and wrong
(COUNT(u.PublicImageUpvoteId) OVER (PARTITION BY v.[publicImageId] )) AS [UpvoteCount]
FROM
[PublicImage] vw
LEFT JOIN
[PublicImageUpvote] u ON u.[PublicImageId] = vw.[PublicImageId]
LEFT JOIN
[PublicImageFavourite] f ON f.[PublicImageId] = vw.[PublicImageId]
LEFT JOIN
[PublicImageView] v ON v.PublicImageId = vw.[PublicImageId]
This might not be done like this or I'm doing blunder mistake.
All I wanted is for each product no of views, no of likes, no of favorites ...etc
Tables:
Public Image Table
PublicImageId: PK
name, color
PublicImageUpvote (same for PublicImageView, PublicImageFavourite)
PublicImageUpvoteId: PK
PublicImageId: FK
CreatedBy, Created Date
You must GROUP BY vw.[Id], vw.[Name], vw.[Color] and count the distinct values:
SELECT
vw.[Id],
vw.[Name],
vw.[Color],
COUNT(DISTINCT v.PublicImageViewId) AS [ViewCount],
COUNT(DISTINCT u.PublicImageUpvoteId) AS [UpvoteCount]
FROM [PublicImage] vw
LEFT JOIN [PublicImageUpvote] u ON u.[PublicImageId] = vw.[PublicImageId]
LEFT JOIN [PublicImageFavourite] f ON f.[PublicImageId] = vw.[PublicImageId]
LEFT JOIN [PublicImageView] v ON v.PublicImageId = vw.[PublicImageId]
GROUP BY vw.[Id], vw.[Name], vw.[Color]
May I suggest that you need another table for this
How many people viewed that product, how people liked it, etc...
Then use JOIN

Select users' info with their total number of comments

I have two tables:
User : (id, username, created_at, updated_at)
Comment : (comment_id, user_id, username, created_at, updated_at)
Note: yes, I do understand Comment table has a duplicated field, 'username'. However, the table is already designed in that way and I have no permission to redesign the schema.
And this is an output format how I want to extract data from tables.
id | username | num_of_counts
And this is two different sql codes I've tried with (I've simplified the codes to show you what I'm trying to do... minor typos may exist but general ideas are here.)
-- Ver 1
SELECT u.id, u.username, COUNT(c.id)
FROM User u
LEFT JOIN Comment c ON u.id = c.id
GROUP BY u.id
-- Ver 2
SELECT u.id, u.username, c.cnt
FROM User u
LEFT JOIN (SELECT id, COUNT(*) AS cnt
FROM Comment
GROUP BY user_id) c
ON u.id = c.id
GROUP BY u.id
Both codes gives me the same issue:
"Column 'username' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause"
After reading some articles regarding it, I've learned that there's a conflict between selecting 'username' and grouping rows by 'id'.
I'm still googling and reading similar cases but still getting the same issue (I'm not that good at sql stuff...)
What would be the best way to code sql query to get outputs in this format?
id | username | num_of_comments
1 | Tyler | 3
2 | Jane | 5
3 | Jack | 1
SELECT u.id, u.username, COUNT(c.id) as theCount
FROM User u
JOIN Comment c ON u.id = c.id
GROUP BY u.id,u.username
Drew has the right answer. But I want to point out that your second query can also work. It just doesn't need a group by at the outermost level:
SELECT u.id, u.username, c.cnt
FROM User u LEFT JOIN
(SELECT id, COUNT(*) AS cnt
FROM Comment
GROUP BY user_id
) c
ON u.id = c.id;
Under some circumstances, this can even have better performance -- for instance, if username were a really, really long string.
Neither has solved the issue.... :'(
SELECT
*,
(SELECT COUNT(id) FROM Comment WHERE id = id) AS Comments
FROM User
ORDER BY id DESC
This work-around has solved the issue... it's a simplified version of what i've actually coded tho. I still appreciate your answers.

Is this SQL query with an EXISTS the most performant way of returning my result?

I'm trying to find the following statistic: How many users have made at least one order
(yeah, sounds like homework .. but this is a simplistic example of my real query).
Here's the made up query
SELECT COUNT(UserId)
FROM USERS a
WHERE EXISTS(
SELECT OrderId
FROM ORDERS b
WHERE a.UserId = b.UserId
)
I feel like I'm getting the correct answers but I feel like this is an overkill and is inefficient.
Is there a more efficient way I can get this result?
If this was linq I feel like I want to use the Any() keyword....
It sounds like you just could use COUNT DISTINCT:
SELECT COUNT(DISTINCT UserId)
FROM ORDERS
This will return the number of distinct values of UserId appear in the table OrderId.
In response to sgeddes's comment, to ensure that UserId also appears in Users, simply do a JOIN:
SELECT COUNT(DISTINCT b.UserId)
FROM ORDERS b
JOIN USERS a
ON a.UserId = b.UserId
Select count(distinct u.userid)
From USERS u
Inner join ORDERS o
On o.userid = u.userid
Your query should be fine, but there are a few other ways to calculate the count:
SELECT COUNT(*)
FROM USERS a
WHERE UserId IN (
SELECT UserId
FROM ORDERS b
)
or
SELECT COUNT(DISTINCT UserID)
FROM USERS a
INNER JOIN ORDERS b ON a.UserID = b.UserID
The only way to know which is faster is to try each method and measure the performance.

Mixing INs and Cross Table queries in Postgres

I'm new to Postgres. I have a query that involves 4 tables. My tables look like the following:
User Account Transaction Action
---- ------- ----------- ------
ID ID ID ID
Name UserID AccountID AccountID
Description Description
For each user, I'm trying to figure out: How many accounts they have, and how many total transactions and actions have been taken across all accounts. In other words, I'm trying to generate a query whose results will look like the following:
User Accounts Transactions Actions
---- -------- ------------ -------
Bill 2 27 7
Jack 1 7 0
Joe 0 0 0
How do I write a query like this? Currently, I'm trying the following:
SELECT
u.Name,
(SELECT COUNT(ID) FROM Account a WHERE a.UserID = u.ID) as Accounts
FROM
User u
Now, I'm stuck though.
untested, I would go for something like this.
select
u.Name,
count(distinct a.ID) as Accounts,
count(distinct t.ID) as Transactions,
count(distinct ac.ID) as Actions
from User u
left join Account a on u.ID = a.UserID
left join Transaction t on t.AccountID = a.ID
left join Action ac on ac.AccountId = a.Id
group by u.Name
As probably intended:
SELECT u.name, u.id
,COALESCE(x.accounts, 0) AS accounts
,COALESCE(x.transactions, 0) AS transactions
,COALESCE(x.actions, 0) AS actions
FROM users u
LEFT JOIN (
SELECT a.userid AS id
,count(*) AS accounts
,sum(t.ct) AS transactions
,sum(c.ct) AS actions
FROM account a
LEFT JOIN (SELECT accountid AS id, count(*) AS ct FROM transaction GROUP BY 1) t USING (id)
LEFT JOIN (SELECT accountid AS id, count(*) AS ct FROM action GROUP BY 1) c USING (id)
GROUP BY 1
) x USING (id);
Group first, join later. That's fastest and cleanest by far if you want the whole table.
SQL Fiddle (building on the one provided by #Raphaƫl, prettified).
Aside: I tripped over your naming convention in my first version.