Group by and aggregate by multiple columns - sql

Example tables
taccount
tuser
tproject
What I want to achieve:
accountName count(u.id) count(p.id)
-----------------------------------
Account A 1 1
Account B 1 1
Account C 2 3
In other words I want a single query to join these tables together and count user's and project's per account
I tried:
SELECT
a.name as "accountName",
count(u.name),
count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY u.name, a.name, p.id
But it's not grouping by account. It's giving me the following result
Any advice?

You can try below
SELECT
a.name as "accountName",
count(distinct u.name),
count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY a.name

When you do Aggregate Function and If there are Column are not do Aggregate you must put in your Group By, because Aggregate functions perform a calculation on a set of rows and return a single row.
SELECT
a.name as "accountName",
count(distinct u.name),
count(p.id)
FROM
"taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY
a.name
So you need just Group By your column "accountName"

change your group by column name
SELECT
a.name as "accountName",
count(distinct u.account_id),
count(p.id)
FROM "taccount" a
INNER JOIN "tuser" u ON u.account_id = a.id
INNER JOIN "tproject" p ON p.admin_id = u.id
GROUP BY a.name

this will work:
select a.name,count(u.id),count(p.id) from
taccount a,tuser b, tproject where
a.id=b.account_id and
b.id=c.admin_id
group by a.name;

Related

Get null records in SQL

I have the next query:
SELECT c.name as clientName, p.id as projectId, p.name as projectName, p.rate, u.name as userName, sum(w.duration) as workedHours
FROM Project p, User u, Worklog w, Client c
WHERE w.user_id = u.id AND w.project_id = p.id AND p.client_id = c.id
GROUP BY p.id, u.id
that returns the projects, clients, hourly rate and worked hours.
How should be changed to return also the projects where workedHours is equal with 0?
Because this query returns just the records where workedHours is not 0.
Thank you for your time.
The problem is that no row in worklog can be joined, and that your condition in the WHERE clause removes any row without worklog associated.
Solution 1 : Using a LEFT JOIN
Using a left join instead would solve your problem.
SELECT c.name as clientName, p.id as projectId, p.name as projectName, p.rate, u.name as userName, coalesce(sum(w.duration), 0) as workedHours
FROM Project p, User u, Client c
LEFT JOIN Worklog w ON w.project_id = p.id AND w.user_id = u.id
WHERE p.client_id = c.id
GROUP BY p.id, u.id
By the way your query is suspicious in other aspects. For example c.name is in the SELECT clause but not in the GROUP BY clause. I take it that you use MySQL which is the only RDBMS I'm aware of which allows such queries. You maybe should consider adding the retrieved columns in the GROUP BY clause.
Solution 2 : Using only ANSI JOINs
As underscore_d points out, you may want to avoid old-style joins completely, and preferable use the following query :
SELECT
c.name as clientName,
p.id as projectId,
p.name as projectName,
p.rate,
u.name as userName,
coalesce(sum(w.duration), 0) as workedHours
FROM Project p
CROSS JOIN User u
INNER JOIN Client c ON p.client_id = c.id
LEFT JOIN Worklog w ON w.project_id = p.id AND w.user_id = u.id
GROUP BY c.name, p.id, p.name, p.rate, u.id, u.name
Solution 3 - Using a subquery
Another solution is to use a subquery, which would allow you to remove the GROUP BY clause completely and get a more manageable query if you ever need to retrieve more information. I personally don't like long lists of columns in a GROUP BY clause.
SELECT
c.name as clientName,
p.id as projectId,
p.name as projectName,
p.rate,
u.name as userName,
(SELECT SUM(duration) FROM Worklog WHERE project_id = c.id AND user_id = u.id) as workedHours
FROM Project p
CROSS JOIN User u
INNER JOIN Client c ON p.client_id = p.id
You should use standard ANSI joins and use LEFT JOIN on worklog table and ultimately you have to use LEFT JOIN on the user table as follows:
SELECT C.NAME AS CLIENTNAME,
P.ID AS PROJECTID,
P.NAME AS PROJECTNAME,
P.RATE,
U.NAME AS USERNAME,
SUM(W.DURATION) AS WORKEDHOURS
FROM PROJECT P
JOIN CLIENT C
ON P.CLIENT_ID = C.ID
LEFT JOIN WORKLOG W
ON W.PROJECT_ID = P.ID
LEFT JOIN USER U
ON W.USER_ID = U.ID
GROUP BY P.ID,
U.ID;

SQL union two complex query

There are 4 tables,
posts(id
post_share(shared_by, post_id
friendship(friend_one, friend_two, status)
add_viewer(id
By using above 4 tables I need to get posts to render in the user's news feed. Currently I only use the first query to do that which means I do not add shared post by user's friends to his/her news feed. Now I need to show the shared post, too just like in facebook.
For that I created new query ( 2nd one ). Now I want to connect both of them together and result only one table or I would like to know if there's a better way to handle this
By assuming currently logged in user_id is 22,
SELECT concat(a.fname, ' ', a.lname) as name, a.id as user_id , p.id as post_id, p.content, p.media FROM posts p
INNER JOIN add_viewers a
ON p.user_id = a.id
WHERE p.user_id in (
SELECT a.id FROM friendships f
INNER JOIN add_viewers a
ON f.friend_one = a.id OR f.friend_two = a.id
WHERE friend_one=22 OR friend_two=22 AND f.status='confirmed'
GROUP BY a.id
)
ORDER BY created_at DESC
LIMIT 10
OFFSET 0
in 2nd query, s.shared_by means the person who share the post and p.user_id means the person who created the post.
SELECT concat(a.fname, ' ', a.lname) as name, a.id as user_id , p.id as post_id, p.content, p.media FROM post_shares s
INNER JOIN posts p
ON p.id = s.post_id
INNER JOIN add_viewers a
ON p.user_id = a.id
WHERE s.shared_by in (
SELECT a.id FROM friendships f
INNER JOIN add_viewers a
ON f.friend_one = a.id OR f.friend_two = a.id
WHERE friend_one=22 OR friend_two=22 AND f.status='confirmed'
GROUP BY a.id
)
ORDER BY created_at DESC
LIMIT 10
OFFSET 0
Try this
With t as
(SELECT a.id as aid FROM friendships f
INNER JOIN add_viewers a
ON f.friend_one = a.id OR
f.friend_two = a.id
WHERE friend_one=22 OR
friend_two=22 AND
f.status='confirmed'
GROUP BY a.id)
SELECT concat(a.fname, ' ', a.lname) as name, a.id as user_id , p.id as post_id, p.content, p.media FROM
post_shares s
INNER JOIN posts p
ON p.id = s.post_id
INNER JOIN t
ON p.user_id = t.aid
OR s.shared_by = t.aid
ORDER BY created_at DESC
LIMIT 10
OFFSET 0

SELECT * and SELECT COUNT(*) in one query

My SQL query looks like this
SELECT *
FROM categories AS c
LEFT JOIN LATERAL (SELECT i.*
FROM influencer_profiles AS i
WHERE c.id = i.category_id
ORDER BY i.updated_at
LIMIT 2) AS i ON 1 = 1
INNER JOIN users AS u ON i.user_id = u.id
But I also want to count each influencer_profile for category to display how many influencer_profiles in each categories. How can I use COUNT(*) with selecting all columns?
SELECT *
FROM categories AS c
LEFT JOIN LATERAL (SELECT COUNT(*)
FROM influencer_profiles AS i
WHERE c.id = i.category_id
ORDER BY i.updated_at
LIMIT 2) AS i ON 1 = 1
INNER JOIN users AS u ON i.user_id = u.id
This code doesn't work.
Perhaps you just want a window function. I note that you are using left join in one place and the inner join is undoing it.
So, I am thinking:
SELECT c.*, i.*, u.*,
COUNT(*) OVER (PARTITION BY c.id) as category_cnt
FROM categories c LEFT JOIN LATERAL
(SELECT i.*
FROM influencer_profiles AS i
WHERE c.id = i.category_id
ORDER BY i.updated_at
LIMIT 2
) i
ON 1=1 LEFT JOIN
users u
ON i.user_id = u.id;

Query returning too many results

SQL query that returns expected 29 results for a.id = 366
select a.name, c.name, MAX(B.date), MAX(b.renew_date) as MAXDATE
from boson_course c
inner join boson_coursedetail b on (c.id = b.course_id)
inner join boson_coursedetail_attendance d on (d.coursedetail_id = b.id)
inner join boson_employee a on (a.id = d.employee_id)
where a.id = 366
GROUP BY a.name, c.name
order by MAX(b.renew_date), MAX(b.date) desc;
SQL code below that returns 34 results, multiple results where two different Provides supplied the same course. I know these extra results are because I added e.name to the list to be returned. But all that is needed is the 29 entries with the latest date and Providers names.
select a.name, c.name, e.name, MAX(B.date), MAX(b.renew_date) as MAXDATE
from boson_course c
inner join boson_coursedetail b on (c.id = b.course_id)
inner join boson_coursedetail_attendance d on (d.coursedetail_id = b.id)
inner join boson_employee a on (a.id = d.employee_id)
inner join boson_provider e on b.provider_id = e.id
where a.id = 366
GROUP BY a.name, c.name, e.name
order by MAX(b.renew_date), MAX(b.date) desc;
Can anyone rework this code to return a single DISTINCT Provider name with the MAX(renew_date) for each course.
This returns exactly one row per distinct combination of (a.name, c.name):
The one with the latest renew_date.
Among these, the one with the latest date (may differ from global max(date)!).
Among these, the one with the alphabetically first e.name:
SELECT DISTINCT ON (a.name, c.name)
a.name AS a_name, c.name AS c_name, e.name AS e_name
, b.renew_date, b.date
FROM boson_course c
JOIN boson_coursedetail b on c.id = b.course_id
JOIN boson_coursedetail_attendance d on d.coursedetail_id = b.id
JOIN boson_employee a on a.id = d.employee_id
JOIN boson_provider e on b.provider_id = e.id
WHERE a.id = 366
ORDER BY a.name, c.name
, b.renew_date DESC NULLS LAST
, b.date DESC NULLS LAST
, e.name;
The result is sorted by a_name, c_name first. If you need your original sort order, wrap this in a subquery:
SELECT *
FROM (<query from above>) sub
ORDER BY renew_date DESC NULLS LAST
, date DESC NULLS LAST
, a_name, c_name, e_name;
Explanation for DISTINCT ON:
Select first row in each GROUP BY group?
Why DESC NULL LAST?
PostgreSQL sort by datetime asc, null first?
Aside: Don't use basic type names like date ad column names. Also, name is hardly ever a good name. As you can see, we have to use aliases to make this query useful. Some general advice on naming conventions:
How to implement a many-to-many relationship in PostgreSQL?
Try using distinct on:
select distinct on (a.name, c.name, e.name), a.name, c.name, e.name,
B.date, b.renew_date as MAXDATE
from boson_course c
inner join boson_coursedetail b on (c.id = b.course_id)
inner join boson_coursedetail_attendance d on (d.coursedetail_id = b.id)
inner join boson_employee a on (a.id = d.employee_id)
inner join boson_provider e on b.provider_id = e.id
where a.id = 366
ORDER BY a.name, c.name, e.name, B.date desc
order by MAX(b.renew_date), MAX(b.date) desc;

How do I match against multiple conditions on a table join?

I have two tables:
users attributes
id|name id|name|user_id
------- ---------------
1 |foo 1 |bla | 1
2 |bar 1 |blub| 1
1 |bla | 2
How do I create a query gives users with both the "bla" AND "blub" attributes?
In this case it should only return the user "foo".
I know that the data is not normalized.
SELECT u.*, a.id, b.Id, a.name, b.name FROM users u
JOIN attributes a
ON a.User_id = u.User_id AND a.name = 'bla'
JOIN attributes b
ON u.User_Id = b.User_id AND b.name = 'blub'
Assuming an attribute association to a user is unique...
if you need 3 conditions to be true add the conditions to the in and adjust count up 1.
SELECT u.name
FROM users u
INNER JOIN attributes a on A.user_Id = u.id
WHERE a.name in ('bla','blub')
GROUP by u.name
HAVING count(*)=2
and if you don't have an unique association, or you need to join to another table you could always do...
SELECT u.name
FROM users u
INNER JOIN attributes a on A.user_Id = u.id
WHERE a.name in ('bla','blub')
GROUP by u.name
HAVING count(distinct A.name)=2
for a slight performance hit. but this allows you to join and get back additional fields which others have indicated was a detriment to this method.
This allows for scaling of the solution instead of incurring the cost of joining each time to different tables. In addition, if you needed thirty-something values to associate, you may run into restrictions on the number of allowed joins.
SELECT U.NAME
FROM USERS U
INNER JOIN
ATTRIBUTES A1
ON U.ID = A1.USER_ID
INNER JOIN
ATTRIBUTES A2
ON U.ID = A2.USER_ID
WHERE A1.NAME = 'bla'
AND A2.NAME = 'blub'
You can use the INTERSECT operator
SELECT
u.id
,u.name
FROM users AS u
INNER JOIN attributes AS a
ON u.id = a.user_id
WHERE a.name = 'bla'
INTERSECT
SELECT
u.id
,u.name
FROM users AS u
INNER JOIN attributes AS a
ON u.id = a.user_id
WHERE a.name = 'blub'
;
Here is a demo on SQL Fiddle: http://sqlfiddle.com/#!6/68986/5
More info on SET operations in SQL: http://en.wikipedia.org/wiki/Set_operations_(SQL)
SELECT u.name
FROM attributes a
JOIN users u
ON u.id = a.user_id
WHERE a.name IN ('bla','bulb')