Include null/0 rows in a group by - sql

I want to query a database for users and the the amount of time they spent on each Activity Category. These categories are stored in the table ActivityCategory (int Id, varchar Name). There's only 8 of them and I want to see all of them, even when nobody spent time on a specific category.
I have the following query:
select u.NoEmploye, u.FirstName, u.LastName, Total=sum(h.NbHeures), ac.Name
from Users u
join Semaines s on u.EntityGuid=s.UserGuid
join HeuresProjets hp on s.Id=hp.WeekId
join Heures h on hp.HPId=h.HpGuid
join ActivityCodes code on h.Code=code.ActivityId
join ActivityCategory ac on code.Categorie=ac.Id
group by u.NoEmploye, u.FirstName, u.LastName, ac.Name
order by u.NoEmploye
It works fine but it doesn't return unused ActivityCategories. I tried every combination of full/left/right/inner/outer/you-name-it joins. The best I could get is completely null rows when a category is used by nobody and a null ac.Name for categories a specific user doesn't use but others do. I suspect the group by [...]ac.Name part is what's "eating" the unused categories. What am I doing wrong? Should I write a select query the a second one to group the results? I'll have a dozen more similar queries to write so I'd like to understand instead of just having a fixed query with no explanation.
EDIT
Lamak's second query works so far but it has the same problem when I add a where clause.
EDIT2 ypercube' edit works perfectly
Now let's see if I understand the query correctly.
I coalesce the Sum with 0 to have a proprer value when the result would be null. I start the selection from ActivityCategory to make sure I have all of them, even the unused ones, which I cross join with users to have every combination of ActivityCategory and Users. I left join Activitycodes to only get relevant rows and then I inner join my other tables to get to Semaine. My condition are applied to the Semaine table's join because I want to filter my data before the cross join. Finally, I group my data by Users and ActivityCategory to have the sum works.

This should do:
SELECT u.NoEmploye, u.FirstName, u.LastName, Total=sum(h.NbHeures), ac.Name
FROM ActivityCategory ac
LEFT JOIN ActivityCodes code
ON code.Categorie=ac.Id
LEFT JOIN Heures h
ON h.Code=code.ActivityId
LEFT JOIN HeuresProjets hp
ON hp.HPId=h.HpGuid
LEFT JOIN Semaines s
ON s.Id=hp.WeekId
LEFT JOIN Users u
ON u.EntityGuid=s.UserGuid
GROUP BY u.NoEmploye, u.FirstName, u.LastName, ac.Name
ORDER BY u.NoEmploye
Basically, if you want all Categories, you need to use that table as the first table on your FROM and do LEFT JOINs to that table.
UPDATE
If you want every category for every user on your results, you'll need a CROSS JOIN:
SELECT u.NoEmploye, u.FirstName, u.LastName, Total=sum(h.NbHeures), ac.Name
FROM ActivityCategory ac
CROSS JOIN Users u
LEFT JOIN ActivityCodes code
ON code.Categorie=ac.Id
LEFT JOIN Heures h
ON h.Code=code.ActivityId
LEFT JOIN HeuresProjets hp
ON hp.HPId=h.HpGuid
LEFT JOIN Semaines s
ON s.Id=hp.WeekId AND u.EntityGuid=s.UserGuid
GROUP BY u.NoEmploye, u.FirstName, u.LastName, ac.Name
ORDER BY u.NoEmploye
To solve the issue when you want to add a WHERE clause:
SELECT u.NoEmploye, u.FirstName, u.LastName,
Total=COALESCE(SUM(h.NbHeures),0), ac.Name
FROM ActivityCategory ac
CROSS JOIN Users u
LEFT JOIN
ActivityCodes code
JOIN Heures h
ON h.Code=code.ActivityId
JOIN HeuresProjets hp
ON hp.HPId=h.HpGuid
JOIN Semaines s
ON s.Id=hp.WeekId
AND (some condition on the dates) -- add here
ON ac.Id = code.Categorie
AND u.EntityGuid = s.UserGuid
GROUP BY u.NoEmploye, u.FirstName, u.LastName, ac.Name
ORDER BY u.NoEmploye

Related

Find average SQL statement multiple tables

Trying to get the average with a amount of times a user has purchases brocolli and then with the price at that time. A 0 if the user has not purchased any. This is working and saying it can't see the Name column. What am I missing?
SELECT U.ID, U.NAME, COALESCE(AVG(P.PRICE),0) AS SELLPRICE FROM USERS AS U
LEFT JOIN PURCHASES AS P ON U.ID=P.USERID AND P.FoodId=1 GROUP BY U.ID;
EDIT
SELECT P.USERID, U.NAME, AVG(P.PRICE) AS "Sell Price"
FROM PURCHASES AS P INNER JOIN USERS AS U ON CASE WHEN P.ID NOT NULL THEN
"Sell Price" ELSE 0 WHERE P.FOODID=1
I also tried simplifying to just use the purchasers table and get wrong results but maybe I can tweek this as it runs.
SELECT AVG(A.Price),ID FROM PURCHASES AS A WHERE FOODID=1 GROUP BY ID;
To be honest this was in part this issue with the compiler I was using as it was browser site compiler so even when I had it working on my machine it was giving different results on the site. I ended using an inner join on the two tables.
Update
This ran correctly for the answer. Thank you.
SELECT U.ID, U.NAME, AVG(P.PRICE) FROM USERS AS U LEFT JOIN PURCHASES AS P ON U.ID = P.USERID AS P AND P.FoodId=1 GROUP BY U.ID, P.NAME;
If I understand correctly, you are trying to get the average purchase price of broccoli for each user. You are closer to achieving this with your second query, you just need to group by the UserId column (to get the average per user), not the Id column (this would give you the average per purchase - kind of meaningless) of the Purchases table. So, i think all you need is:
SELECT AVG(PRICE), USERID FROM PURCHASES WHERE FOODID=1 GROUP BY USERId
If you want to add some user information to your output, like their name, you will need to join with the Users table:
SELECT AVG(P.PRICE), P.USERID, U.NAME
FROM PURCHASES AS P INNER JOIN USERS AS U ON P.USERID = U.ID
WHERE P.FOODID=1
GROUP BY P.USERId, U.NAME
Your first query I'm afraid has more than one syntax issues. For example, the name aver_output doesn't correspond to a table or subquery column. If you were trying to name your average price column, you would need the AS keyword (or just lose the comma). Also, you have a subquery and next to it you have a table (Users) without any correlation between them. You must specify how you want to join the two, i.e. whether you want an inner, outer, left or right join. In most systems you can also use a comma to indicate a join, but you don't even have that.
In any case, even if you do fix the syntax, the subquery is unnecessary, as you can achieve the same thing without one.
Edit (after edits to the original post):
The best way to include the users who have not ever purchased any broccoli, is to perform a left join to the PURCHASES table, as in your last attempt. However, you need to group by the name of the user, because it appears in your select list. Grouping by the name of the product is not necessary in this case. So, I suggest:
SELECT U.ID, U.NAME, AVG(P.PRICE) FROM USERS AS U LEFT JOIN PURCHASES AS P ON U.ID = P.USERID AND P.FoodId=1 GROUP BY U.ID, U.NAME;
You need a LEFT JOIN of USERS to PURCHASES and GROUP BY user:
SELECT
U.ID, U.NAME,
COALESCE(AVG(P.PRICE), 0) AS "Sell Price",
COUNT(P.USERID) AS "Number of purchases"
FROM USERS AS U LEFT JOIN PURCHASES AS P
ON P.USERID = U.ID AND P.FOODID = 1
GROUP BY U.ID, U.NAME
SELECT U.Id, AVG(U.NAME)
FROM (SELECT PRICE, USERID, ID
FROM PURCHASES WHERE FOODID=1) P
Join USERS U on U.Id = P.USERId
GROUP BY U.ID
But actually, all you should need is
SELECT U.Id, AVG(U.NAME)
FROM PURCHASES P Join USERS U
on U.Id = P.USERId
WHERE P.FOODID = 1
GROUP BY U.ID
except I still don't see why you are asking for average of Name column...
What exactly are you trying to do?

Best approach for limiting rows coming back in SQL when joining for a sum

I need to get back a list of users and the total amount that they have ordered. In reality my query is more complex but I think this sums it up. My issue is, if a user made 5 orders for example, I'll get back their name and the total they've ordered 5 times due to the join (having 5 rows in the order table for that user).
What's the recommended approach for when you need to total the records in one table that has multiple rows without requiring many rows to come back? distinct could work but is this the best? (especially when my select chooses more information than what's below)
SELECT user.name, sum(order.amount) FROM USER user
INNER JOIN USER_ORDERS order
ON (user.user_id = order.user_id)
Are you just looking for GROUP BY?
SELECT u.name, SUM(o.amount)
FROM USER u JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id
GROUP BY u.name, u.user_id;
Note that this has included user_id in the GROUP BY, just in case two users have the same name.
If you want all users, even those without orders, then you want a LEFT JOIN:
SELECT u.name, SUM(o.amount)
FROM USER u LEFT JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id
GROUP BY u.name, u.user_id;
Or a correlated subquery:
SELECT u.name,
(SELECT SUM(o.amount)
FROM USER_ORDERS uo
WHERE u.user_id = uo.user_id
)
FROM USER u;
You could use the analytic version of SUM.
SELECT u.name, SUM(o.amount) OVER(PARTITION BY u.name)
FROM USER u JOIN
USER_ORDERS uo
ON u.user_id = uo.user_id;

how can i access column from subquery

select u.phone, u.email , t.to_address (error from this)
from user_accounts u
where u.id
in
(select w.user_id
from wallets w
where w.id
in
(
select t.wallet_id
from withdraws t
where t.to_address
in
('1F6o1fZZ7', 'pJDtRRnyhDN')))
I want to get the column to_address from subquery. How can I get it in postgresql?
I try assign 'AS' for subquery but it didn't work
A join returns a result table constructed from data from multiple tables. You can also retrieve the same result table using a sub query. A sub query is simply a SELECT statement within another select statement.
select u.phone, u.email , t.to_address (
from user_accounts u
INNER JOIN wallets w ON u.id= w.user_id
INNER JOIN withdraws t ON t.wallet_id =w.id
where t.to_address in ('1F6o1fZZ7', 'pJDtRRnyhDN')
use join with all the table, you dont need any subquery
select u.phone, u.email , ww.to_address
from user_accounts u left join wallets w on u.id=w.user_id
left jon withdraws ww on w.id=ww.wallet_id
where ww.to_address in ('1F6o1fZZ7', 'pJDtRRnyhDN')
You can not access t.address because that column inside in condition.
I used left join but it seems it will be inner join type because you used filter in ('1F6o1fZZ7', 'pJDtRRnyhDN') though after applying where condition it also behave like inner join
You cannot achieve what you're trying using subquery. When you want records from different tables and they have a unique column in common that connects them then You should do it using a JOIN.
Sometimes (Not all cases) IN can cause performance problems, so you should consider knowing more about different types of JOINS(https://www.w3schools.com/sql/sql_join.asp)
Check the link for comparison:
Inner join versus doing a where in clause
About the Query:
SELECT
u.phone, u.email , t.to_address (error from this)
FROM
user_accounts u
INNER JOIN wallets w ON u.id = w.id
INNER JOIN withdraws t ON t.wallet_id = w.id
WHERE
t.to_address IN ('1F6o1fZZ7', 'pJDtRRnyhDN')

How to reference the same column from 2 seperate IDs SQL Server

I've been looking around for a bit unable to find an answer to this that works, So, I'm hoping someone can help me with my best bet to solve this. Basically I have a table with a User_id and a delegate_id, however both obviously reference the same user_mstr table. But what I want is the name of both the user and the delegate.
Here is the part of my query in question:
SELECT u.first_name, u.last_name, delegate_user_id, delegate_provider_ind,
tasks_ind, workflow_use_always_ind
FROM workflow_user_delegates ud
LEFT OUTER JOIN user_mstr u ON ud.user_id=u.user_id
But I want the delegate_id to be changed into a name, except obviously I already referenced user first and last name.
The exact software is SQL Server 2008. Thanks!
You can use alias = expression or expression AS alias to alias column names.
select
u.first_name
, u.last_name
, delegate_user_id
, delegate_first_name = d.first_name
, delegate_last_name = d.lastname
, delegate_provider_ind
, tasks_ind
, workflow_use_always_ind
from workflow_user_delegates ud
left join user_mstr u on ud.user_id = u.user_id
left join user_mstr d on ud.delegate_user_id = d.user_id
You need an additional LEFT JOIN:
SELECT u.first_name AS user_first_name, u.last_name AS user_last_name,
d.first_name AS delegate_first_name, d.last_name AS delegate_last_name,
delegate_user_id, delegate_provider_ind,
tasks_ind, workflow_use_always_ind
FROM workflow_user_delegates ud
LEFT OUTER JOIN user_mstr u ON ud.user_id=u.user_id
LEFT OUTER JOIN user_mstr d ON ud.delegate_user_id=d.user_id
You can join the same table twice, just provide different aliases for the table. Something like this:
....
FROM workflow_user_delegates ud
LEFT OUTER JOIN user_mstr u ON ud.user_id=u.user_id
LEFT OUTER JOIN user_mstr d ON ud.delegate_id=d.user_id
So any references to u.[...] in the SELECT will reference the one joined based on user_id, and any references to d.[...] will reference the one based on delegate_id.
To differentiate the columns in the SELECT clause, use a similar approach for aliasing. Something like this:
SELECT
....
u.first_name AS user_first_name,
u.last_name AS user_last_name,
d.first_name AS delegate_first_name,
d.last_name AS delegate_last_name,
....

sql - How to have multiple select/from statements in one query

I'm trying to pull a report where each column is selecting from a specific table set. However, one of the columns needs to pull from a completely different table set and still be included in the same report. Of course, this doesn't work:
select u.first_name, ticket_work.time_spent
FROM tickets LEFT OUTER JOIN ticket_work ON ticket_work.ticket_id = tickets.id JOIN users u
(select count(tickets.id) FROM tickets JOIN users u)
where tickets.assigned_to = u.id
...
So just the part (select count(tickets.id) FROM tickets JOIN users u) needs to be selecting from the different table set but still be included in the report.
I'm a little confused by your question. Are you wanting to return the user, the count of tickets for that user, and the amount of time spent overall? If so, something like this should work:
select u.id, u.first_name,
SUM(tw.time_spent) summed_time_spent,
COUNT(DISTINCT t.id) count_tickets
FROM users u
LEFT JOIN tickets t
ON u.id = t.assigned_to
LEFT JOIN ticket_work tw
ON tw.ticket_id = t.id
GROUP BY u.id, u.first_name
Your questions is unclear, but just generally, it sounds like you're trying to join to a derived table (i.e., a query). In that case, do this:
SELECT...
FROM...
table_A A LEFT JOIN
(SELECT keyfield, valuefield FROM table_b WHERE ...) B
ON A.keyfield = B.keyfield
Does that make sense? To make a derived table, you put a query inside of parenthesis, give it an alias ('B' in this case), and then join it to your other tables as though it were a regular table.
Don't know about your table structure but you may use a sub query for such requirement
select u.first_name, ticket_work.time_spent,(select count(tickets.id) FROM tickets where ticket.id=ticket_work.ticket_id) as myCount
FROM tickets LEFT OUTER JOIN ticket_work ON ticket_work.ticket_id = tickets.id JOIN users u
where tickets.assigned_to = u.id