Combine multiple rows to one after group by - sql

UPDATE:
https://www.db-fiddle.com/f/5fGUTSsAhGRPYPk33wDSzz/0
Sorry for asking the question very similar to the previous one, but I am really stuck here.
There are multiple tables:
items → items_roles → roles
items → zones → roles_zones → roles
Structure:
items: id, zone_id
items_roles: role_id, item_id
zones: id
roles_zones: role_id, zone_id
roles: id, role_type_id,
I am trying to add role fields to items, it should take role_type and it is value from items_zones and if it is NULL fetch fallback value from zone (roles_zones).
I started with:
SELECT
items.id,
(z_roles.role_type_id) as z_role_type_id,
(z_roles.id) as z_role_id,
MAX(i_roles.role_type_id) as i_role_type_id,
MAX(i_roles.id) as i_role_id
FROM
items
LEFT JOIN
zones as j_zones ON j_zones.id = items.zone_id
LEFT JOIN
roles_zones ON roles_zones.zone_id = j_zones.id
LEFT JOIN
roles as z_roles ON (z_roles.id = roles_zones.role_id)
LEFT JOIN
items_roles ON items_roles.item_id = items.id
LEFT JOIN
roles as i_roles ON items_roles.role_id = i_roles.id
AND (z_roles.role_type_id = i_roles.role_type_id)
WHERE
items.id = 834
GROUP BY
items.id, z_roles.role_type_id, z_roles.id
ORDER BY
i_role_id;
Looks right:
id |z_role_type_id |z_role_id |i_role_type_id |i_role_id |
----+---------------+----------+---------------+----------+
834 |5 |111 |5 |68 |
834 |11 |120 |11 |120 |
834 |7 |77 | | |
834 |2 |2 | | |
834 |12 |91 | | |
834 |4 |78 | | |
834 |8 |36 | | |
And now this query:
SELECT
items.id,
z_roles.role_type_id as z_role_type_id,
z_roles.id as z_role_id,
MAX(i_roles.role_type_id) AS i_role_type_id,
MAX(i_roles.id) AS i_role_id,
MAX(CASE
WHEN (i_roles.role_type_id = 5) THEN i_roles.id
WHEN (z_roles.role_type_id = 5) THEN z_roles.id
END) AS role_type_5_value,
MAX(CASE
WHEN (i_roles.role_type_id = 11) THEN i_roles.id
WHEN (z_roles.role_type_id = 11) THEN z_roles.id
END) AS role_type_11_value,
MAX(CASE
WHEN (i_roles.role_type_id = 7) THEN i_roles.id
WHEN (z_roles.role_type_id = 7) THEN z_roles.id
END) AS role_type_7_value
FROM
items
LEFT JOIN
zones AS j_zones ON j_zones.id = items.zone_id
LEFT JOIN
roles_zones ON roles_zones.zone_id = j_zones.id
LEFT JOIN
roles AS z_roles ON (z_roles.id = roles_zones.role_id)
LEFT JOIN
items_roles ON items_roles.item_id = items.id
LEFT JOIN
roles AS i_roles ON items_roles.role_id = i_roles.id
AND (z_roles.role_type_id = i_roles.role_type_id)
WHERE
items.id = 834
GROUP BY
items.id,
z_roles.role_type_id,
z_roles.id
ORDER BY
items.id, i_role_id;
Generates this:
id | z_role_type_id | z_role_id | i_role_type_id | i_role_id | role_type_5_value | role_type_11_value | role_type_7_value
-----+----------------+-----------+----------------+-----------+-------------------+--------------------+-------------------
834 | 5 | 111 | 5 | 68 | 111 | |
834 | 11 | 120 | 11 | 120 | | 120 |
834 | 7 | 77 | | | | | 77
834 | 2 | 2 | | | | |
834 | 12 | 91 | | | | |
834 | 4 | 78 | | | | |
834 | 8 | 36 | | | | |
(7 rows)
Multiple rows and wrong value for role_type_5_value. Probably because of MAX aggregator. Is it possible to use something like first aggregator (because rows ordered by i_role_id and first results are right)?
I want this:
id | role_type_5_value | role_type_11_value | role_type_7_value
-----+-------------------+--------------------+-------------------
834 | 68 | 120 | 77
I tried to group by by aggregated fields, (role_type_5_value, role_type_11_value, role_type_7_value) but this is simply not working.

First: removing the unneeded bridge-tables from the main query and squeeze them into EXISTS() terms will simplify your query.(you only need three tables, the rest is glue)
Second: don't put all your terms in the GROUP BY clause.
SELECT
i0.id
, MAX(CASE
WHEN (r1.role_type_id = 5) THEN r1.id
WHEN (r0.role_type_id = 5) THEN r0.id
END) AS role_type_5_value
, MAX(CASE
WHEN (r1.role_type_id = 11) THEN r1.id
WHEN (r0.role_type_id = 11) THEN r0.id
END) AS role_type_11_value
, MAX(CASE
WHEN (r1.role_type_id = 7) THEN r1.id
WHEN (r0.role_type_id = 7) THEN r0.id
END) AS role_type_7_value
FROM items i0
LEFT JOIN roles AS r0
ON EXISTS ( SELECT*
FROM zones AS z0
JOIN roles_zones rz ON rz.zone_id = z0.id
WHERE z0.id = i0.zone_id
AND r0.id = rz.role_id)
LEFT JOIN roles AS r1
ON EXISTS ( SELECT*
FROM items_roles ir
WHERE ir.item_id = i0.id
AND ir.role_id = r1.id
AND r0.role_type_id = r1.role_type_id
)
WHERE i0.id = 834
GROUP BY i0.id
-- r0.role_type_id,
-- r0.id
ORDER BY i0.id;

Related

JOIN removing rows without entry

I want to display columns even if they have no entry to show they have no data. I've found that joins have omit row needed.
I have two tables
|TRADEID | Value | Date |
|--------|-------|-----------|
| a | 100 | 01/01/2020|
| b | 500 | 01/01/2020|
| c | 10 | 01/01/2020|
| d | 130 | 01/01/2020|
| ID | TradeID | Role | employeeID|
|-----|---------|---------|-----------|
| 1 | a | Trader | T1 |
| 2 | a | Seller | S1 |
| 3 | b | Trader | T1 |
| 4 | d | Trader | T2 |
| 5 | d | Seller | S1 |
| 6 | d | Reporter| R1 |
I would like to end up with the following
TradeID | Trader | Seller | Reporter| Value|
---------|--------|--------|---------|------|
a | T1 | S1 | | 100 |
b | T1 | | | 500 |
c | | | | 10 |
d | T2 | S1 | R1 | 130 |
My current query is :
select t1.TradeID, r1.employeeID, r2.employeeId, r3.employeeId, t1.value
From tradeTable t1
join RoleTable r1 on t1.TradeID = r1.TradeID and r1.role = 'Trader'
join RoleTable r2 on t1.TradeId = r2.TradeID and r1.role = 'Seller'
join RoleTable r3 on t1.TradeId = r3.TradeID and r1.role = 'Reporter'
This however only returns rows d as it has all the values present.
You can left join:
select t1.TradeID, r1.employeeID trader, r2.employeeId seller, r3.employeeId reporter, t1.value
from tradeTable t1
left join RoleTable r1 on t1.TradeID = r1.TradeID and r1.role = 'Trader'
left join RoleTable r2 on t1.TradeId = r2.TradeID and r1.role = 'Seller'
left join RoleTable r3 on t1.TradeId = r3.TradeID and r1.role = 'Reporter'
Another option is conditional aggregation:
select t1.TradeID,
max(case when r.role = 'Trader' then r.employeeID end) trader,
max(case when r.role = 'Seller' then r.employeeID end) seller,
max(case when r.role = 'Reporter' then r.employeeID end) reporter,
t1.value
from tradeTable t1
left join RoleTable r
group by t1.TradeID, t1.value
You might want to test both options to assess which one is more efficient for your dataset.

Issue with SQL join and group

I have 4 tables I am trying to join and then group data. The data consists of jobs, invoices and accounts. I want to generate a total of each account in each job.
I have the following tables:
Jobs
| ID | JobNumber |
|----|-----------|
| 1 | J200 |
| 2 | J201 |
Job_Invoices
| ID | InvoiceNumber | JobID |
|----|---------------|-------|
| 10 | I300 | 1 |
| 11 | I301 | 2 |
Invoice_Accounts
| ID | InvoiceId | AccountID | Amount |
|----|-----------|-----------|--------|
| 23 | 10 | 40 | 200 |
| 24 | 10 | 40 | 300 |
| 25 | 10 | 41 | 100 |
| 26 | 11 | 40 | 100 |
Accounts
| ID | Name |
|----|------|
| 40 | Sales|
| 41 | EXP |
I am trying the following:
SELECT
J.JobNumber,
A.Name AS "Account",
SUM(JA.Amount) AS 'Total'
FROM
Job J
LEFT JOIN
Job_Invoices JI ON JI.JobID = J.JobID
INNER JOIN
Invoice_Accounts JA ON JA.InvoiceId = JI.ID
INNER JOIN
Accounts A ON A.ID = JA.AccountID
GROUP BY
J.JobNumber, A.Name, JA.Amount
ORDER BY
J.JobNumber
What I expect:
| JobNumber | Account | Total |
|-----------|-----------|-------|
| J200 | EXP | 100 |
| J200 | Sales | 500 |
| J201 | Sales | 100 |
What I get:
| JobNumber | Account | Total |
|-----------|-----------|-------|
| J200 | EXP | 100 |
| J200 | Sales | 200 |
| J200 | Sales | 300 |
| J201 | Sales | 100 |
You don't need the Job table in the query. The INNER JOINs are to the Job_Invoices table, so the outer join is turned into an inner join anyway.
So, you can simplify this to:
SELECT JI.JobNumber, A.Name AS Account, SUM(JA.Amount) AS Total
FROM Job_Invoices JI JOIN
Invoice_Accounts JA
ON JA.InvoiceId = JI.ID JOIN
Accounts A
ON A.ID = JA.AccountID
GROUP BY JI.JobNumber, A.Name
ORDER BY JI.JobNumber;
Also note that you don't need to escape the column aliases. The just makes the query harder to type.
The problem is you have the JA.Amount in your GROUP BY clause. Try taking it out:
SELECT J.JobNumber, A.Name AS "Account", SUM(JA.Amount) AS 'Total'
FROM Job J
LEFT JOIN Job_Invoices JI ON JI.JobID = J.JobID
INNER JOIN Invoice_Accounts JA ON JA.InvoiceId = JI.ID
INNER JOIN Accounts A ON A.ID = JA.AccountID
GROUP BY J.JobNumber, A.Name
ORDER BY J.JobNumber
You can write a query as:
select sum (IA.Amount) as Amount, J.JobNumber,A.Name
from #Invoice_Accounts IA --as it holds the base data
join #Job_Invoices JI on IA.InvoiceId = JI.ID
join #Jobs J on J.id = JI.JobID
join #Accounts A on A.ID = IA.AccountID
group by J.JobNumber,A.Name
Included the Jobs table as it has the JobNumber column. Sample code here..

SELECT max date from an inner join relation

I have 2 tables, Staff and updateStaff.
Staff:
Sid Sname
---|--------|
1 | test1 |
2 | test2 |
3 | test3 |
4 | test4 |
5 | test5 |
updateStaff:
Sid Sprice SDate STime
---|--------|----------|--------|
1 | 150 |2015/10/09|6:35:00 |
2 | 250 |2015/10/10|5:21:00 |
3 | 75 |2015/11/11|17:30:00|
3 | 95 |2015/11/11|18:21:00|
4 | 300 |2015/12/12|2:25:00 |
I need result shows as:
Sid SDate STime Sname | Sprice |
---|----------|--------|---------|------------
1 |2015/10/09|6:35:00 |test1 |150 |
2 |2015/10/10|5:21:00 |test2 |250 |
3 |2015/11/11|17:30:00|test3 |95 |
3 |2015/11/11|18:21:00|test3 |300 |
4 |2015/12/12|2:25:00 |test5 |NULL |
In the other case, my below code show me both staff Id 3 on 2015/11/11 date.
SELECT R.SId ,R.SName,R.Sprice
FROM (SELECT Staff.SId ,Staff.SName,Sprice,updateStaff.SDate
FROM Staff
LEFT JOIN updateStaff ON Staff.SId = updateStaff.SId ) AS R
WHERE R.date = (SELECT MAX(date) FROM updateStaff WHERE updateStaff.SId =R.SId)
ORDER BY R.SId , R.SName
I need only the last staff's price order by date, time.
I'm not sure about sql server ce syntax, so I'm pretty sure there is a better way of doing it but you can do this:
SELECT R.SId ,R.SName,R.Sprice
FROM (SELECT Staff.SId ,Staff.SName,Sprice,updateStaff.SDate,updateStaff.stime
FROM Staff
LEFT JOIN updateStaff ON Staff.SId = updateStaff.SId ) AS R
WHERE R.stime = (SELECT MAX(stime) FROM updateStaff us WHERE us.SId =R.SId
and us.sdate =(select max(sdate) from updateStaff us2 where us2.sid = us.sid))
ORDER BY R.SId , R.SName

SQL join 3 tables (based on 2 criterias?)

I have 3 tables setup like this (a bit simplified):
time_tracking: id, tr_proj_id, tr_min, tr_type
time_projects: id, project_name
time_tasks: id, task_name
Basically, I want to retrieve either project_name or task_name based on tr_type which can be of value "project" or "task"
An example
time_tracking
+----+------------+--------+---------+
| id | tr_proj_id | tr_min | tr_type |
+----+------------+--------+---------+
| 1 | 3 | 60 | project |
| 2 | 3 | 360 | task |
| 3 | 1 | 120 | project |
| 4 | 2 | 30 | project |
| 5 | 2 | 30 | task |
| 6 | 1 | 90 | task |
+----+------------+--------+---------+
time_projects
+----+------------------------+
| id | project_name |
+----+------------------------+
| 1 | Make someone happy |
| 2 | Start a project |
| 3 | Jump out of the window |
+----+------------------------+
time_tasks
+----+---------------------+
| id | task_name |
+----+---------------------+
| 1 | drink a beer |
| 2 | drink a second beer |
| 3 | drink more |
+----+---------------------+
Desired output
+----+------------------------+------------+--------+---------+
| id | name | tr_proj_id | tr_min | tr_type |
+----+------------------------+------------+--------+---------+
| 1 | Jump out of the window | 3 | 60 | project |
| 2 | drink more | 3 | 360 | task |
| 3 | Make someone happy | 1 | 120 | project |
| 4 | Start a project | 2 | 30 | project |
| 5 | drink a second beer | 2 | 30 | task |
| 6 | drink a beer | 1 | 90 | task |
+----+------------------------+------------+--------+---------+
And being really bad at the whole JOIN thing, here's the only thing I've come up with so far (which doesn't work..):
SELECT tt.tr_proj_id, tt.tr_type, tt.tr_min, pp.project_name, pp.id, ta.task_name, ta.id
FROM time_tracking as tt, time_projects as pp, time_tasks as ta
WHERE ((tt.tr_type = 'project' AND pp.id = tt.tr_proj_id) OR (tt.tr_type = 'task' AND ta.id = tt.tr_proj_id))
AND tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC
If anyone has an idea on how to do this, feel free to share!
Update: Looks like I forgot to specify that I'm using an access database. Which apparently doesn't accept things like CASE or coalesce.. Apparently there is IIF() but I'm not quite sure on how to use it in this case.
Use join clauses and move your join conditions from the where clause into the on clauses:
SELECT
tt.tr_proj_id,
tt.tr_type,
tt.tr_min,
pp.project_name,
pp.id,
ta.task_name,
ta.id
FROM time_tracking as tt
left join time_projects as pp on tt.tr_type = 'project' AND pp.id = tt.tr_proj_id
left join time_tasks as ta on tt.tr_type = 'task' AND ta.id = tt.tr_proj_id
WHERE tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC,tt.tr_day ASC
I've used left join, which gives you a row from the main table even if one doesn't exist for the join (you get nulls from columns in the joined table if there's no join)
A key point here, that many SQL programmers do not realise, is that the ON clause may contain any conditions, even ones not from the joined table (as in this example). Many programmers assume that the conditions must be only those relating to the formal foreign key relationship.
Try this:
SELECT
tt.id,
CASE WHEN tt.tr_type = 'project' THEN pp.project_name
WHEN tt.tr_type = 'task' THEN ta.task_name END as name,
tt.tr_proj_id,
tt.tr_type,
tt.tr_min,
FROM time_tracking as tt
left join time_projects as pp on pp.id = tt.tr_proj_id
left join time_tasks as ta on ta.id = tt.tr_proj_id
WHERE tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC
perform a union on two joins:
select tt.id, tp.project_name name, tt.tr_proj_id, tt.tr_min, tt.tr_type
from time_tracking tt
inner join time_projects tp on tp.id = tt.tr_proj_id
where tt.tr_type = 'project'
union all
select tt.id, tp.project_name name, tt.tr_proj_id, tt.tr_min, tt.tr_type
from time_tracking tt
inner join time_tasks tk on tk.id = tt.tr_proj_id
where tt.tr_type = 'task'
That will give you the exact table results you want
SELECT
time_tracking.id,
time_tracking.tr_min,
time_tracking.tr_type,
coalesce(time_projects.project_name, time_tasks.task_name) as name
FROM time_tracking
LEFT OUTER JOIN time_projects on time_projects.id = time_tracking.tr_proj_id AND time_tracking.tr_type = 'project'
LEFT OUTER JOIN time_tasks on time_tasks.id = time_tracking.tr_proj_id AND time_tracking.tr_type = 'task'
WHERE time_tracking.tr_min > 0
ORDER BY time_tracking.id DESC -- ...
coalesce is MSSQL, there's equivalent ISNULL and such in other database technologies
The idea is you join to the tables and if the join fails, you'll get NULL where the join failed. Then you use COALESCE to pick out the successful join value.

Counting rows in multiple tables

I have a mysql database that is tracking hockey stats. What I'd like to do is in one query get the number of goals and assists scored by each player as well as the number of games that they've played in. I'm using Zend Framework and the query that I've build is this:
SELECT `p`.*,
`pxt`.`jersey_number`,
count(pxg.player_x_game_id) AS `games`,
count(goals.scoring_id) AS `goals`,
count(assists.scoring_id) AS `assists`
FROM `players` AS `p`
INNER JOIN `players_x_teams` AS `pxt` ON p.player_id = pxt.player_id
INNER JOIN `teams_x_seasons` AS `txs` ON pxt.team_id = txs.team_id
INNER JOIN `seasons` AS `s` ON txs.season_id = s.season_id
INNER JOIN `games` AS `g` ON g.season_id = s.season_id
INNER JOIN `players_x_games` AS `pxg` ON pxg.game_id = g.game_id
AND pxg.player_id = p.player_id
LEFT JOIN `scoring` AS `goals` ON goals.game_id = g.game_id
AND goals.scorer_id = p.player_id
LEFT JOIN `scoring` AS `assists` ON assists.game_id = g.game_id
AND (assists.assist1_id = p.player_id OR assists.assist2_id = p.player_id)
WHERE (pxt.team_id = 1)
AND (txs.season_id = '23')
AND (pxt.date_added <= s.end_date OR pxt.date_added is null)
AND (pxt.date_removed >= s.start_date OR pxt.date_removed is null)
GROUP BY `p`.`player_id`
This query returns me data, but my counts are off.
+-----------+---------------+-------+-------+---------+
| player_id | jersey_number | games | goals | assists |
+-----------+---------------+-------+-------+---------+
| 2 | 3 | 7 | 1 | 3 |
| 3 | 19 | 6 | 1 | 0 |
| 8 | 8 | 7 | 3 | 2 |
| 9 | 11 | 13 | 10 | 8 |
| 11 | 96 | 6 | 1 | 3 |
| 12 | 14 | 6 | 0 | 3 |
| 13 | 7 | 6 | 0 | 1 |
| 115 | 39 | 9 | 6 | 2 |
| 142 | 68 | 6 | 0 | 1 |
| 143 | 30 | 6 | 0 | 0 |
| 150 | 41 | 11 | 11 | 5 |
| 185 | 17 | 6 | 6 | 3 |
| 225 | 97 | 4 | 1 | 3 |
+-----------+---------------+-------+-------+---------+
In this dataset the most games that should be present are 6, but as you can see I'm getting extras. If I adjust my query to remove the goals and assists fields my games count comes out correct. In fact if I only select one of my counted rows I always get the correct counts, but once I add a second or third count my numbers start to get skewed. What am I doing wrong?
Since you are doing multiple joins which may each match multiple rows and carry over to the next join, you'll need to add distinct in your count. Try this:
SELECT `p`.*,
`pxt`.`jersey_number`,
count(distinct pxg.player_x_game_id) AS `games`,
count(distinct goals.scoring_id) AS `goals`,
count(distinct assists.scoring_id) AS `assists`
FROM `players` AS `p`
INNER JOIN `players_x_teams` AS `pxt` ON p.player_id = pxt.player_id
INNER JOIN `teams_x_seasons` AS `txs` ON pxt.team_id = txs.team_id
INNER JOIN `seasons` AS `s` ON txs.season_id = s.season_id
INNER JOIN `games` AS `g` ON g.season_id = s.season_id
INNER JOIN `players_x_games` AS `pxg` ON pxg.game_id = g.game_id
AND pxg.player_id = p.player_id
LEFT JOIN `scoring` AS `goals` ON goals.game_id = g.game_id
AND goals.scorer_id = p.player_id
LEFT JOIN `scoring` AS `assists` ON assists.game_id = g.game_id
AND (assists.assist1_id = p.player_id OR assists.assist2_id = p.player_id)
WHERE (pxt.team_id = 1)
AND (txs.season_id = '23')
AND (pxt.date_added <= s.end_date OR pxt.date_added is null)
AND (pxt.date_removed >= s.start_date OR pxt.date_removed is null)
GROUP BY `p`.`player_id`
Maybe you need count(DISTINCT pxg.player_x_game_id)...? Looks like there might be duplicates in that humungous megajoin (which I admit I haven't actually taken time to fully reproduce!-)...