SQL aggregation of columns with common value in another column

SQL aggregation of columns with common value in another column - sql

I have an unresolved doubt about a query I'm making in PostgreSQL.
I have these 2 tables
PLAYER
playerID title
1 Rondo
2 Allen
3 Pierce
4 Garnett
5 Perkins<
PLAYS
playerID TeamID
1 1
1 2
1 3
2 1
2 3
3 1
3 3
and that's my query
SELECT DISTINCT concat(N.playerID, ':', N.title), TID
FROM player N
INNER JOIN (
SELECT DISTINCT P.playerID as PID, teamID as TID
FROM plays P
) AS derivedTable
ON N.playerID = PID
ORDER BY concat
the result of the query is:
"1:Rondo" | 1
"1:Rondo" | 2
"1:Rondo" | 3
"2:Allen" | 1
"2:Allen" | 3
"3:Pierce" | 1
"3:Pierce" | 3
but I want something like that
"1:Rondo" | 1, 2, 3
"2:Allen" | 1, 3
"3:Pierce" | 1, 3
I could use an array_agg, but i really dunno how

Use string_agg()
SELECT concat(N.playerID, ':', N.title),
string_agg(p.TeamID::text, ',') as teamid_list
FROM player N
JOIN plays p ON n.playerID = p.playerID
GROUP BY n.playerID, n.title
ORDER BY 1;
Your derived table is not necessary (and the distinct even more so)

In Postgres should be:
SELECT concat(N.playerID, ':', N.title) title, string_agg(P.TID,', ') TID
FROM player N
LEFT JOIN plays P ON N.playerID = P.PID
GROUP BY 1
ORDER BY 1

For MySQL
Try this:
SELECT CONCAT(N.playerID, ':', N.title) playerTitle,
GROUP_CONCAT(P.TID SEPARATOR ', ') TID
FROM player N
LEFT JOIN plays P ON N.playerID = PID
GROUP BY N.playerID
ORDER BY playerTitle

Related

SQL two table query

I have 2 tables:
Players
ID
Name
1
John
2
Maya
3
Carl
Results
ID
Player_ID
Result
1
1
250
2
1
300
3
2
100
4
2
350
5
3
500
I want to select all the names from the table Players and the top scores of each person.
What I have so far:
SELECT Players.Name, max(Results.Result)
FROM Players JOIN Results
WHERE Players.ID = Results.Player_ID
But this only selects
| Carl | 500 |
and I want
| John | 300 |
| Maya | 350 |
| Carl | 500 |

try with a condition on the result : it needs to be the highest (max) for the player ID.
Try this:
SELECT p.Name, r.result FROM Players p JOIN Results r WHERE p.ID = r.Player_ID and r.result = (select max(result) from results rr where rr.Player_ID = p.ID)

You need to GROUP BY Players.ID, Players.Name to your query. I added Players.ID in case two players have the same name:
SELECT Players.Name, max(Results.Result)
FROM Players JOIN Results
WHERE Players.ID = Results.Player_ID
GROUP BY Players.ID, Players.Name

SQL split one column into two columns based on values and use columns

Table: ProductionOrder
Id Ordernumber Lotsize
1 Order1 50
2 Order 2 75
3 WO-order1 1
4 WO-order2 1
Table: history
Id ProductionOrderID Completed
1 3 1
2 3 1
3 4 1
4 4 1
Table: ProductionOrderDetail
ID ProductionOrderID ProductionOrderDetailDefID Content
1 1 16 50
2 1 17 7-1-2018
3 2 16 75
4 2 17 7-6-2018
Start of my code:
Select p.ID, p.OrderNumber,
Case productionOrderDetailDefID
Where(Select pd1.productionOrderDetailDefID where ProductionOrderDetialDefID = 16) then min(pd1.content)
from ProductionOrder p
Left join History h1 on p.id = h1.productionOrderID
Left Join ProductionOrderDetail pd1 on p.ID = ProductionOrderID
The result in trying to get is
Id Ordernumber Lotsize Productionorder Completed
1 Order1 50 WO-order1 2
2 Order 2 75 WO-order2 2
Any help would be appreciated.

Try this
SELECT ordernumber,lotsize,Ordernumber,count(Ordernumberid)
FROM productionorder inner join history on productionorder.id = history.Ordernumberid
GROUP BY Ordernumber;

A bit of weird joins going on here. You should add this to a SQL fiddle so that we can see our work easier.
A link to SQL fiddle: http://sqlfiddle.com/
Here is my first attempt
SELECT
po.id
, po.ordernumber
, po.lotsize
, po2.productionorder
, SUM(h.completed)
FROM productionorder as po
INNER JOIN history as h
ON h.id = po.id
INNER JOIN prodcuctionorder as po2
ON po2.ordernumberid = h.ordernumberid
WHERE po.id NOT EXISTS IN ( SELECT ordernumberid FROM history )
GROUP BY
po.id
, po.ordernumber
, po.lotzise
, po2.productionorder
How far does that get you?

Select all categories with COUNT of sub-categories

I need to select all categories with count of its sub-categories.
Assume here are my tables:
categories
id | title
----------
1 | colors
2 | animals
3 | plants
sub_categories
id | category_id | title | confirmed
------------------------------------
1 1 red 1
2 1 blue 1
3 1 pink 1
4 2 cat 1
5 2 tiger 0
6 2 lion 0
What I want is :
id | title | count
------------------
1 colors 3
2 animals 1
3 plants 0
What I have tried so far:
SELECT c.id, c.title, count(s.category_id) as count from categories c
LEFT JOIN sub_categories s on c.id = s.category_id
WHERE c.confirmed = 't' AND s.confirmed='t'
GROUP BY c.id, c.title
ORDER BY count DESC
The only problem with this query is that this query does not show categories with 0 sub categories!
You also can check that on SqlFiddle
Any help would be great appreciated.

The reason you don't get rows with zero counts is that WHERE clause checks s.confirmed to be t, thus eliminating rows with NULLs from the outer join result.
Move s.confirmed check into join expression to fix this problem:
SELECT c.id, c.title, count(s.category_id) as count from categories c
LEFT JOIN sub_categories s on c.id = s.category_id AND s.confirmed='t'
WHERE c.confirmed = 't'
GROUP BY c.id, c.title
ORDER BY count DESC
Adding Sql Fiddle: http://sqlfiddle.com/#!17/83add/13

I think you can try this too (it evidence what column(s) you are really grouping by):
SELECT c.id, c.title, RC
from categories c
LEFT JOIN (SELECT category_id, COUNT(*) AS RC
FROM sub_categories
WHERE confirmed= 't'
GROUP BY category_id) s on c.id = s.category_id
WHERE c.confirmed = 't'
ORDER BY RC DESC

Selecting objects that are associated with similar datasets

I'm trying to select all company rows from a [Company] table that share with at least one other company, the same number of employees (from an [Employee] table that has a CompanyId column), where each group of respective employees share the same set of LocationIds (a column in the [Employee] table) and in the same proportion.
So, for instance, two companies with three employees each that have the locationIds 1,2, and 2, would be selected by this query.
[Employee]
EmployeeId | CompanyId | LocationId |
========================================
1 | 1 | 1
2 | 1 | 2
3 | 1 | 2
4 | 2 | 1
5 | 2 | 2
6 | 2 | 2
7 | 3 | 3
[Company]
CompanyId |
============
1 |
2 |
3 |
Returns the CompanyIds:
======================
1
2
CompanyIds 1 and 2 are selected because they share in common with at least one other company: 1. the number of employees (3 employees); and 2. the number/proportion of LocationIds associated with those employees (1 employee has LocationId 1 and 2 employees have LocationId 2).
So far I think I want to use a HAVING COUNT(?) > 1 statement, but I'm having trouble working out the details. Does anyone have any suggestions?

This is ugly, but the only way I can think of to do it:
;with CTE as (
select c.Id,
(
select e.Location, count(e.Id) [EmployeeCount]
from Employee e
where e.IdCompany=c.Id
group by e.Location
order by e.Location
for xml auto
) LocationEmployeeData
from Company c
)
select c.Id
from Company c
join (
select x.LocationEmployeeData, count(x.Id) [CompanyCount]
from CTE x
group by x.LocationEmployeeData
having count(x.Id) >= 2
) y on y.LocationEmployeeData = (select LocationEmployeeData from CTE where Id = c.Id)
See fiddle: http://www.sqlfiddle.com/#!6/6bc16/5
It works by encoding the Employee count per Location data (multiple rows) into an xml string for each Company.
The CTE code on its own:
select c.Id,
(
select e.Location, count(e.Id) [EmployeeCount]
from Employee e
where e.IdCompany=c.Id
group by e.Location
order by e.Location
for xml auto
) LocationEmployeeData
from Company c
Produces data like:
Id LocationEmployeeData
1 <e Location="1" EmployeeCount="2"/><e Location="2" EmployeeCount="1"/>
2 <e Location="1" EmployeeCount="2"/><e Location="2" EmployeeCount="1"/>
3 <e Location="3" EmployeeCount="1"/>
Then it compares companies based on this string (rather than trying to ascertain whether multiple rows match, etc).

An alternative solution could look like this. However it also requires performance testing in advance (I don't feel quite confident with <> type join).
with List as
(
select
IdCompany,
Location,
row_number() over (partition by IdCompany order by Location) as RowId,
count(1) over (partition by IdCompany) as LocCount
from
Employee
)
select
A.IdCompany
from List as A
inner join List as B on A.IdCompany <> B.IdCompany
and A.RowID = B.RowID
and A.LocCount = B.LocCount
group by
A.IdCompany, A.LocCount
having
sum(case when A.Location = B.Location then 1 else 0 end) = A.LocCount
Related fiddle: http://sqlfiddle.com/#!6/d9f2e/1

aggregation with conditionals?

I'm trying to aggregate depending on the conditional if player_id (Gary)
has greater, equal, or less score then player_id("other")
my schema has
players(player_id, name)
matches(match_id, home_team(player_id), away_team(player_id) )
outcome(outcome_id, match_id, home_score:integer, away_score:integer
Output from:
select m.match_id, p.name AS home_team, p1.name AS away_team, o.home_score, o.away_score
from players p
inner join matches m on (p.player_id = m.home_team)
inner join players p1 on (p1.player_id = m.away_team)
inner join outcomes o on (m.match_id = o.match_id);
match_id | player_id | player_id | home_score | away_score
----------+-----------+-----------+------------+------------
1 | 1 | 2 | 1 | 2
2 | 2 | 1 | 1 | 3
3 | 3 | 1 | 3 | 2
Wanted output:
player_id | Wins | Draws | Losses
-------------+------+-------+--------
1 | 1 | 0 | 2
2 ... | ... | .. | ...
My schema are open for alteration.
EDIT(sqlfiddle): http://www.sqlfiddle.com/#!2/7b6c8/1

I would use UNION ALL to get every outcome twice, once for home and once for away player. The second time home_score/away_score should be switched, to get correct sums for away player.
select
d.player_id,
d.name,
sum(d.home_score > d.away_score) as wins,
sum(d.home_score = d.away_score) as draws,
sum(d.home_score < d.away_score) as loses
from (
select p.player_id, p.name, o.home_score, o.away_score
from players p
join matches m on p.player_id = m.home_team
join outcomes o on o.match_id = m.match_id
union all
select p.player_id, p.name, o.away_score as home_score, o.home_score as away_score
from players p
join matches m on p.player_id = m.away_team
join outcomes o on o.match_id = m.match_id) d
group by d.player_id, d.name
Returns:
PLAYER_ID NAME WINS DRAWS LOSES
1 Gary 1 0 2
2 Tom 1 0 1
3 Brad 1 0 0
sqlFiddle demo: http://www.sqlfiddle.com/#!2/7b6c8/21

For a solution without a subquery and unions: http://www.sqlfiddle.com/#!2/7b6c8/31
SELECT
p.player_id,
COALESCE(SUM(o1.home_score > o1.away_score or o2.home_score < o2.away_score), 0) wins,
COALESCE(SUM(o1.home_score = o1.away_score or o2.home_score = o2.away_score), 0) draws,
COALESCE(SUM(o1.home_score < o1.away_score or o2.home_score > o2.away_score), 0) losses
FROM players p
LEFT JOIN matches m1 ON (p.player_id = m1.home_team)
LEFT JOIN players p1 ON (p1.player_id = m1.away_team)
LEFT JOIN outcomes o1 ON (m1.match_id = o1.match_id)
LEFT JOIN matches m2 ON (p.player_id = m2.away_team)
LEFT JOIN players p2 ON (p2.player_id = m2.home_team)
LEFT JOIN outcomes o2 ON (m2.match_id = o2.match_id)
GROUP BY p.player_id
Results:
PLAYER_ID WINS DRAWS LOSSES
1 1 0 2
2 1 0 1
3 1 0 0
4 0 0 0
5 0 0 0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL aggregation of columns with common value in another column - sql

Use string_agg() SELECT concat(N.playerID, ':', N.title), string_agg(p.TeamID::text, ',') as teamid_list FROM player N JOIN plays p ON n.playerID = p.playerID GROUP BY n.playerID, n.title ORDER BY 1; Your derived table is not necessary (and the distinct even more so)

In Postgres should be: SELECT concat(N.playerID, ':', N.title) title, string_agg(P.TID,', ') TID FROM player N LEFT JOIN plays P ON N.playerID = P.PID GROUP BY 1 ORDER BY 1

For MySQL Try this: SELECT CONCAT(N.playerID, ':', N.title) playerTitle, GROUP_CONCAT(P.TID SEPARATOR ', ') TID FROM player N LEFT JOIN plays P ON N.playerID = PID GROUP BY N.playerID ORDER BY playerTitle

Related

SQL two table query

SQL split one column into two columns based on values and use columns

Select all categories with COUNT of sub-categories

Selecting objects that are associated with similar datasets

aggregation with conditionals?

Categories

Resources