PostgreSQL group by on different columns - sql

I have a table as the following:
branch_id
parent_branch_id
sales
1
10
2
1
5
3
8
4
3
3
5
1
1
6
6
I need to aggregate the table on branches of a father branch
branch_id
sales
1
16
3
11
6
6
What I have tried:
I tried first separating the branches which have a parent branch and aggregating them on their parent_branch_id, then joining this aggregated table with the table of the parent only branches on t1.parent_branch_id=t2.branch_id and summing the resulting columns to get total sales.
I feel this, having to join tables, is costly and maybe there is a smarter way to do it with builtin PostgreSQL functions.

You don't need a join for this:
select coalesce(parent_branch_id, branch_id) as branch_id,
sum(sales)
from t
group by coalesce(parent_branch_id, branch_id)
order by branch_id;
Here is a db<>fiddle.

Assuming that we only need to consider two levels, parents and immediate children, we can try a self join approach here:
SELECT COALESCE(t1.parent_branch_id, t1.branch_id) AS branch_id,
SUM(t1.sales) AS sales
FROM yourTable t1
LEFT JOIN yourTable t2
ON t2.branch_id = t1.parent_branch_id
GROUP BY
COALESCE(t1.parent_branch_id, t1.branch_id);

Related

Inner join + group by - select common columns and aggregate functions

Let's say i have two tables
Customer
---
Id Name
1 Foo
2 Bar
and
CustomerPurchase
---
CustomerId, Amount, AmountVAT, Accountable(bit)
1 10 11 1
1 20 22 0
2 5 6 0
2 2 3 0
I need a single record for every joined and grouped Customer and CustomerPurchase group.
Every record would contain
columns from table Customer
some aggregation functions like SUM
a 'calculated' column. For example difference of other columns
result of subquery to CustomerPurchase table
An example of result i would like to get
CustomerPurchases
---
Name Total TotalVAT VAT TotalAccountable
Foo 30 33 3 10
Bar 7 9 2 0
I was able to get a single row only by grouping by all the common columns, which i dont think is the right way to do. Plus i have no idea how to do the 'VAT' column and 'TotalAccountable' column, which filters out only certain rows of CustomerPurchase, and then runs some kind of aggregate function on the result. Following example doesn't work ofc but i wanted to show what i would like to achieve
select C.Name,
SUM(CP.Amount) as 'Total',
SUM(CP.AmountVAT) as 'TotalVAT',
diff? as 'VAT',
subquery? as 'TotalAccountable'
from Customer C
inner join CustomerPurchase CR
on C.Id = CR.CustomerId
group by C.Id
I would suggest you just need the follow slight changes to your query. I would also consider for clarity, if you can, to use the terms net and gross which is typical for prices excluding and including VAT.
select c.[Name],
Sum(cp.Amount) as Total,
Sum(cp.AmountVAT) as TotalVAT,
Sum(cp.AmountVAT) - Sum(CP.Amount) as VAT,
Sum(case when cp.Accountable = 1 then cp.Amount end) as TotalAccountable
from Customer c
join CustomerPurchase cp on cp.CustomerId = c.Id
group by c.[Name];

How to consecutively count everything greater than or equal to itself in SQL?

Let's say if I have a table that contains Equipment IDs of equipments for each Equipment Type and Equipment Age, how can I do a Count Distinct of Equipment IDs that have at least that Equipment Age.
For example, let's say this is all the data we have:
equipment_type
equipment_id
equipment_age
Screwdriver
A123
1
Screwdriver
A234
2
Screwdriver
A345
2
Screwdriver
A456
2
Screwdriver
A567
3
I would like the output to be:
equipment_type
equipment_age
count_of_equipment_at_least_this_age
Screwdriver
1
5
Screwdriver
2
4
Screwdriver
3
1
Reason is there are 5 screwdrivers that are at least 1 day old, 4 screwdrivers at least 2 days old and only 1 screwdriver at least 3 days old.
So far I was only able to do count of equipments that falls within each equipment_age (like this query shown below), but not "at least that equipment_age".
SELECT
equipment_type,
equipment_age,
COUNT(DISTINCT equipment_id) as count_of_equipments
FROM equipment_table
GROUP BY 1, 2
Consider below join-less solution
select distinct
equipment_type,
equipment_age,
count(*) over equipment_at_least_this_age as count_of_equipment_at_least_this_age
from equipment_table
window equipment_at_least_this_age as (
partition by equipment_type
order by equipment_age
range between current row and unbounded following
)
if applied to sample data in your question - output is
Use a self join approach:
SELECT
e1.equipment_type,
e1.equipment_age,
COUNT(*) AS count_of_equipments
FROM equipment_table e1
INNER JOIN equipment_table e2
ON e2.equipment_type = e1.equipment_type AND
e2.equipment_age >= e1.equipment_age
GROUP BY 1, 2
ORDER BY 1, 2;
GROUP BY restricts the scope of COUNT to the rows in the group, i.e. it will not let you reach other rows (rows with equipment_age greater than that of the current group). So you need a subquery or windowing functions to get those. One way:
SELECT
equipment_type,
equipment_age,
(Select COUNT(*)
from equipment_table cnt
where cnt.equipment_type = a.equipment_type
AND cnt.equipment_age >= a.equipment_age
) as count_of_equipments
FROM equipment_table a
GROUP BY 1, 2, 3
I am not sure if your environment supports this syntax, though. If not, let us know we will find another way.

Optimal SQL to perform multiple aggregate functions with different group by fields

To simplify a complex query I am working on, I feel like solving this is key.
I have the following table
id
city
Item
1
chicago
1
2
chicago
2
3
chicago
1
4
cedar
2
5
cedar
1
6
cedar
2
7
detroit
1
I am trying to find the ratio of number of rows grouped by city and item to the number of rows grouped by just the items for each and every unique city-item pair.
So I would like something like this
City
Item
groupCityItemCount
groupItemCount
Ratio
chicago
1
2
4
2/4
chicago
2
1
3
1/3
cedar
1
1
4
1/4
cedar
2
2
3
2/3
detroit
1
1
4
1/4
This is my current solution but its too slow.
Select city, item, (count(*) / (select count(*) from records t2 where t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item
Also replaced where with groupBy and having but that is also slow.
Select city, item, (count(*) / (select count(*) from records t2 group by item having t1.item=t2.item)) AS pen_ratio
From records t1
Group By city, item
(Note: I have removed column3 and column4 from the solution for smaller code)
(Edit: Typo as pointed out by xQbert and
MatBailie)
Is it slow because it's evaluating each row separately with the subquery in the select statement? It may be operating as a correlated subquery.
If that's the case it might be faster if you get the values out of a join and go from there -
Select city, t1.item, (COUNT(t1.item) / MAX(t2.it_count)) AS pen_ratio
from records t1
JOIN (SELECT item, count(item) AS it_count
FROM records
group by item) t2
ON t2.item = t1.item
GROUP BY city, t1.item
Updated some errors and included the fiddle based off the starting point from xQbert. I had to CAST as float in the fiddle, but you may not need to CAST and use the above query in yours depending on datatypes.
I believe this follows the intent of your original query.
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=d77a715175159304b9192a16ad903347
You can approach it in two parts.
First, aggregate to the level you're interested in, as normal.
Then, use analytical functions to work out subtotals across your partitions (item, in your case).
WITH
aggregate AS
(
SELECT
city,
item,
COUNT(*) AS row_count
FROM
records
GROUP BY
city,
item
)
SELECT
city,
item,
row_count AS groupCityItemCount,
SUM(row_count) OVER (PARTITION BY item) AS groupItemCount
FROM
aggregate
Fiddle borrowed from xQbert
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=730146262267412522f6e27796151f43

How to list distinct column after a join and also count?

I have three tables:
team(ID, name)
goal(ID, team_ID, goalType_ID, date)
goalType(ID, name)
As you can see, team_ID is the ID of teams table, and goalType_ID is the ID of goalType table.
For all teams, I want to list the number of different types of goals that ever happened, 0 should appear if none.
We don't need to care about the goals table since we don't need the name of the type of goal so I've gotten to the follow code that only uses the first two tables:
SELECT team.ID, team.name, goal.goaType_ID
FROM team LEFT JOIN goal ON team.ID=goal.team_ID
What this results in is a three-column table of information I want, but I would like to count the number of DISTINCT goalTypes, and GROUP BY team.ID or team.name and keep it three columns and also if the result is null, show 0 (team might not have scored any goals).
The resulting table looks something like this:
team.ID team.name goalsType.ID
1 Team_1 1
2 Team_2 2
2 Team_2 2
2 Team_2 2
3 Team_3 4
4 Team_4 null
5 Team_5 null
6 Team_6 1
6 Team_6 2
6 Team_6 4
6 Team_6 3
7 Team_7 5
7 Team_7 4
8 Team_8 null
I have tried a combination of GROUP BY, DISTINCT, and COUNT, but still can't get a result I want.
Maybe I'm going about this all wrong? Any help would be appreciated, Thanks.
EDIT:
Based on Gordon Linoff's answer, I tried doing:
SELECT DISTINCT team.name, COUNT(goal.goalType_ID)
FROM team LEFT JOIN goal ON team.ID=goal.team_ID
GROUP BY team.ID, team.name
and it will give me:
Name #0
Team_1 1
Team_2 3
Team_3 1
Team_4 0
Team_5 0
Team_6 4
Team_7 1
Team_8 0
If I try to use "DISTINCT team.ID, DISTINCT team.name", it will error out.
Is this what you want?
SELECT team.ID, team.name, count(distinct goal.goalType_ID) as NumGoalTypes
FROM team LEFT JOIN
goal
ON team.ID = goal.team_ID
GROUP BY team.ID, team.name;
Try this http://sqlfiddle.com/#!3/8ec680/13
;WITH cte
AS (SELECT Row_number() OVER(partition BY tname
ORDER BY goalid), * from temp)--temp= Your join statement
SELECT CASE
WHEN a.goalid IS NULL THEN 0
ELSE a.row_n
END [count],
a.tid,
a.tname,
a.goalid
FROM cte a
JOIN (SELECT Max(row_n) row_n,
tname
FROM cte
GROUP BY tname) b
ON a.row_n = b.row_n
AND a.tname = b.tname

MySql Join with Sum

I have a table called RESULTS with this structure :
resultid,winner,type
And a table called TICKETS with this structure :
resultid,ticketid,bet,sum_won,status
And I want to show each row from table RESULTS and for each result I want to calculate the totalBet and Sum_won using the values from table TICKETS
I tried to make some joins,some sums,but I cant get what I want.
SELECT *,COALESCE(SUM(tickets.bet),0) AS totalbets,
COALESCE(SUM(tickets.sum_won),0) AS totalwins
FROM `results` NATURAL JOIN `tickets`
WHERE tickets.status<>0
GROUP BY resultid
Please give me some advice.
I want to display something like this
RESULT WINNER TOTALBETS TOTALWINS
1 2 431 222
2 3 0 0
3 1 23 0
4 1 324 111
Use:
SELECT r.*,
COALESCE(x.totalbet, 0) AS totalbet,
COALESCE(x.totalwins, 0) AS totalwins
FROM RESULTS r
LEFT JOIN (SELECT t.resultid,
SUM(t.bet) AS totalbet,
SUM(t.sum_won) AS totalwins
FROM TICKETS t
WHERE t.status != 0
GROUP BY t.resultid) x ON x.resultid = r.resultid
I don't care for the NATURAL JOIN syntax, preferring to be explicit about how to JOIN/link tables together.
SELECT *, COALESCE(SUM(tickets.bet),0) AS totalbets,
COALESCE(SUM(tickets.sum_won),0) AS totalwins
FROM `results` NATURAL JOIN `tickets`
WHERE tickets.status<>0
GROUP BY resultid
Try to replace the first * with resultid. If this helps, then add more columns to SELECT and add them to GROUP BY at the same time.