How to perform multiple table calculation with joins and group by - sql

I have two tables client and grouping. They look like this:
Client
C_id
C_grouping_id
Month
Profit
Grouping
Grouping_id
Month
Profit
The client table contains monthly profit for every client and every client belongs to a specific grouping scheme specified by C_grouping_id.
The grouping table contains all the groups and their monthly profits.
I'm struggling with a query that essentially calculates the monthly residual for every subscriber:
Residual= (Subscriber Monthly Profit - Grouping monthly Profit)*(average subscriber monthly profits for all months / average profits for all months for the grouping subscriber belongs to)
I have come up with the following query so far but the results seem to be incorrect:
SELECT client.C_id, client.C_grouping_Id, client.Month,
((client.Profit - grouping.profit) * (avg(client.Profit)/avg(grouping.profit))) as "residual"
FROM client
INNER JOIN grouping
ON "C_grouping_id"="Grouping_id"
group by client.C_id, client.C_grouping_Id,client.Month, grouping.profit
I would appreciate it if someone can shed some light on what I'm doing wrong and how to correct it.
EDIT: Adding sample data and desired results
Client
C_id C_grouping_id Month Profit
001 aaa jul 10$
001 aaa aug 12$
001 aaa sep 8$
016 abc jan 25$
016 abc feb 21$
Grouping
Grouping_id Month Profit
aaa Jul 30$
aaa aug 50$
aaa Sep 15$
abc Jan 21$
abc Feb 27$
Query Result:
C_ID C_grouping_id Month Residual
001 aaa Jul (10-30)*(10/31.3)=-6.38
... and so on for every month for avery client.

This can be done in a pretty straight forward way.
The main difficulty is obviously that you try to deal with different levels of aggregation at once (average of the group and the client as well as the current record).
This is rather difficult/clumsy with simple SELECT FROM GROUP BY-SQL.
But with analytical functions aka Window functions this is very easy.
Start with combining the tables and calculating the base numbers:
select c.c_id as client_id,
c.c_grouping_id as grouping_id,
c.month,
c.profit as client_profit,
g.profit as group_profit,
avg (c.profit) over (partition by c.c_id) as avg_client_profit,
avg (g.profit) over (partition by g.grouping_id) as avg_group_profit
from client c inner join grouping g
on c."C_GROUPING_ID"=g."GROUPING_ID"
and c. "MONTH" = g. "MONTH";
With this you already get the average profits by client and by grouping_id.
Be aware that I changed the data type of the currency column to DECIMAL (10,3) as a VARCHAR with a $ sign in it is just hard to convert.
I also fixed the data for MONTHS as the test data contained different upper/lower case spellings which prevented the join to work.
Finally I turned all column names into upper case to, in order to make typing easier.
Anyhow, running this provides you with the following result set:
CLIENT_ID GROUPING_ID MONTH CLIENT_PROFIT GROUP_PROFIT AVG_CLIENT_PROFIT AVG_GROUP_PROFIT
16 abc JAN 25 21 23 24
16 abc FEB 21 27 23 24
1 aaa JUL 10 30 10 31.666
1 aaa AUG 12 50 10 31.666
1 aaa SEP 8 15 10 31.666
From here it's only one step further to the residual calculation.
You can either put this current SQL into a view to make it reusable for other queries or use it as a inline view.
I chose to use it as a common table expression (CTE) aka WITH clause because it's nice and easy to read:
with p as
(select c.c_id as client_id,
c.c_grouping_id as grouping_id,
c.month,
c.profit as client_profit,
g.profit as group_profit,
avg (c.profit) over (partition by c.c_id) as avg_client_profit,
avg (g.profit) over (partition by g.grouping_id) as avg_group_profit
from client c inner join grouping g
on c."C_GROUPING_ID"=g."GROUPING_ID"
and c. "MONTH" = g. "MONTH")
select client_id, grouping_id, month,
client_profit, group_profit,
avg_client_profit, avg_group_profit,
round( (client_profit - group_profit)
* (avg_client_profit/avg_group_profit), 2) as residual
from p
order by grouping_id, month, client_id;
Notice how easy to read the whole statement is and how straight forward the residual calculation is done.
The result is then this:
CLIENT_ID GROUPING_ID MONTH CLIENT_PROFIT GROUP_PROFIT AVG_CLIENT_PROFIT AVG_GROUP_PROFIT RESIDUAL
1 aaa AUG 12 50 10 31.666 -12
1 aaa JUL 10 30 10 31.666 -6.32
1 aaa SEP 8 15 10 31.666 -2.21
16 abc FEB 21 27 23 24 -5.75
16 abc JAN 25 21 23 24 3.83
Cheers,
Lars

Related

How Do I retrieve most Recent record in different years With Date date in different table

I'm working with a database that isn't structured that well and need to retrieve the row with the latest month used in specific years. The main data is stored is stored in the member table and lists one row per member month. The Date for the member month is not specifically stored here but connected by a foreign Date_Key and linked to a Date table. This is where the column for the Year and Month can be derived based on the Date_Key specified in each table. Each row in the Date table represents 1 new month for a year and each of these rows has a unique sequential date_key.
I am using Microsoft SQL Server Studio as the environment
Member Table
MemberKey
Membe_ID
Date_Key
100
1234
89
101
1234
96
102
1234
97
103
1236
96
104
1236
97
Date Table
Date_Key
Year
Month
89
2020
10
90
2020
11
91
2020
12
92
2021
1
93
2021
2
94
2021
3
95
2021
4
96
2021
5
97
2021
6
Looking for the following Results
Member_ID
Year
Month
1234
2020
10
1234
2021
6
1236
2021
6
2020/11 is NOT a date. It is a year/month pair. But it seems like a simple aggregate - select year, max(month) group by year. You join and include member ID so you include that column in the GROUP BY clause to get one row per member per year.
select mbr.Member_ID, dts.Year, max(dts.Month) as Month
from dbo.Members as mbr
inner join dbo.Dates as dts on mbr.Date_Key = dts.Date_Key
group by mbr.Member_ID, dts.Year
order by mbr.Member_ID, dts.Year
;

Include "0" results in COUNT(*) aggregate

Good morning, I've searched in the forum one doubt that I have but the results that I've seen didn't give me a solution.
I have two tables.
CARS:
Id Model
1 Seat
2 Audi
3 Mercedes
4 Ford
BREAKDOWNS:
IdBd Description Date Price IdCar
1 Engine 01/01/2020 500 € 3
2 Battery 05/01/2020 0 € 1
3 Wheel's change 10/02/2020 110,25 € 4
4 Electronic system 15/03/2020 100 € 2
5 Brake failure 20/05/2020 0 € 4
6 Engine 25/05/2020 400 € 1
I wanna make a query that shows the number of breakdowns by month with 0€ of cost.
I have this query:
SELECT Year(breakdowns.[Date]) AS YEAR, StrConv(MonthName(Month(breakdowns.[Date])),3) AS MONTH, Count(*) AS [BREAKDOWNS]
FROM cars LEFT JOIN breakdowns ON (cars.Id = breakdowns.IdCar AND breakdowns.[Price]=0)
GROUP BY breakdowns.[Price], Year(breakdowns.[Date]), Month(breakdowns.[Date]), MonthName(Month(breakdowns.[Date]))
HAVING ((Year([breakdowns].[Date]))=[Insert a year:])
ORDER BY Year(breakdowns.[Date]), Month(breakdowns.[Date]);
And the result is (if I put year '2020'):
YEAR MONTH BREAKDOWNS
2020 January 1
2020 May 1
And I want:
YEAR MONTH BREAKDOWNS
2020 January 1
2020 February 0
2020 March 0
2020 May 1
Thanks!
The HAVING condition should be in WHERE (otherwise it changes the Outer to an Inner join). But as long as you don't use columns from cars there's no need to join it.
To get rows for months without a zero price you should switch to conditional aggregation (Access doesn't support Standard SQL CASE, but IIF?).
SELECT Year(breakdowns.[Date]) AS YEAR,
StrConv(MonthName(Month(breakdowns.[Date])),3) AS MONTH,
SUM(CASE WHEN breakdowns.[Price]=0 THEN 1 ELSE 0 END) AS [BREAKDOWNS]
FROM breakdowns
JOIN cars
ON (cars.Id = breakdowns.IdCar)
WHERE ((Year([breakdowns].[Date]))=[Insert a year:])
GROUP BY breakdowns.[Price], Year(breakdowns.[Date]), Month(breakdowns.[Date]), MonthName(Month(breakdowns.[Date]))
ORDER BY Year(breakdowns.[Date]), Month(breakdowns.[Date]

Querying 2 Views on a Join - answers on one half being duplicated

I have 2 views one holding inbound calls and the other outbound calls. I want my query to join the 2 views so that the inbound and outbound stand side by side for each operator (destinationname and originationname). At the moment my current query duplicates one half of the join, in the example below the inbound.
SELECT i.destinationname, i.volumein as inbound, o.volumeout as outbound,
i.year, i.month
FROM InboundCalls i
inner join OutboundCalls o
on i.destinationname = o.originationname
GROUP BY i.year, i.month, i.destinationname, o.volumeout, i.volumein
DestinationName Inbound Outbound Year Month
Accounts Spare 9 33 2016 8
Accounts Spare 9 9 2016 8
Accounts Spare 9 7 2016 8
Accounts Spare 9 38 2016 8
Accounts Spare 21 33 2016 9
Accounts Spare 21 9 2016 9
Accounts Spare 21 7 2016 9
Accounts Spare 21 38 2016 9
The result I am looking for will be similar to the below;
DestinationName Inbound Outbound Year Month
Accounts Spare 84 210 2016 9
Accounts Spare 12 32 2016 11
Accounts Spare 36 103 2016 10
Steve Jones 36 96 2016 8
Wayne Rooney 162 172 2016 8
Alan Shearer 1 216 2016 9
Alan Shearer 74 82 2016 8
Please let me know if this needs clarifying.
The reason for wrong results is that you join only on destination name, not on year and month.
First, you need to join not only on DestinationName but on the Year and Month as well. If the views have one row per distinct destination name, year and month, then you can get rid of the GROUP BY as well.
Second, you probably need a FULL JOIN instead of an INNER JOIN, assuming that you want results when there are only Incoming but not Outgoing data (and vice versa) for some month.
SELECT
COALESCE(i.destinationname, o.destinationname) AS DestinationName
COALESCE(i.volumein, 0) AS InBound,
COALESCE(o.volumeout, 0) AS OutBound,
COALESCE(i.year, o.year) AS Year,
COALESCE(i.month, o.month) AS Month
FROM InboundCalls AS i
FULL JOIN OutboundCalls AS o
ON i.destinationname = o.originationname
AND i.year = o.year
AND i.month = o.month ;
If I understand correctly you would want the output to be this:
DestinationName Inbound Outbound Year Month
Accounts Spare 9 87 2016 8
In which case I think the reason you are getting duplicates is because you are grouping by o.volumeout and i.volumein.
If you want a single row for each month and year then you would group by destinationame, mont and year and then get your totals by using SUM e.g.
SELECT i.destinationname, sum(i.volumein) as inbound, sum(o.volumeout) as outbound,
i.year, i.month
FROM InboundCalls i
inner join OutboundCalls o
on i.destinationname = o.originationname
GROUP BY i.year, i.month, i.destinationname

sql running total math current quarter

Im trying to figure out the total for the quarter when the only data shown is a running total for the year:
Id Amount Periods Year Type Date
-------------------------------------------------------------
1 65 2 2014 G 4-1-12
2 75 3 2014 G 7-1-12
3 25 1 2014 G 1-1-12
4 60 1 2014 H 1-1-12
5 75 1 2014 Y 1-1-12
6 120 3 2014 I 7-1-12
7 30 1 2014 I 1-1-12
8 90 2 2014 I 4-1-12
In the data shown above. The items in type G and I are running totals for the period (in qtrs). If my query returns period 3, is there a sql way to get the data for the qtr? The math would involve retrieving the data for the 3rd period - 2nd period.
Right now my sql is something like:
SELECT * FROM data WHERE Date='4-1-12';
In this query, it will return row #1, which is a total for 2 periods. I would like it to return just the total for the 2nd period. Im looking to make this happen with SQLite.
Any help would be appreciated.
Thank alot
You want to subtract the running total of the previous quarter:
SELECT Id,
Year,
Type,
Date,
Amount - IFNULL((SELECT Amount
FROM data AS previousQuarter
WHERE previousQuarter.Year = data.year
AND previousQuarter.Type = data.Type
AND previousQuarter.Periods = data.Periods - 1
), 0) AS Amount
FROM data
The IFNULL is needed to handle a quarter that has no previous quarter.

Select Max Date in Year Month Day Format

I have a table that I drew the following sample from:
Item <other columns> year month day
---- -- ------------ ---- ----- ---
VX4O GL 630.5938 2012 7 20
BX2T GL 0 2012 7 13
MWB806I GL 92004.72 2012 6 15
4XU GL 17.125 2012 7 20
VL4O GL 130.5 2012 7 20
MWB806I GL 92004 2012 10 26
MWB806I GL 92005 2012 11 30
3PU GL 25 2012 7 20
VC4O GL 630.6094 2012 7 20
MWB806I GL 92005 2012 11 2
The first column is Item, the last three columns are year, month, day.
How do I select the max date per item?
SELECT Item, MAX(CONVERT(DATETIME, RTRIM([year])
+ RIGHT('0'+RTRIM([month]), 2)
+ RIGHT('0'+RTRIM([day], 2)
)
FROM dbo.table
GROUP BY Item;
Now you really should consider fixing this schema. Why are you storing year/month/day separately? All it does is make calculations like this one much more difficult, and prevents any proper validation (you can have check constraints for basic stuff, but these are much more complex for things like leap years). And it doesn't save you any space (in fact you lose space if month/day are int, or if you can use smalldatetime).