Why is this SQL SUM statement correct? - sql

In my schema, I have a table Projects, and a table Tasks. Each project is comprised of tasks. Each task has Hours and PercentComplete.
Example table:
ProjectID TaskID Hours PercentComplete
1 1 100 50
1 2 120 80
I am trying to get the weighted percentage complete for the project. I am doing this using the following SQL statement:
SELECT P.ProjectID, P.ProjectName, SUM(T.Hours) AS Hours,
SUM(T.PercentComplete * T.Hours) / 100 AS CompleteHours,
SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete
FROM Projects AS P INNER JOIN
Tasks AS T ON T.ProjectID = P.ProjectID
WHERE (P.ProjectID = 1)
My question is about this part of that statement:
SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete
This gives me the correct weighted percentage for this project (in the case of the sample data above, 66%). But I cannot seem to wrap my head around why it does this.
Why does this query work?

SUM(T.PercentComplete * T.Hours) / 100 is the number of complete hours.
SUM(T.Hours) is the total number of hours.
The ratio of these two amounts, i.e.:
(SUM(T.PercentComplete * T.Hours) / 100) / SUM(T.Hours)
is the proportion of hours complete (it should be between 0 and 1).
Multiplying this by 100 gives the percentage.
I prefer to keep percentages like this out of the database and move them to the presentation layer. It would be much easier if the database stored "hours completed" and "hours total" and did not store the percentages at all. The extra factors of 100 in the calculations confuse the issue.

Basically you are finding the number of hours completed over the number of hours total.
SUM(T.PercentComplete * T.Hours) computes the total number of hours that you have completed. (100 * 50) = 50 * 100 + (120 * 80) = 146 * 100 is the numerator. 146 hours have been completed on this job, and we keep a 100 multiplier for the percent (because it is [0-100] instead of [0-1])
Then we find the total number of hours worked, SUM(T.Hours), which is 100 + 120 = 220.
Then dividing, we find the weighted average. (146 * 100) / 220 = 0.663636364 * 100 = 66.4%
Is this what you were wondering about?

It calculates the two sums individually by adding up the value for each row then divides them at the end
SUM(T.PercentComplete * T.Hours)
50* 100 +
80 * 120
-------
14,600
SUM(T.Hours)
100 +
120
---
220
Then the division at the end
14,600 / 220
------------
66.3636
Edit As per HLGEM's comment it will actually return 66 due to integer division.

Aggregate functions, such as SUM(), work against the set of data defined by the GROUP BY clause. So if you group by ProjectID, ProjectName, the functions will break things down by that.

The SUM peratiorn first multiply the columns than add
( 100* 50+ 120* 80) / (100+ 120)

Related

Calculate Annual Escalations in SQL

I have a dataset (in SQL) where I need to calculate the market value over the entire period. I want to automate this calculation in SQL. The Initial value is 2805.00 per month payable for 36 months. But the value is escalated at 5.5% after each block of twelve months. I have included a picture below to show the values and how they escalate. In terms of what fields I have in SQL, I have the length of the term (36 months [months]). I also have an escalation percentage (5.5% in this case [percentage]) and the starting value [starting_value], 2805.00 in this case. The total value (Expected Result [result]) is 106 635.72. I am after what the calculation should look like in SQL so that I can apply it across all market points.
Here is a fast performance formula to calculate the expected result [result] if you need to calculate every, annual salary with respect to the first one:
[result] = (N * (N + 1) / 2) * [increment] * 12 + M * (N + 1) * [increment] + [starting_value] * [months]
Where:
N = TRUNC([months] / 12) - 1// TRUNC --> Integer value of dividing [months] / 12
M = [months] MOD 12 // Rest of integer division of [months] / 12
[increment] = [percentage] * [starting_value] / 100
On the other hand, if you need to calculate each annual salary with respect to its predecessor you will need the formula below:
∑y=0TRUNC([months]/12)−1{([percentage]/100 + 1)y * 12 * [starting_value]} + ([percentage]/100 + 1)TRUNC([months]/12) + 1 * ([months] MOD 12) * [starting_value]
This is a bit confusing but there is no way to place formulas in Stack overflow. Moreover, if you need to run this in some DBMS you should assure someone that allows you to make loops, as the sum will need it.
You would need to adapt both formulas to the DBMS, taking into account the comments I placed before. Hope this to be helpful.

Complicated SQL View scenario

I have a DB like this:
I would like to create a view that creates an ROI for each 'club'.
so ROI would be (amountbet / amountwon) * 100
Club 2's are ID's 1 and 3
((5 + 10) / (10 + 20)) * 100
and Club 1 is just id 2 which is tricky cause it will be a divide by 0 which is never good
2/0*100
So it should end up with 2 rows
club | ROI
2 | 200%
1 | 0%
I only just found out Views was a thing and have no idea how to tackle this (or if it's even possible)
Thanks
You can use aggregation. I would rather return NULL for roi when nothing has been won:
select club,
sum(amountbet) * 100.0 / nullif(sum(amountwon), 0) as roi
from t
group by club;
If you want 0 you can use coalesce():
select club,
coalesce(sum(amountbet) * 100.0 / nullif(sum(amountwon), 0), 0) as roi
from t
group by club;

How do you calculate using sum in SQL Server

I am trying something like this:
select sum(R_Value) ,((5/cast(sum(R_Value) as int)) * 100.000000000)
from R_Rating
where R_B_Id = 135
But I keep getting value 0, sum(R_Value) = 6, so I just want a percentage out of (5/sum(R_Value)) * 100.
How can I get this?
I have a rating of 5 so each user has a rating they can make select from 1 to 5, so what i need is a formula that takes in the sum and takes into account how many users have rated and give us a result from 1 to 5 in the end
Something like this may work but i need to round up to one decimal place to get a whole number.
select sum(R_Value), ((count(*)/cast(sum(R_Value) as float)) * 10)
from R_Rating
where R_B_Id = 135
To get the average rating you need to force floating point algebra. For example:
select 1.0 * sum(R_Value) / count(*)
from R_Rating
where R_B_Id = 135
Then, if your query selects three rows with the values: 1, 4, and 5, then this query will return 3.33 stars as the average. That is:
= 1.0 * (1 + 4 + 5) / 3
= 1.0 * 10 / 3
= 10.0 / 3
= 3.33333333
I recommend writing this as:
select sum(R_Value) ,
(500.0 / sum(R_Value))
from R_Rating
where R_B_Id = 135;
This avoids an integer division.

Rounding numbers to the nearest 10 in Postgres

I'm trying to solve this particular problem from PGExercises.com:
https://www.pgexercises.com/questions/aggregates/rankmembers.html
The gist of the question is that I'm given a table of club members and half hour time slots that they have booked (getting the list is a simple INNER JOIN of two tables).
I'm supposed to produce a descending ranking of members by total hours booked, rounded off to the nearest 10. I also need to produce a column with the rank, using the RANK() window function, and sort the result by the rank. (The result produces 30 records.)
The author's very elegant solution is this:
select firstname, surname, hours, rank() over (order by hours) from
(select firstname, surname,
((sum(bks.slots)+5)/20)*10 as hours
from cd.bookings bks
inner join cd.members mems
on bks.memid = mems.memid
group by mems.memid
) as subq
order by rank, surname, firstname;
Unfortunately, as a SQL newbie, my very unelegant solution is much more convoluted, using CASE WHEN and converting numbers to text in order to look at the last digit for deciding on whether to round up or down:
SELECT
firstname,
surname,
CASE
WHEN (SUBSTRING(ROUND(SUM(slots*0.5),0)::text from '.{1}$') IN ('5','6','7','8','9','0')) THEN CEIL(SUM(slots*0.5) /10) * 10
ELSE FLOOR(SUM(slots*0.5) /10) * 10
END AS hours,
RANK() OVER(ORDER BY CASE
WHEN (SUBSTRING(ROUND(SUM(slots*0.5),0)::text from '.{1}$') IN ('5','6','7','8','9','0')) THEN CEIL(SUM(slots*0.5) /10) * 10
ELSE FLOOR(SUM(slots*0.5) /10) * 10
END DESC) as rank
FROM cd.bookings JOIN cd.members
ON cd.bookings.memid = cd.members.memid
GROUP BY firstname, surname
ORDER BY rank, surname, firstname;
Still, I manage to almost get it just right - out of the 30 records, I get one edge case, whose firstname is 'Ponder' and lastname is 'Stephens'. His rounded number of hours is 124.5, but the solution insists that rounding it to the nearest 10 should produce a result of 120, whilst my solution produces 130.
(By the way, there are several other examples, such as 204.5 rounding up to 210 both in mine and the exercise author's solution.)
What's wrong with my rounding logic?
If you want to round to the nearest 10, then use the built-in round() function:
select round(<whatever>, -1)
The second argument can be negative, with -1 for tens, -2 for hundreds, and so on.
To round to the nearest multiple of any number (range):
round(<value> / <range>) * <range>
“Nearest” means values exactly half way between range boundaries are rounded up.
This works for arbitrary ranges, you could round to the nearest 13 or 0.05 too if you wanted to:
round(64 / 10) * 10 —- 60
round(65 / 10) * 10 —- 70
round(19.49 / 13) * 13 -- 13
round(19.5 / 13) * 13 -- 26
round(.49 / .05) * .05 -- 0.5
round(.47 / .05) * .05 -- 0.45
I have struggled with an equivalent issue. I needed to round number to the nearest multiple of 50. Gordon's suggestion here does not work.
My first attempt was SELECT round(120 / 50) * 50, which gives 100. However, SELECT round(130 / 50) * 50 gave 100. This is wrong; the nearest multiple is 150.
The trick is to divide using a float, e.g. SELECT round(130 / 50.0) * 50 is going to give 150.
Turns out that doing x/y, where x and y are integers, is equivalent to trunc(x/y). Where as float division correctly rounds to the nearest multiple.
I don't think Bohemian's formula is correct.
The generalized formula is:
round((value + (range/2))/range) * range
so to convert to nearest 50, round((103 + 25)/50) * 50 --> will give 100
A modified version of the Author's elegant solution that works:
I hope you find it useful
select firstname, surname, round(hrs, -1) as hours, rank() over(order by
round(hrs, -1) desc) as rank
from (select firstname, surname, sum(bks.slots) * 0.5 as hrs
from cd.members mems
inner join cd.bookings bks
on mems.memid = bks.memid
group by mems.memid) as subq
order by rank, surname, firstname;

Add an overall average row to Report Builder report

I need help creating an overall average in Report Builder 3.0. I have a query that returns data in the following format:
Major Num Of Students Max GPA Min GPA Avg GPA
---------- ------------------ ---------- ---------- -----------
Accounting 89 4.0 2.3 3.68
Business 107 4.0 2.13 3.23
CIS 85 3.98 2.53 3.75
I added a total row in Report Builder that shows the sum number of students, overall Max GPA, and overall Min GPA. But I can't simply run the Avg function on the Avg GPA column, as it needs to take into account the number of students for an overall average. I believe that I need to do something like the following (in pseudocode):
foreach ( row in rows ) {
totalGpa += row.numOfStudents * row.avgGpa
totalStudents += row.numOfStudents
}
overallAvgGpa = totalGpa / totalStudents
Does anyone know how I could do this in my report?
In your case you need weighted average here, something like this in the Total row:
=Sum(Fields!numOfStudents.Value * Fields!avgGpa.Value)
/ Sum(Fields!numOfStudents.Value)
You can see I'm creating the expression Fields!numOfStudents.Value * Fields!avgGpa.Value for each row, summing that, then dividing by the total students.
In your case this would give (89 * 3.68 + 107 * 3.23 + 85 * 3.75) / (89 + 107 + 85), i.e. 3.53, which seems about correct.