I need help creating an overall average in Report Builder 3.0. I have a query that returns data in the following format:
Major Num Of Students Max GPA Min GPA Avg GPA
---------- ------------------ ---------- ---------- -----------
Accounting 89 4.0 2.3 3.68
Business 107 4.0 2.13 3.23
CIS 85 3.98 2.53 3.75
I added a total row in Report Builder that shows the sum number of students, overall Max GPA, and overall Min GPA. But I can't simply run the Avg function on the Avg GPA column, as it needs to take into account the number of students for an overall average. I believe that I need to do something like the following (in pseudocode):
foreach ( row in rows ) {
totalGpa += row.numOfStudents * row.avgGpa
totalStudents += row.numOfStudents
}
overallAvgGpa = totalGpa / totalStudents
Does anyone know how I could do this in my report?
In your case you need weighted average here, something like this in the Total row:
=Sum(Fields!numOfStudents.Value * Fields!avgGpa.Value)
/ Sum(Fields!numOfStudents.Value)
You can see I'm creating the expression Fields!numOfStudents.Value * Fields!avgGpa.Value for each row, summing that, then dividing by the total students.
In your case this would give (89 * 3.68 + 107 * 3.23 + 85 * 3.75) / (89 + 107 + 85), i.e. 3.53, which seems about correct.
Related
please consider the table below for reference
student name
class
marks
total marks
David
10-B
50
100
Leo
10-B
20
200
Cris
11-B
23
150
Lynn
10-B
100
240
Rachel
11-B
210
500
Ronda
9-B
43
400
So what I want to do is to calculate the percentage for each row but not simply by dividing marks/total marks *100 But I want to calculate the percentage on the class level i.e. for class 10-B the percentage will be (50+20+100)/(100+200+240)*100, So here I want the percentage on class level and display that in each row. So I want the resultant table like as follows:
student name
class
marks
total marks
percentage
David
10-B
50
100
20
Leo
10-B
20
200
20
Cris
11-B
23
150
40
Lynn
10-B
100
240
20
Rachel
11-B
210
500
40
Ronda
9-B
43
400
10
*percentage are just for reference and not calculated correctly.
So here you can see each class has same percentage.
Also, I am using standard sql in bigquery
You seem to want window functions:
select t.*,
( sum(marks) over (partition by class) * 100.0 /
sum(total_marks) over (partition by class)
) as class_ratio
from t;
Why do you need to see the individual students if all you care about is class percentage?
SELECT Class, AVG(marks * 1. / total marks) AS Percentage
FROM dbo.ClassTbl
GROUP BY Class
Also Percent is a Float (1 = 100%) so the multiplier handles it because your values are Int and SQL sees that and thinks the result will also be an Int.
Unless you just want the individuals results
SELECT Class, student name, marks * 1. / total marks AS Percentage
FROM dbo.ClassTbl
Also as a side note TableTitles should be CamelCase no spaces.
hope you are well.
Resource
Reason
Qty
Average
11
Broken
15
48%
11
Shifted
5
16%
11
Flash
10
32%
11
Bleed
1
3%
So as you can see in the table above, I have a resource that has scrap and we need to break each reason down by % of the total.
So as you can see there is a total 31 scraps and I want to break each scrap by % for a particular Resource. I am not sure how to break it down by %
How would I be able the % in this way using SQL?
If I understand correctly, you can use window functions. To get the ratios:
select t.*,
(t.qty * 1.0 / sum(t.qty) over (partition by resource)) as ratio
from t;
You can multiply by 100 if you want a percentage between 0 and 100 rather than a ratio.
I'm writing test queries in MS SQL Server to test reports.
Can't figure out how to calculate following:
Ingredient_Cost_Paid / Total Ingredient_Cost_Paid * 100 as 'Ingredient Cost Allow as % of Total'
This is Ingredient cost allowable as a percentage of the total ingredient cost allowable.
P.S. I'm new to SQL, so would appreciate explanations as well, so I learn for the future. Thanks
Also I'm not sure I correctly understand difference between Total and SUM.
Thanks everyone
The single quote (') is used as a delimiter for textual values. If you use the AS keyword to specify a (column) alias, you need to use square brackets ([]) if it includes spaces and/or special characters:
Ingredient_Cost_Paid / Total_Ingredient_Cost_Paid * 100 as [Ingredient Cost Allow as % of Total]
Is that what you are looking for?
Edit: I noticed that it also works with single quotes! I didn't know that! But honestly, I would not use it. I'm not sure if it's officially considered to be valid.
Regarding the difference between "Total" and SUM, I would need to understand what you mean with "Total", since that is not something that SQL understands. You could probably use the SUM aggregate function to calculate a total. An aggregate function calculates a value based on a certain column/expression in groups of rows (or in the entire table as a whole single group). So you probably need to provide (much) more information in your question to get effective help with that.
Edit:
I would like to elaborate a little on this SQL issue for you. My apologies in advance for this rather lengthy post. ;)
For example, assume that all query logic described here applies to a table called Recipe_Ingredients, which contains rows with information about ingredients for various recipes (identified by the column Recipe_ID) and the price of the recipe ingredient (in a column called Ingredient_Cost_Paid).
The (simplified) table definition would look something like this:
CREATE TABLE Recipe_Ingredients (
Recipe_ID INT NOT NULL,
Ingredient_Cost_Paid NUMERIC NOT NULL
);
For testing purposes, I created this table in a test database and populated it with the following query:
INSERT INTO Recipe_Ingredients
VALUES
(12, 4.65),
(12, 0.40),
(12, 9.98),
(27, 5.35),
(27, 12.50),
(27, 1.09),
(27, 3.00),
(65, 2.35),
(65, 0.99);
You could select all rows from the table to view all data in the table:
SELECT
Recipe_ID,
Ingredient_Cost_Paid
FROM
Recipe_Ingredients;
This would yield the following results:
Recipe_ID Ingredient_Cost_Paid
--------- --------------------
12 4.65
12 0.40
12 9.98
27 5.35
27 12.50
27 1.09
27 3.00
65 2.35
65 0.99
You could group the rows based on corresponding Recipe_ID values. Like this:
SELECT
Recipe_ID
FROM
Recipe_Ingredients
GROUP BY
Recipe_ID;
This will yield the following result:
Recipe_ID
---------
12
27
65
Not very spectacular, I agree. But you could ask the query to calculate values based on those groups as well. That's where aggregate functions like COUNT and SUM come into play:
SELECT
Recipe_ID,
COUNT(Recipe_ID) AS Number_Of_Ingredients,
SUM(Ingredient_Cost_Paid) AS Total_Ingredient_Cost_Paid
FROM
Recipe_Ingredients
GROUP BY
Recipe_ID;
This will yield the following result:
Recipe_ID Number_Of_Ingredients Total_Ingredient_Cost_Paid
--------- --------------------- --------------------------
12 3 15.03
27 4 21.94
65 2 3.34
Introducing your percentage column is somewhat tricky. The calculation has to be performed on a rowset (a table or a query result) and cannot be expressed directly in a SUM.
You could specify the previous query as a subquery in the FROM-clause of another query (this is called a table expression) and join it with table Recipe_Ingredients. That way you combine the group data back with the detail data.
I will drop the Number_Of_Ingredients column from now on. It was just an example for the COUNT function, but you do not need it for your issue at hand.
SELECT
Recipe_Ingredients.Recipe_ID,
Recipe_Ingredients.Ingredient_Cost_Paid,
Subquery.Total_Ingredient_Cost_Paid
FROM
Recipe_Ingredients
INNER JOIN (
SELECT
Recipe_ID,
SUM(Ingredient_Cost_Paid) AS Total_Ingredient_Cost_Paid
FROM
Recipe_Ingredients
GROUP BY
Recipe_ID
) AS Subquery ON Subquery.Recipe_ID = Recipe_Ingredients.Recipe_ID;
This will yield the following results:
Recipe_ID Ingredient_Cost_Paid Total_Ingredient_Cost_Paid
--------- -------------------- --------------------------
12 4.65 15.03
12 0.40 15.03
12 9.98 15.03
27 5.35 21.94
27 12.50 21.94
27 1.09 21.94
27 3.00 21.94
65 2.35 3.34
65 0.99 3.34
With this, it is pretty easy to add your calculation for the percentage:
SELECT
Recipe_Ingredients.Recipe_ID,
Recipe_Ingredients.Ingredient_Cost_Paid,
Subquery.Total_Ingredient_Cost_Paid,
CAST(Recipe_Ingredients.Ingredient_Cost_Paid / Subquery.Total_Ingredient_Cost_Paid * 100 AS DECIMAL(8,1)) AS [Ingredient Cost Allow as % of Total]
FROM
Recipe_Ingredients
INNER JOIN (
SELECT
Recipe_ID,
SUM(Ingredient_Cost_Paid) AS Total_Ingredient_Cost_Paid
FROM
Recipe_Ingredients
GROUP BY
Recipe_ID
) AS Subquery ON Subquery.Recipe_ID = Recipe_Ingredients.Recipe_ID;
Note that I also cast the percentage column values to type DECIMAL(8,1) so that you do not get values with large fractions. The above query yields the following results:
Recipe_ID Ingredient_Cost_Paid Total_Ingredient_Cost_Paid Ingredient Cost Allow as % of Total
--------- -------------------- -------------------------- -----------------------------------
12 4.65 15.03 30.9
12 0.40 15.03 2.7
12 9.98 15.03 66.4
27 5.35 21.94 24.4
27 12.50 21.94 57.0
27 1.09 21.94 5.0
27 3.00 21.94 13.7
65 2.35 3.34 70.4
65 0.99 3.34 29.6
As I said earlier, you will need to supply more information in your question if you need more specific help with your own situation. These queries and their results are just examples to show you what can be possible. Perhaps (and hopefully) this contains enough information to help you find a solution yourself. But you may always ask more specific questions, of course.
I have the following table, for which I am trying to calculate a running balance, and remaining value, but the remaining value is the function of the previously calculated row, as such:
date PR amount total balance remaining_value
----------------------------------------------------------
'2020-1-1' 1 1.0 100.0 1.0 100 -- 100 (inital total)
'2020-1-2' 1 2.0 220.0 3.0 320 -- 100 (previous row) + 220
'2020-1-3' 1 -1.5 -172.5 1.5 160 -- 320 - 160 (see explanation 1)
'2020-1-4' 1 3.0 270.0 4.5 430 -- 160 + 270
'2020-1-5' 1 1.0 85.0 5.5 515 -- 430 + 85
'2020-1-6' 1 2.0 202.0 7.5 717 -- 575 + 202
'2020-1-7' 1 -4.0 -463.0 3.5 334.6 -- 717 - 382.4 (see explanation 2)
'2020-1-8' 1 -0.5 -55.0 3.0 ...
'2020-1-9' 1 2.0 214.0 5.0
'2020-1-1' 2 1.0 100 1.0 100 -- different PR: start new running total
The logic is as follows:
For positive amount rows, the remaining value is simply the value from the previous row in column remaining_value + the value in column total from that row.
For negative amount rows, it gets tickier:
Explanation 1: We start with 320 (previous row balance) and from it we remove 1.5/3.0 (absolute value of current row amount divided by previous row balance) and we multiply it by the previous row remaining_value, which is 320. The calculation gives:
320 - (1.5/3 * 320) = 160
Explanation 2: Same logic as above. 717 - (4/7.5 * 717) = 717 - 382.4
4/7.5 here represents the current row's absolute amount divided by the previous row's balance.
I tried the window function sum() but did not manage to get the desired result. Is there a way to get this done in PostgreSQL without having to resort to a loop?
Extra complexity: There are multiple products identified by PR (product id), 1, 2 etc. Each need their own running total and calculation.
You could create a custom aggregate function:
CREATE OR REPLACE FUNCTION f_special_running_sum (_state numeric, _total numeric, _amount numeric, _prev_balance numeric)
RETURNS numeric
LANGUAGE sql IMMUTABLE AS
'SELECT CASE WHEN _amount > 0 THEN _state + _total
ELSE _state * (1 + _amount / _prev_balance) END';
CREATE OR REPLACE AGGREGATE special_running_sum (_total numeric, _amount numeric, _prev_balance numeric) (
sfunc = f_special_running_sum
, stype = numeric
, initcond = '0'
);
The CASE expression does the split: If amount is positive, just add total, else apply your (simplified) formula:
320 * (1 + -1.5 / 3.0) instead of 320 - (1.5/3 * 320), i.e.:
_state * (1 + _amount / _prev_balance)
Function and aggregate parameter names are only for documentation.
Then your query can look like this:
SELECT *
, special_running_sum(total, amount, prev_balance) OVER (PARTITION BY pr ORDER BY date)
FROM (
SELECT pr, date, amount, total
, lag(balance, 1, '1') OVER (PARTITION BY pr ORDER BY date) AS prev_balance
FROM tbl
) t;
db<>fiddle here
We need a subquery to apply the first window function lag() and fetch the previous balance into the current row (prev_balance). I default to 1 if there is no previous row to avoid NULL values.
Caveats:
If the first row has a negative total, the result is undefined. My aggregate function defaults to 0.
You did not declare data types, nor requirements regarding precision. I assume numeric and aim for maximum precision. The calculation with numeric is precise. But your formula produces fractional decimal numbers. Without rounding, there will be a lot of fractional digits after a couple of divisions, and the calculation will quickly degrade in performance. You'll have to strike a compromise between precision and performance. For example, doing the same with double precision has constant performance.
Related:
Cumulative adding with dynamic base in Postgres
In my schema, I have a table Projects, and a table Tasks. Each project is comprised of tasks. Each task has Hours and PercentComplete.
Example table:
ProjectID TaskID Hours PercentComplete
1 1 100 50
1 2 120 80
I am trying to get the weighted percentage complete for the project. I am doing this using the following SQL statement:
SELECT P.ProjectID, P.ProjectName, SUM(T.Hours) AS Hours,
SUM(T.PercentComplete * T.Hours) / 100 AS CompleteHours,
SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete
FROM Projects AS P INNER JOIN
Tasks AS T ON T.ProjectID = P.ProjectID
WHERE (P.ProjectID = 1)
My question is about this part of that statement:
SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete
This gives me the correct weighted percentage for this project (in the case of the sample data above, 66%). But I cannot seem to wrap my head around why it does this.
Why does this query work?
SUM(T.PercentComplete * T.Hours) / 100 is the number of complete hours.
SUM(T.Hours) is the total number of hours.
The ratio of these two amounts, i.e.:
(SUM(T.PercentComplete * T.Hours) / 100) / SUM(T.Hours)
is the proportion of hours complete (it should be between 0 and 1).
Multiplying this by 100 gives the percentage.
I prefer to keep percentages like this out of the database and move them to the presentation layer. It would be much easier if the database stored "hours completed" and "hours total" and did not store the percentages at all. The extra factors of 100 in the calculations confuse the issue.
Basically you are finding the number of hours completed over the number of hours total.
SUM(T.PercentComplete * T.Hours) computes the total number of hours that you have completed. (100 * 50) = 50 * 100 + (120 * 80) = 146 * 100 is the numerator. 146 hours have been completed on this job, and we keep a 100 multiplier for the percent (because it is [0-100] instead of [0-1])
Then we find the total number of hours worked, SUM(T.Hours), which is 100 + 120 = 220.
Then dividing, we find the weighted average. (146 * 100) / 220 = 0.663636364 * 100 = 66.4%
Is this what you were wondering about?
It calculates the two sums individually by adding up the value for each row then divides them at the end
SUM(T.PercentComplete * T.Hours)
50* 100 +
80 * 120
-------
14,600
SUM(T.Hours)
100 +
120
---
220
Then the division at the end
14,600 / 220
------------
66.3636
Edit As per HLGEM's comment it will actually return 66 due to integer division.
Aggregate functions, such as SUM(), work against the set of data defined by the GROUP BY clause. So if you group by ProjectID, ProjectName, the functions will break things down by that.
The SUM peratiorn first multiply the columns than add
( 100* 50+ 120* 80) / (100+ 120)