Calculate weighted average in single query - sql

Example data:
table A
part rating numReviews
A308 100 7
A308 98 89
I'm trying to get the average rating for the above data.
What it needs to be is the sum of rating*numReviews for each line divided by the total numReviews
This is what I'm trying but it's giving incorrect result (49.07, should be 98.15):
select part,
cast((AVG(rating*numReviews)/sum(numReviews)) as decimal(8,2)) as rating_average
from A group by part order by part
Can this be done in a single query? I'm using SQL Server

Just go back to the definition of weighted average, so use sum()s and division:
select part, sum(rating * numreviews) / sum(numreviews) as rating_average
from a
group by part
order by part;
You can convert this to a decimal if you like:
select part,
cast(sum(rating * numreviews) / sum(numreviews) as decimal(8, 2)) as rating_average
from a
group by part
order by part;

Related

How to calculate percentage of entries out of total those match the condition?? SQL

I've been playing around it for the whole day and it's by far the most hard topic to understand in SQL.
Say we have a students table, which consists of group number and students rating as so:
Each group can contain multiple students
And now I want to look for groups where at least 60% of students have rating of 4 or higher.
Expecting something like:
group_n
percentage_of_goodies
1120
0.7
1200
0.66
1111
1
I tried this
with group_goodies as (
select group_n, count(id) goodies from students
where rating >= 4
group by group_n
), group_counts as (
select group_n, count(id) acount from students
group by group_n
)
select cast(group_goodies.goodies as float)/group_counts.acount from group_counts, group_goodies
where cast(group_goodies.goodies as float)/group_counts.acount > 0.6;
and got an unexpected result
where percentage seems to surpass 100% (and it's not because I misplaced denominator, since there are controversial outputs below as well), which is obviously is not intended. There are also more output rows than there are groups. Apparently, I could use window functions, but I can't figure it out myself.. So how can I have this query done?
Problem is extracting count of students before and after the query seems to be impossible within a single query, so I had to create 2 CTEs in order to receive needed results. Both of the CTEs seems to output proper result, where in first CTE amount of students rarely exceeds 10, and in second CTE amounts are naturally smaller and match needs. But When I divide them like that, it results in something unreasonable.
If someone explains it properly, one will make my day 😳
If I understand correctly, this is a pretty direct aggregation query:
select group_id, avg( (rating >= 4)::int ) as good_students
from students
group by group_id
having avg( (rating >= 4)::int ) > 0.6;
I don't see why two levels of aggregation would be needed.
The avg() works by converting each rating to 0 if less than or equal to 4 or 1 for the higher ones. The average of these values is the ratio that are 1.
First, aggregate the students per group and rating to set a partial sum.
Then calculate the fraction of ratings that are 4 or better and report only those groups where that fraction exceeds 0.6.
SELECT group_n,
100.0 *
sum(ratings_per_group) FILTER (WHERE rating >= 4) /
sum(ratings_per_group)
AS fraction_of_goodies
FROM (SELECT group_n, rating,
count(*) AS ratings_per_group
FROM students
GROUP BY group_n, rating
) AS per_group_and_rating
GROUP BY group_n
HAVING sum(ratings_per_group) FILTER (WHERE rating >= 4) /
sum(ratings_per_group) > 0.6;

Redshift - Find % as compared to total value

I have a table with count by product. I am trying to add a new column that would find % as compared to sum of all rows in that column.
prod_name,count
prod_a,100
prod_b,50
prod_c,150
For example, I want to find % of prod_a as compared to the total count and so on.
Expected output:
prod_name,count,%
prod_a,100,0.33
prod_b,50,0.167
prod_c,150,0.5
Edit on SQL:
select count(*),ratio_to_report(prod_name)
over (partition by count(*))
from sales
group by prod_name;
Using window functions.
select t.*,100.0*cnt_by_prod/sum(cnt_by_prod) over() as pct
from tbl t
Edit: Based on OP's question change, To compute the counts and then percentage, use
select prod_name,100.0*count(*)/sum(count(*)) over()
from tbl
group by prod_name

SQL - 1. Round the difference to 2 decimal places

I am trying to create an SQL statement with a subquery in the SELECT attribute list to show the product id, the current price and the difference between the current price and the overall average.
I know that using the ROUND function will round the difference to zero decimals but I want to round the difference to 2 decimal places.
SELECT p_code, p_price, ROUND(p_price - (SELECT AVG(p_price) FROM product)) AS "Difference"
FROM product;
I tried using CAST but it still gave me the same output.
SELECT p_code, p_price, CAST(ROUND(p_price - (SELECT AVG(p_price) FROM Lab6_Product)) as numeric(10,2)) AS "Difference"
FROM lab6_product;
Thank you in advance for your time and help!
round() takes a second argument:
SELECT p_code, p_price,
ROUND(p_price - AVG(p_price) OVER (), 2) AS "Difference"
FROM product;
Note that I also changed the subquery to a window function.
I often recommend converting to a number or decimal/numeric) instead:
SELECT p_code, p_price,
cast(p_price - AVG(p_price) OVER () as number(10, 2)) AS "Difference"
FROM product;
This ensures that the two decimal points are displayed as well.

Access SQL: using MIN() function with subquery

I'm trying to calculate the 90th percentile on a column of my data. MS Access doesn't have a PERCENTILE function, so I'm taking the top 100 values (of 1000 in total), and then taking the minimum of the values that are returned. I'm however having some difficulty using the MS Access MIN() function. My query currently looks as follows:
SELECT MIN(*)
FROM (SELECT TOP 100 ([table1].[field1] + [table1].[field2]) FROM [table1]);
This gives me the error:
Syntax error (missing operator) in query expression 'MIN(*'.
Why am I not allowed to use the asterisk with the MIN function? Am I calculating this value completely incorrectly?
First, you need an order by if you want to get the 90th percentile.
Second, you need a column name:
SELECT MIN(val)
FROM (SELECT TOP 90 PERCENT ([table1].[field1] + [table1].[field2]) as val
FROM [table1]
ORDER BY ([table1].[field1] + [table1].[field2]) ASC
) as t;

Calculate an average as a percent?

This is a simple problem and I'm surprised I couldn't find an answer to it.
All I need, is to calculate the average of a column and have the output as a percent.
Select round(avg(amount)::numeric, 2)
From Table
All I need is to figure out how to make that a percent, right now it comes out as 12345.67, how would I convert that output to a percentage in the query?
Maybe you need something like this:
select avg(amount) * 100 / sum(amount)
from Table
Or you can add a percent sign to it like this:
select CAST(avg(amount) * 100 / sum(amount)as VARCHAR(max)) + '%' AS Perc
from Table
or you can just concatenate a sign to your query:
Select CAST(round(avg(amount)::numeric, 2)as VARCHAR(max)) + '%' AS Perc
From Table
Result / Total * 100 = Average
In your case, the Result is: avg(amount)::numeric
and the Total is whatever the maximum that amount could be.
Your best bet would be to use whatever tools are available to you. But there is a way you can do it as a query.
Try getting the average divided by the sum multiplied by 100.
So you want the average, like below and divided by the sum, and multiplied by 100 to get 100% of the sum.
select avg(amount) / sum(amount) * 100