SQL Work out average from joined column - sql

I have 3 columns I need to display and I need to join on another column that calculates the AVG from the CLUB_FEE column. My code does not work, it throws a "not a single-group group function" Can someone please help? Here is my SQL:
SELECT S.MEMBER_ID, S.CLUB_ID, C.CLUB_FEE, AVG(C.CLUB_FEE) AVGINCOME
FROM SUBSCRIPTION S, CLUB C
WHERE S.CLUB_ID = C.CLUB_ID;

i Suggest to use Inner join try it also When you include an aggregate function (like avg, sum) in your query, you must group by all columns :
SELECT S.MEMBER_ID, S.CLUB_ID, C.CLUB_FEE, AVG(C.CLUB_FEE) as AVGINCOME
FROM SUBSCRIPTION S INNER JOIN CLUB C
ON S.CLUB_ID = C.CLUB_ID
GROUP BY
S.MEMBER_ID, S.CLUB_ID, C.CLUB_FEE ;

Learn to use explicit JOIN syntax. Simple rule: Never use commas in the FROM clause. Always use explicit JOIN syntax.
In your case, you need to remove columns from the SELECT and the GROUP BY. If you want the average fee paid by any member, then you don't need the GROUP BY at all:
SELECT AVG(C.CLUB_FEE) as AVGINCOME
FROM SUBSCRIPTION S JOIN
CLUB C
ON S.CLUB_ID = C.CLUB_ID;
If you want to control the formatting, either use to_char():
SELECT TO_CHAR(AVG(C.CLUB_FEE), '999.99') as AVGINCOME
(check the documentation for other formats).
Or, cast to a decimal:
SELECT CAST(AVG(C.CLUB_FEE) AS DECIMAL(10, 2)) as AVGINCOME

If you need to display the three columns and the average, not just the average alone, you can do something like this:
SELECT S.MEMBER_ID, S.CLUB_ID, C.CLUB_FEE, A.AVGINCOME
FROM SUBSCRIPTION S INNER JOIN CLUB C
ON S.CLUB_ID = C.CLUB_ID
CROSS JOIN (SELECT AVG(CLUB_FEE) AS AVGINCOME FROM CLUB) A
;
If you need the average rounded to two decimal places, use ROUND(AVG(CLUB_FEE), 2) in the subquery.
A fancier solution, which doesn't require a join (so it doesn't scan the CLUB table twice), uses AVG as an analytic function - but doesn't partition by anything. You still need the PARTITION BY clause (with an empty column list) to indicate it's used as an analytic function, not as an aggregate.
SELECT S.MEMBER_ID, S.CLUB_ID, C.CLUB_FEE,
ROUND(AVG(C.CLUB_FEE) OVER (PARTITION BY NULL)) AS AVGINCOME
FROM SUBSCRIPTION S INNER JOIN CLUB C
ON S.CLUB_ID = C.CLUB_ID
;
Even fancier (although functionally identical) - the keyword OVER is needed to indicate analytic function, but you can also write it as OVER() (no need to even mention PARTITION BY NULL).

Related

SQL - sum column for every date

This seemed like a very easy thing to do but I got stuck. I have a query like this:
select op.date, count(p.numberofoutstanding)
from people p
left join outstandingpunches op
on p.fullname = op.fullname
group by op.date
That outputs a table like this:
How can I sum over the dates so the sum for each row is equal to the sums up to that date? For example, the first column would be 27, the second would be 27 + 4, the third 27 + 4 + 11, etc.
I encountered this and this question, and I saw people are using OVER in their queries for this, but I'm confused by what do I have to partition. I tried partitioning by date but it's giving me incorrect results.
You can use a cumulative sum. This looks like:
select op.date, count(*),
sum(count(*)) over (order by op.date) as running_count
from people p join
outstandingpunches op
on p.fullname = op.fullname
group by op.date;
Note: I changed the join from a left join to an inner join. You are aggregating by a column in the second table. Your results have no examples of a NULL date column and that doesn't seem useful. Hence, it seems that rows are assumed to match.
I believe you need to use sum and not count.
select o.date_c,
sum(sum(p.numberofoutstanding)) over (order by o.date_c)
from people p
left join outstandingpunches o on p.fullname = o.fullname
group by o.date_c;
Here is a small demo:
DEMO
Have in mind that I have renamed your column date to date_c. I believe you should not use data types as column names.

Difference between HAVING and WHERE in SQL

I've seen in other questions that the difference between HAVING and WHERE in SQL is that HAVING is used post-aggregation whereas WHERE is used pre-aggregation. However, I am still unsure about when to use pre-aggregation filtering or post-aggregation filtering.
As a concrete example, why don't these two queries yield the same result (the second sums quantity prematurely in a way that squashes the GROUP BY call)?
Using WHERE to obtain number of condo sales of each real estate agent.
SELECT agentId, SUM(quantity) total_sales
FROM sales s, houses h
WHERE s.houseId = h.houseId AND h.type = "condo"
GROUP BY agentId
ORDER BY total_sales;
Attempted use of HAVING to obtain the same quantity as above.
SELECT agentId, SUM(quantity) total_sales
FROM sales s, houses h
GROUP BY agentId
HAVING s.houseId = h.houseId AND h.type = "condo"
ORDER BY total_sales;
Note: these were written/tested/executed in sqlite3.
The simple way to think about it is to consider the order in which the steps are applied.
Step 1: Where clause filters data
Step 2: Group by is implemented (SUM / MAX / MIN / ETC)
Step 3: Having clause filters the results
So in your 2 examples:
SELECT agentId, SUM(quantity) total_sales
FROM sales s, houses h
WHERE s.houseId = h.houseId AND h.type = "condo"
GROUP BY agentId
ORDER BY total_sales;
Step 1: Filter by HouseId and Condo
Step 2: Add up the results
(number of houses that match the houseid and condo)
SELECT agentId, SUM(quantity) total_sales
FROM sales s, houses h
GROUP BY agentId
HAVING s.houseId = h.houseId AND h.type = "condo"
ORDER BY total_sales;
Step 1: No Filter
Step 2: Add up quantity of all houses
Step 3: Filter the results by houseid and condo.
Hopefully this clears up what is happening.
The easiest way to decide which you should use is:
- Use WHERE to filter the data
- Use HAVING to filter the results of an aggregation (SUM / MAX / MIN / ETC)
WHERE filters rows from the database. Then, if the query has aggregation, aggregation is ran based on the aggregate functions and GROUP BY clause in the query. After that point, HAVING is applied to filter the grouping results. The only filtering that HAVING allows is filtering on GROUP BY columns or calculated aggregates.
I must assume that you're using MySQL for your example query since, as other answers have noted, your HAVING clause doesn't make sense and MySQL has some default behaviors which are occasionally problematic and confusing.
First, learn to use proper, explicit, standard JOIN syntax.
Second, your query should look like:
SELECT s.agentId, SUM(s.quantity) as total_sales
FROM sales s JOIN
houses h
ON s.houseId = h.houseId
WHERE h.type = 'condo'
GROUP BY s.agentId
ORDER BY total_sales;
Your version of the query should generate an error in any reasonable database, because the HAVING clause has columns that are neither GROUP BY keys nor aggregation functions.
Additional notes:
The delimiter for a string is single quotes. If you use double quotes, things may not work as you expect.
You should qualify all column references, especially when your query references more than one table.
JOIN conditions belong in the ON clause, not in a WHERE clause.
Filtering on h.type after the aggregation makes no sense. If it did work, the sum() would include non-condos because the filtering is happening too late.

MS access query aggregation

I am trying to get query like this
SELECT sales.action_date, sales.item_id, items.item_name,
sales.item_quantity, sales.item_price, sales.net
FROM sales INNER JOIN items ON sales.item_id = items.ID
GROUP BY sales.item_id
HAVING (((sales.action_date)=[Forms]![rep_frm]![Text13].[value]));
Every time I try to show data this message show
your query does not include the specified expression ' action date '
as part of aggregate function.
and for all field in the query >>> but i just want the aggregation be for item_id
what i should do?
You don't have any aggregations like SUM in your SELECT statement. I also don't understand why you sales.action_date is in de HAVING clause. This is for aggregated filtering like SUM(sales.item_price) <> 0. It should be possible to put this part in de WHERE-clause, before the GROUP BY instead of the HAVING clause.
This example should work:
SELECT sales.item_id, items.item_name, SUM(sales.item_quantity),
SUM(sales.item_price), SUM(sales.net)
FROM sales INNER JOIN items ON sales.item_id = items.ID
WHERE sales.action_date=[Forms]![rep_frm]![Text13].[value]
GROUP BY sales.item_id, items.item_name;
When you are grouping your data all fields in select query should be either included in group by clause, or some of aggregate functions should be applied to it - otherwise it doesn't makes sanse.
By the way - I far as I can see, you should use WHERE(((sales.action_date)=[Forms]![rep_frm]![Text13].[value])) before group, not having after.
If you want to aggregate by date you have to put the date in the GROUP BY clause
SELECT sales.action_date,
SUM(sales.item_quantity),
SUM(sales.item_quantity * sales.item_price) as Total,
SUM(sales.net)
FROM sales
INNER JOIN items ON sales.item_id = items.ID
WHERE (((sales.action_date)=[Forms]![rep_frm]![Text13].[value]));
GROUP BY sales.action_date
Only the column you want to group by can appear in the GROUP BY clause. Only these columns can appear in the select clause outside of aggregation functions.

How to GROUP BY on Oracle?

I need help with sql oracle, my group by doesnt work and i'm working on a shell so i don't have any help.
Can someone tell me how to group this next request by noArticle.
SELECT Article.noArticle, quantite
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle
/
Thank you
To tie things up, this is the correct SQL.
SELECT Article.noArticle, sum(quantite)
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle
You are grouping by a column and then you attempt to use the quantite field which is not group-level, it is record-level. Group by is aggregation and you have to use aggregate columns (the columns you are grouping by or aggregate functions on columns, like sum, avg, count, max or min). You need to aggregate your record-level fields to be able to use them in your projection (select clause). To name an example, your attempt was like trying to get the hair color of American women (of course, there are many American women and they might have different hair color, so it is unnatural and un-wise to attempt to get the value of hair color from the set of American women). Your fixed query is as follows:
SELECT Article.noArticle, sum(quantite)
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle
For my situation i need the summation of the quantite so in order to make it work i added SUM(quantite) and then i grouped by noArticle

Use of the HAVING clause when using muliple sums

I was having a problem getting mulitple sums from multiple tables. Short story, my answer was solved in the "sql sum data from multiple tables" thread on this site. But where it came up short, is that now I'd like to only show sums that are greater than a certain amount. So while I have sub-selects in my select, I think I need to use a HAVING clause to filter the summed amounts that are too low.
Example, using the code specified in the link above (more specifically the answer that the owner has chosen as correct), I would only like to see a query result if SUM(AP2.Value) > 1500. Any thoughts?
If you need to filter on the results of ANY aggregate function, you MUST use a HAVING clause. WHERE is applied at the row level as the DB scans the tables for matching things. HAVING is applied basically immediately before the result set is sent out to the client. At the time WHERE operates, the aggregate function results are not (and cannot) be available, so you have to use a HAVING clause, which is applied after the main query is complete and all aggregate results are available.
So... long story short, yes, you'll need to do
SELECT ...
FROM ...
WHERE ...
HAVING (SUM_AP > 1500)
Note that you can use column aliases in the having clause. In technical terms, having on a query as above works basically exactly the same as wrapping the initial query in another query and applying another WHERE clause on the wrapper:
SELECT *
FROM (
SELECT ...
) AS child
WHERE (SUM_AP > 1500)
You could wrap that query as a subselect and then specify your criteria in the WHERE clause:
SELECT
PROJECT,
SUM_AP,
SUM_INV
FROM (
SELECT
AP1.[PROJECT],
(SELECT SUM(AP2.Value) FROM AP AS AP2 WHERE AP2.PROJECT = AP1.PROJECT) AS SUM_AP,
(SELECT SUM(INV2.Value) FROM INV AS INV2 WHERE INV2.PROJECT = AP1.PROJECT) AS SUM_INV
FROM AP AS AP1
INNER JOIN INV AS INV1 ON
AP1.[PROJECT] = INV1.[PROJECT]
WHERE
AP1.[PROJECT] = 'XXXXX'
GROUP BY
AP1.[PROJECT]
) SQ
WHERE
SQ.SUM_AP > 1500