Use of the HAVING clause when using muliple sums - sql

I was having a problem getting mulitple sums from multiple tables. Short story, my answer was solved in the "sql sum data from multiple tables" thread on this site. But where it came up short, is that now I'd like to only show sums that are greater than a certain amount. So while I have sub-selects in my select, I think I need to use a HAVING clause to filter the summed amounts that are too low.
Example, using the code specified in the link above (more specifically the answer that the owner has chosen as correct), I would only like to see a query result if SUM(AP2.Value) > 1500. Any thoughts?

If you need to filter on the results of ANY aggregate function, you MUST use a HAVING clause. WHERE is applied at the row level as the DB scans the tables for matching things. HAVING is applied basically immediately before the result set is sent out to the client. At the time WHERE operates, the aggregate function results are not (and cannot) be available, so you have to use a HAVING clause, which is applied after the main query is complete and all aggregate results are available.
So... long story short, yes, you'll need to do
SELECT ...
FROM ...
WHERE ...
HAVING (SUM_AP > 1500)
Note that you can use column aliases in the having clause. In technical terms, having on a query as above works basically exactly the same as wrapping the initial query in another query and applying another WHERE clause on the wrapper:
SELECT *
FROM (
SELECT ...
) AS child
WHERE (SUM_AP > 1500)

You could wrap that query as a subselect and then specify your criteria in the WHERE clause:
SELECT
PROJECT,
SUM_AP,
SUM_INV
FROM (
SELECT
AP1.[PROJECT],
(SELECT SUM(AP2.Value) FROM AP AS AP2 WHERE AP2.PROJECT = AP1.PROJECT) AS SUM_AP,
(SELECT SUM(INV2.Value) FROM INV AS INV2 WHERE INV2.PROJECT = AP1.PROJECT) AS SUM_INV
FROM AP AS AP1
INNER JOIN INV AS INV1 ON
AP1.[PROJECT] = INV1.[PROJECT]
WHERE
AP1.[PROJECT] = 'XXXXX'
GROUP BY
AP1.[PROJECT]
) SQ
WHERE
SQ.SUM_AP > 1500

Related

SQL-MS Access - Expression not included on multiple left join query

I recently started the process of learning SQL and have hit my first wall. I have three tables in our database- Chart of Accounts (ChartAccts), Modified Transaction Detail (ModTD), and Beginning Trial Balance (TB_Beg). I am trying to create a query that shows all accounts and their names in the Chart of Accounts, the Beginning Balance for each account from the Trial Balance, and the amount column from the Modified Transaction Detail. They are all linked via the account numbers within each of the tables.
I am currently getting this "Your query does not include the specified expression 'Account_Num' as part of an aggregate function." when attempting to run this code:
SELECT A.Account_Num, A.Account_Name, NZ(Sum(B.[Amount ]),0) AS [Sum Of Amount], C.Amount
FROM ((ChartAccts AS A)
LEFT JOIN ModTD AS B ON A.[Account_Num] = B.[Account (Line): Number ])
LEFT JOIN TB_Beg AS C ON A.[Account_Num] = C.[Account #];
I feel like my problem must have something to do with the ON statements but I have been starting at this for so long that I don't think I am going to identify the issue despite how simple it must be. Any/all advice is appreciated!
You are missing the "GROUP BY" clause. When you use an SQL aggregate function (e.g., Max(), Min(), Sum(), ...) you must include all the fields that are not inside the expression of the argument of an SQL aggregate function in the GROUP BY clause. Aditionally, you cannot use the same field inside and outside the aggregate function: you cannot aggregate and not aggregate at the same time.
What I think you want is:
SELECT A.Account_Num, A.Account_Name, NZ(Sum(B.[Amount ]),0) AS [Sum Of Amount]
FROM
( ChartAccts AS A
LEFT JOIN
ModTD AS B
ON A.[Account_Num] = B.[Account (Line): Number ])
LEFT JOIN
TB_Beg AS C
ON A.[Account_Num] = C.[Account #]
GROUP BY A.Account_Num, A.Account_Name ;
You may want to check the Query "F_Select_w_group_by_aggreg" and related "F_Select_*" queries from the database of examples that you can download from LightningGuide.net.
Finally, be careful with the Nz() function, because when invoked from SQL it always returns a string. This can create problems if you use this fragment of code as part of a larger Query. You may want to enclose the Nz() function in a type-conversion function, or use an Iif() function instead.

Why are queries these not equivalent? (correlated subquery vs. group by)

Why are these two SQL queries not equivalent? One uses a correlated subquery, the other uses group by. The first produces a little over 51000 rows from my database, the second nearly 66000. In both cases, I am simply trying to return all the parts meeting the stated condition, current revision only. A comparison of the output files shows that method #1 (oracle_test1.txt) fails to return quite a few values. Based on that, I can only assume that method #2 is correct. I have some code that has used method #1 for a long time, but it appears I will have to change it. My reasoning concerning the correlated subquery was that as the inner select is comparing the columns in the self join, it will find the max vaule for the prev value for all matches; then return that max prev value for use in the outer query. I designed that query long ago before becoming familiar with the use of group by. Any insights would be appreciated.
Query #1
select pobj_name, prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
and prev = (select max(prev) from pfmc_part a where a.pobj_name = pfmc_part.pobj_name)
order by pobj_name, prev"
Query #2
select pobj_name, max(prev) prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
group by pobj_name
order by pobj_name, prev"
Sample output:
Query #2 Query #1
P538512 B P538512 B
P538513 A P538513 A
P538514 C P538514 C
P538520 B
P538522 B P538522 B
P538525 A P538525 A
P538531 C P538531 C
P538533 A P538533 A
P538538 B
P538541 B
P538542 B
P538553 A P538553 A
P538569 A P538569 A
Query 1 is returning each of the max ids and then those that have a pmodel of the type specified within your where clause.
Whereas query 2 is selecting all items with a pmodel of the type specified in your where clause and each of the max ids of that.
You may have data which isn't the max id which satisfies your where clause in query 2 which is why it's being omitted in query 1
There are two differences and the rest of the answers focus on one. The "easy" difference is that the max() in the group by is affected by the filter clause. The max() in the other query has no filter, and so it might return no rows (when max(prev) is on a row otherwise filtered out by the where conditions).
In addition, the where version of the query might return duplicate rows when there are multiple rows with the same value of max(prev) for a given pobj_name. The group by will never return duplicate rows.
this query
select pobj_name, prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
and prev = (select max(prev) from pfmc_part a where a.pobj_name = pfmc_part.pobj_name)
order by pobj_name, prev"
has a where clause declaration causing it to return less rows -- specifically, only rows where prev = (subquery). that and prev makes it entirely different, and also assigns the value into prev in the first line
if you wanted them to be the more similar, you'd need to modify it like so
select pobj_name, prev, maxes.max
from pfmc_part
JOIN (select max(prev) as max from pfmc_part a where a.pobj_name = pfmc_part.pobj_name) maxes
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
order by pobj_name, prev"
In query 1 you are ONLY selecting the rows whose prev field is equal to the max(prev) and in query 2 you are selecting all records ALONG WITH max(prev) that's meeting the conditions in the where and group by clause.
Basically, query 1 and query 2 have completely different where clauses. Hope this explains the missing records from query 1.
Your query #1 will certainly fail to return a row for a given pobj_name where maximum prev for that name does not correspond to a revision currently in the database. That could perhaps happen if a revision was skipped or if its row was deleted.
Your Query #2 does not suffer Query #1's limitation, and it may perform better on account of avoiding a correlated subquery. It would be inappropriate, however, if you wanted more data than just pobj_name and aggregate functions of the groups. And by the way, there's no point in including prev in the ORDER BY clause, since pobj_name will already be unique to each result row.
Overall, if the two queries happen to return similar results then that is a matter of the details of the data, not of the queries. They arrive at their results completely differently.

Oracle SQL - Comparing AVG functions in WHERE

I'm trying to write a few Oracle SQL scripts for an assignment. I've managed to get all of it to work, except for one part. To summarize, I have to display data from 2 tables if the average of 1 column in table A is greater than the average of another column in table B. I realize you cannot include AVG functions in a WHERE clause or HAVING clause since it seems unable to properly access the data (from what I've read). When I exclude this clause, the script executes properly, so I'm confident there are no other errors.
I've tried writing it as follows but the error I get is ORA-00936: missing expression and it is just before the > sign. I thought this may be due to improper bracket placing but none of my attempts resolved this. Here is my attempt:
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING (SELECT AVG(l.l_cost) OVER (PARTITION BY l.l_cost)) >
(SELECT AVG(r.r_sold) OVER (PARTITION BY r.r_sold));
I tried doing this without the OVER (PARTITION BY ...) as well as putting it into a WHERE clause but it didn't resolve the error. I'm pretty sure I need to put it into a SELECT statement somehow but I'm at a loss.
You do not need to use the OVER clause when applying the aggregate functions in the HAVING clause. Just use the aggregate functions on their own.
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING HAVING AVG(l.l_cost) > AVG(r.r_sold)

Oracle Group by issue

I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?
Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?
If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)