Access SQL GROUP BY problem (eg. tbl_Produktion.ID not part of the aggregation-function) - sql

I want to group by two columns, however MS Access won't let me do it.
Here is the code I wrote:
SELECT
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter,
tbl_Produktion.ProduktionsID, tbl_Produktion.Linie,
tbl_Produktion.Schicht, tbl_Produktion.Anzahl_Schichten_P,
tbl_Produktion.Schichtteam, tbl_Produktion.Von, tbl_Produktion.Bis,
tbl_Produktion.Pause, tbl_Produktion.Kunde, tbl_Produktion.TeileNr,
tbl_Produktion.FormNr, tbl_Produktion.LabyNr,
SUM(tbl_Produktion.Stueckzahl_Prod),
tbl_Produktion.Stueckzahl_Ausschuss, tbl_Produktion.Ausschussgrund,
tbl_Produktion.Kommentar, tbl_Produktion.StvSchichtleiter,
tbl_Produktion.Von2, tbl_Produktion.Bis2, tbl_Produktion.Pause2,
tbl_Produktion.Arbeiter3, tbl_Produktion.Von3, tbl_Produktion.Bis3,
tbl_Produktion.Pause3, tbl_Produktion.Arbeiter4,
tbl_Produktion.Von4, tbl_Produktion.Bis4, tbl_Produktion.Pause4,
tbl_Produktion.Leiharbeiter5, tbl_Produktion.Von5,
tbl_Produktion.Bis5, tbl_Produktion.Pause5,
tbl_Produktion.Leiharbeiter6, tbl_Produktion.Von6,
tbl_Produktion.Bis6, tbl_Produktion.Pause6, tbl_Produktion.Muster
FROM
tbl_Personal
INNER JOIN
tbl_Produktion ON tbl_Personal.PersID = tbl_Produktion.Schichtleiter
GROUP BY
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter;
It works when I group it by all the columns, but not like this.
The error message say that the rest of the columns aren't part of the aggregation-function (translated from german to english as best as I could).
PS.: I also need the sum of "tbl_Produktion.Stueckzahl_Prod" therefore I tried using the SUM function (couldn't try it yet).

Have you tried something along these lines?
SELECT
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter,
MAX(tbl_Produktion.ProduktionsID), MAX(tbl_Produktion.Linie),
MAX(tbl_Produktion.Schicht), MAX(tbl_Produktion.Anzahl_Schichten_P),
MAX(tbl_Produktion.Schichtteam), MAX(tbl_Produktion.Von), MAX(tbl_Produktion.Bis),
SUM(tbl_Produktion.Stueckzahl_Prod)
FROM
tbl_Personal
INNER JOIN
tbl_Produktion ON tbl_Personal.PersID = tbl_Produktion.Schichtleiter
GROUP BY
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter;
I have used the MAX function for all the data except the two items you specify in the GROUP BY and the one where you desire the SUM. I took the liberty of leaving out mush of your data just to get started.
Using the MAX function turns out to be a convenient workaround when the data item is known to be unique within each group. We cannot know your data or your itent, so we cannot tell you whether MAX will yield the results you need.

If you use an aggregation function in the select clause, you must group by every column that you're selecting that's not an aggregation. If you don't want to do that for some reason (perhaps it changes the output of the aggregation in way that you don't intend) you either must think of an aggregate to use (pick a value. Average? Max? Min?) or just do two selects, one for the aggregate, and one for the non-aggregates. But, then, you have to decide how to get the non-aggregated fields that make sense for the aggregate (or show them all in a table, I suppose?)

Related

SQL grouping with "Invalid use of group" error

I'll be upfront, this is a homework question, but I've been stuck on this one for hours and I just was looking for a push in the right direction. First I'll give you the relations and the hw question for background, then I'll explain my question:
Branch (BookCode, BranchNum, OnHand)
HW problem: List the BranchNum for all branches that have at least one book that has at least 10 copies on hand.
My question: I understand that I must take the SUM(OnHand) grouped by BookCode, but how do I then take that and group it by BranchNum? This is logically what I come up with and various versions:
select distinct BranchNum
from Inventory
where sum(OnHand) >= 10
group by BookCode;
but I keep getting an error that says "Invalid use of group function."
Could someone please explain what is wrong here?
UPDATE:
I understand now, I had to use the HAVING statement, the basic form is this:
select distinct (what you want to display)
from (table)
group by
having
Try this one.
SELECT BranchNum
FROM Inventory
GROUP BY BranchNum
HAVING SUM(OnHand) >= 10
You can also find Group By Clause with example here.
Although all comments in the question seem to be valid and add information they all seem to be missing why your query is not working. The reason is simple and is strictly related by the state/phase at which the sum is calculated.
The where clause is the first thing that will get executed. This means it will filter all rows at the beginning. Then the group by will come in effect and will merge all rows that are not specified in the clause and apply the aggregated functions (if any).
So if you try to add an aggregated function to the where clause you're trying to aggregate before data is being grouped by and even filtered. The having clause gets executed after the group by and allows you to filter the aggregated functions, as they have already been calculated.
That's why you can write HAVING SUM(OnHand) >= 10 and you can't write WHERE SUM(OnHand) >= 10.
Hope this helps!

SQL MIN() returns multiple values?

I am using SQL server 2005, querying with Web Developer 2010, and the min function appears to be returning more than one value (for each ID returned, see below). Ideally I would like it to just return the one for each ID.
SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber) AS Expr1,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
FROM Production.WorksOrderOperations
INNER JOIN Production.Resources
ON Production.WorksOrderOperations.ResourceID = Production.Resources.ResourceID
INNER JOIN Production.WorksOrderExcel_ExcelExport_View
ON Production.WorksOrderOperations.WorksOrderNumber = Production.WorksOrderExcel_ExcelExport_View.WorksOrderNumber
WHERE Production.WorksOrderOperations.WorksOrderNumber IN
( SELECT WorksOrderNumber
FROM Production.WorksOrderExcel_ExcelExport_View AS WorksOrderExcel_ExcelExport_View_1
WHERE (WorksOrderSuffixStatus = 'Proposed'))
AND Production.Resources.ResourceCode IN ('1303', '1604')
GROUP BY Production.WorksOrderOperations.WorksOrderNumber,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
If you can get your head around it, I am selecting certain columns from multiple tables where the WorksOrderNumber is also contained within a subquery, and numerous other conditions.
Result set looks a little like this, have blurred out irrelevant data.
http://i.stack.imgur.com/5UFIp.png (Wouldn't let me embed image).
The highlighted rows are NOT supposed to be there, I cannot explicitly filter them out, as this result set will be updated daily and it is likely to happen with a different record.
I have tried casting and converting the OperationNumber to numerous other data types, varchar type returns '100' instead of the '30'. Also tried searching search engines, no one seems to have the same problem.
I did not structure the tables (they're horribly normalised), and it is not possible to restructure them.
Any ideas appreciated, many thanks.
The MIN function returns the minimum within the group.
If you want the minimum for each ID you need to get group on just ID.
I assume that by "ID" you are referring to Production.WorksOrderOperations.WorksOrderNumber.
You can add this as a "table" in your SQL:
(SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber)
FROM Production.WorksOrderOperations
GROUP BY Production.WorksOrderOperations.WorksOrderNumber)

Sum of distinct values in field SSRS 2005

I'm working on SSRS report builder that is using a dataset calling a SQL Server 2000 database.
The query is getting sums of a few different fields and is also pulling out all records that have to do with that client number. I want to get the sum of the sum but it is way over because of the detail rows. Basically what I want is the sum of the distinct sum column values.
=Sum(Fields!tot.Value, "table1_Group3")
I saw that you can get sums by the groups and I tried the expression above but it comes back with an error:
The Value expression for the textbox 'tot' has a scope parameter that is not
valid for an aggregate function...
table1_Group3 is the name of the group that holds the sum value in the report.
Any suggestions on how to get the distinct values to sum them in this report.
=Sum(Fields!tot.Value, "table1_Group3")
The code above will give you the sum of "tot" for all rows in the current "table1_Group3." This means that this expression only makes sense somewhere within table1_Group3. Otherwise, SSRS doesn't know which is the current instance of that group.
Sounds like you would like to sum this value across multiple groups, but only take one "tot" from each instance of the group. (Are you sure that all rows in that group will have the same "Tot?")
If tot is the total of other fields in your returned data, then simply add those up in your formula. This may have the added benefit of simplifying your SQL query as well.
Some other options that could work:
- Change your SQL query so that only one row per group gets the Tot field set.
- Use Embedded code in the report to keep a running total which is added to only once per group, such as in the group header.
(If upgrading to 2008R2 SSRS is an option, then the Lookup function could be used here, maybe even to look back at the same dataset.)
change the query/ dataset to sum(distinct tot) using the temp table on the sql server
I suppose you need to write sum(distinct columnName).

SQL statement HAVING MAX(some+thing)=some+thing

I'm having trouble with Microsoft Access 2003, it's complaining about this statement:
select cardnr
from change
where year(date)<2009
group by cardnr
having max(time+date) = (time+date) and cardto='VIP'
What I want to do is, for every distinct cardnr in the table change, to find the row with the latest (time+date) that is before year 2009, and then just select the rows with cardto='VIP'.
This validator says it's OK, Access says it's not OK.
This is the message I get: "you tried to execute a query that does not include the specified expression 'max(time+date)=time+date and cardto='VIP' and cardnr=' as part of an aggregate function."
Could someone please explain what I'm doing wrong and the right way to do it? Thanks
Note: The field and table names are translated and do not collide with any reserved words, I have no trouble with the names.
Try to think of it like this - HAVING is applied after the aggregation is done.
Therefore it can not compare to unaggregated expressions (neither for time+date, nor for cardto).
However, to get the last (principle is the same for getting rows related to other aggregated functions as weel) time and date you can do something like:
SELECT cardnr
FROM change main
WHERE time+date IN (SELECT MAX(time+date)
FROM change sub
WHERE sub.cardnr = main.cardnr AND
year(date)<2009
AND cardto='VIP')
(assuming that date part on your time field is the same for all the records; having two fields for date/time is not in your best interest and also using reserved words for field names can backfire in certain cases)
It works because the subquery is filtered only on the records that you are interested in from the outer query.
Applying the same year(date)<200 and cardto='VIP' to the outer query can improve performance further.

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?
Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?
If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)