Group By expression not working in this SQL query?

Group By expression not working in this SQL query? - sql

I am new to SQL. Could anyone help me to figure out why the "Group By" Expression isn't working in this sql query? I get this error
ERROR at line 3:
ORA-00979: not a GROUP BY expression
The code I am using is
CREATE OR REPLACE VIEW CUSTOMER_LINE_ITEM AS
SELECT CUSTOMER_ORDER_CART_INFO.loginName,CUSTOMER_ORDER_CART_INFO.FirstName,
CUSTOMER_ORDER_CART_INFO.LastName,CUSTOMER_ORDER_CART_INFO.orderCartID,(lineItems.orderPrice*lineItems.qtyOrdered) AS TOTAL_ORDER
FROM CUSTOMER_ORDER_CART_INFO
INNER JOIN lineItems
ON CUSTOMER_ORDER_CART_INFO.orderCartID = lineItems.orderCartID
GROUP BY CUSTOMER_ORDER_CART_INFO.loginName,CUSTOMER_ORDER_CART_INFO.FirstName,
CUSTOMER_ORDER_CART_INFO.LastName,CUSTOMER_ORDER_CART_INFO.orderCartID
ORDER BY orderCartID;
Without the Group By expression I generate this view. I think the group by expression should just remove the duplicates and just give me the results with different order cart ID. Could anyone help me understand what I am doing wrong here?
VIEW of CUSTOMER_LINE_ITEM without 'group by'

The error is with group by clause. Remember simple rule of thumb, all columns being selected to be in group by clause, or the columns to be selected which are not part of group by clause are to be selected as some aggregate function, like, MAX, MIN, SUM, AVG, etc.
Try the following query, which would run without issue. But I can't say its logical correctness which you need to figure out on your requirement basis.
CREATE OR REPLACE VIEW customer_line_item AS
SELECT cac.loginName,
cac.FirstName,
cac.LastName,
cac.orderCartID,
(SUM(li.orderPrice) * SUM(li.qtyOrdered)) AS TOTAL_ORDER
FROM customer_order_cart_info cac
INNER JOIN lineItems li
ON cac.orderCartID = li.orderCartID
GROUP BY cac.loginName,
cac.FirstName,
cac.LastName,
cac.orderCartID
ORDER BY cac.orderCartID;
Now thing to note here is, li.orderPrice and li.qtyOrdered were being selected, but were neither in group by nor in a aggregate function.
The use of group by is that, the columns in group by clause are used to logically group your data. Here your data is grouped by loginName, firstname, lastname, ordercartid. But there is a probability that multiple orderprice and qty exist for each group, and SQL is not able to justify the grouping logic then. Per your query one requirement that I could think of was, you want find the total value of order for a customer in his cart. Hence, you are multiplying orderPrice with qtyOrdered. To achieve this, you need to multiply orderPrice and orderqty of each lineItem. Hence, what you need is a sum of (orderPrice*orderQty) group by lineItem(lineItemID/lineItemNo maybe, just a guess). For this one, give me some time, let me devise an example and I will edit my answer with that. Till then you try something like above.

The cause of the error message is that you don't aggregate (lineItems.orderPrice*lineItems.qtyOrdered).
The Oracle documentation tells us
SelectItems in the SelectExpression with a GROUP BY clause must
contain only aggregates or grouping columns.
That means you should aggregate TOTAL_ORDER by using e.g.
sum(lineItems.orderPrice*lineItems.qtyOrdered)
or whatever the requirement is.

Related

Trying to understand how WHERE IN in a subquery works in Teradata SQL?

I'm trying to build a sub-query with a list in the where clause, I have tried several variations and I think the problem is with the way I'm structuring the WHERE IN. Help is grealy appreciated!!
SELECT a.ACCT_SK,
a.BTN,
a.PRODUCT_SET,
MAX(b.ORD_CREATD_DT)
FROM MM.MEC_ACCT_ATTR a, CDI_CRM.ORD_MSTR b
WHERE a.ACCT_SK=b.ACCT_SK AND a.BTN=b.BTN
(SELECT b.ACCT_SK, b.ORD_CREATD_DT
FROM CDI_CRM.ORD_MSTR b
WHERE b.ACCT_SK IN ('44347714',
'44023302',
'43604964'));
SELECT Failed. 3706: (-3706)Syntax error: expected something between '(' and the 'SELECT' keyword
The desired output is a table with Product set for 50 ACCT_SKs with the most recent order date matched on ACCT_SK and BTN.

Sample data and desired results would really help. Your query doesn't make much sense, but I suspect you want:
SELECT a.ACCT_SK, a.BTN, a.PRODUCT_SET,
MAX(o.ORD_CREATD_DT)
FROM MM.MEC_ACCT_ATTR a JOIN
CDI_CRM.ORD_MSTR o
ON a.ACCT_SK = o.ACCT_SK AND a.BTN = o.BTN
WHERE a.ACCT_SK IN ('44347714', '44023302', '43604964')
GROUP BY a.ACCT_SK, a.BTN, a.PRODUCT_SET;
This returns the columns you want for the three specified accounts.
Notes:
Always use proper, explicit, standard JOIN syntax. Never use commas in the FROM clause.
Your subquery simply makes no sense. It is not connected to anything else in the query.
You are using an aggregation function (MAX()) so your query is an aggregation query and needs a GROUP BY.
Use meaningful table aliases. a makes sense for an accounts table, but b does not make sense for an orders table.

2010 Access Query with nested JOIN, WHERE and GROUP BY

I appreciate everyone's help and patience as I continue learning through converting a large Excel/vba system to Access.
I have the following query:
SELECT AccountWeeklyBalances.AccountNumber,
AccountWeeklyBalances.AccountBalance,
AccountWeeklyBalances.AccountDate,
AccountMaster.AccountName,
AccountCurrentModel.Model,
ModelDetailAllHistory.Risk
FROM ((AccountWeeklyBalances
INNER JOIN AccountMaster
ON AccountMaster.[AccountNumber] = AccountWeeklyBalances.AccountNumber)
INNER JOIN AccountCurrentModel
ON AccountWeeklyBalances.AccountNumber=AccountCurrentModel.AccountNumber)
INNER JOIN ModelDetailAllHistory
ON AccountCurrentModel.Model=ModelDetailAllHistory.ModelName
WHERE AccountWeeklyBalances.AccountDate=[MatchDate]
;
This works, except I want to GROUP BY the Model. I tried adding
GROUP BY AccountCurrentModel.Model
and
GROUP BY ModelDetailAllHistory.ModelName
after the WHERE clause, but both give me an error:
Tried to execute a query that does not include the specified expression
'AccountNumber' as part of an aggregate function.
I've read several other posts here, but cannot figure out what I've done wrong.

It depends on what you're trying to do. If you just want to sum the AccountBalance by ModelName, then all the other columns would have to be removed from the select statement. If you want the sum of each model for each account, then you would just add the AccountNumber to the GROUP BY, probably before the ModelName.
When aggregating, you can't include anything in the select list that's not either an aggregate function (min, max, sum, etc) or something you are grouping by, because there's no way to represent that in the query results. How could you show the sum of AccountBalance by ModelName, but also include the AccountNumber? The only way to do that would be to group by both AccountNumber and ModelName.
----EDIT----
After discussing in the comments I have a clearer idea of what's going on. There is no aggregation, but there are multiple records in ModelDetailAllHistory for each Model. However, the only value we need from that table is Risk, and that will always be the same per model. So we need to eliminate the duplicate Risk values. This can be done by joining into a subquery instead of joining directly into ModelDetailAllHistory
INNER JOIN (SELECT DISTINCT ModelName, Risk FROM ModelDetailAllHistory) mh
ON AccountCurrentModel.Model=mh.ModelName
or
INNER JOIN (SELECT ModelName, max(Risk) FROM ModelDetailAllHistory GROUP BY ModelName) mh
ON AccountCurrentModel.Model=mh.ModelName
Both methods collapse the multiple Risk values into a single value per Model, eliminating the duplicate records. I tend to prefer the first option because if for some reason there were multiple Risk values for a single Model, you'd end up with duplicate records and you'd know there was something wrong. Using max() is basically choosing an arbitrary record from ModelDetailAllHistory that matches the given Model and getting the Risk value from it, since you know all the Risk values for that model should be the same. What I don't like about this method is it will hide data inconsistencies from you (e.g. if for some reason there are some ModelDetailAllHistory records for the same Model that don't have the same Risk value), and while it's nice to know you'll never ever get duplicate records, the underlying problem could end up rearing its ugly head in other unexpected ways.

GROUP BY clause order omitting results in Oracle 11g query

I have a simple query that appears to give the desired result:
select op.opr, op.last, op.dept, count(*) as counter
from DWRVWR.BCA_M_OPRIDS1 op
where op.opr = '21B'
group by op.opr, op.last ,op.dept;
My original query returns no results. The only difference was the order of the group by clause:
select op.opr, op.last, op.dept, count(*) as counter
from DWRVWR.BCA_M_OPRIDS1 op
where op.opr = '21B'
group by op.opr, op.dept, op.last;
In actuality, this was part of a much larger, more complicated query, but I narrowed down the problem to this. All documentation I was able to find states that the order of the group by clause doesn't matter. I really want to understand why I am getting different results, as I would have to review all of my queries that use the group by clause, if there is a potential issue. I'm using SQL Developer, if it matters.
Also, if the order of the group by clause did not matter and every field not used in an aggregate function is required to be listed in the group by clause, wouldn't the group by clause simply be redundant and seemingly unnecessary?

All documentation I was able to find states that the order of the group by clause doesn't matter
That's not entirely true, it depends.
The grouping functionality is not impacted by the order of columns in the GROUP BY clause. It will produce the same group set regardless of the order. Perhaps that's what those documentation that you found were referring to. However the order does matter for other aspects.
Before Oracle 10g, the GROUP BY performed implicitly an ORDER BY, so the order of the columns in the GROUP BY clause did matter. The group sets are the same, but only ordered differently. Starting with Oracle10g, if you want the result set to be in any specific order, then you must add an ORDER BY clause. Other databases have similar history.
Another case where the order matters is if you have indexes on the table. Multi-column indexes are only used if the columns exactly match the columns specified in the GROUP BY or ORDER BY clauses. So if you change the order, your query will not use the index and will perform differently. The result is the same, but the performance is not.
Also the order of the columns in the GROUP BY clause becomes important if you use some features like ROLLUP. This time the results themselves will not be the same.
It is recommended to follow the best practice of listing the fields in the GROUP BY clause in the order of the hierarchy. This makes the query more readable and more easily maintainable.
Also, if the order of the group by clause did not matter and every field not used in an aggregate function is required to be listed in the group by clause, wouldn't the group by clause simply be redundant and seemingly unnecessary?
No, the GROUP BY clause is mandatory in the standard SQL and in Oracle. There is only one exception in which you can omit the GROUP BY clause, if you want the aggregate functions to apply to the entire result set. In this case, your SELECT list must consist only of aggregate expressions.

Select Statement with Distinct returning multiple rows and need only first result

I having a challenge with my query returning multiple results.
SELECT DISTINCT gpph.id, gpph.cname, gc2a.assetfilename, gpph.alternateURL
FROM [StepMirror].[dbo].[stepview_nwppck_ngn_getpimproducthierarchy] gpph
INNER JOIN [StepMirror].[dbo].[stepview_nwppck_ngn_getclassification2assetrefs] gc2a
ON gpph.id=gc2a.id
WHERE gpph.subtype='Level_4' AND gpph.parentId=#ID AND gc2a.assettype='Primary Image'
A record, 5679599, has 2 'Primary Images' and is returning 2 results for that id but I only need the first result back. Is there any way to do this IN the current query? Do I need to write multiple queries?
I need some direction on how to constrain the results to only 1 result on Primary Image. I have looked at a ton of similar questions but most typically are just requiring the guidance of adding 'distinct' to the beginning of their query rather than on the where clause.
Edit: This problem is created by a user inputting 2 Primary Images on one record in the database. My business requirements only state to take the first result.
Any help would be awesome!

Given the choice is arbitary which to return, we can just use an aggregate on the value. This then needs a group by clause, which eliminates the need for the distinct.
SELECT gpph.id, gpph.cname, max(gc2a.assetfilename), gpph.alternateURL
FROM [StepMirror].[dbo].[stepview_nwppck_ngn_getpimproducthierarchy] gpph
INNER JOIN [StepMirror].[dbo].[stepview_nwppck_ngn_getclassification2assetrefs] gc2a
ON gpph.id=gc2a.id
WHERE gpph.subtype='Level_4' AND gpph.parentId=#ID AND gc2a.assettype='Primary Image'
GROUP BY gpph.id, gpph.cname, gpph.alternateURL
In this instance, using max(gc2a.assetfilename) is going to give you the alphabetically highest value in the event of there being more than one record. It's not the ideal choice, some kind of timestamp knowing the order of the records might be more helpful, since then the meaning of the word 'first' could make more sense.

Replace distinct to group by :
SELECT MAX(gpph.id), gpph.cname, gc2a.assetfilename, gpph.alternateURL
FROM [StepMirror].[dbo].[stepview_nwppck_ngn_getpimproducthierarchy] gpph
INNER JOIN [StepMirror].[dbo].[stepview_nwppck_ngn_getclassification2assetrefs] gc2a
ON gpph.id=gc2a.id
WHERE gpph.subtype='Level_4' AND gpph.parentId=#ID AND gc2a.assettype='Primary Image'
AND gpph.id = MAX(gpph.id)
GROUP BY gpph.cname, gc2a.assetfilename, gpph.alternateURL

Aggregate function in SQL Server

I'm getting really frustrated about SQL Server. I'm just trying to join 3 tables, very simple and easily done in mysql. But in SQL Server it keeps telling me to contain tbl_department.deptname in an aggregate function. But what aggregate function could I possibly use in a simple string?
SELECT
COUNT(tblStudent_Department.student_id) AS Expr2,
tbl_department.deptname AS Expr1
FROM
tblStudent_Department
LEFT OUTER JOIN
tbl_department ON tblStudent_Department.deptcode = tbl_department.deptcode
LEFT OUTER JOIN
tblStudent ON tblStudent_Department.student_id = tblStudent.studentid
GROUP BY
tblStudent_Department.deptcode
Please help.

The database doesn't know that if you group on deptcode, you're implicitly grouping on deptname. You must tell SQL Server this by adding the column to the group by:
GROUP BY tblStudent_Department.deptcode, tbl_department.deptname
MySQL is special in that it basically picks a random row if you don't specify an aggregate. This can be misleading and lead to wrong results. As in many other things, MySQL has the more pragmatic solution, and SQL Server the more correct one.

The problem is because your GROUP BY and SELECT terms don't match up.
The simplest way to fix this is to add tbl_department.deptname into your GROUP BY, like so:
GROUP BY tblStudent_Department.deptcode, tbl_department.deptname

You're grouping by deptcode but selecting deptname - if you don't want to aggregate the department (which sounds like it makes sense) then you need to have the deptname in the "group by" statement:
SELECT COUNT(tblStudent_Department.student_id) AS Expr2, tbl_department.deptname AS Expr1
FROM tblStudent_Department
LEFT OUTER JOIN tbl_department ON tblStudent_Department.deptcode = tbl_department.deptcode
LEFT OUTER JOIN tblStudent ON tblStudent_Department.student_id = tblStudent.studentid
GROUP BY tblStudent_Department.deptname
Note I've removed the deptcode because I don't think you need it
If you're using aggregate functions (sum, count etc) ALL fields returned in your select statement need to either be aggregated OR in the group by clause.

First, Last, or put it into the group by.
The rule are:
IF you use a group by, every field is either one of the grouping fields OR one of the aggregated fields.
If you select tbl_department.deptname then you have to either group by that, too, or say WHICH ONE is taken.
Some aggretgate functions are faking that nicely - First, Last (take first or last occurance).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group By expression not working in this SQL query? - sql

Related

Trying to understand how WHERE IN in a subquery works in Teradata SQL?

2010 Access Query with nested JOIN, WHERE and GROUP BY

GROUP BY clause order omitting results in Oracle 11g query

Select Statement with Distinct returning multiple rows and need only first result

Aggregate function in SQL Server

Categories

Resources