SQL SUM with GROUP clause giving errors - sql

I've been visiting this site for a while now and many of the responses on here has been most helpful. However, I'm now stuck with an SQL that I can't seem to find just the right solution for.
(the $packplant and $ym is already defined earlier in the program)
SELECT
A.in_house_supplier_cd,
B.maker_cd,
A.packing_plant_cd,
A.parts_no,
substr(A.actual_delivery_date,1,6),
A.actual_delivered_qty
FROM
TRN_DELIVERY_NO A,
TRN_PARTS B
WHERE
A.ISSUE_NO = B.ISSUE_NO
AND A.PACKING_PLANT_CD = '$packplant'
AND B.PACKING_PLANT_CD = '$packplant'
AND A.PARTS_NO = B.PARTS_NO
AND A.IN_HOUSE_SUPPLIER_CD = B.IN_HOUSE_SUPPLIER_CD
AND A.ACTUAL_DELIVERY_DATE LIKE '$ym%'
ORDER BY
in_house_supplier_cd, maker_cd, parts_no;
This sql works fine. However, what I need is that the "A.actual_delivered_qt" to be sum(A.actual_delivered_qty)... in other words, I need the sum of that particular parts and not individual quantities that were received.
When I add the "sum.." part (or even with adding a GROUP BY parts_no), the sql gives a "column ambiguously defined" error.
I believe that I've already assigned the correct table to each column and therefore would really appreciate it if someone could point out the errors as I've been stuck with this for quite a while now. Cheers!

You will need to add a GROUP BY statement, for example on parts_no, but then you will have an issue with the rest of the columns in your select statement.
For example, if you have 3 records for the same part no on different days and you are grouping by part_no and calculating the total number of items within that group number, the date no longer makes sense. The best you can do is select the max value from the date, but again, this doesn't make much sense.
You should think about what data really makes sense to include in the select statement when you are grouping by part_no and then revise the columns in the select statement to meet this new design.

When you add the group by make sure you include the table alias ( like a.parts_no ). Can you add your "new" query including the sum and group by?

Just FYI, according to PostgreSQL 9.1 Documentation - 3.5. Window Functions and Microsoft - OVER Clause (Transact-SQL) - (SQL 2008) (but they have links for MS SQL 2005
as well) GROUP BY isn't strictly needed. Just take a look at the Sample SQL queries
PostGreSQL:
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
MS SQL:
USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN(43659,43664);
GO
I've never used SQL Plus, but it looks like the OVER clause is supported there as well.

Related

Why Does this SQL query work, but the other doesn't?

I am learning SQL, and in this tutorial I can write this:
SELECT MAX(Price) AS Price, ProductName
FROM Products;
Which returns what I expect:
Price ProductName
263.5 Côte de Blaye
But on StrataScratch when I am trying to begin to solve a problem (I am currently just seeing what ideas will work towards a solution) this (and similar variations) throws an error
SELECT MAX(salary) AS salary, first_name
FROM db_employee;
error:
(psycopg2.errors.GroupingError) column "db_employee.first_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT MAX(salary) AS salary, first_name
Conceptually these seem like the same query to me, so I am not sure why the query from StrataScratch throws an error.
You should have a group by clause here, like this:
SELECT MAX(Price) AS Price, ProductName
FROM Products
GROUP BY ProductName
It's literally telling you why right there. MAX() is an aggregate function. Aggregate functions need a GROUP BY statement if the query is going to include a standard column.
SELECT MAX(Price) as Price, ProductName
FROM Products
GROUP BY ProductName
It fails because SQL won't know automatically that it should establish a relationship between the max price and product name. Remember SQL is a RELATIONAL database language. All of the data must be somehow related. SQL is not going to guess, you need to explicitly establish a relationship between Price and ProductName
I really recommend you to read some articles about SQL debugging. I've noticed that a lot of new students have subpar debugging skills and SQL debugging messages are actually incredibly informative.
If you want the product with the maximum price, then use order by and limit the results to one row. In standard SQL this is:
SELECT p.*
FROM Products p
ORDER BY p.price DESC
FETCH FIRST 1 ROW ONLY;
(The last line of code is often LIMIT 1.)
Your first query:
SELECT MAX(Price) AS Price, ProductName
FROM Products;
should fail with a syntax error -- and would in almost all databases. Why? Because this is an aggregation query (because of the MAX() but ProductName is not aggregated and not part of a GROUP BY).
It does work in (at least) two databases. In older versions of MySQL, you would get an arbitrary ProductName. By luck, you might get the right one, but that would be a coincidence. MySQL has since fixed this so the query should return an error in more recent versions (using default configuration settings).
SQLite does permit such syntax and actually returns the ProductName associated with the maximum price. However, I don't recommend learning that as "valid" SQL, because it only works in SQLite.

Refer to aggregate result in Amazon Redshift query?

In other postgresql DBMSes (e.g., Netezza) I can do something like this without errors:
select store_id
,sum(sales) as total_sales
,count(distinct(txn_id)) as d_txns
,total_sales/d_txns as avg_basket
from my_tlog
group by 1
I.e., I can use aggregate values within the same SQL query that defined them.
However, when I go to do the same sort of thing on Amazon Redshift, I get the error "Column total_sales does not exist..." Which it doesn't, that's correct; it's not really a column. But is there a way to preserve this idiom, rather than restructuring the query? I ask because there would be a lot of code to change.
Thanks.
You simply need to repeat the expressions (or use a subquery or CTE):
select store_id,
sum(sales) as total_sales,
count(distinct txn_id) as d_txns,
sum(sales)/count(distinct txn_id) as avg_basket
from my_tlog
group by store_id;
Most databases do not support the re-use of column aliases in the select. The reason is twofold (at least):
The designers of the database engine do not want to specify the order of processing of expressions in the select.
There is ambiguity when a column alias is also a valid column in a table in the from clause.
Personally I loove the construct in netezza. This is compact and the syntax is not ambiguous: any 'dublicate' column names will default to (new) alias in the current query, and if you need to reference the column of the underlying tables, simply put the tablename in front of the column. The above example would become:
select store_id
,sum(sales) as sales ---- dublicate name
,count(distinct(txn_id)) as d_txns
,my_tlog.sales/d_txns as avg_basket --- this illustrates but may not make sense
from my_tlog
group by 1
I recently moved away from sql server, and on that database I used a construct like this to avoid repeating the expressions:
Select *, total_sales/d_txns as avg_basket
From (
select store_id
,sum(sales) as total_sales
,count(distinct(txn_id)) as d_txns
from my_tlog
group by 1
)x
Most (if not all) databases will support this construct, and have done so for 10 years or more

how to find maximum of sum of number using if else in procedure in sap hana sql

I want to list out the product which has highest sales amount on date wise.
note: highest sales amount in the sense max(sum(sales_amnt)...
by using if or case In the procedure in sap hana SQL....
I did this by using with the clause :
/--------------------------CORRECT ONE ----------------------------------------------/
WITH ranked AS
(
SELECT Dense_RAnk() OVER (ORDER BY SUM("SALES_AMNT"), "SALES_DATE", "PROD_NAME") as rank,
SUM("SALES_AMNT") AS Amount, "PROD_NAME",count(*), "SALES_DATE" FROM "KABIL"."DATE"
GROUP BY "SALES_DATE", "PROD_NAME"
)
SELECT "SALES_DATE", "PROD_NAME",Amount
FROM ranked
WHERE rank IN ( select MAX(rank) from ranked group by "SALES_DATE")
ORDER BY "SALES_DATE" DESC;
this is my table
You can not use IF along with SELECT statement. Note that, you can achieve most of boolean logics with CASE statement syntax
In select, you are applying it over a column and your logic will be executed as many as times the count of result set rows. Hence , righting an imperative logic is not well appreciated. Still, if you want to do the same, create a calculation view and use intermediate calculated columns to achieve what you are expecting .
try this... i got an answer ...
select "SALES_DATE","PROD_NAME",sum("SALES_AMNT")
from "KABIL"."DATE"
group by "SALES_DATE","PROD_NAME"
having (SUM("SALES_AMNT"),"SALES_DATE") IN (select
MAX(SUM_SALES),"SALES_DATE"
from (select SUM("SALES_AMNT")
as
SUM_SALES,"SALES_DATE","PROD_NAME"
from "KABIL"."DATE"
group by "SALES_DATE","PROD_NAME"
)
group by "SALES_DATE");

Which product is ordered most frequently?

SELECT ProductID
FROM OrderLine_T
GROUP BY ProductID
ORDER BY COUNT(ProductID) DESC
I'm ordering the products like this but LIMIT or ROWNUM is not functioning for some reason. I need to have a query with only the single most frequently ordered product. Im Using Teradata and the database name is db_pvfc10_big . Im sorry for the confusing question its my first question and im a beginner in using SQL
Thank you in advance
The LIMIT keyword is a MySQL-specific extension to the standard.
And ROWNUM is a pseudo column specific to Oracle.
So there are definitely "some reasons" that you might observe LIMIT and ROWNUM as "not functioning".
The question doesn't indicate which RDBMS is being used... MySQL, PostgreSQL, Oracle, SQL Server, DB2, Teradata, etc.
(NOTE: using "not functioning" as the only description of the behavior you observe is rather imprecise.
The description doesn't indicate whether the execution of the query is returning an error of some kind, or if the query is executing and returning a resultset that isn't expected.
The statement(s) you describe as "not functioning" aren't even shown.
One ANSI-standard SQL approach to getting a result is getting that "maximum" value using the standard MAX() aggregate. One way to do that is using an inline view. For example:
SELECT MAX(s.cnt) AS max_cnt
FROM ( SELECT COUNT(t.productid) AS cnt
FROM orderline_t t
GROUP BY t.productid
) s
That can also be used as an inline view...
SELECT MAX(q.productid)
FROM ( SELECT MAX(s.cnt) AS max_cnt
FROM ( SELECT COUNT(t.productid) AS cnt
FROM orderline_t t
GROUP BY t.productid
) s
) r
JOIN ( SELECT p.productid
, COUNT(p.productid) AS cnt
FROM orderline_t p
GROUP BY p.product_id
) q
ON q.cnt = r.max_cnt
Note that if there are two or more products that are ordered the same "maximum" number of times, this query will return just one of those productid.
This should work in most relational databases.
There are other query patterns that will return an equivalent result.
But this example should help explain why most RDBMS offer extensions to the SQL standard, which often make for simpler queries.
MySQL "... ORDER BY ... LIMIT 1"
SQL Server "SELECT TOP 1 ..."
etc.
You could try including the count(ProductID) in the select statement. Some sql databases use the keyword "top" instead of "limit". So if you're using one of those (like Teradata sql for example), do the following:
select top 1 ProductID, count(ProductID)
from OrderLine_T
group by ProductID
order by 2 desc;

SQL Nested Where with Sums

I've run into a syntax issue with SQL. What I'm trying to do here is add together all of the amounts paid on each order (paid each) an then only select those that are greater than sum of of paid each for a specific order# (1008). I've been trying to move around lots of different things here and I'm not having any luck.
This is what I have right now, though I've had many different things. Trying to use this simply returns an SQL statement not ended properly error. Any help you guys could give would be greatly appreciated. Do I have to use DISTINCT anywhere here?
SELECT ORDER#,
TO_CHAR(SUM(PAIDEACH), '$999.99') AS "Amount > Order 1008"
FROM ORDERITEMS
GROUP BY ORDER#
WHERE TO_CHAR > (SUM (PAIDEACH))
WHERE ORDER# = 1008;
Some versions of SQL regard the hash character (#) as the beginning of a comment. Others use double hyphen (--) and some use both. So, my first thought is that your ORDER# field is named incorrectly (though I can't imagine the engine would let you create a field with that name).
You have two WHERE keywords, which isn't allowed. If you have multiple WHERE conditions, you must link them together using boolean logic, with AND and OR keywords.
You have your WHERE condition after GROUP BY which should be reversed. Specify WHERE conditions before GROUP BY.
One of your WHERE conditions makes no sense. TO_CHAR > (SUM(paideach)): TO_CHAR() is a function which as far as I know is an Oracle function that converts numeric values to strings according to a specified format. The equivalent in SQL Server is CAST or CONVERT.
I'm guessing that you are trying to write a query that finds orders with amounts exceeding a particular value, but it's not very clear because one of your WHERE conditions specifies that the order number should be 1008, which would presumably only return one record.
The query should probably look more like this:
SELECT order,
SUM(paideach) AS amount
FROM orderitems
GROUP BY order
HAVING amount > 999.99;
This would select records from the orderitems table where the sum of paideach exceeds 999.99.
I'm not sure how order 1008 fits into things, so you will have to elaborate on that.
Other have commented on some of the things wrong with your query. I'll try to give more explicit hints about what I think you need to do to get the result I think you're looking for.
The problem seems to break into distinct sections, first finding the total for each order which you're close to and I think probably started from:
SELECT ORDER#, SUM(PAIDEACH) AS AMOUNT
FROM ORDERITEMS
GROUP BY ORDER#;
... finding the total for a specific order:
SELECT SUM(PAIDEACH)
FROM ORDERITEMS
WHERE ORDER# = 1008;
... and combining them, which is where you're stuck. The simplest way, and hopefully something you've recently been taught, is to use the HAVING clause, which comes after the GROUP BY and acts as a kind of filter that can be applied to the aggregated columns (which you can't do in the WHERE clause). If you had a fixed amount you could do this:
SELECT ORDER#, SUM(PAIDEACH) AS AMOUNT
FROM ORDERITEMS
GROUP BY ORDER#
HAVING SUM(PAIDEACH) > 5;
(Note that as #Bridge indicated you can't use the column alias, AMOUNT, in the having clause, you have to repeat the aggregation function SUM). But you don't have a fixed value, you want to use the actual total for order 1008, so you need to replace that fixed value with another query. I'll let you take that last step...
I'm not familiar with Oracle, and since it's homework I won't give you the answers, just a few ideas of what I think is wrong.
select statement should only have one where statement - can have more than one condition of course, just separated by logical operators (anything that evaluates to true will be included). E.g. : WHERE (column1 > column2) AND (column3 = 100)
Group by statements should after WHERE clauses
You can't refer to columns you've aliased in the select in the where clause of the same statement by their aliased name. For example this won't work:
SELECT column1 as hello
FROM table1
WHERE hello = 1
If there's a group by, the columns you're selecting should be the same as in that statement (or aggregates of those). This page does a better explanation of this than I do.