Whats the difference between these two SQL queries? - sql

Question: Select the item and per unit price for each item in the items_ordered table. Hint: Divide the price by the quantity.
1.
select item, sum(price)/sum(quantity)
from items_ordered
group by item;
2.
select item, price/quantity
from items_ordered
group by item;
Have a look at the resultis for flashlights. First one shows average price correctly but 2nd one only takes 28/4 and shows 7, ignoring the 4.5 few rows down. Someone please explain why this is the case.
The used table data from an external website.

SUM() is a group function - so that essentially says go get me all the price and quantities by item, and add them all up to return them in one row.
MySQL is quite forgiving when grouping things and will try to retrieve a rowset (which is why your second example returns something - albeit wrong).
Generally, if you are GROUPing columns (items in your exmaple), you need to return one row per column (item).
Try running the SQL below to see what that looks like.
SELECT item
, SUM(price) AS sum_price
, SUM(quantity) AS sum_quantity
, COUNT(*) AS item_count
, SUM(price) / SUM(quantity) AS avg_price_per_quant
FROM items_ordered
GROUP BY item
ORDER BY item ASC

The first query returns the average price for that item, the second query returns the price for the first item it encounters. This only works in MySQL, the second query would error in SQL Server as no aggegrate function is used. See this post for more details Why does MySQL allow "group by" queries WITHOUT aggregate functions?.

Related

Aggregate my quantity sum in a way that doesn't lead to the storeID repeating?

I am writing a SQL query that needs to show the total number of orders from each store. The issue I am running into, is that while I can figure out how to sum the orders by product and each product is only sold by one store, I can't figure out how to total the orders by store alone
This is the code I currently have
SELECT storeID AS [STORE], Product_ID
, SUM(quantity) AS [ORDERS BY STORE]
FROM Fulfillment, Store
GROUP BY storeID, Product_ID;
This line of code leads to a repeat of storeID in the results, where ideally, I would only want storeID to be included in the results once with the total quantity of all of Product_ID being included. I tried to remove Product_ID from the GROUP BY statement, but this resulted in the following error
Column 'Fulfillment.Product_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I'm new to SQL and am trying to learn, so any help and advice is greatly appreciated
#ZLK is correct that if your goal is a total number of units ordered ("quantity") of any product, simply remove the [product_id] column from the SELECT and GROUP BY.
However, it appears that you're referencing two tables ("FROM Fulfillment, Store") and not specifying how those tables are joined, creating a cartesian join - all rows in one table will be joined to all rows in the other table. If the [storeID] and [quantity] fields are available in the Fulfillment table, I recommend removing the Store table reference from the FROM clause (so "FROM Fulfillment" alone).
One last note: You mention that you want to count "orders". In some circumstances, an order may have multiple products and a quantity > 1. If your goal is the total number of "orders" regardless of the number of products or quantity of products on an order, you'll want to use "COUNT(DISTINCT orderID) as [Orders]" (where "orderID" is the reference to the unique order number).

How to work past "At most one record can be returned by this subquery"

I'm having trouble understanding this error through all the researching I have done. I have the following query
SELECT M.[PO Concatenate], Sum(M.SumofAward) AS TotalAward, (SELECT TOP 1 M1.[Material Group] FROM
[MGETCpreMG] AS M1 WHERE M1.[PO Concatenate]=M.[PO Concatenate] ORDER BY M1.SumofAward DESC) AS TopGroup
FROM MGETCpreMG AS M
GROUP BY M.[PO Concatenate];
For a brief instance it reviews the results I want, but then the "At most one record can be returned by this subquery" error comes and wipes all the data to #Name?
For context, [MGETCpreMG] is a query off a main table [MG ETC] that was used to consolidate Award for differing Material Groups on a PO transaction ([PO Concatenate])
SELECT [MG ETC].[PO Concatenate], Sum([MG ETC].Award) AS SumOfAward, [MG ETC].[Material Group]
FROM [MG ETC]
GROUP BY [MG ETC].[PO Concatenate], [MG ETC].[Material Group]
ORDER BY [MG ETC].[PO Concatenate];
I'm thinking it lies in my inability to understand how to utilize a subquery.
In the case in which the query can return more then one value? Simply add an additonal sort by.
So, a common sub query might be to get the last invoice. So you might have:
select ID, CompanyName,
(SELECT TOP 1 InvoiceDate from tblInvoice
where tblInvoice.CustomerID = tblCompany.ID
Order by InvoiceDate DESC)
As LastInvoiceDate
From tblCustomers
Now the above might work for some time, but then it will blow up since you might have two invoices for the same day!
So, all you have to do is add that extra order by clause - say on the PK of the child table like this:
Order by InvoiceDate DESC,ID DESC)
So top 1 will respect the "additional" order columns you add, and thus only ever return one row - even if there are multiple values that match the top 1 column.
I suppose in the above we could perhaps forget the invoiceDate and always take the top most last autonumber ID, but for a lot of queries, you can't always be sure - it might be we want the last most expensive invoice amount. And again, if the max value (top) was the same for two large invoice amounts, then again two rows could be return. So, simply add the extra ORDER BY clause with an 2nd column that further orders the data. And thus top 1 will only pull the first value. Your example of a top group is such an example. Just tack on the extra order by "ID" or whatever the auto number ID column is.

Averaging Grouped Data in Single SQL Statement Using Multiple Group Bys

I want to see the average cost of an item. First I am using a SUM statement and GROUP BY the manufacturing order and Item to see how much each item costs per manufacturing order (using WHERE statements to take out specific steps in the process). Then I want to average those to see how much the item costs on average based on that set, can I do this easily in one statement instead on creating a temp table?
You have to take result in temp table if you first want to sum the cost of an item per manufacture order and perform average on total cost per item achieved from sum. I hope I understood your problem statement clearly.
SELECT item, AVG(cost) FROM
(SELECT item, manufacture_order, SUM(COST) cost
FROM manufacture_order_tab
GROUP BY item, manufacture_order) tab1
GROUP BY item;
try this
SELECT AVG(Cost), SUM(COST)
FROM your_table
GROUP BY your_column

SQL SUM function with added

total novice here with SQL SUM function question. So, SUM function itself works as I expected it to:
select ID, sum(amount)
from table1
group by ID
There are several records for each ID and my goal is to summarize each ID on one row where the next column would give me the summarized amount of column AMOUNT.
This works fine, however I also need to filter out based on certain criteria in the summarized amount field. I.e. only look for results where the summarized amount is either bigger, smaller or between certain number.
This is the part I'm struggling with, as I can't seem to use column AMOUNT, as this messes up summarizing results.
Column name for summarized results is shown as "00002", however using this in the between or > / < clause does not work either. Tried this:
select ID, sum(amount)
from table1
where 00002 > 1000
group by ID
No error message, just blank result, however plenty of summarized results with values over 1000.
Unfortunately not sure on the engine the database runs on, however it should be some IBM-based product.
The WHERE clause will filter individual rows that don't match the condition before aggregating them.
If you want to do post aggregation filtering you need to use the HAVING Clause.
HAVING will apply the filter to the results after being grouped.
select ID, sum(amount)
from table1
group by ID
having sum(amount) > 1000

MySQL Single Row Returned From Temporary Table

I am running the following queries against a database:
CREATE TEMPORARY TABLE med_error_third_party_tmp
SELECT `med_error_category`.description AS category, `med_error_third_party_category`.error_count AS error_count
FROM
`med_error_category` INNER JOIN `med_error_third_party_category` ON med_error_category.`id` = `med_error_third_party_category`.`category`
WHERE
year = 2003
GROUP BY `med_error_category`.id;
The only problem is that when I create the temporary table and do a select * on it then it returns multiple rows, but the query above only returns one row. It seems to always return a single row unless I specify a GROUP BY, but then it returns a percentage of 1.0 like it should with a GROUP BY.
SELECT category,
error_count/SUM(error_count) AS percentage
FROM med_error_third_party_tmp;
Here are the server specs:
Server version: 5.0.77
Protocol version: 10
Server: Localhost via UNIX socket
Does anybody see a problem with this that is causing the problem?
Standard SQL requires you to specify a GROUP BY clause if any column is not wrapped in an aggregate function (IE: MIN, MAX, COUNT, SUM, AVG, etc), but MySQL supports "hidden columns in the GROUP BY" -- which is why:
SELECT category,
error_count/SUM(error_count) AS percentage
FROM med_error_third_party_tmp;
...runs without error. The problem with the functionality is that because there's no GROUP BY, the SUM is the SUM of the error_count column for the entire table. But the other column values are completely arbitrary - they can't be relied upon.
This:
SELECT category,
error_count/(SELECT SUM(error_count)
FROM med_error_third_party_tmp) AS percentage
FROM med_error_third_party_tmp;
...will give you a percentage on a per row basis -- category values will be duplicated because there's no grouping.
This:
SELECT category,
SUM(error_count)/x.total AS percentage
FROM med_error_third_party_tmp
JOIN (SELECT SUM(error_count) AS total
FROM med_error_third_party_tmp) x
GROUP BY category
...will gives you a percentage per category of the sum of the categories error_count values vs the sum of the error_count values for the entire table.
another way to do it - without the temp table as seperate item...
select category, error_count/sum(error_count) "Percentage"
from (SELECT mec.description category
, metpc.error_count
FROM med_error_category mec
, med_error_third_party_category metpc
WHERE mec.id = metpc.category
AND year = 2003
GROUP BY mec.id
);
i think you will notice that the percentage is unchanging over the categories. This is probably not what you want - you probably want to group the errors by category as well.