How to select MAX from AVG? - sql

I'm practicing for my SQL exam and I can't figure out the following question:
"Of the average amount paid per customer, show the highest amount."
So to retrieve the average amount paid, I would do the following:
SELECT AVG(Amount) AS 'Average amount paid'
FROM Payment;
Then I would like to retrieve the highest average amount out of this list of averages. I thought the following would do the trick:
SELECT MAX(AVG(Amount)) AS 'Highest average amount paid'
FROM Payment;
This doesn't seem to work. I get the following error:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
I would like some help with this. What is the correct way to approach this? Thank you in advance.

In SQL Server, you can order the records and use TOP 1 to keep only the record that has the highest amount:
SELECT TOP 1 Customer_id, AVG(Amount) AS [Average amount paid]
FROM Payment
GROUP BY customer_id
ORDER BY [Average amount paid] DESC;
Note: for this query to make sense, you need a GROUP BY clause. Without it, it would just return one record, with the average of payments within the whole table.

Try using a sub-query:
SELECT MAX(src.cust_avg) AS "Highest average amount paid"
FROM (
SELECT cust_id, AVG(Amount) AS cust_avg
FROM Payment
GROUP BY cust_id -- Get averages per customer
) src
;
To get the "per customer" averages first, you need include something like GROUP BY cust_id.
SQL Fiddle

Use order by:
select customer, avg(amount)
from payment
group by customer
order by avg(amount) desc
fetch first 1 row only;
The fetch first (although standard) is not supported by all databases, so you should use the version appropriate for your database.
In SQL Server, you would use either select top (1) or offset 0 fetch first 1 row only (the offset is not optional, alas).
There are also databases where avg() on an integer returns an integer. If amount is an integer and your database does this, then use avg(amount * 1.0).

Related

Delete duplicates using dense rank

I have a sales data table with cust_ids and their transaction dates.
I want to create a table that stores, for every customer, their cust_id, their last purchased date (on the basis of transaction dates) and the count of times they have purchased.
I wrote this code:
SELECT
cust_xref_id, txn_ts,
DENSE_RANK() OVER (PARTITION BY cust_xref_id ORDER BY CAST(txn_ts as timestamp) DESC) AS rank,
COUNT(txn_ts)
FROM
sales_data_table
But I understand that the above code would give an output like this (attached example picture)
How do I modify the code to get an output like :
I am a beginner in SQL queries and would really appreciate any help! :)
This would be an aggregation query which changes the table key from (customer_id, date) to (customer_id)
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(txn_ts) as count_purchase_dates
FROM
sales_data_table
GROUP BY
cust_xref_id
You are looking for last purchase date and count of distinct transaction dates ( like if a person buys twice, it should be considered as one single time).
Although you mentioned you want count of dates but sample data shows you want count of distinct dates - customer 284214 transacted 9 times but distinct will give you 7.
So, here is the SQL you can use to get your result.
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(distinct txn_ts) as count_purchase_dates -- Pls note distinct will count distinct dates
FROM sales_data_table
GROUP BY 1

Efficiently find last date in a table - Teradata SQL

Say I have a rather large table in a Teradata database, "Sales" that has a daily record for every sale and I want to write a SQL statement that limits this to the latest date only. This will not always be the previous day, for example, if it was a Monday the latest date would be the previous Friday.
I know I can get the results by the following:
SELECT s.*
FROM Sales s
JOIN (
SELECT MAX(SalesDate) as SalesDate
FROM Sales
) sd
ON s.SalesDate=sd.SalesDt
I am not knowledgable on how it would process the subquery and since Sales is a large table would there be a more efficient way to do this given there is not another table I could use?
Another (more flexible) way to get the top n utilizes OLAP-functions:
SELECT *
FROM Sales s
QUALIFY
RANK() OVER (ORDER BY SalesDate DESC) = 1
This will return all rows with the max date. If you want only one of them switch to ROW_NUMBER.
That is probably fine, if you have an index on salesdate.
If there is only one row, then I would recommend:
select top 1 s.*
from sales s
order by salesdate desc;
In particular, this should make use of an index on salesdate.
If there is more than one row, use top 1 with ties.

Aggregate function or the GROUP BY clause in SQL Server

I want to get the sum per day by its specified date, show the sum and the tenant name on it. It should be like this. Does it have any possible way to construct it right??
tenant_id tenant_name Total Amount
-----------------------------------
123 SAMPLE 37100
use both column in group by like below
group by tenant_id ,tenant_name
so your query will be
select s.tenant_id ,i.tenant_name,
sum(s.amount) as total
from sales_data s left join
Tenant_info i
on s.tenant_id=i.tenant_id
group by s.tenant_id ,i.tenant_name
Note: maximum db throwns error if you have not put the selection column in group by in times of using aggregate function

Averaging Grouped Data in Single SQL Statement Using Multiple Group Bys

I want to see the average cost of an item. First I am using a SUM statement and GROUP BY the manufacturing order and Item to see how much each item costs per manufacturing order (using WHERE statements to take out specific steps in the process). Then I want to average those to see how much the item costs on average based on that set, can I do this easily in one statement instead on creating a temp table?
You have to take result in temp table if you first want to sum the cost of an item per manufacture order and perform average on total cost per item achieved from sum. I hope I understood your problem statement clearly.
SELECT item, AVG(cost) FROM
(SELECT item, manufacture_order, SUM(COST) cost
FROM manufacture_order_tab
GROUP BY item, manufacture_order) tab1
GROUP BY item;
try this
SELECT AVG(Cost), SUM(COST)
FROM your_table
GROUP BY your_column

Whats the difference between these two SQL queries?

Question: Select the item and per unit price for each item in the items_ordered table. Hint: Divide the price by the quantity.
1.
select item, sum(price)/sum(quantity)
from items_ordered
group by item;
2.
select item, price/quantity
from items_ordered
group by item;
Have a look at the resultis for flashlights. First one shows average price correctly but 2nd one only takes 28/4 and shows 7, ignoring the 4.5 few rows down. Someone please explain why this is the case.
The used table data from an external website.
SUM() is a group function - so that essentially says go get me all the price and quantities by item, and add them all up to return them in one row.
MySQL is quite forgiving when grouping things and will try to retrieve a rowset (which is why your second example returns something - albeit wrong).
Generally, if you are GROUPing columns (items in your exmaple), you need to return one row per column (item).
Try running the SQL below to see what that looks like.
SELECT item
, SUM(price) AS sum_price
, SUM(quantity) AS sum_quantity
, COUNT(*) AS item_count
, SUM(price) / SUM(quantity) AS avg_price_per_quant
FROM items_ordered
GROUP BY item
ORDER BY item ASC
The first query returns the average price for that item, the second query returns the price for the first item it encounters. This only works in MySQL, the second query would error in SQL Server as no aggegrate function is used. See this post for more details Why does MySQL allow "group by" queries WITHOUT aggregate functions?.