SQL find the maximum value - sql

I have 2 tables:
Customer table
CustomerId
FirstName
LastName
Country
1
Luís
Gonçalves
Brazil
2
Leonie
Köhler
Germany
3
François
Tremblay
Canada
4
Bjørn
Hansen
Norway
52
Emma
Jones
United Kingdom
53
Phil
Hughes
United Kingdom
Invoice table
InvoiceId
CustomerId
Total
1
2
1.98
2
4
3.96
3
8
5.94
140
52
23.76
369
52
13.86
283
53
28.71
109
53
8.91
I have to write a query that returns the country along with the top customer and how much they spent. For countries where the top amount spent is shared, provide all customers who spent this amount.
I wrote a query like:
SELECT c.CustomerId, c.FirstName, c.LastName, c.Country, SUM(i.Total) AS TotalSpent
FROM Customer c
JOIN Invoice i
ON c.CustomerId = i.CustomerId
GROUP BY c.FirstName, c.LastName
HAVING i.Total >= MAX(i.Total)
ORDER BY c.Country;
the query is not finding the maximum values but it is returning all available values.
I am not sure about which DBMS is used as it is my first steps in SQL coding and above example is from Udacity learning platform lab (Maybe it is SQLite that they are using in the lab)
Any help is appreciated. Thank you!

You did not shared your database with us.
Also, you need to add expected results in your question from the data you provided.
But lets say you use SQLite then this would work I think:
select CustomerId
, FirstName
, LastName
, Country
, max(tot)
from ( select sum(i1.Total) as tot
, i1.CustomerId
, c1.Country
, c1.FirstName
, c1.LastName
FROM Customer c1
JOIN Invoice i1
ON c1.CustomerId = i1.CustomerId
group by i1.CustomerId) TAB
group by Country
DEMO
After the comment from the OP I have edited the code:
select c.CustomerId, c.FirstName, c.LastName, c.Country, sum(Total)
from Customer c
JOIN Invoice i ON c.CustomerId = i.CustomerId
group by country, c.CustomerId
having sum(Total) in (select max(tot) as tot2
from (select sum(i1.Total) as tot
, country
FROM Customer c1
JOIN Invoice i1
ON c1.CustomerId = i1.CustomerId
group by i1.CustomerId) TAB
group by country)
DEMO2

As the accepted (at the time of writing this answer) solution would fail on at least PostgreSQL (for not including a selected value in either the group by clause or an aggregate function) I provide another variant:
WITH t AS
(
SELECT
c.customerid AS i, c.last_name AS l, c.first_name AS f,
c.country AS c, SUM(i.total) AS s
FROM customer c JOIN invoice i ON c.customerid = i.customerid
GROUP BY c.customerid, c.country
-- note: last and first name do not need to be included in the group by clause
-- here as the (assumed!) primary key 'customerid' already is
)
SELECT c, l, f, s FROM t
WHERE s = (SELECT MAX(s) FROM t WHERE t.c = c)
(tested on PostgreSQL)

Below code worked fine to fulfill all the requirements:
WITH tab1 AS
( SELECT c.CustomerId, c.FirstName, c.LastName,c.Country, SUM(i.Total) as TotalSpent
FROM Customer AS c
JOIN Invoice i ON c.CustomerId = i.CustomerId
GROUP BY C.CustomerId
)
SELECT tab1.*
FROM tab1
JOIN
( SELECT CustomerId, FirstName, LastName, Country, MAX(TotalSpent) AS TotalSpent
FROM tab1
GROUP BY Country
)tab2
ON tab1.Country = tab2.Country
WHERE tab1.TotalSpent = tab2.TotalSpent
ORDER BY Country;

Related

For each IDcategory, get the maximum sum of customer amounts

I would like to find the total purchase for each customer then return the highest value by customer category.
For now, I'm just able to have the total purchase for each customer
SELECT c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM
Purchases AS p
join Customers AS c ON c.CustomerID = p.CustomerID
join Categories AS cat ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID, c.CustomerName,cat.CustomerCategoryName
ORDER BY TotalAmount DESC
The result set return a row for each CustomerID
CustomerID
CustomerName
CustomerCategoryName
TotalAmount
905
Sara Huiting
Supermarket
24093.60
155
Tailspin Toys
Novelty Shop
23579.50
473
Hilton
Hotel
23125.60
143
Jane Doe
Journalist
21915.50
518
Wingtip Toys
Novelty Shop
20362.40
489
Jason Black
Supermarket
20226.40
...
...
...
...
I have 6 categories:
Hotel
Journalist
Novelty Shop
Supermarket
Computer Store
Gift Store
I would like the highest "TotalAmount" for each "CustomerCategoryName", so that only 6 records are returned (instead of 500).
CustomerID
CustomerName
CustomerCategoryName
TotalAmount
905
Sara Huiting
Supermarket
24093.60
155
Tailspin Toys
Novelty Shop
23579.50
473
Hilton
Hotel
23125.60
143
Jane Doe
Journalist
21915.50
1018
Nils Kaulins
Computer Store
17019.00
866
Jay Bhuiyan
Gift Store
14251.50
How to improve my query to get this output?
You can use TOP(1) WITH TIES in combination with an ORDER BY clause on a ROW_NUMBER window function, that will assign ranking = 1 to all the highest "TotalAmount" values for each "CustomerCategoryName".
SELECT TOP(1) WITH TIES
c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM Purchases p
JOIN Customers c ON c.CustomerID = p.CustomerID
JOIN Categories cat ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName
ORDER BY ROW_NUMBER() OVER(PARTITION BY cat.CustomerCategoryName
ORDER BY SUM(p.Quantity*p.UnitPrice) DESC)
If you want to do this with just subqueries, and not with a CTE, you can do the following process:
Innermost query - Get all row values
Second query - Assign a row number for each row, partitioned by the CustomerCategoryName and ordered by TotalAmount
Final query only has where the RowRank is 1
You can probably optimize the innermost subquery by putting the RowRank in it, but without access to the table, I'm not entirely sure if the query plan will be any more efficient.
/*
Get final values where the RowRank = 1
*/
SELECT DISTINCT
final.CustomerID,
final.CustomerName,
final.CustomerCategoryName,
final.TotalAmount
FROM (
/*
Get the individual row rankings by TotalAmount DESC
*/
SELECT DISTINCT
data.CustomerID,
data.CustomerName,
data.CustomerCategoryName,
data.TotalAmount,
ROW_NUMBER() OVER (PARTITION BY data.CustomerCategoryName ORDER BY data.TotalAmount DESC) AS RowRank
FROM (
/*
Get all row values
*/
SELECT
c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity * p.UnitPrice) AS TotalAmount
FROM Purchases AS p
JOIN Customers AS c
ON c.CustomerID = p.CustomerID
JOIN Categories AS cat
ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName) data) final
WHERE final.RowRank = 1
ORDER BY final.TotalAmount DESC

How can i get all the MAX values from a certain column in a dataset in PostgreSQL

I'm asked to find the top user for different countries, however, one of the countries has 2 users with the same amount spent so they should both be the top users, but I can't get the max value for 2 values in this country.
Here is the code:
WITH t1 AS (
SELECT c.customerid,SUM(i.total) tot
FROM invoice i
JOIN customer c ON c.customerid = i.customerid
GROUP BY 1
ORDER BY 2 DESC
),
t2 AS (
SELECT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,MAX(t1.tot) as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
GROUP BY 4
ORDER BY 4
)
SELECT *
FROM t2
BILLINGCOUNTRY is in Invoice, and it has the name of all the countries.
TOTAL is also in invoice and it shows how much is spent for each purchase by Customer (so there are different fees and taxes for each purchase and total shows the final price payed by the user at each time)
Customer has id,name,last name and from its' ID I'm extracting the total of each of his purchases
MAX was used after finding the sum for each Customer and it was GROUPED BY country so that i could find the max for each country, however I can't seem to find the max of the last country that had 2 max values
Use rank() or dense_rank():
SELECT c.*, i.tot
FROM (SELECT i.customerid, i.billingCountry, SUM(i.total) as tot,
RANK() OVER (PARTITION BY i.billingCountry ORDER BY SUM(i.total) DESC) as seqnum
FROM invoice i
GROUP BY 1, 2
) i JOIN
customer c
ON c.customerid = i.customerid
WHERE seqnum = 1;
The subquery finds the amount per customer in each country -- and importantly calculates a ranking for the combination with ties having the same rank. The outer query just brings in the additional customer information that you seem to want.
here is how it worked for me since i was restricted from using many Commands such RIGHT JOIN and RANK() (As what Gordon Linoff suggessted) so i had to create a 3rd case for the anamoly and join it using union. this solution works only on this case, the good solution is the one posted by Gordon Linoff:
WITH t1 AS (
SELECT c.customerid,SUM(i.total) tot
FROM invoice i
JOIN customer c ON c.customerid = i.customerid
GROUP BY 1
ORDER BY 2 DESC
),
t2 AS (
SELECT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,MAX(t1.tot) as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
GROUP BY 4
ORDER BY 4
) ,
t3 AS (
SELECT DISTINCT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,t1.tot as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
WHERE i.billingcountry = 'United Kingdom'
ORDER BY t1.tot DESC
LIMIT 2
)
SELECT *
FROM t2
UNION
SELECT * FROM t3
ORDER BY t2.country

I need a solution to this SQL Query I'm trying to solve

"Write a query that determines the customer that has spent the most on
music for each country. Write a query that returns the country along
with the top customer and how much they spent. For countries where the
top amount spent is shared, provide all customers who spent this
amount.
You should only need to use the Customer and Invoice tables.
Check Your Solution
Though there are only 24 countries, your query should return 25 rows
because the United Kingdom has 2 customers that share the maximum."
You can find the data set here
.
Here is the code I tried with the results
And here is the expected outcome
Generally, you should always GROUP BY anything in your SELECT that is not an aggregation function (e.g. SUM). Try this:
SELECT c.CustomerId, c.FirstName, c.LastName, c.Country,
SUM(i.Total) AS TotalSpent
FROM Customer c
JOIN Invoice i
ON i.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.FirstName, c.LastName, c.Country
ORDER BY c.Country
WITH tab1 AS ( SELECT c.CustomerId, c.FirstName, c.LastName, c.Country, SUM(i.Total) TotalSpent FROM Customer c JOIN Invoice i ON c.CustomerId = i.CustomerId GROUP BY c.CustomerId ) SELECT tab1.* FROM tab1 left JOIN ( SELECT CustomerId, FirstName, LastName, Country, MAX(TotalSpent) AS TotalSpent FROM tab1 GROUP BY Country ) tab2 ON tab1.Country = tab2.Country WHERE tab1.TotalSpent = tab2.TotalSpent ORDER BY Country;

How return two equal max values for the same country when the query is grouped by the country?

For example I have to write a query that shows the customer who had spent the most in each country but if a country has two customers with same max value i have to show them both in the output.
I have wrote the query that return the maximum value for each customer in each country but the last country in my example which is 'United Kingdom' has two customers with same maximum values and i couldn't show them both.
SELECT c1.CustomerId, c1.FirstName,c1.LastName,c1.Country,
MAX(c1.TotalSpent) as TotalSpent
FROM
(SELECT c.CustomerId,c.FirstName, c.LastName,i.BillingCountry
Country, SUM(i.Total) totalspent
FROM Customer c
JOIN Invoice i
ON c.CustomerId = i.CustomerId
GROUP BY 1
ORDER BY totalspent
) c1
GROUP BY 4
ORDER BY Country
Use window functions!:
SELECT c.*
FROM (SELECT c.CustomerId, c.FirstName, c.LastName, i.BillingCountry as Country,
SUM(i.Total) as totalspent,
DENSE_RANK() OVER (PARTITION BY i.BillingCountry ORDER BY SUM(i.Total) DESC) as seqnum
FROM Customer c JOIN
Invoice i
ON c.CustomerId = i.CustomerId
GROUP BY c.CustomerId, c.FirstName, c.LastName, i.BillingCountry
) c
WHERE seqnum = 1
ORDER BY Country;
This also fixes your GROUP BY clauses so they are consistent with the columns being selected.

question about SQL query

I'm working on a small project involving oracle database,
and I have the following tables:
CUSTOMER ( Cid, CName, City, Discount )
PRODUCT ( Pid, PName, City, Quantity, Price )
ORDERS ( OrderNo, Month, Cid, Aid, Pid, OrderedQuantity, Cost )
How can retrieve the names of all customers who ordered all the products?
For example if customer x ordered product1, product2 and product3 (which are all the products the company offers) he will be selected. And if customer y only ordered product 1 and 2 but not 3 he will not be selected.
How can I achieve this?
You want "relational division".
select *
from customer c
where not exists( -- There are no product
select 'x'
from product p
where not exists( -- the customer did not buy
select 'x'
from orders o
where o.cid = c.cid
and o.pid = p.id));
or
select c.cid
,c.name
from customer c
join orders o using(cid)
group
by c.id
,c.name
having count(distinct o.pid) = (select count(*) from product);
Here is a great article by Joe Celko that shows several ways of implementing relational division (and variations): Divided We Stand: The SQL of Relational Division
You can use group by and use a having clause to demand that the customer has ordered all products there are:
select c.CName
from Customers c
join Orders o
on o.Cid = c.Cid
group by
c.Cid
, c.CName
having count(distinct o.Pid) = (select count(*) from products)
IMHO more readable than the "relational divison" approach, but less efficient.