postgresql group by and inner join - sql

I want a query in SQL which does INNER JOIN and GROUP BY at the same time. I tried the following which doesn't work:
SELECT customer.first_name, SUM(payment.amount)
FROM customer
GROUP BY customer.customer_id
INNER JOIN payment
ON payment.customer_id = customer.customer_id;
Thank you in advance!

First, GROUP BY comes at the end of the query (just before order by or having clauses if you have some).
Then, all fields in the select which are not in an aggregation function must be in the group by clause.
so
SELECT customer.first_name, SUM(payment.amount)
FROM customer
INNER JOIN payment
ON payment.customer_id = customer.customer_id
GROUP BY customer.first_name;
But customers with same first_name will be grouped, which is probably not really what you want.
so rather
SELECT customer.first_name, SUM(payment.amount)
FROM customer
INNER JOIN payment
ON payment.customer_id = customer.customer_id
GROUP BY customer.first_name, customer.customer_id;

You want to group by the customer_id, but get the first_name?
SELECT customer.first_name, SUM(payment.amount)
FROM customer
INNER JOIN payment
ON payment.customer_id = customer.customer_id
GROUP BY customer.customer_id, customer.first_name;
You might also do the aggregation in a Derived Table, then you can get additional columns from customer:
SELECT customer.first_name, SumPayment
FROM customer
INNER JOIN
(
SELECT customer_id,
SUM(payment.amount) AS SumPayment
FROM payment
GROUP BY customer_id
) AS payment
ON payment.customer_id = customer.customer_id

Related

Order customers based on the purchases sum

I have this SQL
SELECT customers.first_name
FROM customers
INNER JOIN orders ON customer.id = orders.customer_id
GROUP BY first_name
HAVING SUM(orders.price) > 100;
But I want all customers to be listed in a table from the highest purchase price of their order to the lowest.
You can use next simple ORDER BY:
SELECT customers.first_name, SUM(orders.price) orders_price
FROM customers
INNER JOIN orders ON customers.id = orders.customer_id
GROUP BY first_name
ORDER BY orders_price DESC;
MySQL order by fiddfe
Also you can use LEFT JOIN and COALESCE function for select customers without orders:
SELECT
customers.first_name,
COALESCE(SUM(orders.price), 0) orders_price FROM customers
LEFT JOIN orders ON customers.id = orders.customer_id
GROUP BY first_name
ORDER BY orders_price DESC;
MySQL LEFT JOIN & COALESCE

Group by and inner join: how to select joined without a "max" trick

Here is a simple query:
SELECT orders.id, customers.name, COUNT(order_product.id)
FROM orders
INNER JOIN order_product ON orders.id = order_product.order_id
INNER JOIN customers ON orders.customer_id = customers.id
GROUP BY orders.id;
In other words, I want:
The ID of an order.
The number of products (count) in each order.
The customer name of the order.
The problem is about selecting customers.name. I cannot select it directly because it's not in aggregate function nor group by. But there is only one, so I d'ont know why I have to aggregate it. I can do a trick like this to select its name:
SELECT MAX(customers.name)
But I think it's dirty, because I don't want the "max name of a customer for an order" but "the name of the customer for an order". What is the elegant way to do such a thing?
Hope it's clear and not a duplicate.
EDIT: an order have only one customer identified by orders.customer_id. That's why I asking why I have to do such a trick.
Add customers.name to the GROUP BY clause:
SELECT orders.id, customers.name, COUNT(order_product.id)
FROM orders
INNER JOIN order_product ON orders.id = order_product.order_id
INNER JOIN customers ON orders.customer_id = customers.id
GROUP BY orders.id, customers.name
Usually you can simply group by all selected columns that are not arguments to set functions!
Alternatively, you could use window functions
SELECT DISTINCT orders.id, customers.name, COUNT(order_product.id) OVER ( PARTITION BY orders.id)
FROM orders
INNER JOIN products ON orders.id = order_product.order_id
INNER JOIN customers ON orders.customer_id = customers.id;

How to use column from main query in subquery?

I have a table Invoice with a column total. Then I have a table Payments with a column amount (Usually there are several payments to one invoice). I need a column balance which is the difference of Invoice.Total - (total of payments made on that invoice). This is what I have (Oh ya using Azure Sql Server)
select I.Invoice_Id,
I.Total - (select sum(Amount) from Payments P
where I.Invoice_Id = P.Invoice_Id) as Balance,
Q.Quote_Id,
Q.Description,
Q.Vendor_Num
from Invoice as I
inner join Payments as P on I.Invoice_Id = P.Invoice_Id
inner join Quote as Q on Q.Quote_Id = I.Quote_Id;
Eventually this will be a view showing what invoices have balance owed. If I remove the where in the sub query it gives me an answer but it is the sum of all payments. I just want the sum of payments made on that invoice. Any help would be appreciated.
Thanks
There's two approaches to this. You could either subquery or group by. If you're doing a subquery you don't need the table in the main query. Also the inner join to Payments would mean that invoices without payment would not be returned by the query. Changing it to a left outer join in the Group By example will return NULL rows when the I.Invoice_Id = P.Invoice_Id is not met.
Group By:
SELECT I.Invoice_Id,
I.Total - sum(ISNULL(P.Amount,0)) AS Balance,
Q.Quote_Id,
Q.Description,
Q.Vendor_Num
FROM Invoice AS I
JOIN Quote AS Q on Q.Quote_Id = I.Quote_Id
LEFT JOIN Payments AS P on I.Invoice_Id = P.Invoice_Id
GROUP BY I.Invoice_Id, I.Total, Q.Quote_Id, Q.Description, Q.Vendor_Num
Subquery:
SELECT I.Invoice_Id,
I.Total - (SELECT ISNULL(SUM(Amount),0) FROM Payments P WHERE P.Invoice_Id = I.Invoice_Id) AS Balance,
Q.Quote_Id,
Q.Description,
Q.Vendor_Num
FROM Invoice AS I
JOIN Quote AS Q on Q.Quote_Id = I.Quote_Id
I suspect your query is returning multiple results (duplicates per payment) since you are joining on the payments table.
One option would be to just remove that join to the payments table. Here's an alternative option which moves the correlated subquery to a join:
select I.Invoice_Id,
I.Total - p.SumAmount as Balance,
Q.Quote_Id,
Q.Description,
Q.Vendor_Num
from Invoice as I
inner join Quote as Q on Q.Quote_Id = I.Quote_Id;
inner join (
select invoice_id, sum(amount) SumAmount
from Payments
group by invoice_id) as P on I.Invoice_Id = P.Invoice_Id

SQL Server Aggregation using Max() and obtaining details from max() line.

I just took a final exam and one of the questions asked me to double join three tables, and report the max sale payout for each salesperson.
The tables have the following variables:
Salesperson(id, name)
Order(orderid, order_date, Cust_id, Saleperson_id, amount)
Customer(id, name)
After joining:
select salesperson.Name, Orders.Number, customer.Name, Orders.Amount
from Orders
join salesperson
on orders.Salesperson_id = salesperson.ID
join Customer
on customer.ID = orders.cust_id
What the instructed wanted was for me to find each salesperson's maximum sell (as found by order.amount). He also wanted the salesperson (salesperson.name), the order number of the max sale (orders.number), the customer the sale was with (customer.name), and the max sale amount. What is the most efficient way to do this problem? I have tried to use "group by salesperson.name", but I cannot because the orders.number and customer.name are never held in the aggregation.
I finished the problem this way
select
salesperson.name as Sales_Person,
orders.number as Order_Number,
customer.Name as Customer_Name,
orders.Amount as Sale_Amount
from salesperson
left join Orders
on salesperson.ID = orders.Salesperson_id
left join Customer
on orders.cust_id = customer.ID
where orders.Amount in (select max(orders.Amount)
from salesperson
join Orders
on salesperson.ID = orders.Salesperson_id
join Customer
on orders.cust_id = customer.ID
group by salesperson.name)
I know this is a bad way to do it. For instance, what if two different salesperson's max sale was equivalent? Max and min are not like count and sum because it is picking out one line from a aggregation, but the rules still apply. Also, you might notices that there is no real unique identifier in the joined table other than order.number which is not useful. Therefore, I would have to use some composite of salesperson.name and order.number.
Also, what do I do if I have to pick the top three sales for each salesperson? Should such an output be totally different code-wise than what would be required from just the first sale?
I keep bumping me head against this problem, and I would love to have a more professional approach to this problem.
SELECT
M.max_amount,
S.Name,
O.Number,
C.Name
FROM orders O
JOIN salesperson S
ON S.Salesperson_id = O.Salesperson_id
JOIN customer C
ON C.Customer_id = O.Customer_id
JOIN (
SELECT Salesperson_id, MAX(amount) max_amount
FROM Order
GROUP BY Salesperson_id
) M
ON M.Salesperson_id = O.Salesperson_id AND M.max_amount = O.amount
For the top 3:
SELECT
M.Amount,
S.Name,
O.Number,
C.Name
FROM orders O
JOIN salesperson S
ON S.Salesperson_id = O.Salesperson_id
JOIN customer C
ON C.Customer_id = O.Customer_id
CROSS APPLY (
SELECT TOP 3 Amount
FROM Order
WHERE Salesperson_id = O.Salesperson_id
ORDER BY Amount DESC
) M

SQL Retrieve Names Based on Multiple Tables

So I have three tables. CUSTOMER(CustomerID, LastName, FirstName), PURCHASE(PurchaseID, ItemName), and TRANSACTION(CustomerID, PurchaseID, Date).
The problem I am having is that I need to get the full name of the customers who specifically buy the both items "Paint" and "Books" but when I run my code nothing comes up. Here is what I have:
SELECT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER, PURCHASE
WHERE PURCHASE.Item = 'Paint' AND PURCHASE.Item = 'Books'
GROUP BY CUSTOMER.LastName, CUSTOMER.FirstName;
Please help, I am really new to this and would really like some help.
This type of problem is called Relational Division.
SELECT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER
INNER JOIN TRANSACTION
ON CUSTOMER.CustomerID = TRANSACTION.CustomerID
INNER JOIN PURCHASE
ON TRANSACTION.PurchaseID = PURCHASE.PurchaseID
WHERE PURCHASE.Item IN ('Paint', 'Books') -- list all items here
GROUP BY CUSTOMER.LastName, CUSTOMER.FirstName
HAVING COUNT(DISTINCT PURCHASE.Item) = 2 -- the total number of items searched
SQL of Relational Division
if there is a UNIQUE constraint that was enforced for every ItemName on each transaction, you can use *
SELECT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER
INNER JOIN TRANSACTION
ON CUSTOMER.CustomerID = TRANSACTION.CustomerID
INNER JOIN PURCHASE
ON TRANSACTION.PurchaseID = PURCHASE.PurchaseID
WHERE PURCHASE.Item IN ('Paint', 'Books')
GROUP BY CUSTOMER.LastName, CUSTOMER.FirstName
HAVING COUNT(*) = 2
NO CONNECTION..
SELECT DISTINCT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER A
JOIN TRANSACTION B
ON A.CUSTOMERID=B.CUSTOMERID
JOIN PURCHASE C
ON B.PURCHASEID=C.PURCHASEID AND C.ITEM='Paint'
JOIN PURCHASE D
ON B.PURCHASEID=D.PURCHASEID AND D.C.Item = 'Books'
SELECT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER C, PURCHASE P, TRANSACTION T
WHERE
C.CUSTOMERID = T.CUSTOMERID AND T.PurchaseID = P.PurchaseID
AND P.Item IN ('Paint','Books')
This is pseudo code. Try it out by yourselves.
You need to JOIN tables. Try
SELECT CUSTOMER.FirstName, CUSTOMER.LastName
FROM CUSTOMER
INNER JOIN TRANSACTION ON TRANSACTION.CustomerID=CUSTOMER.CustomerID
INNER JOIN PURCHASE ON PURCHASE.PurchaseID=TRANSACTION.PurchaseID
WHERE PURCHASE.Item = 'Paint' OR PURCHASE.Item = 'Books'
GROUP BY CUSTOMER.LastName, CUSTOMER.FirstName
HAVING COUNT(DISTINCT PURCHASE.Item) >= 2;