SQL rewrite to optimize - sql

I'm trying to optimize or change the SQL to work with inner joins rather than independent calls
Database: one invoice can have many payment records and order (products) records
Original:
SELECT
InvoiceNum,
(SELECT SUM(Orders.Cost) FROM Orders WHERE Orders.Invoice = InvoiceNum and Orders.Returned <> 1 GROUP BY Orders.Invoice) as vat_only,
(SELECT SUM(Orders.Vat) FROM Orders WHERE Orders.Invoice = InvoiceNum and Orders.Returned <> 1 GROUP BY Orders.Invoice) as sales_prevat,
(SELECT SUM(pay.Amount) FROM Payments as pay WHERE Invoices.InvoiceNum = pay.InvoiceNum ) as income
FROM
Invoices
WHERE
InvoiceYear = currentyear
I'm sure we can do this another way by grouping and joining tables together. When I tried the SQL statement below, I wasn't getting the same amount (count) of records...I'm thinking in respect to the type of join or where it joins !! but still couldn't get it working after 3 hrs of looking on the screen..
So far I got to...
SELECT
Invoices.InvoiceNum,
Sum(Orders.Cost) AS SumOfCost,
Sum(Orders.VAT) AS SumOfVAT,
SUM(distinct Payments.Amount) as money
FROM
Invoices
LEFT JOIN
Orders ON Orders.Invoice = Invoices.InvoiceNum
LEFT JOIN
Payments ON Invoices.InvoiceNum = Payments.InvoiceNum
WHERE
Invoices.InvoiceYear = 11
AND Orders.Returned <> 1
GROUP BY
Invoices.InvoiceNum
Sorry for the bad english and I'm not sure what to search for to find if it's already been answered here :D
Thanks in advance for all the help

Your problem is that an order has multiple lines for an invoice and it has multiple payments on an invoice (sometimes). This causes a cross product effect for a given order. You fix this by pre-summarizing the tables.
A related problem is that the join will fail if there are no payments, so you need left outer join.
select i.InvoiceNum, osum.cost, osum.vat, p.income
from Invoice i left outer join
(select o.Invoice, sum(o.Cost) as cost, sum(o.vat) as vat
from orders o
where Returned <> 1
group by o.Invoice
) osum
on osum.Invoice = i.InvoiceNum left outer join
(select p.InvoiceNum, sum(pay.Amount) as income
from Payments p
group by p.InvoiceNum
) psum
on psum.InvoiceNum = i.InvoiceNum
where i.InvoiceYear = year(getdate())
Two comments: Is the key field for orders really Invoice or is it also InvoiceNum? Also, do you have a field Invoice.InvoiceYear? Or do you want year(i.InvoiceDate) in the where clause?

Assuming that both payments and orders can contain more than one record per invoice you will need to do your aggregates in a subquery to avoid cross joining:
SELECT Invoices.InvoiceNum, o.Cost, o.VAT, p.Amount
FROM Invoices
LEFT JOIN
( SELECT Invoice, Cost = SUM(Cost), VAT = SUM(VAT)
FROM Orders
WHERE Orders.Returned <> 1
GROUP BY Invoice
) o
ON o.Invoice = Invoices.InvoiceNum
LEFT JOIN
( SELECT InvoiceNum, Amount = SUM(Amount)
FROM Payments
GROUP BY InvoiceNum
) P
ON P.InvoiceNum = Invoices.InvoiceNum
WHERE Invoices.InvoiceYear = 11;
ADDENDUM
To expand on the CROSS JOIN comment, imagine this data for an Invoice (1)
Orders
Invoice Cost VAT
1 15.00 3.00
1 10.00 2.00
Payments
InvoiceNum Amount
1 15.00
1 10.00
When you join these tables as you did:
SELECT Orders.*, Payments.Amount
FROM Invoices
LEFT JOIN Orders
ON Orders.Invoice = Invoices.InvoiceNum
LEFT JOIN Payments
ON Invoices.InvoiceNum = Payments.InvoiceNum;
You end up with:
Orders.Invoice Orders.Cost Orders.Vat Payments.Amount
1 15.00 3.00 15.00
1 10.00 2.00 15.00
1 15.00 3.00 10.00
1 10.00 2.00 10.00
i.e. every combination of payments/orders, so for each invoice you would get many more rows than required, which distorts your totals. So even though the original data had £25 of payments, this doubles to £50 because of the two records in the order table. This is why each table needs to be aggregated individually, using DISTINCT would not work in the case there was more than one payment/order for the same amount on a single invoice.
One final point with regard to optimisation, you should probably index your tables, If you run the query and display the actual execution plan SSMS will suggest indexes for you, but at a guess the following should improve the performance:
CREATE NONCLUSTERED INDEX IX_Orders_InvoiceNum ON Orders (Invoice) INCLUDE(Cost, VAT, Returned);
CREATE NONCLUSTERED INDEX IX_Payments_InvoiceNum ON Payments (InvoiceNum) INCLUDE(Amount);
This should allow both subqueries to only use the index on each table, with no bookmark loopup/clustered index scan required.

Try this, note that I haven't tested it, just wipped it out on notepad. If any of your invoices may not exist in any of the subtables, then use LEFT JOIN
SELECT InvoiceNum, vat_only, sales_prevat, income
FROM Invoices i
INNER JOIN (SELECT Invoice, SUM(Cost) [vat_only], SUM(Vat) [sales_prevat]
FROM Orders
WHERE Returned <> 1
GROUP BY Invoice) o
ON i.InvoiceNum = o.Invoice
INNER JOIN (SELECT SUM(Amount) [income]
FROM Payments) p
ON i.InvoiceNum = p.InvoiceNum
WHERE i.InvoiceYear = currentyear

select
PreQuery.InvoiceNum,
PreQuery.VAT_Only,
PreQuery.Sales_Prevat,
SUM( Pay.Amount ) as Income
from
( select
I.InvoiceNum,
SUM( O.Cost ) as VAT_Only,
SUM( O.Vat ) as sales_prevat
from
Invoice I
Join Orders O
on I.InvoiceNum = O.Invoice
AND O.Returned <> 1
where
I.InvoiceYear = currentYear
group by
I.InvoiceNum ) PreQuery
JOIN Payments Pay
on PreQuery.InvoiceNum = Pay.InvoiceNum
group by
PreQuery.InvoiceNum,
PreQuery.VAT_Only,
PreQuery.Sales_Prevat
Your "currentYear" reference could be parameterized or you can use from getting the current date from sql function such as
Year( GetDate() )

Related

Sum from two tables and compare with a value from third table

I'm trying to do something which I believe is very simple, but can't figure it out in SQL Statement.
The tables
Invoices (column - GrossAmount)
Receipts (column - ReceiptValue, there could be a receipt or no receipt at all)
Credit notes (column - GrossCredit, there could be a credit note or none)
I want to show the total outstanding invoices, i.e., show all invoices where Invoices.GrossAmount > (sum(Receipt.ReceiptValue) + sum(CreditNotes.GrossCredit)).
Query needs to show all the invoices which are not fully paid or not paid at all.
InvoiceId is same in all tables as foreign key.
Using MS SQL Server 2014.
You need to sum each table individually (grouped by invoice) and then [left] join the results:
SELECT i.InvoiceId
FROM invoices i
LEFT JOIN (SELECT InvoiceId, SUM(ReceiptValue) AS sum_receipt
FROM receipts
GROUP BY InvoiceId) r ON i.InvoiceId = r.InvoiceId
LEFT JOIN (SELECT InvoiceId, SUM(GrossCredit) AS sum_credit
FROM credit
GROUP BY InvoiceId) g ON i.InvoiceId = g.InvoiceId
WHERE i.GrossAmount > COALESCE(sum_receipt, 0) + COALESCE(sum_credit, 0)
I think you want something like this:
select i.*,
coalesce(r.sumrv, 0) as receiptValue,
coalesce(c.sumgc, 0) as grossCredits
from invoices i left join
(select invoiceId, sum(receiptvalue) as sumrv
from receipts
group by invoiceId
) r
on i.invoiceId = r.invoiceId left join
(select invoiceId, sum(grosscredit) as sumgc
from credits c
group by invoiceId
) c
on i.invoiceId = c.invoiceId
where i.GrossAmount > coalesce(r.sumrv, 0) + coalesce(c.sumgc, 0);
Three important things:
Use left join so you don't drop invoices with no matching records in one or both of the tables.
Use coalesce() so NULL values are treated as 0.
Do the aggregations before joining the tables.

Optimized query for a subquery in sql

I made a query to get the inventory of products as follows:
Select
b.ProductID, c.ProductName,
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID) as Sold,
(Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) as Stocks,
((Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) -
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID)) as RemainingStock
from
InvoiceDetails a
Right join
PurchaseOrderDetails b on a.ProductID = b.ProductID
Inner join
Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
This query returns the data that I want, and it runs fine in my desktop, but when I deploy the application that runs this query on a lower specs laptop, it is really slow and causes the laptop to hang. I need some help on how to optimize the query or maybe change it to make it more efficient... thanks in advance
This are the data of my InvoiceDetails table
Data From my PurchaseOrderDetails table
Data from Products table
So I've taken out your subqueries in the select, I don't think these were necessary at all. I've also moved around your joins and given better aliases to the tables;
SELECT
b.ProductID,
c.ProductName,
ISNULL(SUM(id.Qty),0) as Sold,
ISNULL(SUM(pod.QtyReceive),0) as Stocks,
ISNULL(SUM(pod.QtyReceive),0) - ISNULL(SUM(id.Qty),0) as RemainingStock
FROM PurchaseOrderDetails pod
INNER JOIN Products pr
ON pr.ProductID = pod.ProductID
LEFT JOIN InvoiceDetails id
ON id.ProductID = pod.ProductID
GROUP BY
pod.ProductID, pr.ProductName
You were already joining those two tables so you don't need subqueries in the select at all. I've also wrapped the SUM in ISNULL to ensure there are no NULL errors.
I'd suggest using the SET STATISTICS TIME,IO ON at the beginning of your code (with an OFF command at the end). Then copy all of the text from your 'messages' tab into statisticsparser.com. Do this for both queries and compare, check the total CPU time and the logical reads, you want these both lower for better performance. I'm betting your logical reads will drop significantly with this new query.
EDIT
OK, I've put together a new query based upon your sample data. I've only used the fields that we actually need for this query so that it's simpler for this example.
Sample Data
CREATE TABLE #InvoiceDetails (ProductID int, Qty int)
INSERT INTO #InvoiceDetails (ProductID,Qty)
VALUES (3,50),(1,0),(2,1),(1,12),(2,1),(3,1),(1,1),(2,1),(1,1),(2,1)
CREATE TABLE #PurchaseOrderDetails (ProductID int, Qty int)
INSERT INTO #PurchaseOrderDetails (ProductID, Qty)
VALUES (1,100),(2,20),(4,10),(1,12),(5,12),(4,12),(3,12),(2,20),(3,20),(4,20),(5,20)
CREATE TABLE #Products (ProductID int, ProductName varchar(20))
INSERT INTO #Products (ProductID, ProductName)
VALUES (1,'Sample Product'),(2,'DYE INK CYAN'),(3,'test Product 1'),(4,'test Product 2'),(5,'test Product 3'),(1004,'TESTING PRODUCT')
For this, here is the output of your original query
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
This is the re-written query that I've used. Note, there are no subqueries within the SELECT statement, they're within the joins as they should be. Also see that as we're aggregating in the subqueries we don't need to do this in the outer query too.
SELECT
pod.ProductID,
pr.ProductName,
ISNULL(id.Qty,0) as Sold,
ISNULL(pod.Qty,0) as Stocks,
ISNULL(pod.Qty,0) - ISNULL(id.Qty,0) as RemainingStock
FROM #Products pr
INNER JOIN (SELECT ProductID, SUM(Qty) Qty FROM #PurchaseOrderDetails GROUP BY ProductID) pod
ON pr.ProductID = pod.ProductID
LEFT JOIN (SELECT ProductID, SUM(Qty) Qty FROM #InvoiceDetails GROUP BY ProductID) id
ON id.ProductID = pr.ProductID
And this is the new output
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
Which matches your original query.
I'd suggest trying this query on your machines and seeing which performs better, try the STATISTICS TIME,IO command I mentioned previously.
You grouped by b.ProductID, c.ProductName then you could use aggregate function to calculate.
And create indexes in your table to improve performance.
Select
b.ProductID, c.ProductName,
SUM(isnull(a.Qty,0)) as Sold,
SUM(b.QtyReceive) as Stocks,
SUM(b.QtyReceive) - SUM(isnull(a.Qty,0)) as RemainingStock
from
PurchaseOrderDetails b
LEFT JOIN InvoiceDetails a on a.ProductID = b.ProductID
INNER JOIN Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
Can you try this? (I wrote without testing, as you didn't post sample data nor create table). Please check it and use as a starting point. Compare results from your query and this and compare execution plan. Analysis of performances requires "some" knowledge of Sql and ability to consider several things (eg. how many rows, are there indexes, using of execution plan and statistics, etc.)
SELECT C.PRODUCTID
,C.PRODUCTNAME
,COALESCE(D.QTY_SOLD,0) AS QTY_SOLD
,COALESCE(E.QTY_STOCKS,0) AS QTY_STOCKS
,COALESCE(E.QTY_STOCKS,0)-COALESCE(D.QTY_SOLD,0) AS REMAININGSTOCK
FROM PRODUCTS C
LEFT JOIN (SELECT PRODUCTID, SUM(QTY) AS QTY_SOLD
FROM INVOICEDETAILS
GROUP BY PRODUCTID
) D ON B.PRODUCTID = D.PRODUCTID
LEFT JOIN (SELECT PRODUCTID,SUM(QTYRECEIVE) AS QTY_STOCKS
FROM PURCHASEORDERDETAILS
GROUP BY PRODUCTID
) E ON B.PRODUCTID = E.PRODUCTID
Looking to your query, I think this could be equivalent (or at least I hope it is):
Select
b.ProductID
, c.ProductName
, Case When SUM(a.Qty) IS NULL then 0 else SUM(a.Qty) end as sold
, Case When SUM(b.QtyReceive) IS NULL then 0 else SUM(b.QtyReceive) end as Stock
, Case When SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) IS NULL
then 0
else SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) end as RemainingStock
from Products c
left join InvoiceDetails a on c.ProductID = a.ProductID
left join PurchaseOrderDetails b on c.ProductID = b.ProductID
Group By b.ProductID,c.ProductName

Calculate total sales for all sales order whose sales order is same as beverages

I am trying to get a query for calculating total sales for sales order number that is equal to sales order number for beverages.
Here is my query but it only gives me beverages total sales and not all other items having same sales order:
SELECT
SalesOrderNumber,TransactionDate,[ProductClass],[ProductName], Nett
FROM ((([Sales]
inner join [Date]
on [Sales].DateKey = [Date].DateKey)
inner join [Product] on [Sales].ProductKey = [Product].ProductKey
inner join [ProductCategory] on [Product].ProductCategoryKey =[ProductCategory].ProductCategoryKey)
inner join [Store] on [Sales].StoreKey = [Store].StoreKey)
where SalesOrderNumber in (select SalesOrderNumber from [Sales] where ProductClass = 'Beverages')
and StoreName = 'XYZ'
and [FullDate] = '2016-04-27'
ORDER BY SalesOrderNumber,TransactionDate
Could someone please help me with the above?
Example;
Salesorder1: Beverages - 5$
Chicken - 10$
Salesorder2: Chicken - 12$
Chips - 8$
I just need sum of total which has beverages in it. So, as per above examples, I should get (Beverages + Chicken = 15$) and it should not include the salesorder2.
Your query is pretty much unreadable. Consider using some intending in the future.
But, since you want a row that contains sum of multiple products, you can't select stuff like [ProductClass] and [ProductName] on the same row. You'll have to GROUP them to something common like SalesOrderNumber.
Not sure at all if below query works, but only things I changed were the first and the last line of your query.
SELECT SalesOrderNumber,TransactionDate, SUM(Nett) as sum
FROM
(
(
([Sales] inner join [Date] on [Sales].DateKey = [Date].DateKey)
inner join [Product] on [Sales].ProductKey = [Product].ProductKey
inner join [ProductCategory] on [Product].ProductCategoryKey = [ProductCategory].ProductCategoryKey
)
inner join [Store] on [Sales].StoreKey = [Store].StoreKey
)
where SalesOrderNumber in
(select SalesOrderNumber from [Sales] where ProductClass = 'Beverages')
and StoreName = 'XYZ'
and [FullDate] = '2016-04-27'
GROUP BY SalesOrderNumber, TransactionDate

Rails/SQL: finding invoices by checking two sums

I have an Invoice model that has_many lines and has_many payments.
Invoice:
id
ref
Line:
invoice_id:
total (decimal)
Payment:
invoice_id:
total(decimal)
I need to find all paid invoices. So I'm doing the following:
Invoice.joins(:lines, :payments).having(' sum(lines.total) = sum(payments.total').group('invoices.id')
Which queries:
SELECT *
FROM "invoices"
INNER JOIN "lines" ON "lines"."invoice_id" = "invoices"."id"
INNER JOIN "payments" ON "payments"."invoice_id" = "invoices"."id"
GROUP BY invoices.id
HAVING sum(lines.total) = sum(payments.total)
But it always return empty array even if there are invoices fully paid.
Is something wrong with my code?
If you join to more than one table with a 1:n relationship, the joined rows can multiply each other.
This related answer has more detailed explanation for the problem:
Two SQL LEFT JOINS produce incorrect result
To avoid that, sum the totals before you join. This way you join to exactly 1 (or 0) rows, and nothing is multiplied. Not only correct, also considerably faster.
SELECT i.*, l.sum_total
FROM invoices i
JOIN (
SELECT invoice_id, sum(total) AS sum_total
FROM lines
GROUP BY 1
) l ON l.invoice_id = i.id
JOIN (
SELECT invoice_id, sum(total) AS sum_total
FROM payments
GROUP BY 1
) p ON p.invoice_id = i.id
WHERE l.sum_total = p.sum_total;
Using [INNER] JOIN, not LEFT [OUTER] JOIN on purpose. Invoices that do not have any lines or payments are not of interest to begin with. Since we want "paid" invoices. For lack of definition and by the looks of the provided query, I am assuming that means invoices with actual lines and payments, both totaling the same.
If one invoice have a line and two payments fully paid like this:
lines:
id total invoice_id
1 30 1
payments:
id total invoice_id
1 10 1
2 20 1
Then join lines and payments to invoice with invoce_id will get 2 rows like this:
payment_id payment_total line_id line_total invoice_id
1 10 1 30 1
2 20 1 30 1
So the sum of line_total will not equal to sum of payment_total.
To get all paid invoice could use exists instead of joins:
Invoice.where(
"exists
(select 1 from
(select invoice_id
from (select invoice_id,sum(total) as line_total
from lines
group by invoice_id) as l
inner join (select invoice_id,sum(total) as payment_total
from payments
group by invoice_id) as p
on l.invoice_id = p.invoice_id
where payment_total = line_total) as paid
where invoices.id = paid.id) ")
The sub_query paid will get all paid invoice_ids.

SQL JOIN, GROUP BY on three tables to get totals

I've inherited the following DB design. Tables are:
customers
---------
customerid
customernumber
invoices
--------
invoiceid
amount
invoicepayments
---------------
invoicepaymentid
invoiceid
paymentid
payments
--------
paymentid
customerid
amount
My query needs to return invoiceid, the invoice amount (in the invoices table), and the amount due (invoice amount minus any payments that have been made towards the invoice) for a given customernumber. A customer may have multiple invoices.
The following query gives me duplicate records when multiple payments are made to an invoice:
SELECT i.invoiceid, i.amount, i.amount - p.amount AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN payments p ON ip.paymentid = p.paymentid
LEFT JOIN customers c ON p.customerid = c.customerid
WHERE c.customernumber = '100'
How can I solve this?
I am not sure I got you but this might be what you are looking for:
SELECT i.invoiceid, sum(case when i.amount is not null then i.amount else 0 end), sum(case when i.amount is not null then i.amount else 0 end) - sum(case when p.amount is not null then p.amount else 0 end) AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN payments p ON ip.paymentid = p.paymentid
LEFT JOIN customers c ON p.customerid = c.customerid
WHERE c.customernumber = '100'
GROUP BY i.invoiceid
This would get you the amounts sums in case there are multiple payment rows for each invoice
Thank you very much for the replies!
Saggi Malachi, that query unfortunately sums the invoice amount in cases where there is more than one payment. Say there are two payments to a $39 invoice of $18 and $12. So rather than ending up with a result that looks like:
1 39.00 9.00
You'll end up with:
1 78.00 48.00
Charles Bretana, in the course of trimming my query down to the simplest possible query I (stupidly) omitted an additional table, customerinvoices, which provides a link between customers and invoices. This can be used to see invoices for which payments haven't made.
After much struggling, I think that the following query returns what I need it to:
SELECT DISTINCT i.invoiceid, i.amount, ISNULL(i.amount - p.amount, i.amount) AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN customerinvoices ci ON i.invoiceid = ci.invoiceid
LEFT JOIN (
SELECT invoiceid, SUM(p.amount) amount
FROM invoicepayments ip
LEFT JOIN payments p ON ip.paymentid = p.paymentid
GROUP BY ip.invoiceid
) p
ON p.invoiceid = ip.invoiceid
LEFT JOIN payments p2 ON ip.paymentid = p2.paymentid
LEFT JOIN customers c ON ci.customerid = c.customerid
WHERE c.customernumber='100'
Would you guys concur?
I have a tip for those, who want to get various aggregated values from the same table.
Lets say I have table with users and table with points the users acquire. So the connection between them is 1:N (one user, many points records).
Now in the table 'points' I also store the information about for what did the user get the points (login, clicking a banner etc.). And I want to list all users ordered by SUM(points) AND then by SUM(points WHERE type = x). That is to say ordered by all the points user has and then by points the user got for a specific action (eg. login).
The SQL would be:
SELECT SUM(points.points) AS points_all, SUM(points.points * (points.type = 7)) AS points_login
FROM user
LEFT JOIN points ON user.id = points.user_id
GROUP BY user.id
The beauty of this is in the SUM(points.points * (points.type = 7)) where the inner parenthesis evaluates to either 0 or 1 thus multiplying the given points value by 0 or 1, depending on wheteher it equals to the the type of points we want.
First of all, shouldn't there be a CustomerId in the Invoices table? As it is, You can't perform this query for Invoices that have no payments on them as yet. If there are no payments on an invoice, that invoice will not even show up in the ouput of the query, even though it's an outer join...
Also, When a customer makes a payment, how do you know what Invoice to attach it to ? If the only way is by the InvoiceId on the stub that arrives with the payment, then you are (perhaps inappropriately) associating Invoices with the customer that paid them, rather than with the customer that ordered them... . (Sometimes an invoice can be paid by someone other than the customer who ordered the services)
I know this is late, but it does answer your original question.
/*Read the comments the same way that SQL runs the query
1) FROM
2) GROUP
3) SELECT
4) My final notes at the bottom
*/
SELECT
list.invoiceid
, cust.customernumber
, MAX(list.inv_amount) AS invoice_amount/* we select the max because it will be the same for each payment to that invoice (presumably invoice amounts do not vary based on payment) */
, MAX(list.inv_amount) - SUM(list.pay_amount) AS [amount_due]
FROM
Customers AS cust
INNER JOIN
Payments AS pay
ON
pay.customerid = cust.customerid
INNER JOIN ( /* generate a list of payment_ids, their amounts, and the totals of the invoices they billed to*/
SELECT
inpay.paymentid AS paymentid
, inv.invoiceid AS invoiceid
, inv.amount AS inv_amount
, pay.amount AS pay_amount
FROM
InvoicePayments AS inpay
INNER JOIN
Invoices AS inv
ON inv.invoiceid = inpay.invoiceid
INNER JOIN
Payments AS pay
ON pay.paymentid = inpay.paymentid
) AS list
ON
list.paymentid = pay.paymentid
/* so at this point my result set would look like:
-- All my customers (crossed by) every paymentid they are associated to (I'll call this A)
-- Every invoice payment and its association to: its own ammount, the total invoice ammount, its own paymentid (what I call list)
-- Filter out all records in A that do not have a paymentid matching in (list)
-- we filter the result because there may be payments that did not go towards invoices!
*/
GROUP BY
/* we want a record line for each customer and invoice ( or basically each invoice but i believe this makes more sense logically */
cust.customernumber
, list.invoiceid
/*
-- we can improve this query by only hitting the Payments table once by moving it inside of our list subquery,
-- but this is what made sense to me when I was planning.
-- Hopefully it makes it clearer how the thought process works to leave it in there
-- as several people have already pointed out, the data structure of the DB prevents us from looking at customers with invoices that have no payments towards them.
*/