Complex SQL Query Help - sql

I am trying to do rather complex SQL query to produce a report. This is a database used by an inventory and accounting system.
Essentially I need to produce a report with the following columns
Month / Year (group results by month / year)
Reseller (order results by reseller with in the month / year group)
Total sales - Sales - Hardware
Total sales - Sales - Consumables
The following tables will need to be used in the report:
Invoice
Reseller
Job
JobStockItem
Stock
Essentially the query would need to start as:
1. Select all invoices from Invoice
2. Get the reseller name from Reseller.Name (join on Reseller.ID with Invoice.CustomerID)
3. Get the associated job ID from Job table (join on Job.InvoiceID with Invoice.ID)
4. Get each component of the invoice from JobStockItems (join JobStockItem.JobID on Job.ID)
5. Get the stock item in in the job from Stock (join on JobStockItems.StockId on Stock.ID) and see if the category (Stock.Category1) is either Hardware or Consumables
6. If the stock item is hardware or consumables, use the sale price in the JobStockItem (JobStockItem.PriceExTax) and add it towards the total for the month of the resellers purchases
The month and year come from the invoice date (Invoice.InvoiceDate).
Now I could produce this result myself by executing a bunch of queries and processing myself, one each for the above steps, but it's going to end up slow and I'm sure there'd have to be a query out there that could wrap all those requirements up and do it in one?
I have not attempted to do the query yet as to be honest, I don't know where to start - it's a lot more complex than anything I've done in the past.
I am just using Management Studio, not using Reporting Services, Crystal Reports or anything. My aim is to dump the output to HTML when I have it working.
Thanks heaps in advance.

It think if you Left Join into the JobStockItems table twice (once for hardware, and once for consumables), you can manage all that in one query. The final query will look something like this (Don't have my editor up right now, so apologies for any typos)
SELECT DATEPART(m, Invoice.InvoiceDate) month,
DATEPART(yy, Invoice.InvoiceDate) year,
Reseller.Name,
SUM(jobstockitems_hardware.Price) sales_hardware,
SUM(jobstockitems_consumables.Price) sales_consumables,
FROM Invoice
INNER JOIN Reseller
ON Invoice.CustomerID = Reseller.ID
INNER JOIN Job
ON Invoice.ID = Job.InvoiceID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Hardware'
GROUP BY JobID) jobstockitems_hardware
ON Job.ID = jobstockitems_hardware.JobID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Consumables'
GROUP BY JobID) jobstockitems_consumables
ON Job.ID = jobstockitems_consumables.JobID
GROUP BY DATEPART(m, Invoice.Date),
DATEPART(yy, Invoice.Date),
Reseller.Name
ORDER BY DATEPART(yy, Invoice.Date) ASC,
DATEPART(m, Invoice.Date) ASC,
Reseller.Name ASC
I'm assuming that Retailer has a column called Name that you want to return as well, feel free to change that to ID or whatever else you'd rather return.
Edit: Fixed query to remove duplicates

Partly you want to PIVOT your results. So you can simply join your data together and let the PIVOT do the selecting and summarizing of categories:
select *
from (
select Year = datepart(year, i.InvoiceDate),
Month = datepart(month, i.InvoiceDate),
ResellerName = r.name,
StockCategory = s.Category1,
jsi.PriceExTax
from Invoice i
inner join Reseller r
on r.ID = i.CustomerID
inner join Job j
on j.InvoiceID = i.ID
inner join JobStockItem jsi
on jsi.JobID = j.ID
inner join Stock s
on s.ID = jsi.StockID
) d
pivot (sum(PriceExTax) for StockCategory in (Hardware, Consumables)) p
order by Year, Month, ResellerName;
In the end you'll need some conditions in the inner query (e.g. where i.InvoiceDate between ...).
Perhaps you have to multiply the PriceExTax with the amount in JobStockItem...

DPMattingley has produced a good query, but with one limitation - if a time period has no results, it won't show a row at all. This may well be an unlikely case in the specific example here but where it does happen it's annoying to find the report hides the zero results month!
My standard solution to this involves taking the month from a numbers table, which forces all time periods to appear. Using DPMattingley's query as a starting point -
select
mths.month,
mths.year,
d.Name,
d.sales_hardware,
d.sales_consumables
from
/*All the in-range months on which to report*/
(select
datepart(m,dateadd(m,number,minDate)) month,
datepart(yy,dateadd(m,number,minDate)) year
from
numbers n
inner join (select min(invoice.invoicedate) as minDate from invoice) m
on n.number between 0 and datediff(m,minDate,getdate())) mths
left join
/*Data for each month*/
(SELECT DATEPART(m, Invoice.InvoiceDate) month,
DATEPART(yy, Invoice.InvoiceDate) year,
Reseller.Name,
SUM(jobstockitems_hardware.Price) sales_hardware,
SUM(jobstockitems_consumables.Price) sales_consumables
FROM
Invoice
INNER JOIN Reseller
ON Invoice.CustomerID = Reseller.ID
INNER JOIN Job
ON Invoice.ID = Job.InvoiceID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Hardware'
GROUP BY JobID) jobstockitems_hardware
ON Job.ID = jobstockitems_hardware.JobID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Consumables'
GROUP BY JobID) jobstockitems_consumables
ON Job.ID = jobstockitems_consumables.JobID
GROUP BY DATEPART(m, Invoice.Date),
DATEPART(yy, Invoice.Date),
Reseller.Name) d
on mths.month=d.month
and mths.year=d.year
ORDER BY mths.month ASC,
mths.year ASC,
Reseller.Name ASC

Related

Show all customers who havent placed an order in a specific time frame

Fairly new to SQL and databases.... I have the following issue...
Database for a shipping company, three relevant tables for this problem are;
Customer (CustomerID, Name, Address etc)
Shipment (ShipmentID, ScheduleName, DepartDate, ArriveDate, DepartPort, ArrivePort)
CustomerOrder (CustomerID, ShipmentID, Fee etc)
I need to show
'All customers that have NOT placed an order within a specific year, i.e. 2019' Im struggling to find the correct query for this. Im working on sample data, there are 8 customers in the database, 3 of which have orders for shipments departing in 2019. Therefore the result im hoping to get is a list of the 5 remaining customers who did not have orders on these shipments.
I can easily show customer who have placed an order and those who have never placed an order however im struggling to show those who haven't placed an order in a specific timeframe.
Any ideas or tips would be greatly appreciated!
Thanks
EDIT***
Desired Result - Show the 5 customers who havent placed an order where the shipping depart date is in 2019
I have tried;
~
SELECT CustomerID from customer
LEFT OUTER JOIN customerorder
ON Customer.CustomerID = CustomerOrder.CustomerID
LEFT OUTER JOIN shipment
ON CustomerOrder.ShipmentID = Shipment.ShipmentID
WHERE shipment.DepartDate BETWEEN '2019-01-01' AND '2019-12-31'
AND CustomerOrder.CustomerID IS NULL
~
However this bring back no results.
The three tables have the following information
Customer - (simply table of customer details) (CustomerID, Name, Address, Tel, Type, Size, RegisteredSince)
Shipment - (Details each shipment scheduled) (ShipmentID, ScheduleName, DepartDate, DepartPort, ArriveDate, ArrivePort, Season)
CustomerOrder - (Details customer order, but not individual items on an order) (OrderID, ShipmentID, CustomerID)
Im not sure if its the join thats the problem? There are 4 shipments in 2019 in the sample data with a total of 3 customers. I need to show the customerID's of the 5 customers who didnt place an order within these dates.
Ive tried a few different queries, have been searching online but im new to this and not quite sure where im going wrong. I am able to identify customers who have never placed an order but as soon as I add the date ranges i get no results.
Desired Result - Show the 5 customers who havent placed an order where the shipping depart date is in 2019
Use NOT EXISTS:
SELECT c.*
FROM Customer c
WHERE NOT EXISTS (SELECT 1
FROM CustomerOrder co JOIN
Shipment sh
ON co.shipmentID = sh.ShipmentID
WHERE c.CustomerID = co.CustomerID AND
sh.DepartDate >= '2019-01-01' AND
sh.DepartDate < '2020-01-01'
);
I don't now from your relations where do you save the order date, but you could change anything if it is needed. I have used DepartDate to mange orders Year that are different from 2019:
Example:
SELECT c.*
FROM Customer c
INNER JOIN CustomerOrder co ON c.CutomerID = co.CutomerID
INNER JOIN Shipment sh ON co.shipmentID = sh.ShipmentID
WHERE YEAR(sh.DepartDate) <> 2019
Get all cutomers data where DepartDate not equal to 2019
I think this could be the solution that you want, here is full-query.
Sub-Query means you get all customerID-s that made orders on 2019 and we are exclude theme with not in ...
SELECT c.*
FROM Customer c
INNER JOIN CustomerOrder co ON c.CutomerID = co.CutomerID
INNER JOIN Shipment sh ON co.shipmentID = sh.ShipmentID
WHERE YEAR(sh.DepartDate) <> 2019 AND c.CutomerID
NOT IN (SELECT c.*
FROM Customer c
INNER JOIN CustomerOrder co ON c.CutomerID = co.CutomerID
INNER JOIN Shipment sh ON co.shipmentID = sh.ShipmentID
WHERE YEAR(sh.DepartDate) = 2019)
Just do a left outer join on orders by customerId in the appropriate date range where the customerID doesn't exist:
select c.*
from customers c
left outer join (
select customerID
from CustomerOrder o
join Shipment s on s.ShipmentID=o.ShipmentID
where DepartDate>='2020-01-01' and DepartDate<'2021-01-01'
) x on x.customerID=c.customerID
where x.customerid is null
The subquery selects every single customerID who ordered in the date range.
The left outer join and "customerid is null" does a negative match on those customers who don't exist in our subquery.

T-SQL Get all years from - to

I am trying to find a lists with the sales of each Shop for each year from a database. But i want to print all years with 0 sales as well (for some years/shops there will be no data in my database)
I have done the following procedure but it only returns Years and shops which have non zero values .
SELECT YEAR(temp.INVOICE_DATE) AS Year, Shop.Name, SUM(temp.QTY * Product.Price) AS Total
FROM (
SELECT pci.INVOICE_ID, ci.STORE_ID, ci.INVOICE_DATE, pci.PRODUCT_ID, pci.QTY
FROM [Product_Customer Invoice] pci, [Customer Invoice] ci
WHERE pci.INVOICE_ID = ci.INVOICE_ID
)AS temp, Product, Shop
WHERE Product.PRODUCT_ID = temp.PRODUCT_ID AND Shop.STORE_ID = temp.STORE_ID
GROUP BY YEAR(temp.INVOICE_DATE), Shop.Name
ORDER BY Year ASC
I get The following result
I would like to ask for any ideas how to include 0 for years or shops that not any sales has been done
Use a cross join to generate the rows and then left join to bring in the data you want:
SELECT y.Year, s.Shop.Name,
COALESCE(SUM(pci.QTY * p.Price), 0) AS Total
FROM Shop s CROSS JOIN
(SELECT DISTINCT invoice_date as year FROM [Product_Customer Invoice]
) y LEFT JOIN
[Product_Customer Invoice] pci
ON YEAR(pci.invoice_date) = y.year AND
pci.store_id = s.store_id LEFT JOIN
Product p
ON pci.PRODUCT_ID = p.PRODUCT_ID
GROUP BY y.year, s.name
ORDER BY y.year, s.name;
Notice that this also fixes your join syntax and removes the unnecessary subquery.
This assumes that all the years you want are in [Product_Customer Invoice]. If you want a different set of years, either explicitly list them using VALUES(), use a recursive CTE to generate them, or use a calendar table.

Selecting the most frequent value in a column based on the value of another column in the same row?

So basically what I'm trying to do is generate a report for our stores. We have an incident report website where the employees can report an incident that takes place at any of our stores. So in the general report I'm trying to generate, I want to show the details for each store we have (Five stores). This would include the name of the store, number of incidents, oldest incident date, newest incident date, and then the most recurring type of incident at each store.
SELECT Store.Name AS [Store Name], COUNT(*) AS [No. Of Incidents], Min(CAST(DateNotified AS date)) AS [Oldest Incident], Max(CAST(DateNotified AS date)) AS [Latest Incident],
( SELECT TOP 1 IncidentType.Details
FROM IncidentDetails
INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
Group By IncidentType.Details, IncidentDetails.StoreID
Order By COUNT(IncidentType.Details) DESC) AS [Most Freqeuent Incident]
FROM IncidentDetails
INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
GROUP BY Store.Name
Just to make it clear, the IncidentDetails table stores all the details about the incident including which store it occured at, what the type of incident was, time/date, etc.
What this does though is it gives me 5 rows for each store, but the [Most Frequent Incident] value is the same for every row. Basically, it gets the most frequent incident value for the whole table, regardless of which store it came from, and then displays that for each store, even though different stores have different values for the column.
I've been trying to solve this for a while now but haven't been able to :-(
You have too many joins and no correlation clause.
There are several ways to approach this problem. You have already started with an aggregation in the outer query and then a nested subquery. So, this continues that approach. I think this does what you want:
SELECT s.Name AS [Store Name], COUNT(*) AS [No. Of Incidents],
Min(CAST(DateNotified AS date)) AS [Oldest Incident],
Max(CAST(DateNotified AS date)) AS [Latest Incident],
(SELECT TOP 1 it.Details
FROM IncidentDetails id2 INNER JOIN
IncidentType it2
On id2.IncidentTypeID = it2.IncidentTypeID
WHERE id2.StoreId = s.StoreId
Group By it.Details
Order By COUNT(*) DESC
) AS [Most Freqeuent Incident]
FROM IncidentDetails id INNER JOIN
Store s
ON id.StoreID = s.StoreID
GROUP BY s.Name, s.StoreId;
Notes:
Removed the IncidentType table from the outer joins. This doesn't seem needed (although it could be used for filtering).
Added s.StoredId to the group by clause. This is needed for the correlation in the subquery.
Added a where clause so the subquery is only processed once for each store in the outer query.
Removed the join to Store in the subquery. It seems unnecessary, if the queries can be correlated on StoreId.
Changed the group by in the subquery to use Details. That is the value being selected.
Added table aliases, which make queries easier to write and to read.

SQL server SELECT with join performance issue

Sorry about the saga here but am trying to explain everything.
We have 2 databases that I would like to join some tables in.
1 database holds sales data from various different stores/sites. This database is quite large (over 3mill rows currently) This table is ItemSales
The other holds application data from an in house web app. These tables are Departments and GroupItems
I would like to create a query that joins 2 tables from the app database with the sales database table. This is so we can group some items together for a date range and see the amount sold for example.
My first attempt was (DealId being the variable that it is grouped on in the App):
SELECT d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate,
(SELECT SUM(ItemQty) AS Expr1
FROM Sales.dbo.ItemSales AS s
WHERE (Store = d.SiteId) AND (ItemNo = d.ItemNo) AND (ItemSaleDate >= d.ItemStartDate) AND (ItemSaleDate <= d.ItemEndDate)) AS ItemsSold, Sales.dbo.ItemSales.ItemDesc, Departments.Description
FROM Departments INNER JOIN
Sales.dbo.ItemSales ON Departments.Id = Sales.dbo.ItemSales.ItemDept RIGHT OUTER JOIN
GroupItems AS d ON Sales.dbo.ItemSales.ItemNo = d.ItemNo
WHERE (d.DealId = 11)
GROUP BY d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, ItemDesc, Departments.Description, d.SiteId
ORDER BY d.Id
This does exactly what I want which is:
-Give me all the details from the GroupItems table (UnitValue, ItemStartDate, ItemEndDate etc)
-Gives me the SUM() on the ItemQty column for the amount sold (plus the description etc)
-Returns NULL for something with no sales for the period
It is VERY slow though. To the point that if the GroupItems table has more than about 7 items in it, it times out.
Second attempt has been:
SELECT d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, SUM(ItemQty) AS ItemsSold, Sales.dbo.ItemSales.ItemDesc, Departments.Description
FROM Departments INNER JOIN
Sales.dbo.ItemSales ON Departments.Id = Sales.dbo.ItemSales.ItemDept RIGHT OUTER JOIN
GroupItems AS d ON Sales.dbo.ItemSales.ItemNo = d.ItemNo
WHERE (Store = d.SiteId) AND (d.DealId = 11) AND (Sales.dbo.ItemSales.ItemSaleDate >= d.ItemStartDate) AND (Sales.dbo.ItemSales.ItemSaleDate <= d.ItemEndDate)
GROUP BY d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, ItemDesc, Departments.Description
ORDER BY d.Id
This is very quick and does not time out but does not return the NULLs for no sales items in the ItemSales table. This is a problem as we need to see nothing or 0 for a no sales item otherwise people will think we forgot to check that item.
Can someone help me come up with a query please that returns everything from the GroupItems table, shows the SUM() of items sold and doesn't time out? I have also tried a SELECT x WHERE EXISTS (Subquery) but this also didn't return the NULLs for me but I may have had that one wrong.
If you want everything from GroupItems regardless of the sales, use it as the base of the query and then use left outer joins from there. Something along these lines:
SELECT GroupItems.Id, GroupItems.ItemNo, GroupItems.UnitValue, GroupItems.NoGST,
GroupItems.ItemStartDate, GroupItems.ItemEndDate,
Sales.ItemDesc,
SUM(ItemQty) AS SumOfSales,
Departments.Description
FROM GroupItems
LEFT OUTER JOIN #tempSales AS Sales ON
Sales.ItemNo = GroupItems.ItemNo
AND Sales.Store = GroupItems.SiteId
AND Sales.ItemSaleDate >= GroupItems.ItemStartDate
AND Sales.ItemSaleDate <= GroupItems.ItemEndDate
LEFT OUTER JOIN Departments ON Departments.Id = Sales.ItemDept
WHERE GroupItems.DealId = 11
GROUP BY GroupItems.Id, GroupItems.ItemNo, GroupItems.UnitValue, GroupItems.NoGST,
GroupItems.ItemStartDate, GroupItems.ItemEndDate,
Sales.ItemDesc,
SUM(ItemQty) AS SumOfSales,
Departments.Description
ORDER BY GroupItems.Id
Does changing the INNER JOIN to Sales.dbo.ItemSales into a LEFT OUTER JOIN to Sales.dbo.ItemSales and changing the RIGHT OUTER JOIN to GroupItems into an INNER JOIN to GroupItems fix your issue?

SQL JOIN, GROUP BY on three tables to get totals

I've inherited the following DB design. Tables are:
customers
---------
customerid
customernumber
invoices
--------
invoiceid
amount
invoicepayments
---------------
invoicepaymentid
invoiceid
paymentid
payments
--------
paymentid
customerid
amount
My query needs to return invoiceid, the invoice amount (in the invoices table), and the amount due (invoice amount minus any payments that have been made towards the invoice) for a given customernumber. A customer may have multiple invoices.
The following query gives me duplicate records when multiple payments are made to an invoice:
SELECT i.invoiceid, i.amount, i.amount - p.amount AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN payments p ON ip.paymentid = p.paymentid
LEFT JOIN customers c ON p.customerid = c.customerid
WHERE c.customernumber = '100'
How can I solve this?
I am not sure I got you but this might be what you are looking for:
SELECT i.invoiceid, sum(case when i.amount is not null then i.amount else 0 end), sum(case when i.amount is not null then i.amount else 0 end) - sum(case when p.amount is not null then p.amount else 0 end) AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN payments p ON ip.paymentid = p.paymentid
LEFT JOIN customers c ON p.customerid = c.customerid
WHERE c.customernumber = '100'
GROUP BY i.invoiceid
This would get you the amounts sums in case there are multiple payment rows for each invoice
Thank you very much for the replies!
Saggi Malachi, that query unfortunately sums the invoice amount in cases where there is more than one payment. Say there are two payments to a $39 invoice of $18 and $12. So rather than ending up with a result that looks like:
1 39.00 9.00
You'll end up with:
1 78.00 48.00
Charles Bretana, in the course of trimming my query down to the simplest possible query I (stupidly) omitted an additional table, customerinvoices, which provides a link between customers and invoices. This can be used to see invoices for which payments haven't made.
After much struggling, I think that the following query returns what I need it to:
SELECT DISTINCT i.invoiceid, i.amount, ISNULL(i.amount - p.amount, i.amount) AS amountdue
FROM invoices i
LEFT JOIN invoicepayments ip ON i.invoiceid = ip.invoiceid
LEFT JOIN customerinvoices ci ON i.invoiceid = ci.invoiceid
LEFT JOIN (
SELECT invoiceid, SUM(p.amount) amount
FROM invoicepayments ip
LEFT JOIN payments p ON ip.paymentid = p.paymentid
GROUP BY ip.invoiceid
) p
ON p.invoiceid = ip.invoiceid
LEFT JOIN payments p2 ON ip.paymentid = p2.paymentid
LEFT JOIN customers c ON ci.customerid = c.customerid
WHERE c.customernumber='100'
Would you guys concur?
I have a tip for those, who want to get various aggregated values from the same table.
Lets say I have table with users and table with points the users acquire. So the connection between them is 1:N (one user, many points records).
Now in the table 'points' I also store the information about for what did the user get the points (login, clicking a banner etc.). And I want to list all users ordered by SUM(points) AND then by SUM(points WHERE type = x). That is to say ordered by all the points user has and then by points the user got for a specific action (eg. login).
The SQL would be:
SELECT SUM(points.points) AS points_all, SUM(points.points * (points.type = 7)) AS points_login
FROM user
LEFT JOIN points ON user.id = points.user_id
GROUP BY user.id
The beauty of this is in the SUM(points.points * (points.type = 7)) where the inner parenthesis evaluates to either 0 or 1 thus multiplying the given points value by 0 or 1, depending on wheteher it equals to the the type of points we want.
First of all, shouldn't there be a CustomerId in the Invoices table? As it is, You can't perform this query for Invoices that have no payments on them as yet. If there are no payments on an invoice, that invoice will not even show up in the ouput of the query, even though it's an outer join...
Also, When a customer makes a payment, how do you know what Invoice to attach it to ? If the only way is by the InvoiceId on the stub that arrives with the payment, then you are (perhaps inappropriately) associating Invoices with the customer that paid them, rather than with the customer that ordered them... . (Sometimes an invoice can be paid by someone other than the customer who ordered the services)
I know this is late, but it does answer your original question.
/*Read the comments the same way that SQL runs the query
1) FROM
2) GROUP
3) SELECT
4) My final notes at the bottom
*/
SELECT
list.invoiceid
, cust.customernumber
, MAX(list.inv_amount) AS invoice_amount/* we select the max because it will be the same for each payment to that invoice (presumably invoice amounts do not vary based on payment) */
, MAX(list.inv_amount) - SUM(list.pay_amount) AS [amount_due]
FROM
Customers AS cust
INNER JOIN
Payments AS pay
ON
pay.customerid = cust.customerid
INNER JOIN ( /* generate a list of payment_ids, their amounts, and the totals of the invoices they billed to*/
SELECT
inpay.paymentid AS paymentid
, inv.invoiceid AS invoiceid
, inv.amount AS inv_amount
, pay.amount AS pay_amount
FROM
InvoicePayments AS inpay
INNER JOIN
Invoices AS inv
ON inv.invoiceid = inpay.invoiceid
INNER JOIN
Payments AS pay
ON pay.paymentid = inpay.paymentid
) AS list
ON
list.paymentid = pay.paymentid
/* so at this point my result set would look like:
-- All my customers (crossed by) every paymentid they are associated to (I'll call this A)
-- Every invoice payment and its association to: its own ammount, the total invoice ammount, its own paymentid (what I call list)
-- Filter out all records in A that do not have a paymentid matching in (list)
-- we filter the result because there may be payments that did not go towards invoices!
*/
GROUP BY
/* we want a record line for each customer and invoice ( or basically each invoice but i believe this makes more sense logically */
cust.customernumber
, list.invoiceid
/*
-- we can improve this query by only hitting the Payments table once by moving it inside of our list subquery,
-- but this is what made sense to me when I was planning.
-- Hopefully it makes it clearer how the thought process works to leave it in there
-- as several people have already pointed out, the data structure of the DB prevents us from looking at customers with invoices that have no payments towards them.
*/