SUM on LEFT OUTER JOIN on 2 different child tables - sql

I have the following table schema
What I want to do is return all invoices with the following summarized totals
Total of UnitPrice * Qty for all Line items (e.g. SUM(UnitPrice * Qty))
Total of Amount for all Additional Costs (e.g. SUM(Amount))
Given that an Invoice can exist without any Line Items or Additional Costs I thought all I would need to do here is use a LEFT OUTER JOIN i.e.
SELECT i.*, SUM(li.UnitPrice * li.Qty) As [Sub Total], SUM(ac.Amount) As [AdditionalCosts]
FROM Invoice i
LEFT OUTER JOIN LineItem li ON li.InvoiceId = i.Id
LEFT OUTER JOIN AdditionalCost ac ON ac.InvoiceId = i.Id
GROUP BY i.Id
However, the problem is that if both sub tables are of varied length (e.g. I have 4 line items, but only 1 additional cost) the data for the additional cost is repeated across the extra line item rows (and vice versa), you can verify this by removing the GROUP BY.
So effectively what happens is for the following record
Invoice
-------
400001
LineItem
---------
400001 | 2000 | 100 | 1
400001 | 2001 | 50 | 2
400001 | 2002 | 10 | 10
400001 | 2003 | 20 | 5
AdditionalCost
--------------
1 | 400001 | 30
2 | 400001 | 70
My result set would look like
Id | Sub Total | Additional Costs
--------------------------------------
40001 | 800 | 400 <-- this should be 100
How can I calculate the SUM of each table independently and combine them into a single master record?

This should work:
SELECT i.*, li.[Sub Total], ac.[AdditionalCosts]
FROM Invoice i
LEFT OUTER JOIN (SELECT InvoiceID, SUM(UnitPrice*Qty) As [Sub Total]
FROM LineItem
GROUP BY InvoiceID) li
ON li.InvoiceId = i.Id
LEFT OUTER JOIN (SELECT InvoiceID, SUM(Amount) As [AdditionalCosts]
FROM AdditionalCost
GROUP BY InvoiceID) ac
ON ac.InvoiceId = i.Id

Related

How to prevent duplicates when getting sum of multiple columns with multiple joins

Lets say I have 3 tables: Invoices, Charges, and Payments. Invoices can have multiple charges, and charges can have multiple payments.
Doing a simple join, data would look like this:
invoiceid | chargeid | charge | payment
----------------------------------
1 | 1 | 50 | 50
2 | 2 | 100 | 25
2 | 2 | 100 | 75
2 | 3 | 30 | 10
2 | 3 | 30 | 5
If I do an join with sums,
select invoiceid, sum(charge), sum(payment)
from invoices i
inner join charges c on i.invoiceid = c.invoiceid
inner join payments p on p.chargeid = c.chargeid
group by invoiceid
The sum of payments would be correct but charges would include duplicates:
invoiceid | charges | payments
--------------------------------------
1 | 50 | 50
2 | 260 | 115
I want a query to get a list of invoices with the sum of payments and sum of charges per invoice, like this:
invoiceid | charges | payments
--------------------------------------
1 | 50 | 50
2 | 130 | 115
Is there any way to do this by modifying the query above WITHOUT using subqueries since subqueries can be quite slow when dealing with a large amount of data? I feel like there must be a way to only include unique charges in the sum.
You can also achieve this by using LATERAL JOINS
SELECT
i.invoiceid,
chgs.total_charges,
pays.total_payments
FROM
invoices AS i
JOIN LATERAL (
SELECT
SUM( charge ) AS total_charges
FROM
charges AS c
WHERE
c.invoiceid = i.invoiceid
) AS chgs ON TRUE
JOIN LATERAL (
SELECT
SUM( payment ) AS total_payments
FROM
payments AS p
WHERE
p.chargeid = c.chargeid
) AS pays ON TRUE
one way is to do the aggregation by the tables before the joins on the grouping value
SELECT i.invoiceid, SumOfCharge, SumOfInvoice
FROM invoices i
INNER JOIN (SELECT InvoiceID, suM(charges) sumOfCharges
FROM charges c
GROUP BY Invoiceid) c
on i.invoiceid = c.invoiceid
INNER JOIN (SELECT invoiceid, sum(payment) as SumOfPayment
FROM charages c
INNER JOIN payments p on p.chargeid = c.chargeid
GROUP BY Invoiceid) P
on i.invoiceID = p.invoiceid
Another way would be to do it inline per invoice using correlation
SELECT i.invoiceid
, (SELECT SUM(charge) FROM charges c WHERE c.invoiceid = i.invoiceid) SumOfCharge
, SUM(Payment) SumOfInvoice
FROM invoices i
INNER JOIN charges c
on i.invoiceid = c.invoiceid
INNER JOIN payments p
on p.chargeid = c.chargeid
GROUP BY Invoiceid
I hope this will help.
select invoiceid, sum(distinct charge)as charges, sum(payment)as payments
from yourtable
group by invoiceid;

Creating view from item table and price table

Let's say I have two tables. One is the orders table and the other is a price table giving the price of an item w.r.t. the number of items ordered.
The meaning of the test_price table is "up-to count items(inclusive) the cost per item is price". So 0-50 items cost 1.22 per item. 51-100 cost 1.20 per item.
table test_price
id | count | price
----+-------+-------
1 | 50 | 1.22
2 | 100 | 1.20
3 | 150 | 1.19
4 | 200 | 1.18
5 | 300 | 1.10
table test_orders
id | count
----+-------
1 | 12
2 | 50
3 | 65
4 | 155
5 | 400
So this means that order 1 for 12 items should be priced at 1.22 per item.
Order 5 should be priced at 1.10 per item.
I can get the price for a single order with
SELECT price FROM test_prices WHERE
count >= (SELECT count FROM test_orders WHERE id = 1)
ORDER BY count ASC LIMIT 1;
I would like to create a view that shows the orders with unit price and total price as columns
Join the tables on the count columns applying your conditions:
with cte as (select max(count) maxcount from test_price)
select
o.*, o.count * p.price total_price
from test_orders o inner join test_price p
on p.count = coalesce(
(select min(count) from test_price where o.count <= count),
(select maxcount from cte)
)
order by o.id;
See the demo.
Results:
| id | count | total_price |
| --- | ----- | ----------- |
| 1 | 12 | 14.64 |
| 2 | 50 | 61 |
| 3 | 65 | 78 |
| 4 | 155 | 182.9 |
| 5 | 400 | 440 |
I think your test_prices table is not quite right. I think it should be:
id | count | price
----+-------+-------
1 | 1 | 1.22
2 | 100 | 1.20
3 | 150 | 1.19
4 | 200 | 1.18
5 | 300 | 1.10
This says that for 1-99 quantity, the price is $1.22. For 100-149, the price $1.20, and so on.
With this structure, you can use a lateral join:
select o.*, p.price
from test_orders o left join lateral
(select p.*
from test_prices p
where p.count <= o.count
order by p.count desc
fetch first 1 row only
) p
on 1=1
The problem with your data is that price_table data is not all "up-to count", because the 5th row also means "up-to and beyond".
If you are able to add another row to price_table like (6, 10000, 1.10) then the solution is easy:
CREATE VIEW price_view AS
SELECT orders.id, orders.count, p.price, orders.count * p.price as total
FROM (
SELECT o.id, o.count, min(p.count) as pricebracket
FROM test_price p
LEFT JOIN test_orders o ON p.count >= o.count
GROUP BY o.id, o.count) orders
LEFT JOIN test_price p ON orders.pricebracket = p.count
ORDER BY orders.id
If you cannot get rid of this inconsistency in your data then you have to select maximum value first. So the query modifies like this:
CREATE VIEW price_view AS
WITH temptable as (SELECT max(p.count) as maxcount FROM test_price p)
SELECT orders.id, orders.count, p.price, orders.count * p.price as total
FROM (
SELECT o.id, o.count, min(p.count) as pricebracket
FROM test_price p
CROSS JOIN temptable
LEFT JOIN test_orders o ON p.count >=
(case when o.count > temptable.maxcount then temptable.maxcount else o.count end)
GROUP BY o.id, o.count) orders
LEFT JOIN test_price p ON orders.pricebracket = p.count
ORDER BY orders.id

Check if two products belong to same invoice

I have a table of Invoices, InvocicesLines and Products, and I'd like to select all the invoices that have two different products.
How can I do this?
Example:
Invoices table:
InvoiceNo | CustomerId | ...
=============+===============+========+
1 | 1 |
2 | 2 |
3 | 5 |
4 | 7 |
InvoicesLines table:
InvoiceNo (FK) | Id (PK) | ProductId |
===============+===============+============+
1 | 1 | 3 |
2 | 2 | 1 |
2 | 3 | 2 |
4 | 4 | 5 |
I need the invoices which have product 1 and 2:
InvoiceNo |
============+
2 |
You can join twice your invoices with your lines, and return the ones having different products. You must group the result.
select Invoices.IdInvoice
from Inovices
inner join InvoicesLines as First on First.IdInvoice = Invoices.IdInvoice
inner join InvoicesLines as Second on Second.IdInvoice = Invoice.IdInvoice
where First.IdProduct <> Second.IdProduct
group by Invoices.IdInvoice
This is going to return all the invoices with at least two different products.
But if you want to return the invoices with two different products, and only two different products, you can ensure it with a having clausule.
select Invoices.IdInvoice
from Inovices
inner join InvoicesLines as First on First.IdInvoice = Invoices.IdInvoice
inner join InvoicesLines as Second on Second.IdInvoice = Invoice.IdInvoice
where First.IdProduct <> Second.IdProduct
group by Invoices.IdInvoice
having count(Invoices.IdInvoice) = 2
Group by Invoice number and use having clause to return only those invoices which has distinct count of products equal to 2 (or greater of 1, if you want "at least two products"):
select i.InvoiceNumber
from Inoices i
inner join InvoiceLines il on il.InvoiceId = i.InoiceId
group by i.InvoiceNumber
having count(distinct il.ProductId) = 2

PostgreSQL Referencing Outer Query in Subquery

I have two Postgres tables (really, more than that, but simplified for the purpose of the question) - one a record of products that have been ordered by customers, and another a historical record of prices per customer and a date they went into effect. Something like this:
'orders' table
customer_id | timestamp | quantity
------------+---------------------+---------
1 | 2015-09-29 16:01:01 | 5
1 | 2015-10-23 14:33:36 | 3
2 | 2015-10-19 09:43:02 | 7
1 | 2015-11-16 15:08:32 | 2
'prices' table
customer_id | effective_time | price
------------+---------------------+-------
1 | 2015-01-01 00:00:00 | 15.00
1 | 2015-10-01 00:00:00 | 12.00
2 | 2015-01-01 00:00:00 | 14.00
I'm trying to create a query that will return every order and its unit price for that customer at the time of the order, like this:
desired result
customer_id | quantity | price
------------+----------+------
1 | 5 | 15.00
1 | 3 | 12.00
2 | 7 | 14.00
1 | 2 | 12.00
This is essentially what I want, but I know that you can't reference an outer query inside an inner query, and I'm having trouble figuring out how to re-factor:
SELECT
o.customer_id,
o.quantity,
p.price
FROM orders o
INNER JOIN (
SELECT price
FROM prices x
WHERE x.customer_id = o.customer_id
AND x.effective_time <= o.timestamp
ORDER BY x.effective_time DESC
LIMIT 1
) p
;
Can anyone suggest the best way to make this work?
Instead of joining an inline view based on the prices table, you can perform a subquery in the SELECT list:
SELECT customer_id, quantity, (
SELECT price
FROM prices p
WHERE
p.customer_id = o.customer_id
AND p.effective_time <= o.timestamp
ORDER BY p.effective_time DESC
LIMIT 1
) AS price
FROM orders o
That does rely on a correlated subquery, which could be bad for performance, but with the way your data are structured I doubt there's a substantially better alternative.
You dont need the subquery, just a plain inner join will do (this assumes there are no duplicate effective_times per customer):
SELECT o.customer_id, o.quantity
,p.price
FROM orders o
JOIN prices p ON p.customer_id = o.customer_id
AND p.effective_time <= o.timestamp
AND NOT EXISTS ( SELECT * FROM prices nx
WHERE nx.customer_id = o.customer_id
AND nx.effective_time <= o.timestamp
AND nx.effective_time > p.effective_time
)
;

SUM in multi-currency

I am trying to do SUM() in a multi-currency setup. The following will demonstrate the problem that I am facing:-
Customer
-------------------------
Id | Name
1 | Mr. A
2 | Mr. B
3 | Mr. C
4 | Mr. D
-------------------------
Item
-------------------------
Id | Name | Cost | Currency
1 | Item 1 | 5 | USD
2 | Item 2 | 2 | EUR
3 | Item 3 | 10 | GBP
4 | Item 4 | 5 | GBP
5 | Item 5 | 50 | AUD
6 | Item 6 | 20 | USD
7 | Item 3 | 10 | EUR
-------------------------
Order
-------------------------
User_Id | Product_Id
1 | 1
2 | 1
1 | 2
3 | 3
1 | 5
1 | 7
1 | 5
2 | 6
3 | 4
4 | 2
-------------------------
Now, I want the output of a SELECT query that lists the Customer Name and the total amount worth of products purchased as:-
Customer Name | Amount
Mr. A | Multiple-currencies
Mr. B | 25 USD
Mr. C | 15 GBP
Mr. D | 2 EUR
So basically, I am looking for a way to add the cost of multiple products under the same customer, if all of them have the same currency, else simply show 'multiple-currencies'. Running the following query will not help:-
SELECT Customer.Name, SUM(Item.Amount) FROM Customer
INNER JOIN Order ON Order.User_Id = Customer.Id
INNER JOIN Item ON Item.Id = Order.Product_Id
GROUP BY Customer.Name
What should my query be? I am using Sqlite
I would suggest two output columns, one for the currency and one for the amount:
SELECT c.Name,
(case when max(currency) = min(currency) then sum(amount)
end) as amount,
(case when max(currency) = min(currency) then max(currency)
else 'Multiple Currencies'
end) as currency
FROM Customer c INNER JOIN
Order o
ON o.User_Id = c.Id INNER JOIN
Item
ON i.Id = o.Product_Id
GROUP BY c.Name
If you want, you can concatenate these into a single string column. I just prefer to have the information in two different columns for something like this.
The above is standard SQL.
I think your query should looks like this
SELECT
Data.Name AS [Customer Name],
CASE WHEN Data.Count > 1 THEN "Multiple-currencies" ELSE CAST(Data.Amount AS NVARCHAR) END AS Amount
FROM
(SELECT
Customer.Name,
COUNT(Item.Currency) AS Count,
SUM(Item.Amount) AS Amount
FROM
Customer
INNER JOIN Order ON Order.User_Id = Customer.Id
INNER JOIN Item ON Item.Id = Order.Product_Id
GROUP BY
Customer.Name) AS Data
A subquery to get the count of currencies and then ask for them in the main query to show the total or the text "Multiple-currencies".
Sorry if there is any mistake or mistype but I don't have a database server to test it
Hope this helps.
IMO I would start by standardizing variable names. Why call ID in customer table USER_ID in order table? Just a pet peeve. Anyway, you should learn how to build queries.
start with joining the customer table to the order table on then join the result to the item table. The first join is on CUSTOMER_ID and the second join is on PRODUCT_ID. Once you have that working use SUM and GROUP BY
Ok, I managed to solve the problem this way:-
SELECT innerQuery.Name AS Name, (CASE WHEN innerQuery.Currencies=1 THEN (innerQuery.Amount || innerQuery.Currency) ELSE 'Mutliple-Currencies' END) AS Amount, FROM
(SELECT Customer.Name, SUM(Item.Amount), COUNT(DISTINCT Item.Currency) AS Currencies, Item.Currency AS Currency FROM Customer
INNER JOIN Order ON Order.User_Id = Customer.Id
INNER JOIN Item ON Item.Id = Order.Product_Id
GROUP BY Customer.Name)innerQuery