How to query for SUM of multiple columns one-to-many - sql

I have the following tables:
| Sales.Transaction
| ---------------
| Id
| Date
| BranchId
| Commission
|----------------
| Sales.TransactionItem
| ------------------
| Id
| Rate
| Pages
| TransactionId
|-------------------
| Sales.Branch
|-------------
| Id
| Name
|-------------
How can I get the total sales of each Branches, total number of transactions and total pages? I need to have a shape of data like this:
NOTE: Total amount of Transaction, can be computed by getting the sum of TransactionItems(Rate * Pages) - Commission
| Branches | Total Sales | No. of Transactions | Total Pages |
| Branch A | 10,500 | 14 | 17 |
| Branch B | 5,200 | 4 | 4 |
| Branch C | 400 | 2 | 2 |
| Branch D | 6,100 | 8 | 14 |
The problem with my query is that when the Transaction has a Commission and more than one TransactionItems, the Commission is being multiplied by the number of TransactionItems
select
b.Name as BranchName,
COUNT(t.Id) as Transactions,
SUM(ti.Pages * ti.Rate) - SUM(t.Commission) as TotalSales,
SUM(ISNULL(ti.Pages, 0)) as Pages
from
Sales.Branch b
left join Sales.[Transaction] t
on b.Id = t.BranchId
and t.Date >= '2017-11-01'
AND t.Date < '2017-12-01'
left join Sales.TransactionItem ti
on ti.TransactionId = t.Id
group by b.Name
order by b.Name ASC

This is tricky -- I think the solution is to aggregate the transaction items before joining the rest of the tables together:
select b.Name as BranchName,
count(t.Id) as Transactions,
sum(ti.total_minus_commission) - SUM(t.Commission) as TotalSales,
sum(ti.total_pages) as Pages
from Sales.Branch b left join
Sales.[Transaction] t
on b.Id = t.BranchId and
t.Date >= '2017-11-01'
t.Date < '2017-12-01' left join
(select ti.TransactionId,
sum(ti.Pages * ti.Rate) as total_minus_commission,
sum(ti.Pages) as total_pages
from Sales.TransactionItem ti
group by ti.TransactionId
) ti
on ti.TransactionId = t.Id
group by b.Name
order by b.Name ASC;
Note: I also think this correctly calculates Transactions.

Related

SQL joining tables based off latest previous date

Let's say I have two tables for example:
Table 1 - customer order information
x---------x--------x-------------x
cust_id | item | order date |
x---------x--------x-------------x
1 | 100 | 01/01/2020 |
1 | 112 | 03/07/2022 |
2 | 100 | 01/02/2020 |
2 | 168 | 05/03/2022 |
3 | 200 | 15/06/2021 |
----------x--------x-------------x
and Table 2 - customer membership status
x---------x--------x-------------x
cust_id | Status | startdate |
x---------x--------x-------------x
1 | silver | 01/01/2019 |
1 | bronze | 05/12/2019 |
1 | gold | 05/06/2022 |
2 | silver | 24/12/2021 |
----------x--------x-------------x
I want to join the two tables so that I can see what their membership status was at the time of purchase, to produce something like this:
x---------x--------x-------------x----------x
cust_id | item | order date | status |
x---------x--------x-------------x----------x
1 | 100 | 01/01/2020 | bronze |
1 | 112 | 03/07/2022 | gold |
2 | 100 | 01/02/2020 | NULL |
2 | 168 | 05/03/2022 | silver |
3 | 200 | 15/06/2021 | NULL |
----------x--------x-------------x----------x
Tried multiple ways include min/max, >=, group by having etc with no luck. I feel like multiple joins are going to be needed here but I can't figure out - any help would be greatly appreciated.
(also note: dates are in European/au not American format.)
Try the following using LEAD function to define periods limits for each status:
SELECT T.cust_id, T.item, T.orderdate, D.status
FROM order_information T
LEFT JOIN
(
SELECT cust_id, Status, startdate,
LEAD(startdate, 1, GETDATE()) OVER (PARTITION BY cust_id ORDER BY startdate) AS enddate
FROM customer_membership
) D
ON T.cust_id = D.cust_id AND
T.orderdate BETWEEN D.startdate AND D.enddate
See a demo on SQL Server.
SELECT
[cust_id],
[item],
[order date],
[status]
FROM
(
SELECT
t1.[cust_id],
t1.[item],
t1.[order date],
t2.[status],
ROW_NUMBER() OVER (PARTITION BY t1.[cust_id], t1.[item] ORDER BY t2.[startdate] DESC) rn
FROM #t1 t1
LEFT JOIN #t2 t2
ON t1.[cust_id] = t2.[cust_id] AND t1.[order date] >= t2.[startdate]
) a
WHERE rn = 1
SELECT
o.cust_id,
o.item,
o.order_date,
m.status
FROM
customer_order o
LEFT JOIN
customer_membership m
ON o.cust_id = m.cust_id
AND o.order_date > m.start_date
GROUP BY
o.cust_id,
o.item,
o.order_date
HAVING
Count(m.status) = 0
OR m.start_date = Max(m.start_date);

SQL duplicate values of records with multiple joins

My query works fine until I add in the estimate tables, where my data duplicates.
Below is my table structure:
Jobs
| ID | JobNumber |
|----|-----------|
| 1 | J200 |
| 2 | J201 |
Job_Invoices
| ID | InvoiceNumber | JobID |
|----|---------------|-------|
| 10 | I300 | 1 |
| 11 | I301 | 2 |
Invoice_Accounts
| ID | InvoiceId | AccountID | Amount |
|----|-----------|-----------|--------|
| 23 | 10 | 40 | 200 |
| 24 | 10 | 40 | 300 |
| 25 | 10 | 41 | 100 |
| 26 | 11 | 40 | 100 |
Estimates
| ID | JobID |
|----|-------|
| 50 | 1 |
Estimate_Accounts
| ID | EstimateID| AccountID | Amount |
|----|-----------|-----------|--------|
| 23 | 50 | 40 | 100 |
| 24 | 50 | 40 | 100 |
Accounts
| ID | Name |
|----|------|
| 40 | Sales|
| 41 | EXP |
I am trying the below:
SELECT
J.JobNumber,
A.Name AS "Account",
SUM(JA.Amount) AS 'Total Invoiced',
SUM(EA.Amount) AS 'Total Estimated',
FROM
Job J
LEFT JOIN
Job_Invoices JI ON JI.JobID = J.ID
LEFT JOIN
Estimates E ON E.JobID = J.ID
LEFT JOIN
Estimate_Accounts EA ON EA.EstimateID = E.ID
INNER JOIN
Invoice_Accounts JA ON JA.InvoiceId = JI.ID
INNER JOIN
Accounts A ON A.ID = JA.AccountID
GROUP BY
J.JobNumber, A.Name, JA.Amount
ORDER BY
J.JobNumber
This is what I am hoping to achieve:
| JobNumber | Account | Total Invoiced | Total Estimated |
|-----------|-----------|----------------|-----------------|
| J200 | EXP | 100 | 0 |
| J200 | Sales | 500 | 200 |
| J201 | Sales | 100 | 0 |
This works fine if before I add the Estimates and Estimate_Accounts tables, my result looks like the above (without the Total Estimate) column.
However, when I try adding the Total Estimated column by joining the Estimates and Estimate_Accounts tables, Total Invoiced and Total Estimated values double, to something like this:
| JobNumber | Account | Total Invoiced | Total Estimated |
|-----------|-----------|----------------|-----------------|
| J200 | EXP | 200 | 0 |
| J200 | Sales | 1000 | 400 |
| J201 | Sales | 200 | 0 |
You want to join invoice totals with esitimated totals. Both are aggregations. So, make these aggregations, then join. With the jobs and accounts thus found, join the job and account tables.
select
j.jobnumber,
a.name as "Account",
inv.total as "Total Invoiced",
est.total as "Total Estimated",
from
(
select e.jobid, ea.accountid, sum(ea.amount) as total
from estimate_accounts ea
join estimates e on e.id = ea.estimateid
group by e.jobid, ea.accountid
) est
full outer join
(
select ji.jobid, ia.accountid, sum(ia.amount) as total
from invoice_accounts ia
join job_invoices ji on ji.id = ia.invoiceid
group by ji.jobid, ia.accountid
) inv using (jobid, accountid)
join jobs j on j.id = jobid
join accounts a on a.id = accountid
order by j.jobnumber, a.name;
If your DBMS doesn't support the USING clause, you must use ON instead:
select
[...]
) inv on inv.jobid = est.jobid and inv.accountid = est.accountid
join jobs j on j.id in (est.jobid, inv.jobid)
join accounts a on a.id in (est.accountid, inv.accountid)
order by j.jobnumber, a.name;
You need to aggregate before joining, because otherwise the JOIN generates a Cartesian product. However, this is complicated by the account information.
So, this approach aggregates the estimates and invoices separately by account and job. It then combines them using UNION ALL and joins in the rest of the information:
SELECT J.JobNumber, A.Name AS Account,
JE.Total_Invoiced, JE.Total_Estimated
FROM Job J LEFT JOIN
((SELECT JI.JobId, JA.AccountId, SUM(JA.Amount) AS Total_Invoiced, NULL as Total_Estimated
FROM Job_Invoices JI JOIN
Invoice_Accounts JA
ON JA.InvoiceId = JI.ID
GROUP BY JI.JobId, JA.AccountId
) UNION ALL
(SELECT E.JobId, EA.AccountId, NULL, SUM(EA.Amount) as Total_Estimated
FROM Estimates E JOIN
Estimate_Accounts EA
ON EA.EstimateID = E.ID
GROUP BY E.JobId, EA.AccountId
)
) JE
ON JE.JobId = J.ID LEFT JOIN
Accounts A
ON A.ID = JE.AccountID
ORDER BY J.JobNumber;
There are two tables where duplication may happen:
Invoice_Accounts has several records per AccountID/InvoiceId tuple, that you want to SUM()
Estimate_Accounts has several records per EstimateID/AccountID tuple. Also I think that you should use column AccountID when joining this table: this requires changing the order of the JOINs, so Estimate_Accounts is joined after Accounts
I think that it would be simpler to move the aggregation to subqueries, and then join them in the outer query.
Consider:
SELECT
J.JobNumber,
A.Name AS Account,
JA.Amount AS Total_Invoiced,
COALESCE(EA.Amount, 0) AS Total_Estimated
FROM
Job J
LEFT JOIN
Job_Invoices JI ON JI.JobID = J.ID
INNER JOIN
(
SELECT AccountID, InvoiceId, SUM(Amount) Amount
FROM Invoice_Accounts
GROUP BY InvoiceId, AccountID
) JA ON JA.InvoiceId = JI.ID
INNER JOIN
Accounts A ON A.ID = JA.AccountID
LEFT JOIN
Estimates E ON E.JobID = J.ID
LEFT JOIN
(
SELECT EstimateID, AccountID , SUM(Amount) Amount
FROM Estimate_Accounts
GROUP BY EstimateID, AccountID
) EA ON EA.EstimateID = E.ID AND EA.AccountID = JA.AccountID
ORDER BY
J.JobNumber, A.Name;
This demo on DB Fiddle with your sample data returns:
| JobNumber | Account | Total_Invoiced | Total_Estimated |
| --------- | ------- | -------------- | --------------- |
| J200 | EXP | 100 | 0 |
| J200 | Sales | 500 | 200 |
| J201 | Sales | 100 | 0 |

Grouped Weighted Average in Access Query

I am trying to have a weighted average fee % of sales for each Client/Product/City combo from this data. I don't need the level of detail of sub product.
My data looks like this:
+--------+---------+-------+--------------+-------+----------------+
| Client | Product | City | Sub Product | Sales | Fee % of Sales |
+--------+---------+-------+--------------+-------+----------------+
| a | b | b | c | 1000 | 1% |
| a | b | b | d | 2000 | 2% |
| c | c | b | c | 3000 | 3% |
| d | c | b | c | 4000 | 4% |
+--------+---------+-------+--------------+-------+----------------+
I want to calculate the weighted average Fee % charged for each Client & Product combo. i.e. For Client 'a', Product 'b', City 'b': the fee% of sales would be (1,000/3,000)*1% + (2,000/3000 * 2%)
After I do this I will have another query that takes only the Client, Product,City Sales and new Weighted average field from the last query. I need another query because I will be using the results as part of a larger query.
This would have been easier done using window function, but since you are using ms-access... You can compute the sales subtotals per client/product/city in a subquery, and then JOIN in with the original table:
SELECT
t.client, t.product, t.city, SUM(t.sales * t.fee / t1.sales) res
FROM
mytable t
INNER JOIN (
SELECT client, product, city, SUM(sales) sales
FROM mytable
GROUP BY client, product, city
) t1
ON t1.client = t.client
AND t1.product = t.product
AND t1.city = t.city
GROUP BY t.client, t.product, t.city
This demo on DB Fiddle with your sample data returns:
| client | product | city | res |
| ------ | ------- | ---- | ------------------------------- |
| a | b | b | 0.016666666294137638 |
| c | c | b | 0.029999999329447746 |
| d | c | b | 0.03999999910593033 |
You can calculate the total sales & fee values as part of a subquery, then perform the division with the resulting values, e.g.:
select
q.client,
q.product,
q.city,
q.fee/q.totalsales as weightedfee
from
(
select
t.client,
t.product,
t.city,
sum(t.sales) as totalsales,
sum(t.sales*t.[fee % of sales]) as fee
from yourtable t
group by t.client, t.product, t.city
) q
Change yourtable to suit your table name.

SQL query for insert into with update on duplicate key

I have two tables debitTable and creditTable.
debitTable has the following records:
+----+-------+
| id | debit |
+----+-------+
| a | 10000 |
| b | 35000 |
+----+-------+
and creditTable has these records:
+----+--------+
| id | credit |
+----+--------+
| b | 5000 |
+----+--------+
How about the SQL Server query to produce these results:
+----+----------------+--------------+
| id | debit | credit | debit-credit |
+----+----------------+--------------+
| a | 10000 | 0 | 10000 |
| b | 35000 | 5000 | 30000 |
+----+-------+--------+--------------+
You want to use a join. However, it is important to aggregate before joining:
select coalesce(d.id, c.id) as id, coalesce(credit, 0) as credit,
(coalesce(debit, 0) - coalesce(credit, 0)) as DebitMinusCredit
from (select id, sum(debit) as debit
from debit
group by id
) d full outer join
(select id, sum(credit) as credit
from debit
group by id
) c
on d.id = c.id;
This uses full outer join to ensure that all records from both tables are included, even if an id is not in one of the tables. The aggregation before joining is to avoid Cartesian products when there are multiple rows for a single id in both tables.
You can try "Left Join"
Select *
from debit d
left join credit c on d.id = c.id
select
debit.id, debit.debit, credit.credit,
debit.debit - credit.credit as [debit-credit]
from
debit
left join
credit on debit.id = credit.id
BUT this will be based only on debit: meaning if you have id in credit which is not in debit it won't appear in this result.

Joining 6 tables into single query?

Hey can anyone help me join the 5 tables below into a single query? I currently have the query below but is doesn't seem to work as if there are two products with the same ID inside the hires table all of the products are returned form the products table which is obviously wrong.
SELECT products.prod_id, products.title, products.price, product_types.name,
listagg(suppliers.name, ',') WITHIN GROUP(ORDER BY suppliers.name) suppliers
FROM products
INNER JOIN product_suppliers ON products.prod_id = product_suppluer.prod_id
INNER JOIN product_types ON product_types.type_id = products.type_id
INNER JOIN suppliers ON product_suppliers.supp_id = suppliers.supp_id
LEFT OUTER JOIN hires ON hires.prod_id = products.prod_id
WHERE (hires.hire_end < to_date('21-JAN-13') OR hires.hire_start > to_date('26-JAN-13'))
OR hires.prod_id IS NULL
GROUP BY products.prod_id, products.title, products.price, product_types.name
Table data:
PRODUCTS
--------------------------------------------
| Prod_ID | Title | Price | Type_ID |
|------------------------------------------|
| 1 | A | 5 | 1 |
| 2 | B | 7 | 1 |
| 3 | C | 3 | 2 |
| 4 | D | 3 | 3 |
|------------------------------------------|
PRODUCT_TYPES
----------------------
| Type_ID | Type |
|--------------------|
| 1 | TYPE_A |
| 2 | TYPE_B |
| 3 | TYPE_C |
| 4 | TYPE_D |
|--------------------|
PRODUCT_SUPPLIERS
-------------------------
| Prod_ID | Supp_ID |
|-----------------------|
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
|-----------------------|
SUPPLIERS
----------------------
| Supp_ID | Name |
|--------------------|
| 1 | SUPP_A |
| 2 | SUPP_B |
| 3 | SUPP_C |
| 4 | SUPP_D |
|--------------------|
HIRES
---------------------------------------------------------------
| Hire_ID | Prod_ID | Cust_ID | Hire_Start | Hire_End |
|-----------------------|------------|------------------------|
| 1 | 1 | 1 | 22-Jan-13 | 23-Jan-13 |
| 2 | 2 | 2 | 27-Jan-13 | 29-Jan-13 |
| 3 | 1 | 3 | 30-Jan-13 | 31-Jan-13 |
|-----------------------|------------|------------|-----------|
PRODUCTS
--------------------------------
| Cust_ID | Name | Phone |
|------------------------------|
| 1 | Cust_A | 555-666 |
| 2 | Cust_B | 444-234 |
| 3 | Cust_C | 319-234 |
| 4 | Cust_D | 398-092 |
|------------------------------|
The output from the query at the moment looks like this:
-------------------------------------------------------------
| Prod_ID | Title | Price | Type_ID | Suppliers |
|------------------------------------------|----------------|
| 1 | A | 5 | Type_A | SUPP_A,SUPP_B |
| 2 | B | 7 | Type_B | SUPP_B |
| 3 | C | 3 | Type_C | SUPP_C |
| 4 | D | 3 | Type_D | SUPP_D |
|------------------------------------------|----------------|
When it should look like this surely? as Prod_ID '1' is hired out between the dates in the query
-------------------------------------------------------------
| Prod_ID | Title | Price | Type_ID | Suppliers |
|------------------------------------------|----------------|
| 2 | B | 7 | Type_B | SUPP_B |
| 3 | C | 3 | Type_C | SUPP_C |
| 4 | D | 3 | Type_D | SUPP_D |
|------------------------------------------|----------------|
If anyone can help modify the query to output as suggested i would be really grateful. Because my understanding is that it should work as written?
Your issue is that Prod_Id 1 is both in and out of those date ranges. So instead, use a subquery to filter out which Prod_Id are in those ranges, and exclude those.
This is a much simplified version of your query:
SELECT P.Prod_ID
FROM Products P
LEFT JOIN (
SELECT Prod_ID
FROM Hires
WHERE hire_end >= To_Date('20130121', 'yyyymmdd') AND hire_start <= To_Date('20130126', 'yyyymmdd')
) H ON P.Prod_ID = H.Prod_ID
WHERE h.prod_id IS NULL
And the SQL Fiddle.
Assuming I copied and pasted correctly, this should be your query:
SELECT products.prod_id, products.title, products.price, product_types.name,
listagg(suppliers.name, ',') WITHIN GROUP(ORDER BY suppliers.name) suppliers
FROM products
INNER JOIN product_suppliers ON products.prod_id = product_suppluer.prod_id
INNER JOIN product_types ON product_types.type_id = products.type_id
INNER JOIN suppliers ON product_suppliers.supp_id = suppliers.supp_id
LEFT JOIN (
SELECT Prod_ID
FROM Hires
WHERE hire_end >= To_Date('20130121', 'yyyymmdd') AND hire_start <= To_Date('20130126', 'yyyymmdd')
) H ON products.Prod_ID = H.Prod_ID
WHERE H.Prod_ID IS NULL
GROUP BY products.prod_id, products.title, products.price, product_types.name
Hope this helps.
Your left outer join will return null values when there is no match, meaning you still have a row (with no HIRE table data) when the results of this join query are Null:
LEFT OUTER JOIN hires ON hires.prod_id = products.prod_id
WHERE (hires.hire_end < to_date('21-JAN-13')
OR hires.hire_start > to_date('26-JAN-13'))
OR hires.prod_id IS NULL
Try adding a select from the hires table (eg. hire.Hire_Start) to see this happening, then switch it to an inner join as well and I think your problem will be solved.
OR add a WHERE clause on the full query with something like hire.Hire_Start is not null
EDIT
If you change your original query to:
SELECT hires.Hire_Start, products.prod_id, products.title, products.price, product_types.name,
listagg(suppliers.name, ',') WITHIN GROUP(ORDER BY suppliers.name) suppliers
FROM products
INNER JOIN product_suppliers ON products.prod_id = product_suppluer.prod_id
INNER JOIN product_types ON product_types.type_id = products.type_id
INNER JOIN suppliers ON product_suppliers.supp_id = suppliers.supp_id
LEFT OUTER JOIN hires ON hires.prod_id = products.prod_id
WHERE (hires.hire_end < to_date('21-JAN-13') OR hires.hire_start > to_date('26- JAN-13'))
OR hires.prod_id IS NULL
GROUP BY products.prod_id, products.title, products.price, product_types.name
What comes back in the Hire_Start column?
Then if you add it to the where clause do you get the expected result:
SELECT hires.Hire_Start, products.prod_id, products.title, products.price, product_types.name,
listagg(suppliers.name, ',') WITHIN GROUP(ORDER BY suppliers.name) suppliers
FROM products
INNER JOIN product_suppliers ON products.prod_id = product_suppluer.prod_id
INNER JOIN product_types ON product_types.type_id = products.type_id
INNER JOIN suppliers ON product_suppliers.supp_id = suppliers.supp_id
LEFT OUTER JOIN hires ON hires.prod_id = products.prod_id
WHERE (hires.hire_end < to_date('21-JAN-13') OR hires.hire_start > to_date('26- JAN-13'))
OR hires.prod_id IS NULL
WHERE hires.Hire_Start is not null
GROUP BY products.prod_id, products.title, products.price, product_types.name
Finally, dropping the Outer Join altogether, does this work as expected?
SELECT hires.Hire_Start, products.prod_id, products.title, products.price, product_types.name,
listagg(suppliers.name, ',') WITHIN GROUP(ORDER BY suppliers.name) suppliers
FROM products
INNER JOIN product_suppliers ON products.prod_id = product_suppluer.prod_id
INNER JOIN product_types ON product_types.type_id = products.type_id
INNER JOIN suppliers ON product_suppliers.supp_id = suppliers.supp_id
INNER JOIN hires ON hires.prod_id = products.prod_id
WHERE (hires.hire_end < to_date('21-JAN-13') OR hires.hire_start > to_date('26- JAN-13'))
GROUP BY products.prod_id, products.title, products.price, product_types.name
And note: is the OR Hires.prod_ID sopposed to indicate that if the result returns no hire information it is available, in which case you need to write the query more like the other answer provided.
Here is some code that may help you:
SELECT L.V_PRODUCT_ID "PROD_ID" , L.TITLE "TITLE" , L.PRICE "PRICE" , L.TYPE "TYPE" , S.NAME "SUPPLIERS"
FROM
(SELECT V_PRODUCT_ID , TITLE , PRICE , TYPE , SUPPLIER_ID FROM
((select p.prod_id v_product_id , p.title TITLE , p.price PRICE , t.type TYPE
from products p , products_types t
where p.type_id = t_type_id) A
JOIN
(SELECT PROD_ID VV_PRODUCT_ID , SUPP_ID SUPPLIER_ID
FROM PRODUCTS_SUPPLIERS) H
ON (A.V_PRODUCT_ID = H.VV_PRODUCT_ID))) L
JOIN
SUPLLIERS S
ON (L.SUPPLIER_ID = S.SUPP_ID);
SELECT Emp.Empid, Emp.EmpFirstName, Emp.EmpLastName, Dept.DepartmentName
FROM Employee Emp
INNER JOIN Department dept
ON Emp.Departmentid=Dept.Departmenttid