Excel's SUMPRODUCT in SQL across tables without huge joins?

Excel's SUMPRODUCT in SQL across tables without huge joins? - sql

Wondering what the options are to do Excel's SUMPRODUCT in SQL Server.
I have 3 tables:
Transaction list of items being sold
Raw materials making up each item
Price of raw materials by date
For each item sold (table 1), I want to find the total price of raw materials (table 3) on that sold date, considering the % of raw materials that make up the item. Sample data below.
Table 1: Items sold
Item
Date
Qty_sold
Pencil
5/1/2022
1
Pencil
6/1/2022
2
Pencil
9/1/2022
1
Table 2: Raw materials making up each item
Item
Raw_material
pct_of_total
Pencil
Wood
70%
Pencil
Rubber
5%
Pencil
Lead
25%
Table 3: Raw material prices by date
Date
Raw_material
Part_unitprice
5/1/2022
Wood
0.20
6/1/2022
Wood
0.21
9/1/2022
Wood
0.21
5/1/2022
Rubber
0.10
6/1/2022
Rubber
0.10
9/1/2022
Rubber
0.12
5/1/2022
Lead
0.50
6/1/2022
Lead
0.55
9/1/2022
Lead
0.50
The result I'm looking for is below, at the same level of detail as Table 1.
Item
Date
Qty_sold
SUMPRODUCT_unitprice
Pencil
5/1/2022
1
0.27
Pencil
6/1/2022
2
0.2895
Pencil
9/1/2022
1
0.278
My approach would be to
First join Table 2 (~3k rows) & Table 3 (~72k rows)
Join the resulting table with Table 1 (> 2M rows)
I'm conscious of the number of rows that would need to get crunched with these two joins, and I'm wondering if there is a more sophisticated way of doing this.

This is an example query where you can calculate the required sumproduct value. Just note that I don't see any reason to worry about "huge numbers of rows" here. This is not a cartesian join.
select
t1.Item,
t1.[Date],
t1.Qty_Sold, sum(t2.pct_of_total * t3.Part_UnitPrice) as SumProduct_UnitPrice
from
#Table1 t1
inner join #Table2 t2
on t2.Item = t1.Item
inner join #Table3 t3
on t3.Raw_Material = t2.Raw_Material
and t3.[Date] = t1.[Date]
group by
t1.Item, t1.[Date], t1.Qty_Sold;
For completeness sake, my test data:
drop table if exists #Table1, #Table2, #Table3
create table #Table1 (Item nvarchar(30), [Date] date, Qty_Sold int)
insert into #Table1 values
('Pencil', '20220501', 1),
('Pencil', '20220601', 2),
('Pencil', '20220901', 1)
create table #Table2 (Item nvarchar(30), Raw_Material nvarchar(30), pct_of_total decimal(5, 2))
insert into #Table2 values
('Pencil', 'Wood', .70),
('Pencil', 'Rubber', .05),
('Pencil', 'Lead', .25)
create table #Table3 ([Date] date, Raw_Material nvarchar(30), Part_UnitPrice decimal(5,2))
insert into #Table3 values
('20220501', 'Wood', .20),
('20220601', 'Wood', .21),
('20220901', 'Wood', .21),
('20220501', 'Rubber', .10),
('20220601', 'Rubber', .10),
('20220901', 'Rubber', .12),
('20220501', 'Lead', .50),
('20220601', 'Lead', .55),
('20220901', 'Lead', .50)

Related

SQL Query combine 2 rows into 1 adding values

I have a query that will potentially return multiple rows for the same ID from my database. This is because it is a payment table and an invoice can be paid on multiple times.
So my results can look like this.
ID Company BillAmount AmountPaid
----- --------- ------------ ------------
123 ABC 1000.00 450.00
123 ABC 1000.00 250.00
456 DEF 1200.00 1200.00
I am building this query to put into Crystal Reports. If I just pull the raw data, I won't be able to do any sub totaling in CR as Bill amount on this will show $3200 when it is really $2200. I'll need to show balance and I can do that in CR but if I am pulling balance on each line returned, the total balance due for all records shown will be wrong as the "duplicate" rows will be counted wrong.

I am not sure what kind of report you need but maybe a query like this might be useful:
select ID, Company, max(BillAmount), sum(AmountPaid)
from Payment
group by ID
-improved after Juan Carlos' suggestion

For this, there are 2 option available.
at Crystal report side
In crystal report, there is facility to group, as suggested in this link, follow steps
for group summary, after add group, put all fields in group footer, check this link
at Sql side the below suggestion (you are not define which sql or db use, I assume Sqlserver 2012 and above)
Get the records with extra 2 column ( TotalBill ,TotalPaid)
declare #Invoice table(id int , Company varchar(25), BillAmount int )
declare #payment table(id int , InvoiceId int, AmountPaid int )
insert into #Invoice values (1, 'ABC', 1000), (2, 'DFE', 1200)
insert into #payment values (1, 1, 450), (2, 1, 250), (3, 2, 1200)
;with cte as
( select sum(BillAmount) TotalBill from #Invoice i )
Select
i.*, p.AmountPaid ,
Sum(AmountPaid) over ( partition by i.id ) InvoiceWiseTotalPaid,
cte.TotalBill,
Sum(AmountPaid) over ( order by i.id ) TotalPaid
from
#Invoice i
Join #payment p on i.id= p.InvoiceId
, cte
Output will be

Product Final Price after Many Discount given

I have two tables.
One table of Ids and their prices, and second table of discounts per Id.
In the table of discounts an Id can has many Discounts, and I need to know the final price of an Id.
What is the Best way to query it (in one query) ?
The query should be generic for many discounts per id (not only 2 as mentioned below in the example)
For example
Table one
id price
1 2.00
2 2.00
3 2.00
Table two
id Discount
1 0.20
1 0.30
2 0.40
3 0.50
3 0.60
Final result:
id OrigPrice PriceAfterDiscount
1 2.00 1.12
2 2.00 1.20
3 2.00 0.40

Here's another way to do it:
SELECT T1.ID, T1.Price, T1.Price * EXP(SUM(LOG(1 - T2.Discount)))
FROM T1 INNER JOIN T2 ON T1.ID = T2.ID
GROUP BY T1.ID, T1.Price
The EXP/LOG trick is just another way to do multiplication.
If you have entries in T1 without discounts in T2, you could change the INNER JOIN to a LEFT JOIN. You would end up with the following:
ID Price Discount
4 2.00 NULL
Your logic can either account for the null in the discounted price column and take the original price instead, or just add a 0 discount record for those.

Generally it can be done with a trick with LOG/EXP functions but it is complex.
Here is a basic example:
declare #p table(id int, price money)
declare #d table(id int, discount money)
insert into #p values
(1, 2),
(2, 2),
(3, 2)
insert into #d values
(1, 0.2),
(1, 0.3),
(2, 0.4),
(3, 0.5),
(3, 0.6)
select p.id,
p.price,
p.price * ca.discount as PriceAfterDiscount
from #p p
cross apply (select EXP(SUM(LOG(1 - discount))) as discount FROM #d where id = p.id) ca
For simpler(cursor based approach) you will need a recursive CTE, but in this case you need some unique ordering column in Discounts table to run it correctly. This is shown in #Tanner`s answer.
And finally you can approach this with a regular cursor

I believe this produces the desired results using a CTE to iterate through the discounts. The solution below is re-runnable in isolation.
Edited: to include data that might not have any discounts applied in the output with a left join in the first part of the CTE.
CREATE TABLE #price
(
id INT,
price DECIMAL(5, 2)
);
CREATE TABLE #discount
(
id INT,
discount DECIMAL(5, 2)
);
INSERT INTO #price
(
id,
price
)
VALUES
(1, 2.00),
(2, 2.00),
(3, 2.00),
(4, 3.50); -- no discount on this item
INSERT INTO #discount
(
id,
discount
)
VALUES
(1, 0.20),
(1, 0.30),
(2, 0.40),
(3, 0.50),
(3, 0.60);
-- new temporary table to add a row number to discounts so we can iterate through them
SELECT d.id,
d.discount,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY d.discount) rn
INTO #GroupedDiscount
FROM #discount AS d;
-- note left join in first part of cte to get prices that aren't discounted included
WITH cte
AS (SELECT p.id,
p.price,
CASE
WHEN gd.discount IS NULL THEN
p.price
ELSE
CAST(p.price * (1.0 - gd.discount) AS DECIMAL(5, 2))
END AS discountedPrice,
gd.rn
FROM #price AS p
LEFT JOIN #GroupedDiscount AS gd
ON gd.id = p.id
AND gd.rn = 1
UNION ALL
SELECT cte.id,
cte.price,
CAST(cte.discountedPrice * (1.0 - gd.discount) AS DECIMAL(5, 2)) AS discountedPrice,
cte.rn + 1 AS rn
FROM cte
INNER JOIN #GroupedDiscount AS gd
ON gd.id = cte.id
AND gd.rn = cte.rn + 1
)
SELECT cte.id,
cte.price,
MIN(cte.discountedPrice) AS discountedPrice
FROM cte
GROUP BY id,
cte.price;
DROP TABLE #price;
DROP TABLE #discount;
DROP TABLE #GroupedDiscount;
Results:
id price discountedPrice
1 2.00 1.12
2 2.00 1.20
3 2.00 0.40
4 3.50 3.50 -- no discount

As others have said, EXP(SUM(LOG())) is the way to do the calculation. Here is basically same approach from yet another angle:
WITH CTE_Discount AS
(
SELECT Id, EXP(SUM(LOG(1-Discount))) as TotalDiscount
FROM TableTwo
GROUP BY id
)
SELECT t1.id, CAST(Price * COALESCE(TotalDiscount,1) AS Decimal(18,2)) as FinalPRice
FROM TableOne t1
LEFT JOIN CTE_Discount d ON t1.id = d.id
SQLFIddle Demo

Getting the daily sales report given the date

Given a date, the table displays the items sold on that date.
The table groups the category of the items and show the total sales value for each category. At the end, the report shows the total sales value for the day(s). Something like this:
ID Category Price Units Total Value
----------------------------------------------------
2244 class 10.50 10 105.00
2555 class 5.00 5 25.00
3455 class 20.00 1 20.00
Total 16 150.00
1255 pop 20.00 5 100.00
5666 pop 10.00 10 100.00
Total 15 200,00
1244 rock 2.50 20 50.00
8844 rock 5.00 50 250.00
Total 70 300.00
----------------------------------------------
Total Daily Sales 101 650.00
DBMS: SQL Server 2012
Bolded: primary keys
Item (upc, title*, type, category, company, year, price, stock)
PurchaseItem (receiptId, upc, quantity)
Order (receiptId, date, cid, card#, expiryDate, expectedDate, deliveredDate)
Rough work of what I have so far..
SELECT I.upc, I.category, I.price, P.quantity, P.quantity*I.price AS totalValue, SUM(totalValue), SUM(P.quantity) AS totalUnits, O.date
FROM Item I, Order O
JOIN (SELECT P.quantity
FROM PurchaseItem P, Item I
WHERE I.upc = P.upc)
ON I.upc = P.upc
WHERE O.date = ({$date}) AND O.receiptId = P.receiptId
GROUP BY I.upc, I.category, I.price, P.quantity, totalValue, O.date
Alright, this isn't right and I'm kind of stuck. Need some help!
I want it so it produces the total value of items from one category then in the end, it will add up the total value of the items from all categories.
SAMPLE TABLES
Item(2568, Beatles, rock, Music Inc, 1998, 50.50, 5000)
PurchaseItem (5300, 2568, 2)
Order (5300, 10/09/2014, ...Not important..) cid is customerId and card# is credit card number.

SSRS or a similar reporting package can usually handles this for you natively. If this has to be done in SQl script then a quick solution would be a cursor/while loop. pseudo code would look like this;
create tempsales table;
get distinct list of categories;
for each category
Begin
insert into tempsale (...)
sales where category = #category
group by category, item
insert into tempsales (...) -- use NULL value for item or perhaps a value of 'TOTAL'
Sales where category = #category
group by category
when last category
insert into tempsales (...) -- use NULL value for item AND Category or perhaps a value of 'TOTAL'
total with no group
end
select from tempsales;

See if this helps:
CREATE SAMPLE DATA
use tempdb;
create table Item(
upc int,
category varchar(100),
price decimal(8,2)
)
create table PurchaseItem(
receiptId int,
upc int,
quantity int
)
create table [Order](
receiptId int,
[date] date
)
insert into Item values
(2244, 'class', 10.50),
(2555, 'class', 5.0),
(3455, 'class', 20.0),
(1255, 'pop', 20.0),
(5666, 'pop', 10.0),
(1244, 'rock', 2.50),
(8844, 'rock', 5.0)
insert into PurchaseItem values
(5300, 2244, 10),
(5300, 2555, 5),
(5300, 3455, 1),
(5300, 1255, 5),
(5300, 5666, 10),
(5300, 1244, 20),
(5300, 8844, 50)
insert into [Order] values(5300,'20140910')
SOLUTION
;with cte as(
select
i.upc as Id,
i.category as Category,
i.price as Price,
p.quantity as Units,
price * quantity as TotalValue
from [Order] o
inner join PurchaseItem p
on p.receiptId = o.receiptId
inner join Item i
on i.upc = p.upc
)
select
Id,
case
when grouping(Id) = 1 then 'Total'
else Category
end as Category,
Price,
sum(Units) as Units,
sum(TotalValue) as TotalValue
from cte
group by
grouping sets(Category, (Category, Id, Price, Units, TotalValue))
union all
select
null,
'Total Daily Sales',
null,
sum(Units),
sum(Totalvalue)
from cte
DROP SAMPLE DATA
drop table item
drop table PurchaseItem
drop table [Order]

Insert into temp table based on rows already in table

I am building a temp table #Table that contains different Categories, Items, and Costs.
For example:
Category Item Cost
-------- ---- ----
Office Desk 100.00
Office Chair 75.00
Office PC 800.00
Home Desk 0.00
At the time that I receive the temp table for my processing, there are the individual rows with Category, Item, and Cost as well as Summary rows that contain the sum of each category that has a non-zero total:
Category Item Cost Type
-------- ---- ---- -----
Office Desk 100.00 Cost
Office Chair 75.00 Cost
Office PC 800.00 Cost
Office null 975.00 Summary
Home Desk 0.00 Cost
I would like to add summary rows for the $0.00 cost rows now as well, but am having trouble with figuring out how to do so.
INSERT INTO #Table
SELECT X.Category, null, 0.00, 'Summary'
FROM #Table X
...[Get Category data that does not have a Summary row]
I had thought about
WHERE NOT EXISTS ( SELECT * FROM #Table Y WHERE Y.Category = X.Category AND Type = 'Summary')
GROUP BY X.Category
but am concerned about performance, as there could be a lot of rows in this table.

You should benchmark your performance, but I've found joins are faster. Also, if your dataset it very large, adding an index after the data is loaded into the temp table can dramatically speed up subsequent table queries.
INSERT INTO #Table
SELECT X.Category, null, 0.00, 'Summary'
FROM #Table X
LEFT JOIN #Table Y ON Y.Category = X.Category AND Y.Type = 'Summary'
WHERE Y.Category IS NULL

How about doing this in 2 steps -
1 - Insert the raw data straight away - without summary info
2 - Insert more data - SELECT data grouping by Category, using SUM aggregate on cost, and hard coding NULL for Item & SUMMARY for Type field.

Applying multiple percentages to a column

I know I can use a cursor for this, but I'm trying to write this with ideally a set based solution or perhaps a CTE. I have 2 tables (simplified for post), products - each having a base price, then a table of modifiers which are percentage increases to apply in succession to that price. So if a product has 2 percentages, i.e., 4% and 5%, I can't just increase the base price by 9%, the requirement is to increase the base price by 4% then the result of that is increased by 5%. This can happen 1 to many times. Here is what I have so far:
CREATE TABLE #Product
(ProdID INT,
BasePrice MONEY)
INSERT INTO #Product
VALUES
(1, 10), (2, 20)
CREATE TABLE #Modifiers
(ProdID INT,
ModPercent INT)
INSERT INTO #Modifiers
VALUES
(1, 2), (1,5), (2, 2), (2, 3), (2,5)
The desired output for these 2 products is:
Prod 1 ((10 * 1.02) * 1.05) = 10.71
Prod 2 (((20 * 1.02) * 1.03) * 1.05) = 22.0626
I tried messing around with EXP(SUM(LOG())) in a straight query, but it seems I'm always summing the percentages. I also tried a CTE, but I can't seem to get it from infinitely recursing:
WITH ProductOutput (ProdID, SumPrice) AS
(
SELECT ProdID, BasePrice
FROM #Product
UNION ALL
SELECT P.ProdID, CAST(O.SumPrice * (1 + (M.ModPercent / 100.00)) AS MONEY)
FROM #Product P
INNER JOIN #Modifiers M ON
P.ProdID = M.ProdID
INNER JOIN ProductOutput AS O
ON P.ProdID = O.ProdID
)
SELECT ProdID, SUM(SumPrice)
FROM ProductOutput
GROUP BY ProdID
I appreciate any insights that could be offered. I would imagine this has been done before, but my searches didn't yield any hits.

select ProdId, EXP(SUM(LOG(ModPercent/100+1)))*AVG(BasePrice)
from Product
join Modifiers using(ProdId)
group by ProdId
Should do the trick

SQL 2005 added Outer Apply -- makes lots of complex SQL clearer to me -- clearly not necessary as the Group By is providing the key insight here -- but worth learning when you add conditions to the "join logic" it becomes invaluable
select P.ProdID
, ML.logmarkup
, P.BasePrice
, P.BasePrice * exp(ML.logmarkup) as NewPrice
from #Product P
outer apply
(
select sum(log(1.0+M.ModPercent/100.0)) as logmarkup
from #Modifiers M where (M.ProdID = P.ProdID)
group by M.ProdID
) ML
ProdID logmarkup BasePrice NewPrice
----------- ---------------------- --------------------- ----------------------
1 0.0685927914656118 10.00 10.71
2 0.0981515937071562 20.00 22.0626
(2 row(s) affected)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Excel's SUMPRODUCT in SQL across tables without huge joins? - sql

Related

SQL Query combine 2 rows into 1 adding values

Product Final Price after Many Discount given

Getting the daily sales report given the date

Insert into temp table based on rows already in table

Applying multiple percentages to a column

Categories

Resources