sum divided values problem (dealing with rounding error) - sql

I've a product that costs 4€ and i need to divide this money for 3 departments.
On the second column, i need to get the number of rows for this product and divide for the number of departments.
My query:
select
department, totalvalue,
(totalvalue / (select count(*) from departments d2 where d2.department = p.product))
dividedvalue
from products p, departments d
where d.department = p.department
Department Total Value Divided Value
---------- ----------- -------------
A 4 1.3333333
B 4 1.3333333
C 4 1.3333333
But when I sum the values, I get 3,999999. Of course with hundreds of rows i get big differences...
Is there any chance to define 2 decimal numbers and round last value? (my results would be 1.33 1.33 1.34)
I mean, some way to adjust the last row?

In order to handle this, for each row you would have to do the following:
Perform the division
Round the result to the appropriate number of cents
Sum the difference between the rounded amount and the result of the division operation
When the sum of the differences exceeds the lowest decimal place (in this case, 0.01), add that amount to the results of the next division operation (after rounding).
This will distribute fractional amounts evenly across the rows. Unfortunately, there is no easy way to do this in SQL with simple queries; it's probably better to perform this in procedural code.
As for how important it is, when it comes to financial applications and institutions, things like this are very important, even if it's only by a penny, and even if it can only happen every X number of records; typically, the users want to see values tie to the penny (or whatever your unit of currency is) exactly.
Most importantly, you don't want to allow for an exploit like "Superman III" or "Office Space" to occur.

With six decimals of precision, you would need about 5,000 transactions to notice a difference of one cent, if you round the final number to two decimals. Increasing the number of decimals to an acceptable level would eliminate most issues, i.e. using 9 decimals you would need about 5,000,000 transactions to notice a difference of a cent.

Maybe you can make a forth row that will be Total - sum(A,B,C).
But it depends on what you want to do, if you need exact value, you can keep fractions, else, truncate and don't care about the virtual loss

Also can be done simply by adding the rounding difference of a particular value to the next number to be rounded (before rounding). This way the pile remains always the same size.

Here's a TSQL (Microsoft SQL Server) implementation of the algorithm provided by Martin:
-- Set parameters.
DECLARE #departments INTEGER = 3;
DECLARE #totalvalue DECIMAL(19, 7) = 4.0;
WITH
CTE1 AS
(
-- Create the data upon which to perform the calculation.
SELECT
1 AS Department
, #totalvalue AS [Total Value]
, CAST(#totalvalue / #departments AS DECIMAL(19, 7)) AS [Divided Value]
, CAST(ROUND(#totalvalue / #departments, 2) AS DECIMAL(19, 7)) AS [Rounded Value]
UNION ALL
SELECT
CTE1.Department + 1
, CTE1.[Total Value]
, CTE1.[Divided Value]
, CTE1.[Rounded Value]
FROM
CTE1
WHERE
Department < #departments
),
CTE2 AS
(
-- Perform the calculation for each row.
SELECT
Department
, [Total Value]
, [Divided Value]
, [Rounded Value]
, CAST([Divided Value] - [Rounded Value] AS DECIMAL(19, 7)) AS [Rounding Difference]
, [Rounded Value] AS [Calculated Value]
FROM
CTE1
WHERE
Department = 1
UNION ALL
SELECT
CTE1.Department
, CTE1.[Total Value]
, CTE1.[Divided Value]
, CTE1.[Rounded Value]
, CAST(CTE1.[Divided Value] + CTE2.[Rounding Difference] - ROUND(CTE1.[Divided Value] + CTE2.[Rounding Difference], 2) AS DECIMAL(19, 7))
, CAST(ROUND(CTE1.[Divided Value] + CTE2.[Rounding Difference], 2) AS DECIMAL(19, 7))
FROM
CTE2
INNER JOIN CTE1
ON CTE1.Department = CTE2.Department + 1
)
-- Display the results with totals.
SELECT
Department
, [Total Value]
, [Divided Value]
, [Rounded Value]
, [Rounding Difference]
, [Calculated Value]
FROM
CTE2
UNION ALL
SELECT
NULL
, NULL
, SUM([Divided Value])
, SUM([Rounded Value])
, NULL
, SUM([Calculated Value])
FROM
CTE2
;
Output:
You can plug in whatever numbers you want at the top. I'm not sure if there is a mathematical proof for this algorithm.

Related

MS Access aggregate function Query works on a single table but but not on multiple tables

This query works fine:
SELECT
SessionResults.Driver,
SUM( IIF( SessionResults.Place = 1, 1, NULL ) ) AS [Race Wins],
ROUND( AVG( SessionResults.Place ), 2 ) AS [Avg Race Pos],
SUM( SessionResults.TotalLaps ) AS [Total Laps]
FROM
SessionResults
GROUP BY
SessionResults.Driver;
This one does not:
SELECT DISTINCT
SessionResults.Driver,
( SUM( IIF( [SessionResults.Place] = 1, 1, NULL ) ) ) AS [Race Wins],
ROUND( AVG( SessionResults.Place ), 2 ) AS [Avg Race Placing],
ROUND( AVG( SegmentData.SegPlace ), 2 ) AS [Avg Heat Placing],
COUNT( SessionResults.TotalLaps ) AS [Total Laps],
EventHistory.TrackID AS [Track/Layout],
MAX( EventHistory.Date ) AS [Last Race]
FROM
SessionResults,
SegmentData,
EventHistory
WHERE
(
(
( SessionResults.EventID ) = EventHistory.EventID
AND
SegmentData.EventID = SessionResults.EventID
)
)
GROUP BY
SessionResults.Driver,
EventHistory.TrackID;
The second query produces excessive counts in the [Race Wins] column. The other columns appear to be accurate.
I know there are better ways to do this in newer Access versions but I'm not up to speed on those methods either.
What am I missing here? I messed with different Join options but couldn't get anything to work. Can this be done with Access? The version I'm running is very old, Access 2000, V9.
Again, a single table query seems to work fine. The data gets messed up in the [Race Wins] column when I extend the query to other tables within the same mdb file. The single table query produces a max of 36 wins in the Race Wins column. When the query is extended to multiple tables those numbers increase to well over 600 wins, which is not accurate.

Freight per delivery

I am a beginner in SQL and struggle with a little issue where I hope you can help me with. What I want to achieve: I want to calculate freight costs per delivery which is depending on the route and the weight.
For this I have one table (shipments 3350) where all shipments for a certain period are included, so it contains delivery number, route, weight, etc. The table shipments I want to join with table freight rates as I want to calculate the freight costs per delivery. Table freight rates includes basically the different routes, weight categories and the price (one route can have different costs based on the weight being shipped). Moreover it is to consider that the table shipments is not clean and I need to remove duplicates for deliveries (Delivery numbers can pop up several times which should not be the case)
This is what I did. Basically I have created 2 CTEs which I joined afterards. The outcome looks promising. However I have one issue I struggle with. As mentioned price is depending on the route and the correspondent weight. However each route has different freight rates depending on the weight. I.e. route abc, weight within 0 and 5kg 5€, >5kg but <10kg 10€ and so on. Hence, the query should identify correct freight costs based on route and weight information to be found on the delivery. Sometimes this fails (wrong freight costs being selected) and I have no clue what needs to be changed. Hence my question is whether there is something obviously wrong in my code which prevents me from getting correct freight costs?
With CTE1 as
(
Select row_number() over (Partition by [Delivery] order by [Delivery]) as ROWID
,[Delivery]
,[Total Weight]
,[CTY]
,[Route]
,[Shipment]
,[SearchTerm]
,[Shipment route]
,[Shipping Conditions]
from [BAAS_PowerBI].[dbo].[Shipments 3350 ]
)
, CTE2 as
(select * from(
select [route],[Lower Scale quantity limit],[Upper scale quantity limit],[Amount],[sales org]
from [BAAS_PowerBI].[dbo].[RM35_freight rates 27112018 test]
)x where x.[sales org]=3350)
Select * from CTE1
left join CTE2
on [CTE1].[route] = [CTE2].[route]
where [Total Weight] <[Upper scale quantity limit] and [Total Weight] >=[Lower Scale quantity limit] and ROWID=1
You can see from the pictures that the query has selected the wrong weight category. It should have selected the category 0-10Kg and not 30-55Kg
where [Total Weight] < [Upper scale quantity limit]
and [Total Weight] >= [Lower Scale quantity limit]
didn't work, because your columns were strings, and for strings '10' is smaller than '2' for instance, because '1' is smaller than '2' in the character table (ANSI, ASCII, UNICODE, well, whatever it is).
But there is another issue with your WHERE clause: it renders your outer join a mere inner join. Here is why:
With CTE1
[Delivery] [Route] [Total Weight]
A X 6
B X 60
C Y 6
and CTE2
[Route] [Lower Scale quantity limit] [Upper scale quantity limit]
X 1 10
X 11 20
This statement:
select *
from cte1
left join cte2 on cte1.route = cte2.route
leads to
[Delivery] [Route] [Total Weight] [Lower Scale quantity limit] [Upper scale quantity limit]
A X 6 1 10
A X 6 11 20
B X 60 1 10
B X 60 11 20
C Y 6 null null
and the WHERE clause
where [Total Weight] < [Upper scale quantity limit]
and [Total Weight] >= [Lower Scale quantity limit]
reduces this to:
[Delivery] [Route] [Total Weight] [Lower Scale quantity limit] [Upper scale quantity limit]
A X 6 1 10
as only this one joined row matches the condition. This result is exactly the same as you would get with an inner join.
What you really want instead is an not an outer join that joins all route matches and even keeps routes that have no match (which is what left join cte2 on cte1.route = cte2.route does), but an outer join that joins all route/range matches and even keeps routes/totals that have no matching route/range:
select *
from cte1
left join cte2 on cte1.route = cte2.route
and [Total Weight] < [Upper scale quantity limit]
and [Total Weight] >= [Lower Scale quantity limit]
[Delivery] [Route] [Total Weight] [Lower Scale quantity limit] [Upper scale quantity limit]
A X 6 1 10
B X 60 null null
C Y 6 null null
Here you join every CTE1 row with their matching CTE2 row or with a dummy CTE2 row consisting of nulls when there is no match in CTE2.
(ROWID=1 belongs in the WHERE clause by the way, as this has nothing to do with which CTE2 rows to join to CTE1, but merely says which CTE1 rows you want to consider. If you mistakenly put ROWID=1 in the ON clause, too, you would suddenly select all CTE1 rows, but only look for CTE2 matches for those with ROWID=1.)
In short: When you outer join a table, put all its join criteria in the ON clause.

Dividing two SQL query results

I'm just trying to learn how to take a value from a column, in this case how much JJ spent on product a, and divide it by the sum of the total Product A sales and turn it into a percentage.
My SQL understanding is pretty low level right now, so the simpler the response the better.
SELECT
JJ / Result * 100 AS percentage
FROM
(SELECT
([Product A] AS JJ
FROM [Test].[dbo].[TableA]
WHERE [Customer Name] = 'JJ'
SELECT SUM([Product A]) AS Result
FROM [Test].[dbo].[TableA]
)
--JJ/Result * 100 = ProdAPercentSales)
You could use a case expression to find JJ's purchases, and divide their sum with the total sum:
SELECT SUM(CASE [Customer name] WHEN 'JJ' THEN [Product A] ELSE 0 END) /
SUM([Product A]) * 100 AS [Percentage]
FROM [Test].[dbo].[TableA]

How do I divide the sum of one column by the sum of another column when they are from different tables in Access?

I want to grab the sum of two identically named columns, except they are from two different tables (I have been told to combine them multiple times, unfortunately I do not have a choice in the matter...). I want to then divide these two summed values to find the percentage variation.
Below is the code I wrote, all it does is load infinitely.
SELECT SUM(C.[Market Value]) AS CSUM,
SUM(P.[Market Value]) AS PSUM,
CSUM/PSUM AS Percentage_Variation
FROM JanReport AS P,
FebReport AS C;
I have been unsuccessful in using UNION and JOIN. It seems like this function should be easily doable, as I have been able to sum and find % variation by asset class, I can even get the sum of both columns to show with the Totals function. However I just cannot seem to get them to divide.
Code that worked:
SELECT SUM(FebReport.[Market Value]) AS Curr_Total_MV,
(
SELECT SUM(JanReport.[Market Value]) FROM JanReport
) AS Prior_Total_MV,
SUM(FebReport.[Market Value]) /
(
SELECT SUM([Market Value]) FROM JanReport
) AS Percentage_Variation_In_Total_MV,
IIf(
0.9<Percentage_Variation_In_Total_MV AND
Percentage_Variation_In_Total_MV<1.1,'Pass','Fail') AS Result
FROM FebReport;
Inline SELECT should work here:
SELECT SUM([Market Value]) / (SELECT SUM([Market Value]) FROM JanReport) AS Percentage_Variation,
FROM FebReport;
Update: Two sub-queries each returning just one row can be cross-joined to provide the underlying values:
SELECT
[Current Market Value],
[Previous Market Value],
[Current Market Value] / [Previous Market Value] As Percentage_Variation
FROM
(SELECT SUM([Market Value]) As [Current Market Value] FROM FebReport) AS C,
(SELECT SUM([Market Value]) As [Previous Market Value] FROM JanReport) AS P;

SQL:Pivot table which includes sum and percentage total

I'm trying to recreate a view in Tableau as a view in SQL. It requires me pivoting a table based on month and not only summing the amount but I also need to sum by margin and also create a Margin % row.The desired output is
BUSINESS_UNIT CLASS JANUARY FEBRUARY MARCH
202 Cost of Sales 100 (null) 60
202 Revenue 200 80 (null)
202 Margin x xx xxx
202 Margin % x% xx% xxx%
I can pivot based on Month but how do perform twos sums in one pivot table and how would I go about including a percenatge row also?
Code so far
SELECT
*
FROM
(SELECT
[Business_Unit]
,[Class]
,Month as Period
,[Amount]
--,Margin
FROM [sample_table]
where [Class] in ('Revenue','Cost of Sales') )AS T
PIVOT(SUM(Amount)
FOR Period IN ([January],[February],[March])) as Pvt
I have included my code so far http://www.sqlfiddle.com/#!3/06bafc/6
Not the prettiest SQL I've done. but this seems to work...
http://www.sqlfiddle.com/#!3/06bafc/60/0
What it does is build on what you've done by generating a margin line and adding a total column
Using this line and total we can then calculate the % of margin. Grouping SETS allowed me to generate the multiple rows, subtotals and totals, Since I knew the only additional line generated would have a null class, I was able to set the Name of the class to margin when null.
WITH CTE AS (
SELECT
Business_Unit
,case when class is NULL then 'Margin' else class end as Class
,Sum(January) as January
,Sum(February) as February
,Sum(March) as march
,Sum(coalesce(January,0)+coalesce(February,0)+coalesce(March,0)) as Total
FROM (
SELECT
*
FROM
(SELECT
[Business_Unit]
,[Class]
,Month as Period
,[Amount]
--,Margin
FROM [sample_table]
where [Class] in ('Revenue','Cost of Sales') )AS T
PIVOT(SUM(Amount)
FOR Period IN ([January],[February],[March])) as Pvt
) as Base
GROUP BY Grouping sets
((Business_Unit,Class,January,February,March,
coalesce(January,0)+coalesce(February,0)+coalesce(March,0))
,(Business_Unit)
))
SELECT *
FROM CTE UNION
SELECT Business_Unit
,'Margin %'
,January*100.00/Total
,February*100.00/Total
,March*100.00/Total
,Total*100.00/Total
FROM CTE
WHERE CLASS='Margin'