Rounding all numbers but keep the sum - sql

Say I have 4 records as follows:
CASH
========
1993.772
5015.572
996.884
1993.772
These numbers add up to 10000.00. Now I want to round all these numbers to two decimal places, but keep the sum as 10000.00.
In the above example, if I remove the last digit, the sum of the numbers will be 9999.99 and not 10000.00, but something like this would still add up to 10000.00:
CASH
========
1993.78 <- changed from 1993.77 to 1993.78
5015.57
996.88
1993.77
Any easy way to do it?

This is challenging. Here is one method:
select t.*,
round(cash, 2),
(case when row_number() over (order by cash desc) = 1
then sum(cash) over () - sum(round(cash, 2)) over (order by cash rows between unbounded preceding and 1 preceding)
else round(cash, 2)
end)
from t;
Here is a db<>fiddle.
Basically, this rounds all the values to two decimal places except the biggest one. For that one, it subtracts the sum of the rounded values from the total.
Note: This adds the extra to the largest value. It could round the smallest value, but I think rounding the largest is safer (a smaller incremental change to the value). If you have other columns to specify the ordering, then the "first" or "last" column can be chosen instead.

Aggregate all of the values up into a sum and into an array, then unroll the array back into their separate rows.
SELECT
unnest(array_agg(round(ct.cash, 2))) AS cash,
round(sum(ct.cash), 2) AS total
FROM cash_table AS ct;
Result
cash | total
--------+---------
1993.77 | 10000.00
5015.57 | 10000.00
996.88 | 10000.00
1993.77 | 10000.00

Related

Calculating Column value based on row above and previous column [duplicate]

This question already has answers here:
How to calculate Running Multiplication
(4 answers)
Closed 6 months ago.
I have a table I'm trying to create that has a column that needs to be calculated based on the row above it multiplied by the previous column. The first row is defaulted to 100,000 and the rest of the rows would be calculated off of that. Here's an example:
Age
Population
Deaths
DeathRate
DeathPro
DeathProb
SurvivalProb
PersonsAlive
0
1742
0
0
0.1
0
1
100,000
51
2048
1
0.00048
0.5
0.00048
0.99951
99951.18379
52
1921
0
0
0.5
0
1
99951.18379
61
1965
1
0.00051
0.5
0.00051
0.99949
99900.33
I skipped some ages so I didn't have type it all in there, but the ages go from 0 - 85. This was orginally done in excel and the formula for PersonsAlive (which is what I'm trying to recreate) was G3*H2 aka previous value of PersonsAlive * Survival Probability.
I was thinking I could accomplish this with the lag function, but with the example I provided above, I get null values for everything after age 1 because there is no value in the previous row. What I want to happen is that PersonsAlive returns 100,000 until I get a death (in the example at Age 51) and then it does the calculation and returns the value (99951) until another death happens (Age 61). Here's my code, which includes two extra columns, ZipCode (the reason we want to do it in SQL is so we can calculate all zips at once) and PersonsAliveTemp, which I used to set Age 0 to 100,000:
SELECT
ZipCode
,Age
,[Population]
,Deaths
,DeathRate
,Death_Proportion
,DeathProbablity
,SurvivalProbablity
,PersonsAliveTemp
,(LAG(PersonsAliveTemp,1) OVER(PARTITION BY ZipCode ORDER BY Age))*SurvivalProbablity as PersonsAlive
FROM #temp4
I also tried it with defaulting PersonsAliveTemp to 100,000 and 0, which "works" but doesn't do the running calculation.
Is it possible to get the lag function (or some other function) to do a running row by row calc?
This converts a running product into an addition via logarithms.
select *,
100000 * exp(sum(log(SurvivalProb)) over
(partition by ZipCode order by Age
rows between unbounded preceding and current row)
) as PersonsAlive
from data
order by Age;
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=36be4d66260c74196f7d36833018682a

How to take average of two columns row by row in SQL?

I have a table match which looks like this (please see attached image). I wanted to retrieve a dataset that had a column of average values for home_goal and away_goal using this code
SELECT
m.country_id,
m.season,
m.home_goal,
m.away_goal,
AVG(m.home_goal + m.away_goal) AS avg_goal
FROM match AS m;
However, I got this error
column "m.country_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 3: m.country_id,
My question is: why was GROUP BY clause required? Why couldn't SQL know how to take average of two columns row by row?
Thank you.
try this:
SELECT
m.country_id,
m.season,
m.home_goal,
m.away_goal,
(m.home_goal + m.away_goal)/2 AS avg_goal
FROM match AS m;
You have been asked for the group_by as avg() much like sum() work on multiple values of one column where you classify all columns that are not a columns wise operation in the group by
You are looking to average two distinct columns - it is a row-wise operations instead of column-wise
how to take average of two columns row by row?
You don't use AVG() for this; it is an aggregate function, that operates over a set of rows. Here, it seems like you just want a simple math computation:
SELECT
m.country_id,
m.season,
m.home_goal,
m.away_goal,
(m.home_goal + m.away_goal) / 2.0 AS avg_goal
FROM match AS m;
Note the decimal denominator (2.0): this avoids integer division in databases that implement it.
Avg in the context of the function mentioned above is calculating the average of the values of the columns and not the average of the two values in the same row. It is an aggregate function and that’s why the group by clause is required.
In order to take the average of two columns in the same row you need to divide by 2.
Let's consider the following table:
CREATE TABLE Numbers([x] int, [y] int, [category] nvarchar(10));
INSERT INTO Numbers ([x], [y], [category])
VALUES
(1, 11, 'odd'),
(2, 22, 'even'),
(3, 33, 'odd'),
(4, 44, 'even');
Here is an example of using two aggregate functions - AVG and SUM - with GROUP BY:
SELECT
Category,
AVG(x) as avg_x,
AVG(x+y) as avg_xy,
SUM(x) as sum_x,
SUM(x+y) as sum_xy
FROM Numbers
GROUP BY Category
The result has two rows:
Category avg_x avg_xy sum_x sum_xy
even 3 36 6 72
odd 2 24 4 48
Please note that Category is available in the SELECT part because the results are GROUP BY'ed by it. If a GROUP BY is not specified then the result would be 1 row and Category is not available (which value should be displayed if we have sums and averages for multiple rows with different caetories?).
What you want is to compute a new column and for this you don't use aggregate functions:
SELECT
(x+y)/2 as avg_xy,
(x+y) as sum_xy
FROM Numbers
This returns all rows:
avg_xy sum_xy
6 12
12 24
18 36
24 48
If your columns are integers don't forget to handle rounding, if needed. For example (CAST(x AS DECIMAL)+y)/2 as avg_xy,
The simple arithmetic calculation:
(m.home_goal + m.away_goal) / 2.0
is not exactly equivalent to AVG(), because NULL values mess it up. Databases that support lateral joins provide a pretty easy (and efficient) way to actually use AVG() within a row.
The safe version looks like:
(coalesce(m.home_goal, 0) + coalesce(m.away_goal, 0)) /
nullif( (case when m.home_goal is not null then 1 else 0 end +
case when m.away_goal is not null then 1 else 0 end
), 0
)
Some databases have syntax extensions that allow the expression to be simplified.

Complicate SQL Amount split by percentage in same row transpose (Pivot)?

I am struggling to split total amount field into percentage in the same row and then update the last column with Amount type for which the percentage is applied.
Example data
Total Amount | UF% | UFI% |RA% |RL% |NP% | AmountType
100 |0.00 |20 |9.15 |0.75 |70.01
1520.23 |64.4 |19.1 |15.5 |0.25 |0.75
158520.03|13.25 |35 |2.25 |19.28 |30.22
I have to get percentage of total amount column and then transpose insert them as additional rows in the same table and upate the last column what type of amount it is.
For example for 1st row I can get 5 new rows
Total Amount Amount type
0 UF%
20 UFI%
9.15 RA%
0.75 RL%
70.01 NP%
I am one step at a time to I have created 5 new columns to calculate the percentage as TotalAmount UF%, TotalAmount UFI%, TotalAmountRA% and so on…
Selec t [Total Amount]* UF% as [TotalAmount UF%] … and so on.
I am stuck here shall I use Pivot/unpivot? Or case ?
Or is it any other easier way to use row over partition by ?
Please suggest.
this should work for you. Just copy this into an empty query window and execute. Adapt to your needs...
EDIT: Calculate percentages...
declare #amounts table (TotalAmount decimal(8,2),[UF%] decimal(4,2), [UFI%] decimal(4,2)
,[RA%] decimal(4,2),[RL%] decimal(4,2)
,[NP%] decimal(4,2));
insert into #amounts values
(100,0.00,20,9.15,0.75,70.01)
,(1520.23,64.4,19.1,15.5,0.25,0.75)
,(158520.03,13.25,35,2.25,19.28,30.22);
select up.TotalAmount
,up.Percentag
,(up.TotalAmount/100)*up.Percentag AS AmountPercentage
,up.Amount AS AmountType
from
(
select *
from #amounts
) AS tbl
unpivot
(
Percentag FOR Amount IN([UF%],[UFI%],[RA%],[RL%],[NP%])
) AS up

SQL Rounding Percentages to make the sum 100% - 1/3 as 0.34, 0.33, 0.33

I am currently trying to split one value with percentage column. But as most of percentages values are 1/3, I am not able to get aboslute 100% with two decimal points in the value. For example:
Product Supplier percentage totalvalue customer_split
decimal(15,14) (decimal(18,2) decimal(18,2)
-------- -------- ------------ --------------- ---------------
Product1 Supplier1 0.33 10.00 3.33
Product1 Supplier2 0.33 10.00 3.33
Product1 Supplier3 0.33 10.00 3.33
So, here we are missing 0.01 in the value column and suppliers would like to put this missing 0.01 value against any one of the supplier randomly. I have been trying to get this done in a two sets of SQLs with temporary tables, but is there any simple way of doing this. If possible how can I get 0.34 in the percentage column itself for one of the above rows? 0.01 is negligible value, but when the value column is 1000000000 it is significant.
It sounds like you're doing some type of "allocation" here. This is a common problem any time you are trying to allocate something from a higher granulartiy to a lower granularity, and you need to be able to re-aggregate to the total value correctly.
This becomes a much bigger problem when dealing with larger fractions.
For example, if I try to divide a total value of, say $55.30 by eight, I get a decimal value of $6.9125 for each of the eight buckets. Should I round one to $6.92 and the rest to $6.91? If I do, I will lose a cent. I would have to round one to $6.93 and the others to $6.91. This gets worse as you add more buckets to divide by.
In addition, when you start to round, you introduce problems like "Should 33.339 be rounded to 33.34 or 33.33?"
If your business logic is such that you just want to take whatever remainder beyond 2 significant digits may exist and add it to one of the dollar values "randomly" so you don't lose any cents, #Diego is on the right track with this.
Doing it in pure SQL is a bit more difficult. For starters, your percentage isn't 1/3, it's .33, which will yield a total value of 9.9, not 10. I would either store this as a ratio or as a high-precision decimal field (.33333333333333).
P S PCT Total
-- -- ------------ ------
P1 S1 .33333333333 10.00
P2 S2 .33333333333 10.00
P3 S3 .33333333333 10.00
SELECT
BaseTable.P, BaseTable.S,
CASE WHEN BaseTable.S = TotalTable.MinS
THEN BaseTable.BaseAllocatedValue + TotalTable.Remainder
ELSE BaseTable.BaseAllocatedValue
END As AllocatedValue
FROM
(SELECT
P, S, FLOOR((PCT * Total * 100)) / 100 as BaseAllocatedValue,
FROM dataTable) BaseTable
INNER JOIN
(SELECT
P, MIN(S) AS MinS,
SUM((PCT * Total) - FLOOR((PCT * Total * 100)) / 100) as Remainder,
FROM dataTable
GROUP BY P) as TotalTable
ON (BaseTable.P = TotalTable.P)
It appears your calculation is an equal distribution based on the total number of products per supplier. If it is, it may be advantageous to remove the percentage and instead just store the count of items per supplier in the table.
If it is also possible to store a flag indicating the row that should get the remainder value applied to it, you could assign based on that flag instead of randomly.
run this, it will give an idea on how you can solve your problem.
I created a table called orders just with an ID to be easy to understand:
create table orders(
customerID int)
insert into orders values(1)
go 3
insert into orders values(2)
go 3
insert into orders values(3)
go 3
these values represent the 33% you have
1 33.33
2 33.33
3 33.33
now:
create table #tempOrders(
customerID int,
percentage numeric(10,2))
declare #maxOrder int
declare #maxOrderID int
select #maxOrderID = max(customerID) from orders
declare #total numeric(10,2)
select #total =count(*) from orders
insert into #tempOrders
select customerID, cast(100*count(*)/#total as numeric(10,2)) as Percentage
from orders
group by customerID
update #tempOrders set percentage = percentage + (select 100-sum(Percentage) from #tempOrders)
where customerID =#maxOrderID
this code will basically calculate the percentage and the order with the max ID, then it gets the diference from 100 to the percentage sum and add it to the order with the maxID (your random order)
select * from #tempOrders
1 33.33
2 33.33
3 33.34
This should be an easy task using Windowed Aggregate Functions. You probably use them already for the calculation of customer_split:
totalvalue / COUNT(*) OVER (PARTITION BY Product) as customer_split
Now sum up the customer_splits and if there's a difference to total value add (or substract) it to one random row.
SELECT
Product
,Supplier
,totalvalue
,customer_split
+ CASE
WHEN COUNT(*)
OVER (PARTITION BY Product
ROWS UNBOUNDED PRECEDING) = 1 -- get a random row, using row_number/order you might define a specific row
THEN totalvalue - SUM(customer_split)
OVER (PARTITION BY Product)
ELSE 0
END
FROM
(
SELECT
Product
,Supplier
,totalvalue
,totalvalue / COUNT(*) OVER (PARTITION BY Product) AS customer_split
FROM dropme
) AS dt
After more than one trial and test i think i found better solution
Idea
Get Count of all(Count(*)) based on your conditions
Get Row_Number()
Check if (Row_Number() value < Count(*))
Then select round(curr_percentage,2)
Else
Get sum of all other percentage(with round) and subtract it from 100
This steps will select current percentage every time EXCEPT Last one will be
100 - the sum of all other percentages
this is part of my code
Select your_cols
,(Select count(*) from [tbl_Partner_Entity] pa_et where [E_ID] =#E_ID)
AS cnt_all
,(ROW_NUMBER() over ( order by pe.p_id)) as row_num
,Case when (
(ROW_NUMBER() over ( order by pe.p_id)) <
(Select count(*) from [tbl_Partner_Entity] pa_et where [E_ID] =#E_ID))
then round(([partnership_partners_perc]*100),2)
else
100-
((select sum(round(([partnership_partners_perc]*100),2)) FROM [dbo].
[tbl_Partner_Entity] PEE where [E_ID] =#E_ID and pee.P_ID != pe.P_ID))
end AS [partnership_partners_perc_Last]
FROM [dbo].[tbl_Partner_Entity] PE
where [E_ID] =#E_ID

My aggregate is not affected by ROLLUP

I have a query similar to the following:
SELECT CASE WHEN (GROUPING(Name) = 1) THEN 'All' ELSE Name END AS Name,
CASE WHEN (GROUPING(Type) = 1) THEN 'All' ELSE Type END AS Type,
sum(quantity) AS [Quantity],
CAST(sum(quantity) * (SELECT QuantityMultiplier FROM QuantityMultipliers WHERE a = t.b) AS DECIMAL(18,2)) AS Multiplied Quantity
FROM #Table t
GROUP BY Name, Type WITH ROLLUP
I'm trying to return a list of Names, Types, a summed Quantity and a summed quantity multiplied by an arbitrary number. All fine so far. I also need to return a sub-total row per Name and per Type, such as the following
Name Type Quantity Multiplied Quantity
------- --------- ----------- -------------------
a 1 2 4
a 2 3 3
a ALL 5 7
b 1 6 12
b 2 1 1
b ALL 7 13
ALL ALL 24 40
The first 3 columns are fine. I'm getting null values in the rollup rows for the multiplied quantity though. The only reason I can think this is happening is because SQL doesn't recognize the last column as an aggregate now that I've multiplied it by something.
Can I somehow work around this without things getting too convoluted?
I will be falling back onto temporary tables if this can't be done.
In your sub-query to acquire the multiplier, you have WHERE a=b. Are either a or b from the tables in your main query?
If these values are static (nothing to do with the main query), it looks like it should be fine...
If the a or b values are the name or type field, they can be NULL for the rollup records. If so, you can change to something similiar to...
CAST(sum(quantity * (<multiplie_query>)) AS DECIMAL(18,2)).
If a or b are other field from your main query, you'd be getting multiple records back, not just a single multiplier. You could change to something like...
CAST(sum(quantity) * (SELECT MAX(multiplier) FROM ...)) AS DECIMAL(18,2))