Strange result of floating-point operation

Strange result of floating-point operation - vb.net

Problems like this drive me crazy. Here's the relevant piece of code:
Dim RES As New Size(Math.Floor(Math.Round(mPageSize.Width - mMargins.Left - mMargins.Right - mLabelSize.Width, 4) / (mLabelSize.Width + mSpacing.Width) + 1),
Math.Floor((mPageSize.Height - mMargins.Top - mMargins.Bottom - mLabelSize.Height) / (mLabelSize.Height + mSpacing.Height)) + 1)
Values of the variables (all are of Single type):
mPageSize.Width = 8.5
mMargins.Left = 0.18
mMargins.Right = 0.18
mLabelSize.Width = 4.0
mSpacing.Width = 0.14
For God-knows-what reason, RES evaluates to {Width=1,Height=5} instead of {Width=2,Height=5}. I have evaluated the expressions on the right-side individually and as a whole and they correctly evaluate to {2,5}, but RES would never get correct value. Wonder what am I missing here.
EDIT
I have simplified the problem further. The following code will produce 2.0 if you QuickWatch the RHS, but the variable on the LHS will get 1.0 after you execute this line:
Dim X = Math.Floor(Math.Round(mPageSize.Width - mMargins.Left - mMargins.Right - mLabelSize.Width, 4) / (mLabelSize.Width + mSpacing.Width) + 1)
Time for MS to check it out?
EDIT 2
More info. The following gives correct results:
Dim Temp = mPageSize.Width - mMargins.Left - mMargins.Right - mLabelSize.Width
Dim X = Math.Floor(Temp / CDec(mLabelSize.Width + mSpacing.Width)) + 1

The problem is that the following expression evaluates to a value just below 1:
Math.Round(mPageSize.Width - mMargins.Left - mMargins.Right - mLabelSize.Width, 4) / (mLabelSize.Width + mSpacing.Width)
= 0.99999999985602739 (Double)
But what's the reason for that? The truth is that I don't know exactly. The MSDN does not offer enough information about the implementation of / but here's my guess:
Math.Round returns a Double with value 4.14. The right-hand side of the division is a Single. So you're dividing a Double by a Single. This results in a Double (see MSDN). So far, so good. The MSDN states that all integral data types are widened to Double before the division. Although Single is not an integral data type, this is probably what happens. And here is the problem. The widening does not seem to be performed on the result of the addition, but on its operands.
If you write
Dim sum = (mLabelSize.Width + mSpacing.Width) 'will be 4.14 Single
Math.Round(mPageSize.Width - mMargins.Left - mMargins.Right - mLabelSize.Width, 4) / sum
= 1 (Double)
Here sum is converted to double (resulting in 4.14) and everything is fine. But, if we convert both operands to double, then the conversion of 0.14 introduces some floating point error:
Dim dblLabelSizeWidth As Double = mLabelSize.Width ' will be 4.0
Dim dblSpacing As Double = mSpacing.Width ' will be 0.14000000059604645
The sum is slightly bigger than 4.14, resulting in a quotient slightly smaller than 1.
So the reason is that the conversion to double is not performed on the division's operand, but on the operand's operands, which introduces floating point errors.
You could overcome this problem by adding a small epsilon to the quotient before rounding off. Alternatively you might consider using a more precise data type such as Decimal. But at some point, there will also be floating-point errors with Decimal.

This is due to rounding error: you're taking the floor of a value that is very close to 2, but is less than 2 (while the mathematical value is 2). You should do all your computations with integers, or take rounding errors into account before using operations like floor (not always possible if you want the true value).
EDIT: Since vb.net has a Decimal datatype, you can also use it instead of integers. It may help in some cases like here: the base conversions for 0.18 and 0.14 (not representable exactly in binary) are avoided and the additions and subtractions will be performed exactly here, so that the operands of the division will be computed exactly. Thus, if the result of the division is an integer, you'll get it exactly (instead of possibly a value just below, like what you got with binary). But make sure that your inputs are already in decimal.

Related

VBA: difference between Variant/Double and Double

I am using Excel 2013. In the following code fragment, VBA calculates 40 for damage:
Dim attack As Variant, defense As Variant, damage As Long
attack = 152 * 0.784637
defense = 133 * 0.784637
damage = Int(0.5 * attack / defense * 70)
If the data types are changed to Double, VBA calculates 39 for damage:
Dim attack As Double, defense As Double, damage As Long
attack = 152 * 0.784637
defense = 133 * 0.784637
damage = Int(0.5 * attack / defense * 70)
In the debugger, the Variant/Double and Double values appear the same. However, the Variant/Double seems to have more precision.
Can anyone explain this behavior?

tldr; If you need more precision than a Double, don't use a Double.
The answer lies in the timing of when the result is coerced into a Double from a Variant. A Double is an IEEE 754 floating-point number, and per the IEEE specification reversibility is guaranteed to 15 significant digits. Your value flirts with that limit:
0.5 * (152 * .784637) / (133 * .784637) * 70 = 39.99999999999997 (16 sig. digits)
VBA will round anything beyond 15 significant digits when it is coerced into a double:
Debug.Print CDbl("39.99999999999997") '<--Prints 40
In fact, you can watch this behavior in the VBE. Type or copy the following code:
Dim x As Double
x = 39.99999999999997
The VBE "auto-corrects" the literal value by casting it to a Double, which gives you:
Dim x As Double
x = 40#
OK, so by now you're probably asking what that has to do with the difference between the 2 expressions. VBA evaluates mathematical expressions using the "highest order" variable type that it can.
In your second Sub where you have all of the variable declared as Double on the right hand side, the operation is evaluated with the high order of Double, then the result is implicitly cast to a Variant before being passed as the parameter for Int().
In your first Sub where you have Variant declarations, the implicit cast to Variant isn't performed before passing to Int - the highest order in the mathematical expression is Variant, so no implicit cast is performed before passing the result to Int() - the Variant still contains the raw IEEE 754 float.
Per the documentation of Int:
Both Int and Fix remove the fractional part of number and return the
resulting integer value.
No rounding is performed. The top code calls Int(39.99999999999997). The bottom code calls Int(40). The "answer" depends on what level of floating point error you want to round at. If 15 works, then 40 is the "correct" answer. If you want to floor anything up to 16 or more significant digits, then 39 is the "correct" answer. The solution is to use Round and specify the level of precision you're looking for explicitly. For example, if you care about the full 15 digits:
Int(Round((0.5 * attack / defense * 70), 15))
Keep in mind that the highest precision you use anywhere in the inputs is 6 digits, so that would be a logical rounding cut-off:
Int(Round((0.5 * attack / defense * 70), 6))

If you get rid of the Int() function on both lines where damage is calculated both end up being the same. You shouldn't be using Int as this is producing the errant behavour, you should be using CLng as you are converting to a Long variable or if damage were an Int you should use CInt.
Int and CInt behave differently. Int always rounds down to the next lower whole number - whereas CInt will round up or down using Banker's Rounding. You'll typically see this behaviour for numbers that have a mantissa of 0.5.
As for the variant and double differences, if you do a TypeName to a MsgBox for the 1st code block you'll find that both attack and defense after having been assigned values have been converted to a double despite having been declared as variant.

How do you multiply two fixed point numbers?

I am currently trying to figure out how to multiply two numbers in fixed point representation.
Say my number representation is as follows:
[SIGN][2^0].[2^-1][2^-2]..[2^-14]
In my case, the number 10.01000000000000 = -0.25.
How would I for example do 0.25x0.25 or -0.25x0.25 etc?
Hope you can help!

You should use 2's complement representation instead of a seperate sign bit. It's much easier to do maths on that, no special handling is required. The range is also improved because there's no wasted bit pattern for negative 0. To multiply, just do as normal fixed-point multiplication. The normal Q2.14 format will store value x/214 for the bit pattern of x, therefore if we have A and B then
So you just need to multiply A and B directly then divide the product by 214 to get the result back into the form x/214 like this
AxB = ((int32_t)A*B) >> 14;
A rounding step is needed to get the nearest value. You can find the way to do it in Q number format#Math operations. The simplest way to round to nearest is just add back the bit that was last shifted out (i.e. the first fractional bit) like this
AxB = (int32_t)A*B;
AxB = (AxB >> 14) + ((AxB >> 13) & 1);
You might also want to read these
Fixed-point arithmetic.
Emulated Fixed Point Division/Multiplication
Fixed point math in c#?
With 2 bits you can represent the integer range of [-2, 1]. So using Q2.14 format, -0.25 would be stored as 11.11000000000000. Using 1 sign bit you can only represent -1, 0, 1, and it makes calculations more complex because you need to split the sign bit then combine it back at the end.

Multiply into a larger sized variable, and then right shift by the number of bits of fixed point precision.

Here's a simple example in C:
int a = 0.25 * (1 << 16);
int b = -0.25 * (1 << 16);
int c = (a * b) >> 16;
printf("%.2f * %.2f = %.2f\n", a / 65536.0, b / 65536.0 , c / 65536.0);
You basically multiply everything by a constant to bring the fractional parts up into the integer range, then multiply the two factors, then (optionally) divide by one of the constants to return the product to the standard range for use in future calculations. It's like multiplying prices expressed in fractional dollars by 100 and then working in cents (i.e. $1.95 * 100 cents/dollar = 195 cents).
Be careful not to overflow the range of the variable you are multiplying into. Your constant might need to be smaller to avoid overflow, like using 1 << 8 instead of 1 << 16 in the example above.

ceil() not working as I expected

I'm trying to divide one number by another and then immediately ceil() the result. These would normally be variables, but for simplicity let's stick with constants.
If I try any of the following, I get 3 when I want to get 4.
double num = ceil(25/8); // 3
float num = ceil(25/8); // 3
int num = ceil(25/8); // 3
I've read through a few threads on here (tried the nextafter() suggestion from this thread) as well as other sites and I don't understand what's going on. I've checked and my variables are the numbers I expect them to be and I've in fact tried the above, using constants, and am still getting unexpected results.
Thanks in advance for the help. I'm sure it's something simple that I'm missing but I'm at a loss at this point.

This is because you are doing integer arithmetic. The value is 3 before you are calling ceil, because 25 and 8 are both integers. 25/8 is calculated first using integer arithmetic, evaluating to 3.
Try:
double value = ceil(25.0/8);
This will ensure the compiler treats the constant 25.0 as a floating point number.
You can also use an explicit cast to achieve the same result:
double value = ceil(((double)25)/8);

This is because the expressions are evaluated before being passed as an argument to the ceil function. You need to cast one of them to a double first so the result will be a decimal that will be passed to ceil.
double num = ceil((double)25/8);

in VB Why (1 = 1) is False

I just came across this piece of code:
Dim d As Double
For i = 1 To 10
d = d + 0.1
Next
MsgBox(d)
MsgBox(d = 1)
MsgBox(1 - d)
Can anyone explain me the reason for that? Why d is set to 1?

Floating point types and integer types cannot be compared directly, as their binary representations are different.
The result of adding 0.1 ten times as a floating point type may well be a value that is close to 1, but not exactly.
When comparing floating point values, you need to use a minimum value by which the values can differ and still be considered the same value (this value is normally known as the epsilon). This value depends on the application.
I suggest reading What Every Computer Scientist Should Know About Floating-Point Arithmetic for an in-depth discussion.
As for comaring 1 to 1.0 - these are different types so will not compare to each other.

.1 (1/10th) is a repeating fraction when converted to binary:
.0001100110011001100110011001100110011.....
It would be like trying to show 1/3 as a decimal: you just can't do it accurately.

This is because a double is always only an approximation of the value and not the exact value itself (like a floating point value). When you need an exact decimal value, instead use a Decimal.
Contrast with:
Dim d As Decimal
For i = 1 To 10
d = d + 0.1
Next
MsgBox(1)
MsgBox(d = 1)
MsgBox(1 - d)

Syntax for rounding up in VB.NET

What is the syntax to round up a decimal leaving two digits after the decimal point?
Example: 2.566666 -> 2.57

If you want regular rounding, you can just use the Math.Round method. If you specifially want to round upwards, you use the Math.Ceiling method:
Dim d As Decimal = 2.566666
Dim r As Decimal = Math.Ceiling(d * 100D) / 100D

Here is how I do it:
Private Function RoundUp(value As Double, decimals As Integer) As Double
Return Math.Ceiling(value * (10 ^ decimals)) / (10 ^ decimals)
End Function

Math.Round is what you're looking for. If you're new to rounding in .NET - you should also look up the difference between AwayFromZero and ToEven rounding. The default of ToEven can sometime take people by surprise.
dim result = Math.Round(2.56666666, 2)

You can use System.Math, specifically Math.Round(), like this:
Math.Round(2.566666, 2)

Math.Round(), as suggested by others, is probably what you want. But the text of your question specifically asked how to "roundup"[sic]. If you always need to round up, regarless of actual value (ie: 2.561111 would still go to 2.57), you can do this:
Math.Ceiling(d * 100)/100D

The basic function for rounding up is Math.Ceiling(d), but the asker specifically wanted to round up after the second decimal place. This would be Math.Ceiling(d * 100) / 100. For example, it may multiply 46.5671 by 100 to get 4656.71, then rounds up to get 4657, then divides by 100 to shift the decimal back 2 places to get 46.57.

I used this way:
Math.Round(d + 0.49D, 2)

Math.Ceiling((14.512555) * 100) / 100
Dot net will give you 14.52. So, you can use above syntax to round the number up for 2 decimal numbers.

I do not understand why people are recommending the incorrect code below:
Dim r As Decimal = Math.Ceiling(d * 100D) / 100D
The correct code to round up should look like this:
Dim r As Double = Math.Ceiling(d)
Math.Ceiling works with data type Double (not Decimal).
The * 100D / 100D is incorrect will break your results for larger numbers.
Math.Ceiling documentation is found here: http://msdn.microsoft.com/en-us/library/zx4t0t48.aspx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas