Why is my very small number not being stored precisely? - raku

In an answer on StackOverflow en Español, I showed that Perl 6 avoids the calculation errors of many other languages because it keeps track of the numerators and denominators. That is to say, decimal numbers are actually represented as Ratios. However, it does make a small error with very small numbers:
> 0.000000000000000000071.nude.perl
(71, 1000000000000000000000)
> 0.0000000000000000000071.nude.perl
(71, 10000000000000000000000)
> 0.00000000000000000000071.nude.perl
(71, 99999999999999991611392)
Is this something that will be fixed in future versions?
I get the same answers using perl6/rakudo-star-2015.09 and perl6/rakudo-star-2015.11

Denominators are supposed to be limited to 64-bit - you need a FatRat to go beyond that.
However, said limit does not appear to be enforced in current Rakudo: If you do so manually, it will happily construct your number via Rat.new(71, 10**23).
My guess would be you have uncovered a bug in the handling of rational literals, but it might only trigger in code that is not future-proof anyway.
edit: It is possible to use angle brackets to get an allomorphic value, and this produces the correct value. In fact, regular rational literals are also specced to fall back to RatStr on overflow.
However, this fallback mechanism does not appear to be implemented in Rakudo.

Related

How to access net displacements in pyiron

Using pyiron, I want to calculate the mean square displacement of the ions in my system. How do I see the total displacement (i.e. not folded back by periodic boundary conditions) without dumping very frequently and checking when an atom passes over the boundary and gets wrapped?
Try to compare job['output/generic/unwrapped_positions'][-1] and job.structure.positions+job.output.total_displacements[-1]. If they deliver the same values, it's definitely fine both ways. If not, you can post the relevant lines in your notebook here.
I'd like to add a few comments to Jan's answer:
While job['output/generic/unwrapped_positions'] returns the unwrapped positions parsed from the output files, job.output.total_displacements returns the displacement of atoms calculated from each pair of consecutive snapshots. So if an atom moves more than half the box length in any direction, job.output.total_displacements will give wrong coordinates. Therefore, job['output/generic/unwrapped_positions'] is generally more trustworthy, but it is not available in all the codes (since some codes simply do not provide an output for unwrapped positions).
Moreover, if an interactive job is used, it is possible that job.structure.positions does not return the initial positions, i.e. job.structure.positions+job.output.total_displacements won't be initial positions + displacements.
So, in short, my answer to your question would be rather "Use job['output/generic/unwrapped_positions'] and if it's not available, use job.structure.positions+job.output.total_displacements but be aware of potential problems you might be running into."

Why is the condition in this if statement written as a multiplication instead of the value of the multiplication?

I was reviewing some code from a library for Arduino and saw the following if statement in the main loop:
draw_state++;
if ( draw_state >= 14*8 )
draw_state = 0;
draw_state is a uint8_t.
Why is 14*8 written here instead of 112? I initially thought this was done to save space, as 14 and 8 can both be represented by a single byte, but then so can 112.
I can't see why a compiler wouldn't optimize this to 112, since otherwise it would mean a multiplication has to be done every iteration instead of the lookup of a value. This looks to me like there is some form of memory and processing tradeoff.
Does anyone have a suggestion as to why this was done?
Note: I had a hard time coming up with a clear title, so suggestions are welcome.
Probably to explicitly show where the number 112 came from. For example, it could be number of bits in 14 bytes (but of course I don't know the context of the code, so I could be wrong). It would then be more obvious to humans where the value came from, than wiriting just 112.
And as you pointed out, the compiler will probably optimize it, so there will be no multiplication in the machine code.

How does VB.NET 2008 round off integer numbers? [duplicate]

According to the documentation, the decimal.Round method uses a round-to-even algorithm which is not common for most applications. So I always end up writing a custom function to do the more natural round-half-up algorithm:
public static decimal RoundHalfUp(this decimal d, int decimals)
{
if (decimals < 0)
{
throw new ArgumentException("The decimals must be non-negative",
"decimals");
}
decimal multiplier = (decimal)Math.Pow(10, decimals);
decimal number = d * multiplier;
if (decimal.Truncate(number) < number)
{
number += 0.5m;
}
return decimal.Round(number) / multiplier;
}
Does anybody know the reason behind this framework design decision?
Is there any built-in implementation of the round-half-up algorithm into the framework? Or maybe some unmanaged Windows API?
It could be misleading for beginners that simply write decimal.Round(2.5m, 0) expecting 3 as a result but getting 2 instead.
The other answers with reasons why the Banker's algorithm (aka round half to even) is a good choice are quite correct. It does not suffer from negative or positive bias as much as the round half away from zero method over most reasonable distributions.
But the question was why .NET use Banker's actual rounding as default - and the answer is that Microsoft has followed the IEEE 754 standard. This is also mentioned in MSDN for Math.Round under Remarks.
Also note that .NET supports the alternative method specified by IEEE by providing the MidpointRounding enumeration. They could of course have provided more alternatives to solving ties, but they choose to just fulfill the IEEE standard.
Probably because it's a better algorithm. Over the course of many roundings performed, you will average out that all .5's end up rounding equally up and down. This gives better estimations of actual results if you are for instance, adding a bunch of rounded numbers. I would say that even though it isn't what some may expect, it's probably the more correct thing to do.
While I cannot answer the question of "Why did Microsoft's designers choose this as the default?", I just want to point out that an extra function is unnecessary.
Math.Round allows you to specify a MidpointRounding:
ToEven - When a number is halfway between two others, it is rounded toward the nearest even number.
AwayFromZero - When a number is halfway between two others, it is rounded toward the nearest number that is away from zero.
Decimals are mostly used for money; banker’s rounding is common when working with money. Or you could say.
It is mostly bankers that need the
decimal type; therefore it does
“banker’s rounding”
Bankers rounding have the advantage that on average you will get the same result if you:
round a set of “invoice lines” before adding them up,
or add them up then round the total
Rounding before adding up saved a lot of work in the days before computers.
(In the UK when we went decimal banks would not deal with half pence, but for many years there was still a half pence coin and shop often had prices ending in half pence – so lots of rounding)
Use another overload of Round function like this:
decimal.Round(2.5m, 0,MidpointRounding.AwayFromZero)
It will output 3. And if you use
decimal.Round(2.5m, 0,MidpointRounding.ToEven)
you will get banker's rounding.

Where does the limitation of 10^15 in D.J. Bernstein's 'primegen' program come from?

At http://cr.yp.to/primegen.html you can find sources of program that uses Atkin's sieve to generate primes. As the author says that it may take few months to answer an e-mail sent to him (I understand that, he sure is an occupied man!) I'm posting this question.
The page states that 'primegen can generate primes up to 1000000000000000'. I am trying to understand why it is so. There is of course a limitation up to 2^64 ~ 2 * 10^19 (size of long unsigned int) because this is how the numbers are represented. I know for sure that if there would be a huge prime gap (> 2^31) then printing of numbers would fail. However in this range I think there is no such prime gap.
Either the author overestimated the bound (and really it is around 10^19) or there is a place in the source code where the arithmetic operation can overflow or something like that.
The funny thing is that you actually MAY run it for numbers > 10^15:
./primes 10000000000000000 10000000000000100
10000000000000061
10000000000000069
10000000000000079
10000000000000099
and if you believe Wolfram Alpha, it is correct.
Some facts I had "reverse-engineered":
numbers are sifted in batches of 1,920 * PRIMEGEN_WORDS = 3,932,160 numbers (see primegen_fill function in primegen_next.c)
PRIMEGEN_WORDS controls how big a single sifting is - you can adjust it in primegen_impl.h to fit your CPU cache,
the implementation of the sieve itself is in primegen.c file - I assume it is correct; what you get is a bitmask of primes in pg->buf (see primegen_fill function)
The bitmask is analyzed and primes are stored in pg->p array.
I see no point where the overflow may happen.
I wish I was on my computer to look, but I suspect you would have different success if you started at 1 as your lower bound.
Just from the algorithm, I would conclude that the upper bound comes from the 32 bit numbers.
The page mentiones Pentium-III as CPU so my guess it is very old and does not use 64 bit.
2^32 are approx 10^9. Sieve of Atkins (which the algorithm uses) requires N^(1/2) bits (it uses a big bitfield). Which means in 2^32 big memory you can make (conservativ) N approx 10^15. As this number is a rough conservative upper bound (you have system and other programs occupying memory, reserving address ranges for IO,...) the real upper bound is/might be higher.

Visual Studio expanding 1.1 to 1.1000000000000001

This is, at least for me, most bizzare Visual Studio 2010 behaviour ever. I'm working on MVC3 project, I copied a line of code from another project (VS2010 also, MVC1 if it matters) which looks like this:
target_height = height * 1.1
when I paste it into MVC3 project, it gets expanded to
target_height = height * 1.1000000000000001
Now, if I type 1.2, it's fine, nothing happens, but if I type 1.12 it is expanded to 1.1200000000000001.
Both target_height and height are integers. Why does one Visual Studio display 1.1 while other expands it to 1.1000000000000001?
What is going on???
I think it is autocomplete went crazy and started fixing floating points constant into "allowed" values. As written in http://accessmvp.com/Strive4Peace/VBA/VBA_L1_02_Crystal.pdf , VB autocomplete really tries to offer only "things that apply specifically to that data type". int * double is understandably not truncated into int * int (automatic conversions always happen only as needed) and what you see is double representation of 1.1 or 1.12 (epsilon = 1.11e-16).
I think it would still need some further checking or verification to learn exact conditions when this happens, but as I am not using VB.NET or MVCx this is not something I am willing to do.
The numerical literal 1.1 does not actually represent the quantity 11/10, but instead represents the quantity round[(2^53*11)/10]/(2^53), which is a tiny bit larger than 11/10. Although that value could be written out precisely as a decimal number with 53 significant figures, doing so would be about as useful as using an inch-denominated measuring tape to determine that something is 1 3/16" long and recording the measurement as 30.1625mm. If one wouldn't be able to distinguish a measurement that was longer or shorter or shorter by less than 1/64", the measurement would be 30.1625mm +/- 0.396875mm, which is functionally the same as 30.2mm +/- 0.4mm.
The fact that Visual Studio would choose to represent the numeric quantity closest to 1.1 as 1.1000000000000001 is curious. On the one hand, the literal 1.1 would be a more concise representation of the same value. On the other hand, even if the aforementioned literal would be indistinguishable from 1.1, the more verbose representation is not without advantage. In some cases, it may be helpful to know whether a quantity is slightly larger or slightly smaller than what it "appears" to be. Even though the difference between the numeric literal 1.1 and the mathematical value 11/10 is numerically insignificant (multiplying the numeric literal by ten yields precisely 11), the difference between (1.1-1.0) and (1/10) is noticeable (multiplying the numeric expression by 10 yields a value greater than one).
1.1 and 1.12 must not have an exact binary representation.
see this : https://stackoverflow.com/questions/634206/what-every-programmer-should-know-about