Why write 1,000,000,000 as 1000*1000*1000 in C? - objective-c

In code created by Apple, there is this line:
CMTimeMakeWithSeconds( newDurationSeconds, 1000*1000*1000 )
Is there any reason to express 1,000,000,000 as 1000*1000*1000?
Why not 1000^3 for that matter?

One reason to declare constants in a multiplicative way is to improve readability, while the run-time performance is not affected.
Also, to indicate that the writer was thinking in a multiplicative manner about the number.
Consider this:
double memoryBytes = 1024 * 1024 * 1024;
It's clearly better than:
double memoryBytes = 1073741824;
as the latter doesn't look, at first glance, the third power of 1024.
As Amin Negm-Awad mentioned, the ^ operator is the binary XOR. Many languages lack the built-in, compile-time exponentiation operator, hence the multiplication.

There are reasons not to use 1000 * 1000 * 1000.
With 16-bit int, 1000 * 1000 overflows. So using 1000 * 1000 * 1000 reduces portability.
With 32-bit int, the following first line of code overflows.
long long Duration = 1000 * 1000 * 1000 * 1000; // overflow
long long Duration = 1000000000000; // no overflow, hard to read
Suggest that the lead value matches the type of the destination for readability, portability and correctness.
double Duration = 1000.0 * 1000 * 1000;
long long Duration = 1000LL * 1000 * 1000 * 1000;
Also code could simple use e notation for values that are exactly representable as a double. Of course this leads to knowing if double can exactly represent the whole number value - something of concern with values greater than 1e9. (See DBL_EPSILON and DBL_DIG).
long Duration = 1000000000;
// vs.
long Duration = 1e9;

Why not 1000^3?
The result of 1000^3 is 1003. ^ is the bit-XOR operator.
Even it does not deal with the Q itself, I add a clarification. x^y does not always evaluate to x+y as it does in the questioner's example. You have to xor every bit. In the case of the example:
1111101000₂ (1000₁₀)
0000000011₂ (3₁₀)
1111101011₂ (1003₁₀)
But
1111101001₂ (1001₁₀)
0000000011₂ (3₁₀)
1111101010₂ (1002₁₀)

For readability.
Placing commas and spaces between the zeros (1 000 000 000 or 1,000,000,000) would produce a syntax error, and having 1000000000 in the code makes it hard to see exactly how many zeros are there.
1000*1000*1000 makes it apparent that it's 10^9, because our eyes can process the chunks more easily. Also, there's no runtime cost, because the compiler will replace it with the constant 1000000000.

For readability. For comparison, Java supports _ in numbers to improve readability (first proposed by Stephen Colebourne as a reply to Derek Foster's PROPOSAL: Binary Literals for Project Coin/JSR 334) . One would write 1_000_000_000 here.
In roughly chronological order, from oldest support to newest:
XPL: "(1)1111 1111" (apparently not for decimal values, only for bitstrings representing binary, quartal, octal or hexadecimal values)
PL/M: 1$000$000
Ada: 1_000_000_000
Perl: likewise
Ruby: likewise
Fantom (previously Fan): likewise
Java 7: likewise
Swift: (same?)
Python 3.6
C++14: 1'000'000'000
It's a relatively new feature for languages to realize they ought to support (and then there's Perl). As in chux#'s excellent answer, 1000*1000... is a partial solution but opens the programmer up to bugs from overflowing the multiplication even if the final result is a large type.

Might be simpler to read and get some associations with the 1,000,000,000 form.
From technical aspect I guess there is no difference between the direct number or multiplication. The compiler will generate it as constant billion number anyway.
If you speak about objective-c, then 1000^3 won't work because there is no such syntax for pow (it is xor). Instead, pow() function can be used. But in that case, it will not be optimal, it will be a runtime function call not a compiler generated constant.

To illustrate the reasons consider the following test program:
$ cat comma-expr.c && gcc -o comma-expr comma-expr.c && ./comma-expr
#include <stdio.h>
#define BILLION1 (1,000,000,000)
#define BILLION2 (1000^3)
int main()
{
printf("%d, %d\n", BILLION1, BILLION2);
}
0, 1003
$

Another way to achieve a similar effect in C for decimal numbers is to use literal floating point notation -- so long as a double can represent the number you want without any loss of precision.
IEEE 754 64-bit double can represent any non-negative integer <= 2^53 without problem. Typically, long double (80 or 128 bits) can go even further than that. The conversions will be done at compile time, so there is no runtime overhead and you will likely get warnings if there is an unexpected loss of precision and you have a good compiler.
long lots_of_secs = 1e9;

Related

How to handle precision problems of floating point numbers?

I am using Firebird 3.0.4 (both in Windows and Linux) and I have the following procedure that clearly demonstrates my problem with floating point numbers, and that also demonstrates a possible workaround:
create or alter procedure test_float returns (res double precision,
res1 double precision,
res2 double precision)
as
declare variable z1 double precision;
declare variable z2 double precision;
declare variable z3 double precision;
begin
z1=15;
z2=1.1;
z3=0.49;
res=z1*z2*z3; /* one expects res to be 8.085, but internally, inside the procedure
it is represented as 8.084999999999.
The procedure-internal representation is repaired when then
res is sent to the output of the procedure, but the procedure-internal
representation (which is worng) impacts the further calculations */
res1=round(res, 2);
res2=round(round(res, 8), 2);
suspend;
end
On can see the result of the procedure with:
select proc.res, proc.res1, proc.res2
from test_float proc
The result is
RES RES1 RES2
8,085 8,08 8,09
But one can expect that RES2 should be 8.09.
One can clearly see that the internal representation of the res contains 8.0849999 (e.g. one can assign res to the exception message and then raise this exception), it is repaired during output but it leads to the failed calculations when such variable is used in the further calculations.
RES2 demonstrates the repair: I can always apply ROUND(..., 8) to repair the internal representation. I am ready to go with this solution, but my question is - is it acceptable workaround (when the outer ROUND is with strictly less than 5 decimal places) or is there better workaround.
All my tests pass with this workaround, but the feeling is bad.
Of course, I know the minimum that every programmer should know about floats (there is article about that) and I know that one should not use double for business calculations.
This is an inherent problem with calculating with floating point numbers, and is not specific to Firebird. The problem is that the calculation of 15 * 1.1 * 0.49 using double precision numbers is not exactly 8.085. In fact, if you would do 8.085 - RES, you'd get a value that is (approximately) 1.776356839400251e-015 (although likely your client will just present it as 0.00000000).
You would get similar results in different languages. For example, in Java
DecimalFormat df = new DecimalFormat("#.00");
df.format(15 * 1.1 * 0.49);
will also produce 8.08 for exactly the same reason.
Also, if you would change the order of operations, you would get a different result. For example using 15 * 0.49 * 1.1 would produce 8.085 and round to 8.09, so the actual results would match your expectations.
Given round itself also returns a double precision, this isn't really a good way to handle this in your SQL code, because the rounded value with a higher number of decimals might still yield a value slightly less than what you'd expect because of how floating point numbers work, so the double round may still fail for some numbers even if the presentation in your client 'looks' correct.
If you purely want this for presentation purposes, it might be better to do this in your frontend, but alternatively you could try tricks like adding a small value and casting to decimal, for example something like:
cast(RES + 1e-10 as decimal(18,2))
However this still has rounding issues, because it is impossible to distinguish between values that genuinely are 8.08499999999 (and should be rounded down to 8.08), and values where the result of calculation just happens to be 8.08499999999 in floating point, while it would be 8.085 in exact numerics (and therefor need to be rounded up to 8.09).
In a similar vein, you could try to use double casting to decimal (eg cast(cast(res as decimal(18,3)) as decimal(18,2))), or casting the decimal and then rounding (eg round(cast(res as decimal(18,3)), 2). This would be a bit more consistent than double rounding because the first cast will convert to exact numerics, but again this has similar downside as mentioned above.
Although you don't want to hear this answer, if you want exact numeric semantics, you shouldn't be using floating point types.

Why BigFloat.to_s is not precise enough?

I am not sure if this is a bug. But I've been playing with big and I cant understand why this code works this way:
https://carc.in/#/r/2w96
Code
require "big"
x = BigInt.new(1<<30) * (1<<30) * (1<<30)
puts "BigInt: #{x}"
x = BigFloat.new(1<<30) * (1<<30) * (1<<30)
puts "BigFloat: #{x}"
puts "BigInt from BigFloat: #{x.to_big_i}"
Output
BigInt: 1237940039285380274899124224
BigFloat: 1237940039285380274900000000
BigInt from BigFloat: 1237940039285380274899124224
First I though that BigFloat requires to change BigFloat.default_precision to work with bigger number. But from this code it looks like it only matters when trying to output #to_s value.
Same with precision of BigFloat set to 1024 (https://carc.in/#/r/2w98):
Output
BigInt: 1237940039285380274899124224
BigFloat: 1237940039285380274899124224
BigInt from BigFloat: 1237940039285380274899124224
BigFloat.to_s uses LibGMP.mpf_get_str(nil, out expptr, 10, 0, self). Where GMP is saying:
mpf_get_str (char *str, mp_exp_t *expptr, int base, size_t n_digits, const mpf_t op)
Convert op to a string of digits in base base. The base argument may vary from 2 to 62 or from -2 to -36. Up to n_digits digits will be generated. Trailing zeros are not returned. No more digits than can be accurately represented by op are ever generated. If n_digits is 0 then that accurate maximum number of digits are generated.
Thanks.
In GMP (it applies to all languages not just Crystal), integers (C mpz_t, Crystal BigInt) and floats (C mpf_t, Crystal BigFloat) have separate default precision.
Also, note that using an explicit precision is better than setting a default one, because the default precision might not be reentrant (it depends on a configure-time switch). Also, if someone reads only a part of your code, they may skip the part with setting the default precision and assume a wrong one. Although I do not know the Crystal binding well, I assume that such functionality is exposed somewhere.
The zero parameter passed to mpf_get_str means to guess the value from the precision. I know the number of significant digits is proportional and close to precision / log2(10). Floating point numbers have finite precision. In that case, it was not the mpf_get_str call which made the last digits zero - it was the internal representation that did not keep such data. It looks like your (default) precision is too small to store all the necessary digits.
To summarize, there are two solutions:
Set a global default precision. Although this approach will work, it will require to either change the default precision frequently, or use one in the whole program. Both ways, the approach with the default precision is a form of procrastination which is going to have its vengeance later.
Set a precision on variable basis. This is a better solution than the former. Although it requires more code (1-2 more lines per variable initialization), it is going to pay back later. For example, in a space object tracking system, the physics calculations have to be super-precise, but other systems could use lower precision numbers for speed and memory saving.
I am still unsure what made the conversion BigFloat --> BigInt yield the missing digits.

Precise Multiplication

first post!
I have a problem with a program that i'm writing for a numerical simulation and I have a problem with the multiplication. Basically, I am trying to calculate:
result1 = (a + b)*c
and this loops thousands of times. I need to expand this code to be
result2 = a*c + b*c
However, when I do that I start to get significant errors in my results. I used a high precision library, which did improve things, but the simulation ran horribly slow (the simulation took 50 times longer) and it really isn't a practical solution. From this I realised that it isn't really the precision of the variables a, b, & c that is hurting me, but something in the way the multiplication is done.
My question is: how can I multiply out these brackets in way so that result1 = result2?
Thanks.
SOLVED!!!!!!!!!
It was a problem with the addition. So i reordered the terms and applied Kahan addition by writing the following piece of code:
double Modelsimple::sum(double a, double b, double c, double d) {
//reorder the variables in order from smallest to greatest
double tempone = (a<b?a:b);
double temptwo = (c<d?c:d);
double tempthree = (a>b?a:b);
double tempfour = (c>d?c:d);
double one = (tempone<temptwo?tempone:temptwo);
double four = (tempthree>tempfour?tempthree:tempfour);
double tempfive = (tempone>temptwo?tempone:temptwo);
double tempsix = (tempthree<tempfour?tempthree:tempfour);
double two = (tempfive<tempsix?tempfive:tempsix);
double three = (tempfive>tempsix?tempfive:tempsix);
//kahan addition
double total = one;
double tempsum = one + two;
double error = (tempsum - one) - two;
total = tempsum;
// first iteration complete
double tempadd = three - error;
tempsum = total + tempadd;
error = (tempsum - total) - tempadd;
total = tempsum;
//second iteration complete
tempadd = four - error;
total += tempadd;
return total;
}
This gives me results that are as close to the precise answer as makes no difference. However, in a fictitious simulation of a mine collapse, the code with the Kahan addition takes 2 minutes whereas the high precision library takes over a day to finish!!
Thanks to all the help here. This problem was really a pain in the a$$.
I am presuming your numbers are all floating point values.
You should not expect result1 to equal result2 due to limitations in the scale of the numbers and precision in the calculations. Which one to use will depend upon the numbers you are dealing with. More important than result1 and result2 being the same is that they are close enough to the real answer (eg that you would have calculated by hand) for your application.
Imagine that a and b are both very large, and c much less than 1. (a + b) might overflow so that result1 will be incorrect. result2 would not overflow because it scales everything down before adding.
There are also problems with loss of precision when combining numbers of widely differing size, as the smaller number has significant digits reduced when it is converted to use the same exponent as the larger number it is added to.
If you give some specific examples of a, b and c which are causing you issues it might be possible to suggest further improvements.
I have been using the following program as a test, using values for a and b between 10^5 and 10^10, and c around 10^-5, but so far cannot find any differences.
Thinking about the storage of 10^5 vs 10^10, I think it requires about 13 bits vs 33 bits, so you may lose about 20 bits of precision when you add a and b together in result1.
But multiplying them by the same value c essentially reduces the exponent but leaves the significand the same, so it should also lose about 20 bits of precision in result2.
A double significand usually stores 53 bits, so I suspect your results will still retain 33 bits, or about 10 decimal digits of precision.
#include <stdio.h>
int main()
{
double a = 13584.9484893449;
double b = 43719848748.3911;
double c = 0.00001483394434;
double result1 = (a+b)*c;
double result2 = a*c + b*c;
double diff = result1 - result2;
printf("size of double is %d\n", sizeof(double));
printf("a=%f\nb=%f\nc=%f\nr1=%f\nr2=%f\ndiff=%f\n",a,b,c,result1,result2,diff);
}
However I do find a difference if I change all the doubles to float and use c=0.00001083394434. Are you sure that you are using 64 (or 80) bit doubles when doing your calculations?
Usually "loss of precision" in these kinds of calculations can be traced to "poorly formulated problem". For example, when you have to add a series of numbers of very different sizes, you will get a different answer depending on the order in which you sum them. The problem is even more acute when you subtract numbers.
The best approach in your case above is to look not simply at this one line, but at the way that result1 is used in your subsequent calculations. In principle, an engineering calculation should not require precision in the final result beyond about three significant figures; but in many instances (for example, finite element methods) you end up subtracting two numbers that are very similar in magnitude - in which case you may lose many significant figures and get a meaningless answer. Given that you are talking about "materials properties" and "strain", I am suspecting that is actually at the heart of your problem.
One approach is to look at places where you compute a difference, and see if you can reformulate your problem (for example, if you can differentiate your function, you can replace Y(x+dx)-Y(x) with dx * Y(x)'.
There are many excellent references on the subject of numerical stability. It is a complicated subject. Just "throwing more significant figures at the problem" is almost never the best solution.

Is there a practical limit to the size of bit masks?

There's a common way to store multiple values in one variable, by using a bitmask. For example, if a user has read, write and execute privileges on an item, that can be converted to a single number by saying read = 4 (2^2), write = 2 (2^1), execute = 1 (2^0) and then add them together to get 7.
I use this technique in several web applications, where I'd usually store the variable into a field and give it a type of MEDIUMINT or whatever, depending on the number of different values.
What I'm interested in, is whether or not there is a practical limit to the number of values you can store like this? For example, if the number was over 64, you couldn't use (64 bit) integers any more. If this was the case, what would you use? How would it affect your program logic (ie: could you still use bitwise comparisons)?
I know that once you start getting really large sets of values, a different method would be the optimal solution, but I'm interested in the boundaries of this method.
Off the top of my head, I'd write a set_bit and get_bit function that could take an array of bytes and a bit offset in the array, and use some bit-twiddling to set/get the appropriate bit in the array. Something like this (in C, but hopefully you get the idea):
// sets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// result is 0 on success, non-zero on failure (offset out-of-bounds)
int set_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//set the right bit
bytes[offset >> 3] |= (1 << (offset & 0x7));
return 0; //success
}
//gets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// returns (-1) on error, 0 if bit is "off", positive number if "on"
int get_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//get the right bit
return (bytes[offset >> 3] & (1 << (offset & 0x7));
}
I've used bit masks in filesystem code where the bit mask is many times bigger than a machine word. think of it like an "array of booleans";
(journalling masks in flash memory if you want to know)
many compilers know how to do this for you. Adda bit of OO code to have types that operate senibly and then your code starts looking like it's intent, not some bit-banging.
My 2 cents.
With a 64-bit integer, you can store values up to 2^64-1, 64 is only 2^6. So yes, there is a limit, but if you need more than 64-its worth of flags, I'd be very interested to know what they were all doing :)
How many states so you need to potentially think about? If you have 64 potential states, the number of combinations they can exist in is the full size of a 64-bit integer.
If you need to worry about 128 flags, then a pair of bit vectors would suffice (2^64 * 2).
Addition: in Programming Pearls, there is an extended discussion of using a bit array of length 10^7, implemented in integers (for holding used 800 numbers) - it's very fast, and very appropriate for the task described in that chapter.
Some languages ( I believe perl does, not sure ) permit bitwise arithmetic on strings. Giving you a much greater effective range. ( (strlen * 8bit chars ) combinations )
However, I wouldn't use a single value for superimposition of more than one /type/ of data. The basic r/w/x triplet of 3-bit ints would probably be the upper "practical" limit, not for space efficiency reasons, but for practical development reasons.
( Php uses this system to control its error-messages, and I have already found that its a bit over-the-top when you have to define values where php's constants are not resident and you have to generate the integer by hand, and to be honest, if chmod didn't support the 'ugo+rwx' style syntax I'd never want to use it because i can never remember the magic numbers )
The instant you have to crack open a constants table to debug code you know you've gone too far.
Old thread, but it's worth mentioning that there are cases requiring bloated bit masks, e.g., molecular fingerprints, which are often generated as 1024-bit arrays which we have packed in 32 bigint fields (SQL Server not supporting UInt32). Bit wise operations work fine - until your table starts to grow and you realize the sluggishness of separate function calls. The binary data type would work, were it not for T-SQL's ban on bitwise operators having two binary operands.
For example .NET uses array of integers as an internal storage for their BitArray class.
Practically there's no other way around.
That being said, in SQL you will need more than one column (or use the BLOBS) to store all the states.
You tagged this question SQL, so I think you need to consult with the documentation for your database to find the size of an integer. Then subtract one bit for the sign, just to be safe.
Edit: Your comment says you're using MySQL. The documentation for MySQL 5.0 Numeric Types states that the maximum size of a NUMERIC is 64 or 65 digits. That's 212 bits for 64 digits.
Remember that your language of choice has to be able to work with those digits, so you may be limited to a 64-bit integer anyway.

How do I process enormous numbers? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Most efficient implementation of a large number class
Suppose I needed to calculate 2^150000. Obviously that number is going to exceed the size of an int, float, or double. How can I make a data type that allows normal math functions but exceeds the basic number types?
If this is a "depends which language you use" kind of deal. I will say C#.
See
Most efficient implementation of a large number class
for some leads.
If C# is not cast in stone, and you want something that just works out of the box, then there are several options. The one I know best is Python, but I think that languages like Scheme and Ruby support large numbers, too.
Python: 2**150000. Prints the result after about 1 second.
If you want free mathematics software, look at Maxima or Sage.
You might also consider using Frink, which is a language with the native capability of dealing with measurement units.
It computes 2^150000 without difficulty, deals with fractions (e.g. 1/3+2/5 --> 11/15), computes 3 meters + 2 inch --> 3.0508 m and is a full programming language.
Frink - Copyright 2000-2008 Alan Eliasen, eliasen#mindspring.com
http://futureboy.us/frinkdocs/
Several languages have built in support for arbitrary large numbers. You could use Mathematica, for example. I tried your example in Mathematica, and the result has 45,155 digits. I tried the same example with bc on a Unix machine. bc supports extended precision, but not that extended; it bombed on the example.
Lisp is your friend. Default biginteger numbers.
I find it very frustrating to use a language without arbitrarily large numbers: it seems nonsensical to be able to use ordinary operators like addition on most numbers, but to have to switch to method calls on a BigInt instance simply because of its size.
A whole bunch of languages have more complete numeric towers, and seamlessly coerce when needed; e.g., Allegro Common Lisp evaluates and prints all 45,155 digits of (expt 2 150000) in 1ms.
cl-user(2): (time (expt 2 150000))
; cpu time (non-gc) 0 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 0 msec user, 0 msec system
; real time 1 msec
; space allocation:
; 2 cons cells, 18,784 other bytes, 0 static bytes
There is a product in C called calc which is an arbitrary precision calculator. I used it once when working as a researcher and found it fairly straightforward to use...
http://sourceforge.net/projects/calc/
It can be programmed for difficult or long calculations and can accept arguments from the command line. In interactive mode, it accepts one command at a time, and displays the answer.
Ordinarily the commands are simply expressions such as:
3 * (4 + 1)
and calc will print:
15
Calc does the arithmetic operators +, -, /, * as well as ^ (exponentiation), % (modulus) and // (integer divide).
For example:
3 * 19 ^ 43 - 1
will produce:
29075426613099201338473141505176993450849249622191102976
Calc values can be VERY large. For example:
2 ^ 23209 - 1
will print:
402874115778988778181873329071 ... loads of digits ... 3779264511
Hope this helps...
I don't know C# but I do know the Ruby programming language has the BigDemical class that seems to allow numbers of unlimited size.
Python has a bignum library. If you need to implement a bignum library in another language you can at least use the Python one as reference for validating your work. Note that bignums have a few implementation gotchas that aren't immediately obvious if you don't know what you're looking for.