Approximation using gmp mpf_class - gmp

I am writing a UnitTest using Catch2.
I want to check if two vectors are equal. They look like the following using gmplib:
std::vector<mpf_class> result
Due to me 'faking' the expected_result vector, I get the following message after a failed test:
unittests/test.cpp:01: FAILED:
REQUIRE( actual_result == expected_result )
with expansion:
{ 0.5, 0.166667, 0.166667, 0.166667 }
==
{ 0.5, 0.166667, 0.166667, 0.166667 }
So I was looking for a function that could do an approximation for me.
I just wasn't successful in finding a solution that worked out for me.
I found some Comparison Functions but they do not work on my project.
EDIT:
The "minimal, reproducible example would simply be:
TEST_CASE("DemoTest") {
// simplified:
mpf_class a = 1;
mpf_class b = 6;
mpf_class actual_result = a / b;
mpf_class expected_result= 0.16666666667;
REQUIRE(actual_result == expected_result);
}
The "only" difference to my real application is that the results are stored in vectors. But because I am only "faking" the result by saying it is "0.1666666667" it probably doesn't fit the == anymore. So I need a function that takes an approximation and compares the range like epsilon = +-0.001.
Edit:
After implementing the solution #Arc suggested it worked well until I had some Values that were not complete "even".
So I have a failure with the following values:
actual 0.16666666666666666666700000000000000000000000000000
expected 0.16666666666666665741500000000000000000000000000000
Even though my "expected" value looks like this:
mpf_class expected = 0.16666666666666666666700000000000000000000000000000
Getting back to my original question if there is a way I can compare an approximation of the number with an epsilon of like +-0.0001 or what would be the best way to fix this issue?

First, we need to see some Minimal, Reproducible Example to be sure of what is happening. You can for example cut down some code from your test.cpp until you are left with just a few lines of code, but the issue still happens. Also, please provide compilation and running instructions. Frequently, a little bit of explanation on what your goals are may also help. As Catch2 is available on GitHub you don't need to provide it.
Without seeing the code, the best I can guess is that your code is trying to comparing mpf_t types in the mpf_class using the == operator, which I'm afraid has not been overload (see here). You should compare mpf_ts with the cmp function, since the C type mpf_t is actually an struct containing the pointer to the actual significand limbs. Check some usage examples in the tests/cxx/ directory of GMP (like here).
I note you are using GNU MP 4.1 version which is very old, you probably want to move to the 6.2.1 latest version if possible. Also, for using floats it's recommended that you use the GNU MPFR library instead of GMP floats.
EDIT: I did not yet manage to run Catch2, but the issue with your code is the expected_result is actually not equal to the actual_result. In GMP mpf_t variables are created with a 64-bit significand precision (on 64-bit machines), so that the division a / b actually results in a binary that prints 0.166666666666666666667 (that's 19 sixes after the digit 1). Try printing the result with gmp_printf("%.50Ff\n", actual_result);, because the standard cout output will only give you the value rounded to 6 digits: 0.166667.
But the problem is you can't just assign this like expected_result = 0.166666666666666666667 because in C/C++ numeric constants are parsed as double, thus you have to use the string overload attribution to get more precision.
But you can't also manage to easily (or, in general, justifiably) coin a decimal string that will correctly convert to the exact same binary given by a / b because decimal to float conversion has subtleties, see for example here and here.
So, it all depends on your application and the kind of numerical validation you aim to do. If you know that your decimal validation values are correct to some known precision, and if you set the mpf_t variables to withstanding precision (using for example mpf_set_prec), then you can use tolerance comparison, like so.
in C++ (without Catch2), it works like this:
#include <iostream>
#include <gmpxx.h>
using namespace std;
int main (void)
{
mpf_class a = 1;
mpf_class b = 6;
mpf_class actual = a / b;
mpf_class expected;
mpf_class tol;
expected = "0.166666666666666666666666666666667";
tol = "1e-30";
cout << "actual " << actual << "\n";
cout << "expected " << expected << "\n";
gmp_printf("actual %.50Ff\n", actual);
gmp_printf("expected %.50Ff\n", expected);
gmp_printf("tol %.50Ff\n", tol);
mpf_class diff = expected - actual;
gmp_printf("diff %.50Ff\n", diff);
if (abs(actual - expected) < tol)
cout << "ok\n";
else
cout << "nop\n";
return 0;
}
And compile with -lgmpxx -lgmp options.
It produces the output:
actual 0.166667
expected 0.166667
actual 0.16666666666666666666700000000000000000000000000000
expected 0.16666666666666666666700000000000000000000000000000
tol 0.00000000000000000000000000000100000000000000000000
diff 0.00000000000000000000000000000000033333529249058470
ok
If I understand Catch2 well, it should be ok if you assign expected_result with string then compare with REQUIRE(abs(actual - expected) < tol).

Related

Measuring Program Execution Time with Cycle Counters

I have confusion in this particular line-->
result = (double) hi * (1 << 30) * 4 + lo;
of the following code:
void access_counter(unsigned *hi, unsigned *lo)
// Set *hi and *lo to the high and low order bits of the cycle
// counter.
{
asm("rdtscp; movl %%edx,%0; movl %%eax,%1" // Read cycle counter
: "=r" (*hi), "=r" (*lo) // and move results to
: /* No input */ // the two outputs
: "%edx", "%eax");
}
double get_counter()
// Return the number of cycles since the last call to start_counter.
{
unsigned ncyc_hi, ncyc_lo;
unsigned hi, lo, borrow;
double result;
/* Get cycle counter */
access_counter(&ncyc_hi, &ncyc_lo);
lo = ncyc_lo - cyc_lo;
borrow = lo > ncyc_lo;
hi = ncyc_hi - cyc_hi - borrow;
result = (double) hi * (1 << 30) * 4 + lo;
if (result < 0) {
fprintf(stderr, "Error: counter returns neg value: %.0f\n", result);
}
return result;
}
The thing I cannot understand is that why is hi being multiplied with 2^30 and then 4? and then low added to it? Someone please explain what is happening in this line of code. I do know that what hi and low contain.
The short answer:
That line turns a 64bit integer that is stored as 2 32bit values into a floating point number.
Why doesn't the code just use a 64bit integer? Well, gcc has supported 64bit numbers for a long time, but presumably this code predates that. In that case, the only way to support numbers that big is to put them into a floating point number.
The long answer:
First, you need to understand how rdtscp works. When this assembler instruction is invoked, it does 2 things:
1) Sets ecx to IA32_TSC_AUX MSR. In my experience, this generally just means ecx gets set to zero.
2) Sets edx:eax to the current value of the processor’s time-stamp counter. This means that the lower 64bits of the counter go into eax, and the upper 32bits are in edx.
With that in mind, let's look at the code. When called from get_counter, access_counter is going to put edx in 'ncyc_hi' and eax in 'ncyc_lo.' Then get_counter is going to do:
lo = ncyc_lo - cyc_lo;
borrow = lo > ncyc_lo;
hi = ncyc_hi - cyc_hi - borrow;
What does this do?
Since the time is stored in 2 different 32bit numbers, if we want to find out how much time has elapsed, we need to do a bit of work to find the difference between the old time and the new. When it is done, the result is stored (again, using 2 32bit numbers) in hi / lo.
Which finally brings us to your question.
result = (double) hi * (1 << 30) * 4 + lo;
If we could use 64bit integers, converting 2 32bit values to a single 64bit value would look like this:
unsigned long long result = hi; // put hi into the 64bit number.
result <<= 32; // shift the 32 bits to the upper part of the number
results |= low; // add in the lower 32bits.
If you aren't used to bit shifting, maybe looking at it like this will help. If lo = 1 and high = 2, then expressed as hex numbers:
result = hi; 0x0000000000000002
result <<= 32; 0x0000000200000000
result |= low; 0x0000000200000001
But if we assume the compiler doesn't support 64bit integers, that won't work. While floating point numbers can hold values that big, they don't support shifting. So we need to figure out a way to shift 'hi' left by 32bits, without using left shift.
Ok then, shifting left by 1 is really the same as multiplying by 2. Shifting left by 2 is the same as multiplying by 4. Shifting left by [omitted...] Shifting left by 32 is the same as multiplying by 4,294,967,296.
By an amazing coincidence, 4,294,967,296 == (1 << 30) * 4.
So why write it in that complicated fashion? Well, 4,294,967,296 is a pretty big number. In fact, it's too big to fit in an 32bit integer. Which means if we put it in our source code, a compiler that doesn't support 64bit integers may have trouble figuring out how to process it. Written like this, the compiler can generate whatever floating point instructions it might need to work on that really big number.
Why the current code is wrong:
It looks like variations of this code have been wandering around the internet for a long time. Originally (I assume) access_counter was written using rdtsc instead of rdtscp. I'm not going to try to describe the difference between the two (google them), other than to point out that rdtsc does not set ecx, and rdtscp does. Whoever changed rdtsc to rdtscp apparently didn't know that, and failed to adjust the inline assembler stuff to reflect it. While your code might work fine despite this, it might do something weird instead. To fix it, you could do:
asm("rdtscp; movl %%edx,%0; movl %%eax,%1" // Read cycle counter
: "=r" (*hi), "=r" (*lo) // and move results to
: /* No input */ // the two outputs
: "%edx", "%eax", "%ecx");
While this will work, it isn't optimal. Registers are a valuable and scarce resource on i386. This tiny fragment uses 5 of them. With a slight modification:
asm("rdtscp" // Read cycle counter
: "=d" (*hi), "=a" (*lo)
: /* No input */
: "%ecx");
Now we have 2 fewer assembly statements, and we only use 3 registers.
But even that isn't the best we can do. In the (presumably long) time since this code was written, gcc has added both support for 64bit integers and a function to read the tsc, so you don't need to use asm at all:
unsigned int a;
unsigned long long result;
result = __builtin_ia32_rdtscp(&a);
'a' is the (useless?) value that was being returned in ecx. The function call requires it, but we can just ignore the returned value.
So, instead of doing something like this (which I assume your existing code does):
unsigned cyc_hi, cyc_lo;
access_counter(&cyc_hi, &cyc_lo);
// do something
double elapsed_time = get_counter(); // Find the difference between cyc_hi, cyc_lo and the current time
We can do:
unsigned int a;
unsigned long long before, after;
before = __builtin_ia32_rdtscp(&a);
// do something
after = __builtin_ia32_rdtscp(&a);
unsigned long long elapsed_time = after - before;
This is shorter, doesn't use hard-to-understand assembler, is easier to read, maintain and produces the best possible code.
But it does require a relatively recent version of gcc.

Kindly explain me this code of increment decrement operator [duplicate]

#include <stdio.h>
int main(void)
{
int i = 0;
i = i++ + ++i;
printf("%d\n", i); // 3
i = 1;
i = (i++);
printf("%d\n", i); // 2 Should be 1, no ?
volatile int u = 0;
u = u++ + ++u;
printf("%d\n", u); // 1
u = 1;
u = (u++);
printf("%d\n", u); // 2 Should also be one, no ?
register int v = 0;
v = v++ + ++v;
printf("%d\n", v); // 3 (Should be the same as u ?)
int w = 0;
printf("%d %d\n", ++w, w); // shouldn't this print 1 1
int x[2] = { 5, 8 }, y = 0;
x[y] = y ++;
printf("%d %d\n", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}
C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.
As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.
So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.
Your most interesting-looking example, the one with
u = (u++);
is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).
Most of the answers here quoted from C standard emphasizing that the behavior of these constructs are undefined. To understand why the behavior of these constructs are undefined, let's understand these terms first in the light of C11 standard:
Sequenced: (5.1.2.3)
Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.
Unsequenced:
If A is not sequenced before or after B, then A and B are unsequenced.
Evaluations can be one of two things:
value computations, which work out the result of an expression; and
side effects, which are modifications of objects.
Sequence Point:
The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.
Now coming to the question, for the expressions like
int i = 1;
i = i++;
standard says that:
6.5 Expressions:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]
Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.
Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behavior.
Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like
il = ir++ // Note that suffix l and r are used for the sake of clarity.
// Both il and ir represents the same object.
An important point regarding Postfix ++ operator is that:
just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.
It means the expression il = ir++ could be evaluated either as
temp = ir; // i = 1
ir = ir + 1; // i = 2 side effect by ++ before assignment
il = temp; // i = 1 result is 1
or
temp = ir; // i = 1
il = temp; // i = 1 side effect by assignment before ++
ir = ir + 1; // i = 2 result is 2
resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes UB.
I think the relevant parts of the C99 standard are 6.5 Expressions, §2
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
and 6.5.16 Assignment operators, §4:
The order of evaluation of the operands is unspecified. If an attempt is made to modify
the result of an assignment operator or to access it after the next sequence point, the
behavior is undefined.
Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.
This is what I get on my machine, together with what I think is going on:
$ cat evil.c
void evil(){
int i = 0;
i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
0x00000000 <+0>: push %ebp
0x00000001 <+1>: mov %esp,%ebp
0x00000003 <+3>: sub $0x10,%esp
0x00000006 <+6>: movl $0x0,-0x4(%ebp) // i = 0 i = 0
0x0000000d <+13>: addl $0x1,-0x4(%ebp) // i++ i = 1
0x00000011 <+17>: mov -0x4(%ebp),%eax // j = i i = 1 j = 1
0x00000014 <+20>: add %eax,%eax // j += j i = 1 j = 2
0x00000016 <+22>: add %eax,-0x4(%ebp) // i += j i = 3
0x00000019 <+25>: addl $0x1,-0x4(%ebp) // i++ i = 4
0x0000001d <+29>: leave
0x0000001e <+30>: ret
End of assembler dump.
(I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)
The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.
So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):
The grouping of operators and operands is indicated by the syntax.74) Except as specified
later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
So when we have a line like this:
i = i++ + ++i;
we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.
We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
it cites the following code examples as being undefined:
i = ++i + 1;
a[i++] = i;
In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:
i = i++ + ++i;
^ ^ ^
i = (i++);
^ ^
u = u++ + ++u;
^ ^ ^
u = (u++);
^ ^
v = v++ + ++v;
^ ^ ^
Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:
use of an unspecified value, or other behavior where this International Standard provides
two or more possibilities and imposes no further requirements on which is chosen in any
instance
and undefined behavior is defined in section 3.4.3 as:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
and notes that:
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?
The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.
The second fragment, i = i++, is a little easier to understand. Someone is clearly trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to add 1 to i, and assign the result back to i, is the same in almost any programming language:
i = i + 1
C, of course, has a handy shortcut:
i++
This means, "add 1 to i, and assign the result back to i". So if we construct a hodgepodge of the two, by writing
i = i++
what we're really saying is "add 1 to i, and assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.
Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."
We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:
Why doesn't the Standard define what these do?
Doesn't operator precedence determine the order of evaluation?
See also question 3.8 and the rest of the questions in section 3 of the C FAQ list.
Often this question is linked as a duplicate of questions related to code like
printf("%d %d\n", i, i++);
or
printf("%d %d\n", ++i, i++);
or similar variants.
While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:
x = i++ + i++;
In the following statement:
printf("%d %d\n", ++i, i++);
the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:
Annex J, unspecified behaviours
The order in which the function designator, arguments, and
subexpressions within the arguments are evaluated in a function call
(6.5.2.2).
3.4.4, unspecified behavior
Use of an unspecified value, or other behavior where this
International Standard provides two or more possibilities and imposes
no further requirements on which is chosen in any instance.
EXAMPLE An example of unspecified behavior is the order in which the
arguments to a function are evaluated.
The unspecified behaviour itself is NOT an issue. Consider this example:
printf("%d %d\n", ++x, y++);
This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.
What renders the following statement
printf("%d %d\n", ++i, i++);
as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.
Another detail is that the comma involved in the printf() call is a separator, not the comma operator.
This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:
int i = 5;
int j;
j = (++i, i++); // No undefined behaviour here because the comma operator
// introduces a sequence point between '++i' and 'i++'
printf("i=%d j=%d\n",i, j); // prints: i=7 j=6
The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.
So if the comma in the function call were to be a comma operator then
printf("%d %d\n", ++i, i++);
will not be a problem. But it invokes undefined behaviour because the comma here is a separator.
For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.
This post: Undefined, unspecified and implementation-defined behavior is also relevant.
While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:
In a single operation, read `i` and lock it to prevent access until further notice
Compute (1+read_value)
In a single operation, unlock `i` and store the computed value
While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.
If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.
While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]
With footnote 73 further clarifying that
This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
while allowing
i = i + 1;
a[i] = i;
The various sequence points are listed in Annex C of C11 (and C99):
The following are the sequence points described in 5.1.2.3:
Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).
Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).
Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).
The end of a full declarator: declarators (6.7.6);
Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).
Immediately before a library function returns (7.1.4).
After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).
Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).
The wording of the same paragraph in C11 is:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)
You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:
% gcc plusplus.c -Wall -Werror -pedantic
plusplus.c: In function ‘main’:
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
i = i++ + ++i;
~~^~~~~~~~~~~
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
i = (i++);
~~^~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
u = u++ + ++u;
~~^~~~~~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
u = (u++);
~~^~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
v = v++ + ++v;
~~^~~~~~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
cc1: all warnings being treated as errors
The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so
j = (i ++, ++ i);
is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write
i += 2;
j = i;
However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call
int i = 0;
printf("%d %d\n", i++, ++i, i);
has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.
The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.
So every statement of the form:
i = i++;
i = i++ + ++i;
and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.
However, two different variables can be incremented between two sequence points.
while(*src++ = *dst++);
The above is a common coding practice while copying/analysing strings.
In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:
int k[] = {0,1,2,3,4,5,6,7,8,9,10};
int i = 0;
int num;
num = k[++i+k[++i]] + k[++i];
printf("%d", num);
which prints 7... the OP expected it to print 6.
The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k[] were read, then the last ++i then k[].
num = k[i+1]+k[i+2] + k[i+3];
i += 3
Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).
Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.
This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.
I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like
int x = 5;
printf("%d %d %d\n", x, ++x, x++);
But, to your surprise, this program did not help you understand — it printed some strange, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.
Or, perhaps you're looking at a hard-to-understand expression like
int x = 5;
x = x++ + ++x;
printf("%d\n", x);
Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it — and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)
As you've heard by now, these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and unsettling result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.
What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.
For the expressions we're talking about, what makes them undefined is when there's too much going on at once, when we can't tell what order things will happen in, but when the order matters to the result we'll get.
Let's go back to the two examples I've used in this answer. When I wrote
printf("%d %d %d\n", x, ++x, x++);
the question is, before actually calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get a different series of numbers printed out.
What about this crazy expression?
x = x++ + ++x;
The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to take x's value, add 1, store the new value in x, and return the old value; (2) the ++x part tries to take x's value, add 1, store the new value in x, and return the new value; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually determine the final value of x? Again, and perhaps surprisingly, there's no rule in C to tell us.
You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.
So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?
These expressions are all fine:
y = x++;
z = x++ + y++;
x = x + 1;
x = a[i++];
x = a[i++] + b[j++];
x[i++] = a[j++] + b[k++];
x = *p++;
x = *p++ + *q++;
These expressions are all undefined:
x = x++;
x = x++ + ++x;
y = x + x++;
a[i] = i++;
a[i++] = i;
printf("%d %d %d\n", x, ++x, x++);
And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?
As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:
If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?
If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?
As an example of #1, in the expression
x = x++ + ++x;
there are three attempts to modify x.
As an example of #2, in the expression
y = x + x++;
we both use the value of x, and modify it.
So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.
One more thing. You might be wondering how to "fix" the undefined expressions I started this answer by presenting.
In the case of printf("%d %d %d\n", x, ++x, x++);, it's easy — just write it as three separate printf calls:
printf("%d ", x);
printf("%d ", ++x);
printf("%d\n", x++);
Now the behavior is perfectly well defined, and you'll get sensible results.
In the case of x = x++ + ++x, on the other hand, there's no way to fix it. There's no way to write it so that it has guaranteed behavior matching your expectations — but that's okay, because you would never write an expression like x = x++ + ++x in a real program anyway.
A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.
I explain the ideas.
The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
The sequence points in an expression like i=i++ are before i= and after i++.
In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).
So the order of instructions inside a box has no clear order.
i=i++
can be interpreted as
tmp = i
i=i+1
i = tmp
or as
tmp = i
i = tmp
i=i+1
because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.
So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.
EDIT:
Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .
The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).
However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.
Using Nuwen MinGW 15 GCC 7.1 you will get:
#include<stdio.h>
int main(int argc, char ** argv)
{
int i = 0;
i = i++ + ++i;
printf("%d\n", i); // 2
i = 1;
i = (i++);
printf("%d\n", i); //1
volatile int u = 0;
u = u++ + ++u;
printf("%d\n", u); // 2
u = 1;
u = (u++);
printf("%d\n", u); //1
register int v = 0;
v = v++ + ++v;
printf("%d\n", v); //2
}
How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).
according to GCC C++: Operators
In GCC C++, the precedence of the operators controls the order in
which the individual operators are evaluated
the equivalent code in defined behavior C++ as GCC understands:
#include<stdio.h>
int main(int argc, char ** argv)
{
int i = 0;
//i = i++ + ++i;
int r;
r=i;
i++;
++i;
r+=i;
i=r;
printf("%d\n", i); // 2
i = 1;
//i = (i++);
r=i;
i++;
i=r;
printf("%d\n", i); // 1
volatile int u = 0;
//u = u++ + ++u;
r=u;
u++;
++u;
r+=u;
u=r;
printf("%d\n", u); // 2
u = 1;
//u = (u++);
r=u;
u++;
u=r;
printf("%d\n", u); // 1
register int v = 0;
//v = v++ + ++v;
r=v;
v++;
++v;
r+=v;
v=r;
printf("%d\n", v); //2
}
Then we go to Visual Studio. Visual Studio 2015, you get:
#include<stdio.h>
int main(int argc, char ** argv)
{
int i = 0;
i = i++ + ++i;
printf("%d\n", i); // 3
i = 1;
i = (i++);
printf("%d\n", i); // 2
volatile int u = 0;
u = u++ + ++u;
printf("%d\n", u); // 3
u = 1;
u = (u++);
printf("%d\n", u); // 2
register int v = 0;
v = v++ + ++v;
printf("%d\n", v); // 3
}
How does Visual Studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.
So the equivalent in defined behavior C++ as Visual C++ understands:
#include<stdio.h>
int main(int argc, char ** argv)
{
int r;
int i = 0;
//i = i++ + ++i;
++i;
r = i + i;
i = r;
i++;
printf("%d\n", i); // 3
i = 1;
//i = (i++);
r = i;
i = r;
i++;
printf("%d\n", i); // 2
volatile int u = 0;
//u = u++ + ++u;
++u;
r = u + u;
u = r;
u++;
printf("%d\n", u); // 3
u = 1;
//u = (u++);
r = u;
u = r;
u++;
printf("%d\n", u); // 2
register int v = 0;
//v = v++ + ++v;
++v;
r = v + v;
v = r;
v++;
printf("%d\n", v); // 3
}
as Visual Studio documentation states at Precedence and Order of Evaluation:
Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.

ROL / ROR on variable using inline assembly only in Objective-C [duplicate]

This question already has answers here:
ROL / ROR on variable using inline assembly in Objective-C
(2 answers)
Closed 9 years ago.
A few days ago, I asked the question below. Because I was in need of a quick answer, I added:
The code does not need to use inline assembly. However, I haven't found a way to do this using Objective-C / C++ / C instructions.
Today, I would like to learn something. So I ask the question again, looking for an answer using inline assembly.
I would like to perform ROR and ROL operations on variables in an Objective-C program. However, I can't manage it – I am not an assembly expert.
Here is what I have done so far:
uint8_t v1 = ....;
uint8_t v2 = ....; // v2 is either 1, 2, 3, 4 or 5
asm("ROR v1, v2");
the error I get is:
Unknown use of instruction mnemonic with unknown size suffix
How can I fix this?
A rotate is just two shifts - some bits go left, the others right - once you see this rotating is easy without assembly. The pattern is recognised by some compilers and compiled using the rotate instructions. See wikipedia for the code.
Update: Xcode 4.6.2 (others not tested) on x86-64 compiles the double shift + or to a rotate for 32 & 64 bit operands, for 8 & 16 bit operands the double shift + or is kept. Why? Maybe the compiler understands something about the performance of these instructions, maybe the just didn't optimise - but in general if you can avoid assembler do so, the compiler invariably knows best! Also using static inline on the functions, or using macros defined in the same way as the standard macro MAX (a macro has the advantage of adapting to the type of its operands), can be used to inline the operations.
Addendum after OP comment
Here is the i86_64 assembler as an example, for full details of how to use the asm construct start here.
First the non-assembler version:
static inline uint32 rotl32_i64(uint32 value, unsigned shift)
{
// assume shift is in range 0..31 or subtraction would be wrong
// however we know the compiler will spot the pattern and replace
// the expression with a single roll and there will be no subtraction
// so if the compiler changes this may break without:
// shift &= 0x1f;
return (value << shift) | (value >> (32 - shift));
}
void test_rotl32(uint32 value, unsigned shift)
{
uint32 shifted = rotl32_i64(value, shift);
NSLog(#"%8x <<< %u -> %8x", value & 0xFFFFFFFF, shift, shifted & 0xFFFFFFFF);
}
If you look at the assembler output for profiling (so the optimiser kicks in) in Xcode (Product > Generate Output > Assembly File, then select Profiling in the pop-up menu as the bottom of the window) you will see that rotl32_i64 is inlined into test_rotl32 and compiles down to a rotate (roll) instruction.
Now producing the assembler directly yourself is a bit more involved than for the ARM code FrankH showed. This is because to take a variable shift value a specific register, cl, must be used, so we need to give the compiler enough information to do that. Here goes:
static inline uint32 rotl32_i64_asm(uint32 value, unsigned shift)
{
// i64 - shift must be in register cl so create a register local assigned to cl
// no need to mask as i64 will do that
register uint8 cl asm ( "cl" ) = shift;
uint32 shifted;
// emit the rotate left long
// %n values are replaced by args:
// 0: "=r" (shifted) - any register (r), result(=), store in var (shifted)
// 1: "0" (value) - *same* register as %0 (0), load from var (value)
// 2: "r" (cl) - any register (r), load from var (cl - which is the cl register so this one is used)
__asm__ ("roll %2,%0" : "=r" (shifted) : "0" (value), "r" (cl));
return shifted;
}
Change test_rotl32 to call rotl32_i64_asm and check the assembly output again - it should be the same, i.e. the compiler did as well as we did.
Further note that if the commented out masking line in rotl32_i64 is included it essentially becomes rotl32 - the compiler will do the right thing for any architecture all for the cost of a single and instruction in the i64 version.
So asm is there is you need it, using it can be somewhat involved, and the compiler will invariably do as well or better by itself...
HTH
The 32bit rotate in ARM would be:
__asm__("MOV %0, %1, ROR %2\n" : "=r"(out) : "r"(in), "M"(N));
where N is required to be a compile-time constant.
But the output of the barrel shifter, whether used on a register or an immediate operand, is always a full-register-width; you can shift a constant 8-bit quantity to any position within a 32bit word, or - as here - shift/rotate the value in a 32bit register any which way.
But you cannot rotate 16bit or 8bit values within a register using a single ARM instruction. None such exists.
That's why the compiler, on ARM targets, when you use the "normal" (portable [Objective-]C/C++) code (in << xx) | (in >> (w - xx)) will create you one assembler instruction for a 32bit rotate, but at least two (a normal shift followed by a shifted or) for 8/16bit ones.

How to 'checksum' an array of noisy floating point numbers?

What is a quick and easy way to 'checksum' an array of floating point numbers, while allowing for a specified small amount of inaccuracy?
e.g. I have two algorithms which should (in theory, with infinite precision) output the same array. But they work differently, and so floating point errors will accumulate differently, though the array lengths should be exactly the same. I'd like a quick and easy way to test if the arrays seem to be the same. I could of course compare the numbers pairwise, and report the maximum error; but one algorithm is in C++ and the other is in Mathematica and I don't want the bother of writing out the numbers to a file or pasting them from one system to another. That's why I want a simple checksum.
I could simply add up all the numbers in the array. If the array length is N, and I can tolerate an error of 0.0001 in each number, then I would check if abs(sum1-sum2)<0.0001*N. But this simplistic 'checksum' is not robust, e.g. to an error of +10 in one entry and -10 in another. (And anyway, probability theory says that the error probably grows like sqrt(N), not like N.) Of course, any checksum is a low-dimensional summary of a chunk of data so it will miss some errors, if not most... but simple checksums are nonetheless useful for finding non-malicious bug-type errors.
Or I could create a two-dimensional checksum, [sum(x[n]), sum(abs(x[n]))]. But is the best I can do, i.e. is there a different function I might use that would be "more orthogonal" to the sum(x[n])? And if I used some arbitrary functions, e.g. [sum(f1(x[n])), sum(f2(x[n]))], then how should my 'raw error tolerance' translate into 'checksum error tolerance'?
I'm programming in C++, but I'm happy to see answers in any language.
i have a feeling that what you want may be possible via something like gray codes. if you could translate your values into gray codes and use some kind of checksum that was able to correct n bits you could detect whether or not the two arrays were the same except for n-1 bits of error, right? (each bit of error means a number is "off by one", where the mapping would be such that this was a variation in the least significant digit).
but the exact details are beyond me - particularly for floating point values.
i don't know if it helps, but what gray codes solve is the problem of pathological rounding. rounding sounds like it will solve the problem - a naive solution might round and then checksum. but simple rounding always has pathological cases - for example, if we use floor, then 0.9999999 and 1 are distinct. a gray code approach seems to address that, since neighbouring values are always single bit away, so a bit-based checksum will accurately reflect "distance".
[update:] more exactly, what you want is a checksum that gives an estimate of the hamming distance between your gray-encoded sequences (and the gray encoded part is easy if you just care about 0.0001 since you can multiple everything by 10000 and use integers).
and it seems like such checksums do exist: Any error-correcting code can be used for error detection. A code with minimum Hamming distance, d, can detect up to d − 1 errors in a code word. Using minimum-distance-based error-correcting codes for error detection can be suitable if a strict limit on the minimum number of errors to be detected is desired.
so, just in case it's not clear:
multiple by minimum error to get integers
convert to gray code equivalent
use an error detecting code with a minimum hamming distance larger than the error you can tolerate.
but i am still not sure that's right. you still get the pathological rounding in the conversion from float to integer. so it seems like you need a minimum hamming distance that is 1 + len(data) (worst case, with a rounding error on each value). is that feasible? probably not for large arrays.
maybe ask again with better tags/description now that a general direction is possible? or just add tags now? we need someone who does this for a living. [i added a couple of tags]
I've spent a while looking for a deterministic answer, and been unable to find one. If there is a good answer, it's likely to require heavy-duty mathematical skills (functional analysis).
I'm pretty sure there is no solution based on "discretize in some cunning way, then apply a discrete checksum", e.g. "discretize into strings of 0/1/?, where ? means wildcard". Any discretization will have the property that two floating-point numbers very close to each other can end up with different discrete codes, and then the discrete checksum won't tell us what we want to know.
However, a very simple randomized scheme should work fine. Generate a pseudorandom string S from the alphabet {+1,-1}, and compute csx=sum(X_i*S_i) and csy=sum(Y_i*S_i), where X and Y are my original arrays of floating point numbers. If we model the errors as independent Normal random variables with mean 0, then it's easy to compute the distribution of csx-csy. We could do this for several strings S, and then do a hypothesis test that the mean error is 0. The number of strings S needed for the test is fixed, it doesn't grow linearly in the size of the arrays, so it satisfies my need for a "low-dimensional summary". This method also gives an estimate of the standard deviation of the error, which may be handy.
Try this:
#include <complex>
#include <cmath>
#include <iostream>
// PARAMETERS
const size_t no_freqs = 3;
const double freqs[no_freqs] = {0.05, 0.16, 0.39}; // (for example)
int main() {
std::complex<double> spectral_amplitude[no_freqs];
for (size_t i = 0; i < no_freqs; ++i) spectral_amplitude[i] = 0.0;
size_t n_data = 0;
{
std::complex<double> datum;
while (std::cin >> datum) {
for (size_t i = 0; i < no_freqs; ++i) {
spectral_amplitude[i] += datum * std::exp(
std::complex<double>(0.0, 1.0) * freqs[i] * double(n_data)
);
}
++n_data;
}
}
std::cout << "Fuzzy checksum:\n";
for (size_t i = 0; i < no_freqs; ++i) {
std::cout << real(spectral_amplitude[i]) << "\n";
std::cout << imag(spectral_amplitude[i]) << "\n";
}
std::cout << "\n";
return 0;
}
It returns just a few, arbitrary points of a Fourier transform of the entire data set. These make a fuzzy checksum, so to speak.
How about computing a standard integer checksum on the data obtained by zeroing the least significant digits of the data, the ones that you don't care about?

How to do numerical integration with quantum harmonic oscillator wavefunction?

How to do numerical integration (what numerical method, and what tricks to use) for one-dimensional integration over infinite range, where one or more functions in the integrand are 1d quantum harmonic oscillator wave functions. Among others I want to calculate matrix elements of some function in the harmonic oscillator basis:
phin(x) = Nn Hn(x) exp(-x2/2)
where Hn(x) is Hermite polynomial
Vm,n = \int_{-infinity}^{infinity} phim(x) V(x) phin(x) dx
Also in the case where there are quantum harmonic wavefunctions with different widths.
The problem is that wavefunctions phin(x) have oscillatory behaviour, which is a problem for large n, and algorithm like adaptive Gauss-Kronrod quadrature from GSL (GNU Scientific Library) take long to calculate, and have large errors.
An incomplete answer, since I'm a little short on time at the moment; if others can't complete the picture, I can supply more details later.
Apply orthogonality of the wavefunctions whenever and wherever possible. This should significantly cut down the amount of computation.
Do analytically whatever you can. Lift constants, split integrals by parts, whatever. Isolate the region of interest; most wavefunctions are band-limited, and reducing the area of interest will do a lot to save work.
For the quadrature itself, you probably want to split the wavefunctions into three pieces and integrate each separately: the oscillatory bit in the center plus the exponentially-decaying tails on either side. If the wavefunction is odd, you get lucky and the tails will cancel each other, meaning you only have to worry about the center. For even wavefunctions, you only have to integrate one and double it (hooray for symmetry!). Otherwise, integrate the tails using a high order Gauss-Laguerre quadrature rule. You might have to calculate the rules yourself; I don't know if tables list good Gauss-Laguerre rules, as they're not used too often. You probably also want to check the error behavior as the number of nodes in the rule goes up; it's been a long time since I used Gauss-Laguerre rules and I don't remember if they exhibit Runge's phenomenon. Integrate the center part using whatever method you like; Gauss-Kronrod is a solid choice, of course, but there's also Fejer quadrature (which sometimes scales better to high numbers of nodes, which might work nicer on an oscillatory integrand) and even the trapezoidal rule (which exhibits stunning accuracy with certain oscillatory functions). Pick one and try it out; if results are poor, give another method a shot.
Hardest question ever on SO? Hardly :)
I'd recommend a few other things:
Try transforming the function onto a finite domain to make the integration more manageable.
Use symmetry where possible - break it up into the sum of two integrals from negative infinity to zero and zero to infinity and see if the function is symmetry or anti-symmetric. It could make your computing easier.
Look into Gauss-Laguerre quadrature and see if it can help you.
The WKB approximation?
I am not going to explain or qualify any of this right now. This code is written as is and probably incorrect. I am not even sure if it is the code I was looking for, I just remember that years ago I did this problem and upon searching my archives I found this. You will need to plot the output yourself, some instruction is provided. I will say that the integration over infinite range is a problem that I addressed and upon execution of the code it states the round off error at 'infinity' (which numerically just means large).
// compile g++ base.cc -lm
#include <iostream>
#include <cstdlib>
#include <fstream>
#include <math.h>
using namespace std;
int main ()
{
double xmax,dfx,dx,x,hbar,k,dE,E,E_0,m,psi_0,psi_1,psi_2;
double w,num;
int n,temp,parity,order;
double last;
double propogator(double E,int parity);
double eigen(double E,int parity);
double f(double x, double psi, double dpsi);
double g(double x, double psi, double dpsi);
double rk4(double x, double psi, double dpsi, double E);
ofstream datas ("test.dat");
E_0= 1.602189*pow(10.0,-19.0);// ev joules conversion
dE=E_0*.001;
//w^2=k/m v=1/2 k x^2 V=??? = E_0/xmax x^2 k-->
//w=sqrt( (2*E_0)/(m*xmax) );
//E=(0+.5)*hbar*w;
cout << "Enter what energy level your looking for, as an (0,1,2...) INTEGER: ";
cin >> order;
E=0;
for (n=0; n<=order; n++)
{
parity=0;
//if its even parity is 1 (true)
temp=n;
if ( (n%2)==0 ) {parity=1; }
cout << "Energy " << n << " has these parameters: ";
E=eigen(E,parity);
if (n==order)
{
propogator(E,parity);
cout <<" The postive values of the wave function were written to sho.dat \n";
cout <<" In order to plot the data should be reflected about the y-axis \n";
cout <<" evenly for even energy levels and oddly for odd energy levels\n";
}
E=E+dE;
}
}
double propogator(double E,int parity)
{
ofstream datas ("sho.dat") ;
double hbar =1.054*pow(10.0,-34.0);
double m =9.109534*pow(10.0,-31.0);
double E_0= 1.602189*pow(10.0,-19.0);
double dx =pow(10.0,-10);
double xmax= 100*pow(10.0,-10.0)+dx;
double dE=E_0*.001;
double last=1;
double x=dx;
double psi_2=0.0;
double psi_0=0.0;
double psi_1=1.0;
// cout <<parity << " parity passsed \n";
psi_0=0.0;
psi_1=1.0;
if (parity==1)
{
psi_0=1.0;
psi_1=m*(1.0/(hbar*hbar))* dx*dx*(0-E)+1 ;
}
do
{
datas << x << "\t" << psi_0 << "\n";
psi_2=(2.0*m*(dx/hbar)*(dx/hbar)*(E_0*(x/xmax)*(x/xmax)-E)+2.0)*psi_1-psi_0;
//cout << psi_1 << "=psi_1\n";
psi_0=psi_1;
psi_1=psi_2;
x=x+dx;
} while ( x<= xmax);
//I return 666 as a dummy value sometimes to check the function has run
return 666;
}
double eigen(double E,int parity)
{
double hbar =1.054*pow(10.0,-34.0);
double m =9.109534*pow(10.0,-31.0);
double E_0= 1.602189*pow(10.0,-19.0);
double dx =pow(10.0,-10);
double xmax= 100*pow(10.0,-10.0)+dx;
double dE=E_0*.001;
double last=1;
double x=dx;
double psi_2=0.0;
double psi_0=0.0;
double psi_1=1.0;
do
{
psi_0=0.0;
psi_1=1.0;
if (parity==1)
{double psi_0=1.0; double psi_1=m*(1.0/(hbar*hbar))* dx*dx*(0-E)+1 ;}
x=dx;
do
{
psi_2=(2.0*m*(dx/hbar)*(dx/hbar)*(E_0*(x/xmax)*(x/xmax)-E)+2.0)*psi_1-psi_0;
psi_0=psi_1;
psi_1=psi_2;
x=x+dx;
} while ( x<= xmax);
if ( sqrt(psi_2*psi_2)<=1.0*pow(10.0,-3.0))
{
cout << E << " is an eigen energy and " << psi_2 << " is psi of 'infinity' \n";
return E;
}
else
{
if ( (last >0.0 && psi_2<0.0) ||( psi_2>0.0 && last<0.0) )
{
E=E-dE;
dE=dE/10.0;
}
}
last=psi_2;
E=E+dE;
} while (E<=E_0);
}
If this code seems correct, wrong, interesting or you do have specific questions ask and I will answer them.
I am a student majoring in physics, and I also encountered the problem. These days I keep thinking about this question and get my own answer. I think it may help you solve this question.
1.In gsl, there are functions can help you integrate the oscillatory function--qawo & qawf. Maybe you can set a value, a. And the integration can be separated into tow parts, [0,a] and [a,pos_infinity]. In the first interval, you can use any gsl integration function you want, and in the second interval, you can use qawo or qawf.
2.Or you can integrate the function to a upper limit, b, that is integrated in [0,b]. So the integration can be calculated using a gauss legendry method, and this is provided in gsl. Although there maybe some difference between the real value and the computed value, but if you set b properly, the difference can be neglected. As long as the difference is less than the accuracy you want. And this method using the gsl function is only called once and can use many times, because the return value is point and its corresponding weight, and integration is only the sum of f(xi)*wi, for more details you can search gauss legendre quadrature on wikipedia. Multiple and addition operation is much faster than integration.
3.There is also a function which can calculate the infinity area integration--qagi, you can search it in the gsl-user's guide. But this is called everytime you need to calculate the integration, and this may cause some time consuming, but I'm not sure how long will it use in you program.
I suggest NO.2 choice I offered.
If you are going to work with Harmonic oscillator functions less than n = 100 you might want to try:
http://www.mymathlib.com/quadrature/gauss_hermite.html
The program computes an integral via gauss-hermite quadrature with 100 zeroes and weights (the zeroes of H_100). Once you go over Hermite_100 the integrals are not as accurate.
Using this integration method I wrote a program calculating exactly what you want to calculate and it works fairly well. Also, there might be a way to go beyond n=100 by using the asymptotic form of the Hermite-polynomial zeroes but I haven't looked into it.