SIGSEGV in optimizated ifort

SIGSEGV in optimizated ifort - optimization

If I compile with -O0 in ifort, the program can run correctly. But as long as I open the optimization option, like -O, -O3, -fast, there will be a SIGSEGV segmentation fault come out.
This error occurred in a subroutine named maketable(). And the belows are the phenomenons:
(1) I call fftw library in this subroutine. If I comment the sentences about fftw, it'll be ok. But I think it's not the fault of fftw, because I also use fftw in some other places of this code, and they are good.
(2) the fftw is called in a loop, and the loop can run several times when the program crashed. The segfault does not happen at the first time of entering the loop.
(3) I considered the stack overflow, but I don't think so now. I have the executable file complied by others long time ago, it's can be executed in my computer. I think that suggests it's not due to the system stack overflow.
The version of ifort is 10.0, of fftw is fftw-2.1.5. The cpu type is intel xeon 5130. Thanks a lot.

There are two common causes of segmentation faults in Fortran programs:
Attempting to access an element outside the bounds of an array.
Mismatching actual and dummy arguments in a procedure call.
Both are relatively easy to find:
Your compiler will have an option to generate code which performs array bounds checking at run time. Check your compiler documentation, rebuild your code and rerun it. If this is the cause of the problem you will get an error message identifying where your code goes awry.
Program explicit interfaces for any subroutines and functions in your program, or use modules so that the compiler generates such interfaces for you, or use a compiler option (see the documentation) to check that argument types match at compile-time.
It's not unusual that such errors (seem to) arise only when optimisation is turned up high.
EDIT
Note that I'm not suggesting that optimisation causes the error you observe, but that it causes the error to affect the execution of your program and become evident.
It's not unknown for incorrect programs to run many times apparently without fault only for, say, recompilation with a new compiler version to create an executable which crashes every time.
Your wish to switch off optimisation only for the subroutine where the segmentation fault seems to arise is, I suggest, completely wrong-headed. I expect my programs to execute correctly at any level of optimisation (save for clear evidence of a compiler bug, such things are not unknown). I think that by turning off optimisation you are sweeping a real problem with your program under the carpet, as it were.

Related

error detection in static analysis and symbolic execution

what kind of errors static analysis (e.g. compiler) can detect and symbolic execution can not detect? and what kind of errors that symbolic execution can detect and static analysis can not detect? for example can symbolic execution detect syntax errors?

In short, static analysis is capable of spotting coding issues, such as bad practices. For example, if you declare (unnecessarily) a class field as public, a static analysis tool may warn you that such field should be declared as private. However, the "cleanest" code is not necessarily bug free. Although, no malpractices can be found in some code, an incorrect reasoning on behalf of the coder may lead (later) to a crash in runtime.
For example, if we develop clean code to implement a calculator, then a static analysis tool does not output any warning, however if we forget to verify the input to prevent the user from attempting a division by zero, then the our calculator would eventually crash in runtime.
On the other hand, Symbolic (or Concolic) execution executes the target program, hence they have the potential to achieve any possible runtime execution state of the program, such as inducing a runtime error caused by a bug. In the above-described calculator example, symbolic execution would find the runtime failure and would also tell us which inputs induce such failure. To answer your last question, symbolic execution is not meant to inspect the quality of the code.
Ideally, we should use both before releasing the software.

GNU compile-time stack checking

Our application crashed immediately upon entering main with the following:
0x0000000000492148 in main (argc=Cannot access memory at address 0x7fffff7689fc)
After using objdump we realized there were several very large objects that were created on the stack. All this is good.
Now, we are trying to instruct g++ to inform us of such huge stack allocations at the preamble, but using -fstack-check doesn't do it in this case, possibly because the problem is in main.
I read about STACK_CHECK_BUILTIN, but is that a flag that should be provided to the compilation of g++, and not my application? The documentation is vast, but not concise.

Why isn't all the java bytecode initially interpreted to machine code?

I read about Just-in-time compilation (JIT) and as I understood, there are two approaches for this – Interpreter and JIT, both of which interpreting the bytecode at runtime.
Why not just preparatively interprete all the bytecode to machine code, and only then start to run the process with no more need for interpreter?

Another reason for late JIT compiling has to do with optimization: At run-time the VM can detect more/other patterns it may optimize than the compiler could ever do at compile-time. JIT pre-compiling at startup will always have to be static, and the same could have been done by the compiler already, but through analysis of the actual run-time behaviour the VM may have more information on possible optimizations and may therefore produce better optimization results.
For example, the VM can detect that a single piece of code is actually run a million times at run-time and perform appropriate optimizations which the compiler may have no information about, not unlike the branch prediction that's done at runtime in modern CPUs.
More information can be found in the Wikipedia article on "Adaptive optimization".

Simple: Because it takes time to precompile everything to machine code. And users don't want to wait on the application to start. Remember, the precompilation would have to make a lot of optimizations which takes time.
The server version of JVM is more aggressive in precompiling and optimizing code upfront because code on the server side tends to be executed more often and for a longer period of time before the process is shutdown.
However, a solution (for .Net) is an application called NGen which make the precompilation upfront such that it isn't needed after that point. You only have to run that once.
Not all VM's include an interpreter. For instance Chrome and CLR (.Net) always compiles to machine code before running. However, they have multiple levels of optimizations to reduce the startup time.

I found link showing how runtime recompilation can optimize performance and save extra CPU cycles.
Inlining expansion: To decrease the cost of procedure calls.
Removing redundant loads: When 2 compiled code results in some duplicate code then it can be removed and further optimised by recompilation at run time.
Copy propagation
Eliminating dead code
Here is another link for the same explanation given above.

gfortran optimization causes fortran do-variable loop error during runtime

I have written a fortran routine that uses some legacy fortran 77 code for finite elements. However, with a particular mesh, when the -O optimization flag is turned on, an important do-loop iterator is somehow being modified, even though fortran supposedly prohibits this. I have compiled this code using gfortran4.5 with the -fcheck=do run-time checking enabled and it has verifies what I've noted above. A runtime error occurs, only when optimizations are turned on and points directly to the do-iterator.
Using gdb on optimized code seems (while it seems erratic - lines bouncing back and forth) seems to clearly indicate that the do-iterator somehow gets set back to zero, and essentially this causes a nice infinite loop.
Any suggestions as to how to hunt down and fix whatever is causing this bug would be greatly appreciated, as I'd like to make sure the whole project can be consistently compiled with the same flags.

You say that you use fcheck=do; why not go all the way and use fcheck=all? What you're seeing sounds like a typical case of memory corruption due to an array bounds violation, which fcheck=all can in some cases catch. Where the array bounds checking doesn't work that well is with implicit interfaces and incorrect bounds being passed; a solution here is to put your procedures into modules, allowing the compiler to check interfaces.
And, like Jonathan Dursi said, consider using a tool like valgrind.

Any Macro or Technic for Part Optimization?

I am working on lock free structure with g++ compiler. It seems that with -o1 switch, g++ will change the execution order of my code. How can I forbid g++'s optimization on certain part of my code while maintain the optimization to other part? I know I can split it to two files and link them, but it looks ugly.

If you find that gcc changes the order of execution in your code, you should consider using a memory barrier. Just don't assume that volatile variables will protect you from that issue. They will only make sure that in a single thread, the behavior is what the language guarantees, and will always read variables from their memory location to account for changes "invisible" to the executing code. (e.g changes to a variable done by a signal handler).
GCC supports OpenMP since version 4.2. You can use it to create a memory barrier with a special #pragma directive.
A very good insight about locking free code is this PDF by Herb Sutter and Andrei Alexandrescu: C++ and the Perils of Double-Checked Locking

You can use a function attribute "__attribute__ ((optimize 0))" to set the optimization for a single function, or "#pragma GCC optimize" for a block of code. These are only for GCC 4.4, though, I think - check your GCC manual. If they aren't supported, separation of the source is your only option.
I would also say, though, that if your code fails with optimization turned on, it is most likely that your code is just wrong, especially as you're trying to do something that is fundamentally very difficult. The processor will potentially perform reordering on your code (within the limits of sequential consistency) so any re-ordering that you're getting with GCC could potentially occur anyway.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas