Is there a length limit on g++ variable names? - variables

See title​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​

Short Answer:
No
Long Answer:
Yes, it has to be small enough that it will fit in memory, but otherwise no, not really. If there is a builtin limit (I don't believe there is) it is so huge you'd be really hard-pressed to reach it.
Actually, you got me really curious, so I created the following Python program to generate code:
#! /usr/bin/env python2.6
import sys;
cppcode="""
#include <iostream>
#include <cstdlib>
int main(int argc, char* argv[])
{
int %s = 0;
return 0;
}
"""
def longvarname(n):
str="x";
for i in xrange(n):
str = str+"0";
return str;
def printcpp(n):
print cppcode % longvarname(n);
if __name__=="__main__":
if len(sys.argv)==2:
printcpp(int(sys.argv[1]));
This generates C++ code using the desired length variable name. Using the following:
./gencpp.py 1048576 > main.cpp
g++ main.cpp -o main
The above gives me no problems (the variable name is roughly 1MB in length). I tried for a gigabyte, but I'm not being so smart with the string construction, and so I decided to abort when gencpp.py took too long.
Anyway, I very much doubt that gcc pre-allocates 1MB for variable names. It is purely bounded by memory.

an additional gotcha, some linkers have a limit on the length of the mangled name. this tends to be an issue with template and nested classes more than identifier length but either could trigger a problem afaik

I don't know what the limit is (or if there is one), but I think it is good practice that there should be one, in order to catch pathological code, for example that created by a runaway code generator. For what it's worth, the C++ Standard suggests a minimum of 1K for identifier length.

Related

How can I tell cppcheck to ignore inline assembly?

We have a file within inline assembly for a DSP. Cppcheck thinks there are a load of "variable assigned but not used" lines in the assembly.
Is there any way to tell it to skip checking the inline assembly sections? I couldn't see anything obvious in the manual, and it is a bit tedious to have to suppress each line in turn (t
Here's an example of some of the the offending lines. It's a context save routine.
inline assembly void save_ctx()
{
asm_begin
.undef global data saved_ctx;
.undef global data p_ctx;
asm_text
...
st XM[p0++], r0;
st XM[p0++], r1;
st XM[p0++], r2;
st XM[p0++], r3;
st XM[p0++], r4;
st XM[p0++], r5;
st XM[p0++], r6;
...
I can turn off the messages with
// cppcheck-suppress unreadVariable
before each line, but it would be better to just tell cppcheck to skip the whole inline assembly section.
Is there any way I can do this, or will we just have to accept lots of repeated comments?
Somewhat counter-intuitive, but thanks to #DavidWohlferd for pointing me the right way.
-D__CPPCHECK__ doesn't do the right thing. It tells cppcheck to only check blocks with __CPPCHECK__ or nothing defined, i.e. it completely turns off the combinatorial checking. However there is a simple but counter-intuitive solution using -U.
Wrap the block with
#define EXCLUDE_CPPCHECK
#ifdef EXCLUDE_CPPCHECK
...
#endif // EXCLUDE_CPPCHECK
Now if you call cppcheck with -UEXCLUDE_CPPCHECK it will skip that block (even though the #define is just before it!) but still do all the other combinations of #define which are used in #if.
Thank you David and Drew.
According to man page (didn't try myself) you can add command line options:
--suppress=<spec>
Suppress a specific warning. The format of <spec> is: [error id]:[filename]:[line]. The [filename] and [line] are optional. [error id] may be * to suppress all warnings (for a specified file or files). [filename] may contain the wildcard characters * or ?.
--suppressions-list=<file>
Suppress warnings listed in the file. Each suppression is in the format of <spec> above.
I.e. in your case --suppress=unreadVariable:all_dsp_asm_*.cpp and switch it completely for those particular files. Which is IMO usable, as you can put all the DSP inline asm things into separate file, so it will not affect your ordinary cpp check.
Or in worst case use the suppression-list listing file, where you may list particular lines ad absurd I guess, to cover whole inline parts.
I don't see how to inline it in the source, looks like it may affect only single line.
Checking probably more up to date version of manual here, you can exclude whole file also by -i<filename> (second page).
The options above are at page 11.

How can I use cmake to test processes that are expected to fail with an exception? (e.g., failures due to clang's address sanitizer)

I've got some tests that test that clang's address sanitizer catch particular errors. (I want to ensure my understanding of the types of error it can catch is correct, and that future versions continue to catch the type of errors I'm expecting them to.) This means I have several tests that fail by crapping out with an OTHER_FAULT, which appears to be the fixed way that clang's runtime reports an error.
I've set the WILL_FAIL flag to TRUE for these tests, but this only seems to check the return value from a successful, exception-free failure. If the process terminates with an exception, cmake still classes it as a failure.
I've also tried using PASS_REGULAR_EXPRESSION to watch for the distinguishing messages that are printed out when this error occurs, but again, cmake seems to class the test as a failure if it terminates with an exception.
Is there anything I can do to get around this?
(clang-specific answers are also an option! - but I doubt this will be the last time I need to test something like this, so I'd prefer to know how to do it with cmake generally, if it's possible)
CTest provides only basic, commonly used interpretators for result of test programs. For implement other interpretators you can write simple program/script, which wraps the test and interpret its result as needed. E.g. C program (for Linux):
test_that_crash.c:
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
int main(int argc, char** argv)
{
pid_t pid = fork();
if(pid == -1)
{
// fork fails
return 1;
}
else if(pid)
{
// Parent - wait child and interpret its result
int status = 0;
wait(&status);
if(WIFSIGNALED(status)) return 0; // Signal-terminated means success
else return 1;
}
else
{
// Child - execute wrapped command
execvp(argv[1], argv + 1);
exit(1);
}
}
This program can be used in CMake as follows:
CMakeLists.txt:
# Compile our wrapper
add_executable(test_that_crash test_that_crash.c)
# Similar to add_test(name command), but test is assumed successfull only if it is crashed(signalled)
macro(add_test_crashed name command)
# Use generic flow of add_test() command for automatically recognize our executable target
add_test(NAME ${name} COMMAND test_that_crash ${command} ${ARGN})
endmacro(add_test_crashed)
# ...
# Add some test, which should crash
add_test_crashed(clang.crash.1 <clang-executable> <clang-args>)
There is also a clang-specific solution: configure its manner of exit using the ASAN_OPTIONS environment variable. (See https://github.com/google/sanitizers/wiki/AddressSanitizerFlags.) To do this, set the ASAN_OPTIONS environment variable to abort_on_error=0. When the address sanitizer detects a problem, the process will then do _exit(1) rather than (presumably) abort(), and will thus appear to have terminated cleanly. You can then pick this up using cmake's WILL_FAIL mechanism. (It's still not clear why OS X and Linux differ in this respect - but there you go.)
As a bonus, the test fails much more quickly.
(Another handy option that can improve turnaround time when running through cmake is to set ASAN_SYMBOLIZER_PATH to an empty value, which stops the address sanitizer symbolizing the stack traces. Symbolizing takes a moment, but there's no point doing it when running through cmake, since you can't see the output.)
Rather than do this by hand, I made a Python script that sets the environment appropriately on OS X (doing nothing on Linux), and invokes the test. I then add each asan test using a macro, along the lines of Tsyvarev's answer.
macro(add_asan_test basename)
add_executable(${basename} ${basename}.c)
add_test(NAME test/${basename} COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/wrap_clang_sanitizer_test.py -a $<TARGET_FILE:${basename}>)
set_tests_properties(test/${basename} PROPERTIES WILL_FAIL TRUE)
endmacro()
This gives a simple pass/fail as quickly as possible. I'm in the habit of investigating failures by running the test in question from the shell by hand and examining the output, in which case I get the stack trace as normal (and the fact exiting by abort is a bit slow is less of a problem).
(There are similar options for the other sanitizers, but I haven't investigated them.)

Determining symbol addresses using binutils/readelf

I am working on a project where our verification test scripts need to locate symbol addresses within the build of software being tested. This might be used for setting breakpoints or reading static data from memory. What I am after is to create a map file containing symbol names, base address in memory, and size. Our build outputs an ELF file which has the information I want. I've been trying to use the readelf, nm, and objdump tools to try and to gain the symbol addresses I need.
I originally tried readelf -s file.elf and that seemed to access some symbols, particularly those which were written in assembler. However, many of the symbols that I wanted were not in there - specifically those that originated within our Ada code.
I used readelf --debug-dump file.elf to dump all debug information. From that I do see all symbols, including those that were in the Ada code. However, the format seems to be in the DWARF format. Does anyone know why these symbols would not be output by readelf when I ask it to list the symbolic information? Perhaps there is simply an option I am missing.
Now I could go to the trouble of writing a custom DWARF parser to get the information but if I can get it using one of the Binutils (nm, readelf, objdump) then I'd really like prefer a standard solution.
DWARF is the debug information and tries to reflect the relation of the original source code. Taking following code as an example
static int one() {
// something
return 1;
}
int main(int ac, char **av) {
return one();
}
After you compile it using gcc -O3 -g, the static function one will be inlined into main. So when you use readelf -s, you will never see the symbol one. However, when you use readelf --debug-dump, you can see one is a function which is inlined.
So, in this example, compiler does not prohibit you use optimization with -g, so you can still debug the executable. In that example, even the function is optimized and inlined, gdb still can use DWARF information to know the function and source/line from current code block inside inlined function.
Above is just a case of compiler optimization. There might be plenty of reasons that could lead to mismatch symbols address between readelf -s and DWARF.

clang optimization bug?

I've been trying to track down what seems like a bug in clang, and I think I've got a reasonably minimal reproduction of it. Here's my program:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define x_Is_Digit(x) isdigit((unsigned char) (x))
void Odd_Behavior(char * version)
{
char * ptr, *tmp;
for (ptr = version; x_Is_Digit(*ptr); ptr++);
ptr++;
for (tmp = ptr; x_Is_Digit(*ptr); ptr++);
if (ptr == tmp)
printf("%08x == %08x! Really?\n", ptr, tmp);
}
int main()
{
char buffer[100];
strcpy(buffer, "3.8a");
Odd_Behavior(buffer);
return(0);
}
When I compile it with optimization, in the clang included with the Xcode download ("Apple clang 2.1"):
clang++ -Os optimizebug.cpp
And run it, it reports:
6b6f2be3 == 6b6f2be2! Really?
This strikes me as a tad odd, to say the least. If I remove the (unsigned char) cast in x_Is_Digit, it works properly.
Have I run into a bug in clang? Or am I doing something here that's causing some sort of undefined behavior? If I compile it with -O0, I don't get the problem.
Certainly looks like a bug to me. Clang mainline doesn't display this (at least on darwin/x86-64). Please file a bug at llvm.org/bugs with full details on how to reproduce this. Stack overflow isn't a great place to report compiler bugs :)
Definitively a bug. If the two pointers are equal at the if statement, they must also be equal in the printf statement.

MinGW and "declaration does not declare anything"

I'm working on converting a Linux project of mine to compile on Windows using MinGW. It compiles and runs just fine on Linux, but when I attempt to compile it with MinGW it bombs out with the following error message:
camera.h:11: error: declaration does not declare anything
camera.h:12: error: declaration does not declare anything
I'm kind of baffled why this is happening, because
I'm using the same version of g++ (4.4) on both Linux and Windows (via MinGW).
The contents of camera.h is absurdly simple.
Here's the code. It's choking on lines 11 and 12 where float near; and float far; are defined.
#include "Vector.h"
#ifndef _CAMERA_H_
#define _CAMERA_H_
class Camera{
public:
Vector eye;
Vector lookAt;
float fov;
float near;
float far;
};
#endif
Thanks for your help.
EDIT: Thanks both Dirk and mingos, that was exactly the problem!
Edit If you happen to include windef.h (either directly or indirectly), you will find
#define FAR
#define far
#define NEAR
#define near
there. I think, that this is the culprit.
Try
#undef near
#undef far
before your class definition.
Try giving them different names, like
float my_near;
float my_far;
I recall Borland using "near" and "far" as keywords (my 1992 Turbo C had these, back in MS-DOS era). Dunno if this is the case with gcc, but you can always try that.
In <windef.h>, you'll find on the following lines:
#define NEAR
#define near
Simple answer: you can't #undef them because they're a part of the Windows headers (_WINDEF_H will still be defined even if you #undef those definitions, so it won't be re-included if you try to #include <windef.h> again, not to mention the fact that if you #undef _WINDEF_H before using #include <windef.h> after your class definition, you'll end up with duplicate definitions for things like RECT, LONG, PROC and more), so the only other solution is to change your variable names.