Valgrinding a program that uses openldap2's libldap is a chore because of OpenSSL's use of uninitialized memory. There exists a --ignore-fn option, but only for the massif subcomponent of Valgrind. Is there anything similar for memcheck to exclude traces in which certain functions appear?
==13795== Use of uninitialised value of size 8
==13795== at 0x6A9C8CF: ??? (in /lib64/libz.so.1.2.3)
==13795== by 0x6A9A63B: inflate (in /lib64/libz.so.1.2.3)
==13795== by 0x68035C1: ??? (in /lib64/libcrypto.so.1.0.0)
==13795== by 0x6802B9F: COMP_expand_block (in /lib64/libcrypto.so.1.0.0)
==13795== by 0x64ABBCD: ssl3_do_uncompress (in /lib64/libssl.so.1.0.0)
==13795== by 0x64ACA6F: ssl3_read_bytes (in /lib64/libssl.so.1.0.0)
==13795== by 0x64A9F2F: ??? (in /lib64/libssl.so.1.0.0)
==13795== by 0x56B3E61: ??? (in /usr/lib64/libldap-2.4.so.2.5.4)
==13795== by 0x5E4DB1B: ??? (in /usr/lib64/liblber-2.4.so.2.5.4)
==13795== by 0x5E4E96E: ber_int_sb_read (in /usr/lib64/liblber-2.4.so.2.5.4)
==13795== by 0x5E4B4A6: ber_get_next (in /usr/lib64/liblber-2.4.so.2.5.4)
==13795== by 0x568FB9E: ??? (in /usr/lib64/libldap-2.4.so.2.5.4)
You can create a suppression file and use it to suppress errors coming from certain sources: http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress
Related
I want to generate a suppressions file with --gen-suppressions in valgrind.
However, I do not want to have to go through thousands of lines of output the cut and paste out the suppressions and remove the valgrind stack traces / other valgrind output, and resolve .
Is there a way to do this easily? This seems like a very basic use case...
// I want this part vvvvv
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
fun:strdup
fun:_XlcCreateLC
fun:_XlcDefaultLoader
fun:_XOpenLC
fun:_XrmInitParseInfo
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
fun:XrmGetStringDatabase
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
fun:XGetDefault
fun:GetXftDPI
fun:X11_InitModes_XRandR
fun:X11_InitModes
fun:X11_VideoInit
}
// I do not want this part vvvv
==187526== 2 bytes in 1 blocks are still reachable in loss record 2 of 137
==187526== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==187526== by 0x4B7C50E: strdup (strdup.c:42)
==187526== by 0x5922D81: _XlcResolveLocaleName (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5926387: ??? (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5925956: ??? (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x592615C: _XlcCreateLC (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x5943664: _XlcDefaultLoader (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==187526== by 0x592D995: _XOpenLC (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
It is quite unlikely that all of the suppressions are different.
If you create a suppression like
{
XINIT-1
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
fun:strdup
fun:_XlcCreateLC
fun:_XlcDefaultLoader
fun:_XOpenLC
fun:_XrmInitParseInfo
obj:/usr/lib/x86_64-linux-gnu/libX11.so.6.3.0
}
Then re-run. Typically the error count will go down very quickly and you will only need to add a fairly small number of suppressions (single or low double digits).
(you need to apply your knowledge of the code and libs(s) to get a sensible stack depth for suppressions - too many stack entries and the suppression will be too specific and you need more suppressions, too few and you risk suppressing real problems).
I receive the following errors from valgrind.
==30996== Conditional jump or move depends on uninitialised value(s)
==30996== at 0x12B28904: ??? (in /usr/lib64/libmlx4-rdmav2.so)
==30996== by 0xE12CF9A: ibv_open_device (in /usr/lib64/libibverbs.so.1.0.0)
==30996== by 0xAAFA03B: btl_openib_component_init (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xAAF0832: mca_btl_base_select (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xAAF0160: mca_bml_r2_component_init (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xAAEE95D: mca_bml_base_init (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xABE96D9: mca_pml_ob1_component_init (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xABE75A8: mca_pml_base_select (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xAA98BD3: ompi_mpi_init (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0xAAB87EC: PMPI_Init_thread (in /sw/arcts/centos7/openmpi/1.10.2-gcc-4.8.5/lib/libmpi.so.12.0.2)
==30996== by 0x5D4664: PetscInitialize.part.3 (in /scratch/kfid_flux/ykmizu/ROMLSS/bin/ks_main.x)
==30996== by 0x49B5B4: main (in /scratch/kfid_flux/ykmizu/ROMLSS/bin/ks_main.x)
==30996==
and this error repeats itself over and over again. I don't understand why PetscInitialize would give me a hard time. It's one of the first things I call in my main.c file after I initialize ints and doubles and etc.
PetscInitialize(&argc, &argv, NULL, NULL);
SlepcInitialize(&argc, &argv, NULL, NULL);
PetscViewerPushFormat(PETSC_VIEWER_STDOUT_SELF, PETSC_VIEWER_ASCII_MATLAB);
Are these just false errors? Any help would be greatly appreciated. Getting a little desperate about this. Thank you.
There are discussions here.
It seems that you use Open MPI which is noisy under valgrind. You can try to compiler two versions of PETSc (so two different PETS_ARCHs): one uses the optimized MPI in your system, and another is built using MPICH with the configure option --download-mpich.
For debugging, you can select the PETSC_ARCH compiled with mpich. For performance evaluation, you can select another PETSC_ARCH compiled with optimized MPI of your platform.
Additionaly, if you want to use both PETSc and SLEPc, you can select either PetscInitialize or SlepcInitialize for start their environment. It makes no sense to repeat two times.
I hope it's helpful for you.
I'm catching quite a few uninitialized value(s) under Valgrind. The finding is expected because its related to to OpenSSL's PRNG:
==5787== Use of uninitialised value of size 8
==5787== at 0x533B449: _x86_64_AES_encrypt_compact (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x533B6DA: fips_aes_encrypt (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x56FBC47: ??? (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x56FBD27: ??? (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x56FBE47: ??? (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0xFFEFFFE17: ???
==5787== Uninitialised value was created by a heap allocation
==5787== at 0x4C28D84: malloc (vg_replace_malloc.c:291)
==5787== by 0x53575AF: CRYPTO_malloc (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x53FB52B: drbg_get_entropy (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x534C312: fips_get_entropy (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x534CABE: FIPS_drbg_instantiate (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x53FB94E: RAND_init_fips (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x5403F5D: EVP_add_cipher (in /usr/local/ssl/lib/libcrypto.so.1.0.0)
==5787== by 0x507B7C0: SSL_library_init (in /usr/local/ssl/lib/libssl.so.1.0.0)
==5787== by 0x4103E7: DoStartupOpenSSL() (ac-openssl-1.cpp:494)
==5787== by 0x419504: main (main.cpp:69)
==5787==
But I'm having trouble suppressing it (and that's not expected). I'm trying to use the following three rules, which use frame-level wildcards.
{
RAND_init_fips (1)
Memcheck:Cond
...
fun:RAND_init_fips
...
}
{
RAND_init_fips (2)
Memcheck:Value8
...
fun:RAND_init_fips
...
}
{
RAND_init_fips (3)
Memcheck:Value4
...
fun:RAND_init_fips
...
}
I don't want to do things like initialize the memory because of the Debian PRNG fiasco a few years ago. Plus, its the OpenSSL FIPS Object Module, so I can't modify it because the source code and resulting object file are sequestered.
I'm not sure what the issue is because it appears RAND_init_fips surrounded by frame level-wildcards should match the finding. Any ideas what might be going wrong here?
According to Tom Hughes on the Valgrind User's mailing list, its not possible to write the suppression rule:
How can I tell valgrind to stop showing any kind of error related to a certain library? I got lots of reports that look like this:
==24152== Invalid write of size 8
==24152== at 0xD9FF876: ??? (in /usr/lib64/dri/fglrx_dri.so)
==24152== by 0x110647AF: ???
==24152== Address 0x7f3c98553f20 is not stack'd, malloc'd or (recently) free'd
I could prune them by the address (0x7fxxxxxxxxxx is not something that is allocated at userland), but my valgrind build seems not to accept --ignore-ranges=0x7f0000000000-0x7fffffffffff
You can generate suppression-lists using --gen-suppressions=all. Then you can add those to some .supp file under lib/valgrind.
In gdb, is there a way to access the contents of info frame in a script?
I'm debugging a problem somewhere between Apache, PHP, APC and my own code, and I have about a hundred cores to choose from. Following the instructions here
http://bugs.php.net/bugs-generating-backtrace.php
I end up with a stacktrace like:
#0 0x0121a31a in do_bind_function (opline=0xa94dd750, function_table=0x9b9cf98, compile_time=0 '\0') at /usr/src/debug/php-5.2.7/Zend/zend_compile.c:2407
#1 0x0124bb2e in ZEND_DECLARE_FUNCTION_SPEC_HANDLER (execute_data=0xbfef7990) at /usr/src/debug/php-5.2.7/Zend/zend_vm_execute.h:498
#2 0x01249dfa in execute (op_array=0xb79d5d3c) at /usr/src/debug/php-5.2.7/Zend/zend_vm_execute.h:92
#3 0x01261e31 in ZEND_INCLUDE_OR_EVAL_SPEC_VAR_HANDLER (execute_data=0xbfef80d0) at /usr/src/debug/php-5.2.7/Zend/zend_vm_execute.h:7809
#4 0x01249dfa in execute (op_array=0xb79d55ec) at /usr/src/debug/php-5.2.7/Zend/zend_vm_execute.h:92
...
#26 0x09caa894 in ?? ()
#27 0x00000000 in ?? ()
The stack will always look similar, with function execute and ZEND_something interleaved several times. I need to go up to the last instance of execute (up 2 in this case) and print myVar.
Obviously gdb knows the function names, but does it surface them in any user variables I could access?
Typing frame 2 shows a one-line version, and info frame shows a single stackframe in detail. I want to do something like
while ($current_frame.function_name != "execute") {up;} print myVar but I don't see how to do it strictly within gdb.
Is there a variable / structure / special memory location / something that allows access to gdb's information on either the whole stack (like bt) or to the current stack frame (like info frame)?
GBD 7.1 has support for accessing frame information from Python for exactly this kind of scripting.