What causes mkstemp to fail when running many simultaneous valgrind processes? - valgrind

I'm doing testing of some software with valgrind. Ideally, I would like to have 20 or more instances of valgrind open at once. However, if I run more than 16 instances in parallel, I start getting messages like:
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_269e37a6
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_d6b675e7
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_db46c594
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_51cd683d
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_86662832
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_226a8983
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_bb94a700
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_532d4b39
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_de4a957e
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_fcc23adf
==30533== VG_(mkstemp): failed to create temp file: /tmp/valgrind_proc_30533_cmdline_f41d332c
valgrind: Startup or configuration error:
valgrind: Can't create client cmdline file in /pathtomyproject/
valgrind: Unable to start up properly. Giving up.
Some of the processes (perhaps 1/3 of them) instead terminate with the error
==30482== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
==30482==
==30482== 1 errors in context 1 of 1:
==30482== Jump to the invalid address stated on the next line
==30482== at 0x4C6: ???
==30482== by 0x4005D2E: open_verify (dl-load.c:1914)
==30482== by 0x4006362: open_path (dl-load.c:2175)
==30482== by 0x4008799: _dl_map_object (dl-load.c:2407)
==30482== by 0x400CFE1: openaux (dl-deps.c:65)
==30482== by 0x400F175: _dl_catch_error (dl-error.c:178)
==30482== by 0x400D6BD: _dl_map_object_deps (dl-deps.c:258)
==30482== by 0x400350C: dl_main (rtld.c:1826)
==30482== by 0x4015B23: _dl_sysdep_start (dl-sysdep.c:244)
==30482== by 0x4005364: _dl_start (rtld.c:338)
==30482== by 0x40016B7: ??? (in /lib/x86_64-linux-gnu/ld-2.15.so)
==30482== by 0x4: ???
==30482== by 0x7FF0007C6: ???
==30482== by 0x7FF0007DD: ???
==30482== by 0x7FF0007E2: ???
==30482== by 0x7FF0007E9: ???
==30482== by 0x7FF0007EE: ???
==30482== Address 0x4c6 is not stack'd, malloc'd or (recently) free'd
While running these calls, no files are created in /tmp, but the user account I'm using does have read, write and execute permissions for /tmp.
I cannot find any information about this bug online, but perhaps somewhere here knows something about it?
EDIT: Some further experimentation suggests that in fact, no more than 5 processes can be run together at once.

The error comes from here:
// coregrind/m_libcfile.c
/* Create and open (-rw------) a tmp file name incorporating said arg.
Returns -1 on failure, else the fd of the file. If fullname is
non-NULL, the file's name is written into it. The number of bytes
written is guaranteed not to exceed 64+strlen(part_of_name). */
Int VG_(mkstemp) ( HChar* part_of_name, /*OUT*/HChar* fullname )
{
HChar buf[200];
Int n, tries, fd;
UInt seed;
SysRes sres;
const HChar *tmpdir;
vg_assert(part_of_name);
n = VG_(strlen)(part_of_name);
vg_assert(n > 0 && n < 100);
seed = (VG_(getpid)() << 9) ^ VG_(getppid)();
/* Determine sensible location for temporary files */
tmpdir = VG_(tmpdir)();
tries = 0;
while (True) {
if (tries++ > 10)
return -1;
VG_(sprintf)( buf, "%s/valgrind_%s_%08x",
tmpdir, part_of_name, VG_(random)( &seed ));
if (0)
VG_(printf)("VG_(mkstemp): trying: %s\n", buf);
sres = VG_(open)(buf,
VKI_O_CREAT|VKI_O_RDWR|VKI_O_EXCL|VKI_O_TRUNC,
VKI_S_IRUSR|VKI_S_IWUSR);
if (sr_isError(sres)) {
VG_(umsg)("VG_(mkstemp): failed to create temp file: %s\n", buf);
continue;
}
/* VG_(safe_fd) doesn't return if it fails. */
fd = VG_(safe_fd)( sr_Res(sres) );
if (fullname)
VG_(strcpy)( fullname, buf );
return fd;
}
/* NOTREACHED */
}
As you can see, that code will fail if there are more than 10 processes that share the same pid and ppid. It is not clear how you are creating the 20 valgrind processes -- they should normally not share pid.
You might be able to work around the problem by either
creating your Valgrind instances such that they do not share the same pid, or
setting TMPDIR to a different directory for each of the Valgrind instance.

Related

What's a bad file descriptor?

I have the next system swi-prolog in a file call 'system.pl';
helloWorld :- read(X), write(X).
And i want to test it, then, i write it;
:- begin_tests(helloWorld_test).
test(myTest, true(Output == "hello")) :-
with_output_to(string(Output), getEntry).
:- end_tests(helloWorld_test).
getEntry :-
open('testcase.test', read, Myfile),
set_input(Myfile),
process_create(path(swipl), ['-g', 'main', '-t', 'halt', 'system.pl'], [stdin(stream(Myfile)), stdout(pipe(Stream))]),
copy_stream_data(Stream, current_output),
close(Myfile).
In testcase.test is contained the following;
hello.
Ok, now, when i call to swipl -g run_tests -t halt system.pl i get it;
% PL-Unit: helloWorld_test ERROR: -g helloWorld: read/1: I/O error in read on stream user_input (Bad file descriptor)
ERROR: c:/programasvscode/prolog/programasrandom/system.pl:40:
test myTest: wrong answer (compared using ==)
ERROR: Expected: "hello"
ERROR: Got: ""
done
% 1 test failed
% 0 tests passed
ERROR: -g run_tests: false
Warning: Process "c:\swipl\bin\swipl.exe": exit status: 2
I tried use read/2 with current_input but i got the same with the difference of read/2 instead read/1
What does mean it? any solve?

Valgrind and where new/delete/malloc/free are called from?

I'm running valgrind-3.15.0 on an embedded ARM platform with switches:
--5484-- --tool=memcheck
--5484-- --track-origins=yes
--5484-- --leak-check=full
--5484-- --show-leak-kinds=all
--5484-- --trace-children=yes
--5484-- --sigill-diagnostics=no
--5484-- --keep-debuginfo=yes
--5484-- --num-callers=500
--5484-- --verbose
--5484-- --verbose
--5484-- --demangle=yes
Everyone agrees that debugging information needs to be built into the binary, and I've seen people say use -g, -ggdb, or -ggdb3. I've tried all 3 with no change in output with regards to the problem. I am compiling with optimization O1 which I've read should be OK.
I wrote a contrived program to allocate memory, free it, and then write past the end of free'd memory to see what I'd get:
==5484== Invalid write of size 1
==5484== at 0x5EBF6: SpecificObject::Process() (SpecificObject.cpp:98)
==5484== by 0x10BA1: SSystem::RunSingleSol(unsigned int) (SSystem.cpp:4162)
==5484== by 0x1162F: SSystem::RunSolution(unsigned int) (SSystem.cpp:3772)
==5484== by 0x159DB: SSystem::RunMain() (SSystem.cpp:2765)
==5484== by 0x1E527: SSystem::Enter() (SSystem.cpp:2557)
==5484== by 0x485F43D: MTask::MTaskEnter(void*) (MTask.cpp:183)
==5484== by 0x4886E35: ThreadCaller(void*) (WinAbstract.cpp:1562)
==5484== by 0x4EF0F0F: start_thread (pthread_create.c:458)
==5484== by 0x4C0AF57: ??? (clone.S:86)
==5484== Address 0x64b9752 is 1 bytes after a block of size 25 free'd
==5484== at 0x4836D18: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)
==5484== Block was alloc'd at
==5484== at 0x483584C: operator new[](unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)
In SpecificObject::Process, I have this test code in a switch/case statement (the only other code in this case is a break):
char *Crash = new char[25];
for(i = 0 ; i < 25 ; i++)
Crash[i] = 0xFF;
delete []Crash;
// use after free.
Crash[26] = 10;
Line 98 is the "Crash[26]=10" which makes sense since that is the overwrite and the allocation via new is only a few lines above it.
However no matter what I do, I cannot get new nor delete to show me anything except what's above, I see other people have tracebacks from where exactly new/delete/malloc/free was called from. For example, from some other random example on Stack Overflow, someone said they see:
==9700== Uninitialised value was created by a heap allocation
==9700== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9700== by 0x403D6F: get_all_system_info (kernel.c:118)
==9700== by 0x547DE99: start_thread (pthread_create.c:308)
==9700== by 0x57873FC: clone (clone.S:112)
As you can see, it shows that malloc was called by get_all_system_info(), etc... (again that's just a random example of what I'm trying to see)
How do I obtain this information?

Sending signals from DCL command line on OpenVMS

I'm trying to send a signal via the command line on an OpenVMS server. Using Perl I have set up signal handlers between processes and Perl on VMS is able to send Posix signals. In addition, C++ programs are able to send and handle signals too. However, the problem I run into is that the processes could be running on another node in the cluster and I need to write a utility script to remotely send a signal to them.
I'm trying to avoid writing a new script and would rather simply execute a command remotely to send the signal from the command line. I need to send SIGUSR1, which translates to C$_SIGUSR1 for OpenVMS.
Thanks.
As far as I know, there is no supported command line interface to do this. But you can accomplish the task by calling an undocumented system service called SYS$SIGPRC(). This system service can deliver any condition value to the target process, not just POSIX signals. Here's the interface described in standard format:
FORMAT
SYS$SIGPRC process-id ,[process-name] ,condition-code
RETURNS
OpenVMS usage: cond_value
type: longword (unsigned)
access: write only
mechanism: by value
ARGUMENTS
process-id
OpenVMS usage: process_id
type: longword (unsigned)
access: modify
mechanism: by reference
Process identifier of the process for which is to receive the signal. The
process-id argument is the address of an unsigned longword containing the
process identifier. If you do not specify process-id, process-name is
used.
The process-id is updated to contain the process identifier actually
used, which may be different from what you originally requested if you
specified process-name.
process-name
OpenVMS usage: process_name
type: character string
access: read only
mechanism: by descriptor
A 1 to 15 character string specifying the name of the process for
which will receive the signal. The process-name argument is the
address of a descriptor pointing to the process name string. The name
must correspond exactly to the name of the process that is to receive
the signal; SYS$SIGPRC does not allow trailing blanks or abbreviations.
If you do not specify process-name, process-id is used. If you specify
neither process-name nor process-id, the caller's process is used.
Also, if you do not specify process-name and you specify zero for
process-id, the caller's process is used.
condition-value
OpenVMS usage: cond_value
type: longword (unsigned)
access: read only
mechanism: by value
OpenVMS 32-bit condition value. The condition-value argument is
an unsigned longword that contains the condition value delivered
to the process as a signal.
CONDITION VALUES RETURNED
SS$_NORMAL The service completed successfully
SS$_NONEXPR Specified process does not exist
SS$_NOPRIV The process does not have the privilege to signal
the specified process
SS$_IVLOGNAM The process name string has a length of 0 or has
more than 15 characters
(plus I suspect there are other possible returns having to do
with various cluster communications issues)
EXAMPLE CODE
#include <stdio.h>
#include <stdlib.h>
#include <ssdef.h>
#include <stsdef.h>
#include <descrip.h>
#include <errnodef.h>
#include <lib$routines.h>
int main (int argc, char *argv[]) {
/*
**
** To build:
**
** $ cc sigusr1
** $ link sigusr1
**
** Run example:
**
** $ sigusr1 := $dev:[dir]sigusr1.exe
** $ sigusr1 20206E53
**
*/
static unsigned int pid;
static unsigned int r0_status;
extern unsigned int sys$sigprc (unsigned int *,
struct dsc$descriptor_s *,
int);
if (argc < 2) {
(void)fprintf (stderr, "Usage: %s PID\n",
argv[0]);
exit (EXIT_SUCCESS);
}
sscanf (argv[1], "%x", &pid);
r0_status = sys$sigprc (&pid, 0, C$_SIGUSR1);
if (!$VMS_STATUS_SUCCESS (r0_status)) {
(void)lib$signal (r0_status);
}
}

A list of error in my site

I don't know why my site give me this error. This is the list of errors.
plz lid me ! what shall i do ?
Fatal error: Out of memory (allocated 6029312) (tried to allocate 8192 bytes) in /home/lifegat/domains/life-gate.ir/public_html/includes/functions.php on line 7216
Fatal error: Out of memory (allocated 7602176) (tried to allocate 1245184 bytes) in /home/lifegat/domains/life-gate.ir/public_html/misc.php(89) : eval()'d code on line 1534
Fatal error: Out of memory (allocated 786432) (tried to allocate 1245184 bytes) in /home/lifegat/domains/life-gate.ir/public_html/showthread.php on line 1789
Fatal error: Out of memory (allocated 7340032) (tried to allocate 30201 bytes) in /home/lifegat/domains/life-gate.ir/public_html/includes/class_core.php(4633) : eval()'d code on line 627
Fatal error: Out of memory (allocated 2097152) (tried to allocate 77824 bytes) in /home/lifegat/domains/life-gate.ir/public_html/includes/functions.php on line 2550
Warning: mysql_query() [function.mysql-query]: Unable to save result set in [path]/includes/class_core.php on line 417
Warning: Cannot modify header information - headers already sent by (output started at [path]/includes/class_core.php:5615) in [path]/includes/functions.php on line 4513
Database error
Fatal error: Out of memory (allocated 786432) (tried to allocate 311296 bytes) in /home/lifegat/domains/life-gate.ir/public_html/includes/init.php on line 552
Fatal error: Out of memory (allocated 3145728) (tried to allocate 19456 bytes) in /home/lifegat/domains/life-gate.ir/public_html/includes/functions.php on line 8989
Fatal error: Out of memory (allocated 262144) (tried to allocate 311296 bytes) in /home/lifegat/domains/life-gate.ir/public_html/forum.php on line 475
Warning: mysql_query() [function.mysql-query]: Unable to save result set in [path]/includes/class_core.php on line 417
Warning: Cannot modify header information - headers already sent by (output started at [path]/includes/class_core.php:5615) in [path]/includes/functions.php on line 4513
Fatal error: Out of memory means that the server is out of reserved memory. This usually happens when you are working with big objects, such as images.
The solution is to use the & operator. This makes a variable point towards another object. Example:
$object = new BigObject();
$copy = $object; // this copies the object thus more memory is required
$pointer = &$object; // the & makes the $pointer variable point to $object
Because the variable is pointed to another variable, if you change one, the other will change as well.
$object = new BigObject();
$pointer = &$object;
$object->x = 12345;
echo $object->x;
echo $pointer->x; // will have the same output as $object->x
Pointers are often used in functions, like this:
$object = new BigObject();
x( $object );
function x( &$object ) {
// do stuff with $object
}
The Warning: Cannot modify header information warning is usually given when you are trying to change the header data after sending output. You probably have a header(); call after you have echo'd something or have some whitespaces before you use the PHP open tag <?php.
Finally, the Warning: mysql_query() [function.mysql-query]: Unable to save result set error is usually a MySQL issue. But knowing you are out of memory, you might fix the other errors first.
Increase memory_limit in php.ini or slim your code.

making valgrind abort on error for heap corruption checking?

I'd like to try using valgrind to do some heap corruption detection. With the following corruption "unit test":
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main()
{
char * c = (char *) malloc(10) ;
memset( c, 0xAB, 20 ) ;
printf("not aborted\n") ;
return 0 ;
}
I was suprised to find that valgrind doesn't abort on error, but just produces a message:
valgrind -q --leak-check=no a.out
==11097== Invalid write of size 4
==11097== at 0x40061F: main (in /home/hotellnx94/peeterj/tmp/a.out)
==11097== Address 0x51c6048 is 8 bytes inside a block of size 10 alloc'd
==11097== at 0x4A2058F: malloc (vg_replace_malloc.c:236)
==11097== by 0x400609: main (in /home/hotellnx94/peeterj/tmp/a.out)
...
not aborted
I don't see a valgrind option to abort on error (like gnu-libc's mcheck does, but I can't use mcheck because it isn't thread safe). Does anybody know if that is possible (our code dup2's stdout to /dev/null since it runs as a daemon, so a report isn't useful and I'd rather catch the culprit in the act or closer to it).
There is no such option in valgrind.
Consider adding a non-daemon mode (debug mode) into your daemon.
http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs 4.6 explains some requests from debugged program to valgrind+memcheck, so you can use some of this in your daemon to do some checks at fixed code positions.