Memcheck reports unitialised values when accessing local variables down the stack - false-positive

I encountered a problem that Memcheck reports uninitialized values and I think these are perfectly legal. I managed to create a small example program that exhibits this behavior. I would like to know if Memcheck is really wrong and what can be done about it. (Is there any solution besides adding the errors into a suppression file?)
To reproduce this, I made the program below. It runs function go that puts 0x42 on the stack, calls og (this pushes the address of the next instruction leave on the stack), then in og it stores esp+4 into global variable a.
The stack looks like this:
| address of `leave` instruction | pc = a[-1]
| 0x42 | a points here, answer = a[0]
If I build it and run Valgrind,
gcc -g -m32 main.c go.S -o main
valgrind --track-origins=yes ./main
Valgrind thinks that the value in variable pc (and answer, if you put it in the if instead) is undefined. I checked with debugger that the values there actually are what I wanted.
==14160== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14160== Command: ./main
==14160==
==14160== Conditional jump or move depends on uninitialised value(s)
==14160== at 0x804847D: print (main.c:18)
==14160== by 0x80484B0: ??? (go.S:19)
==14160== by 0x8048440: main (main.c:8)
==14160== Uninitialised value was created by a stack allocation
==14160== at 0x80484AC: ??? (go.S:19)
==14160==
==14160== Use of uninitialised value of size 4
==14160== at 0x80484B1: ??? (go.S:20)
==14160== by 0x8048440: main (main.c:8)
==14160== Uninitialised value was created by a stack allocation
==14160== at 0x80484AC: ??? (go.S:19)
If I debug from Valgrind with --vgdb-error=0 and print the definedness, it says that all the bits are undefined.
(gdb) p &pc
$1 = (int *) 0xfea5e4a8
(gdb) mo xb 0xfea5e4a8 4
ff ff ff ff
0xFEA5E4A8: 0x9e 0x84 0x04 0x08
The value at 0xfea5e4a8 is
(gdb) x/x 0xfea5e4a8
0xfea5e4a8: 0x0804849e
and
(gdb) x/i 0x0804849e
0x804849e <go+10>: leave
(gdb)
main.c
#include<stdio.h>
int *a;
extern void go();
int main() {
go();
printf("finito\n");
return 0;
}
int print() {
int answer = a[0];
int pc = a[-1];
// use the vars
if (pc == 0x42) {
printf("%d\n", 0);
}
}
go.S
.text
.globl go
go:
pushl %ebp
movl %esp, %ebp
pushl $0x42
call og
leave
ret
og:
addl $4, %esp
movl %esp, a
sub $4, %esp
call print
ret

The problem is this sequence of code:
addl $4, %esp
movl %esp, a
sub $4, %esp
When you move %esp up valgrind will mark everything below the new stack pointer position as undefined, and it will stay that way even after you move it back.
It's not safe anyway, because if a signal hit between the add and sub then that stack really might get overwritten (in 32 bit code - in 64 bit code there is a "red zone" below the pointer that is safe but valgrind knows about that).

Related

How do I prevent valgrind from creating extra log files when using popen?

valgrind-3.15.0
I have a weird issue with valgrind when I use switches --trace-children, --trace-children-skip and --log-file along with popen().
My code:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
FILE *fp = NULL;
fp = popen("ls -l", "r");
pclose(fp);
fp = popen("ls -l", "r");
pclose(fp);
fp = popen("ls -l", "r");
pclose(fp);
fp = popen("ls -l", "r");
pclose(fp);
fp = popen("ls -l", "r");
pclose(fp);
return EXIT_SUCCESS;
}
Command 1: valgrind --leak-check=full --track-origins=yes --trace-children=yes --trace-children-skip=*/sh,*/ls ./test
When I run this command I get only one process output in STDOUT. That's what I expect because I'm specifying with the --trace-children-skip to ignore what I'm doing with popen().
Command 2: valgrind --leak-check=full --track-origins=yes --trace-children=yes --trace-children-skip=*/sh,*/ls --log-file=logs/%p ./test
When I run this command I get 6 log files. One for the main process and one for each of the popen() calls. valgrind reports the command as ./test. This is not what I'd expect. I'd expect the same as above, only one process log.
Running without --trace-children-skip I'd get 11 files; one for the main process and two for each of the popen() calls (sh and ls as the commands). This is what I'd expect since I'm not skipping anything and popen() calls sh which then calls ls.
I'm not sure what the deal is here. --trace-children-skip with --log-file is working in that it's not showing the sh and ls logs, but it's still creating a process for each popen() call which doesn't happen if I don't use --log-file. Am I missing something?

Lex and Yacc to make compiler?

I am starting a toy compiler, and I am making the simplest thing I can imagine, but it won't work.
Lex compiles, and Yacc compiles, and they link together, but the outputted program does not do what I expected.
Lex:
%{
#include <stdlib.h>
void yyerror(char *);
#include "y.tab.h"
%}
%%
a {
yylval = atoi(yytext);
return AAA;
}
. yyerror("invalid character");
%%
int yywrap(void) {
return 1;
}
Yacc:
%{
void yyerror(char *);
int yylex(void);
int sym[26];
#include <stdio.h>
%}
%token AAA
%%
daaaa:
AAA {printf("%d\n", $1);}
%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}
int main(void) {
yyparse();
return 0;
}
The program I am trying to compile with this compiler is a file containing: a. that's it.
I don't know what's happened!
Clarification: What I expected the compiled compiler to do was to accept a file into it, process the file, and spit out a compiled version of that file.
Can you explain, maybe in an answer, exactly what you did, and how it worked, because as far as I can tell, and as far as I have tested the question, it shouldn't work as you say.
I took your code verbatim, creating files grammar.y and lexer.l. I then compiled the code. I'm working on Mac OS X 10.11.4, using GCC 6.1.0, Bison 2.3 (disguised as yacc) and Flex 2.5.35 (disguised as lex).
$ yacc -d grammar.y
$ lex lexer.l
$ gcc -o gl y.tab.c lex.yy.c
$ ./gl <<< 'a'
0
$
I subsequently made two changes. In grammar.y, I changed main() to:
int main(void) {
#if YYDEBUG
yydebug = 1;
#endif
yyparse();
return 0;
}
and in lexer.l, I changed the default character rule to:
\n|. yyerror("invalid character");
(The . doesn't match newline, so the newline after the a in the input was echoed by default in the original output.)
With a similar compilation, the output becomes:
$ ./gl <<< 'a'
0
invalid character
$
With the compilation specifying -DYYDEBUG too:
$ gcc -DYYDEBUG -o gl lex.yy.c y.tab.c
$
the output includes useful debugging information:
$ ./gl <<< 'a'
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
0
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: invalid character
Now at end of input.
Stack now 0 2
Cleanup: popping nterm daaaa ()
$ ./gl <<< 'aa'
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
0
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: Next token is token AAA ()
syntax error
Error: popping nterm daaaa ()
Stack now 0
Cleanup: discarding lookahead token AAA ()
Stack now 0
$
The second a in the input correctly triggers a syntax error (it isn't allowed by the grammar). Other characters are permitted, generate a 'invalid character' message, and are otherwise ignored (so ./gl <<< 'abc' generates 3 invalid character messages, one for the b, one for the c, and one for the newline).
Changing the assignment to yylval in lexer.l to:
yylval = 'a'; // atoi(yytext);
changes the number printed from 0 to 97, which is the character code for 'a' in ASCII, ISO 8859-1, Unicode, etc.
I've been using a here string as the source of data. It would be equally feasible to have used a file as the input:
$ echo a > program
$ cat program
a
$ ./gl < a
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
97
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: invalid character
Now at end of input.
Stack now 0 2
Cleanup: popping nterm daaaa ()
$
If you want to read files specified by name on the command line, you have to write more code in main() to process those files.
The program does not accept a file because it was not told to.
In the Yacc program,
extern FILE *yyin; must be added in the definitions section.
I believe that's it.

How to execute a GDB command from a string variable?

Say the debugged process has a string variable as follows:
char* cmd_str = "set confirm on";
How to execute the command from cmd_str in GDB?
(gdb) $cmd = cmd_str
(gdb) ???
You can use gdb's eval command, which runs printf on its arguments and then evaluates the result as a command.
(gdb) list
1 #include <stdlib.h>
2 main()
3 {
4 char *a = "set confirm off";
5
6 pause();
7 }
(gdb) break 6
Breakpoint 1 at 0x400540: file cmdtest.c, line 6.
(gdb) run
Starting program: ./cmdtest
Breakpoint 1, main () at cmd.c:6
6 pause();
(gdb) show confirm
Whether to confirm potentially dangerous operations is on.
(gdb) printf "%s", a
set confirm off(gdb)
(gdb) eval "%s", a
(gdb) show confirm
Whether to confirm potentially dangerous operations is off.

inline-Assembler compiler error messages

The machine i use is 64-bit, I wrote inline assembly code like this
__asm__ (
"mov %cl TEMP_CHAR \n"
"xor %eax, %eax \n"
"mov %eax, A \n"
"rcr %eax, %cl \n"
"mov TEMP_B, %eax \n"
)
Using gcc compiler,
When I compile with it using commaand line
It turns out errors as follow
/tmp/ccK8W7qx.s: Assembler messages:
/tmp/ccK8W7qx.s:177 : Error: suffix or operands invalid for 'rcr'
I wonder why this happens. Could anybody help me out?
AT&T syntax has the operands the other way: rcr %cl, %eax. You'll probably want to change the other intructions, too.

Capturing pretty-printed string output, to display multiple variables on a single line in gdb?

In my gdb session, I have typed this:
(gdb) p arg1
$17 = (svn_revnum_t *) 0xbfffea0c
(gdb) p *(arg1)
$18 = -1
Now, I would like the "pretty-printed" output for both commands to be shown in a single line, as in:
$19 = (svn_revnum_t *) 0xbfffea0c ; -1
... so I try something like this:
(gdb) p arg1, ";", *(arg1)
$19 = -1
... but obviously, it doesn't work.
Is there a way to do something like this?
I guess, if it was possible to somehow "capture" the pretty-printed output of print as a string, then I could use printf "%s ; %s" to format my output; but how would one capture the print output, then?
I think the simplest way to do this is to write a new Python command that does what you want.
Alternatively you could write a Python convenience function that uses str on its argument. Then, use that with printf.