I have this jasmin assembly, which is the equivalent of the simplification of JVM assembly produced by a bytecode rewriter I am writing. It crashes when run, but works if I remove the sipush and the first store.
.method public simple()V
.limit stack 4
sipush 12345
istore_1
getstatic java/lang/System/out Ljava/io/PrintStream;
sipush 12345
ldc 12345
iadd
invokevirtual java/io/PrintStream/println(I)V
return
.end method
Does the JVM require every store to be used by a load?
istore_1 stores a value to the local variable #1, but your methods has no locals.
The method will become valid if you add the following line:
.limit locals 2
Related
Below is the assembly code dump output from JIT C2.
It performs a func call (callq), but in the comment section, JIT outputs a call stack.
Does this imply inline is only applied up to SomeClass::SomeMethod? Thanks for the answering.
0x00007f4a9f4f269f: callq 0x00007f4a9d0453e0 ; OopMap{rbp=Oop [288]=Oop [312]=Oop [112]=Oop [120]=Oop [128]=Oop [136]=Oop [176]=Oop [192]=Oop off=4132}
;*if_icmpeq
; - org.apache.spark.xyz.abc.SomeClass::SomeMethod#178 (line 87)
; - org.apache.spark.abc.xyz.OtherClass::OtherMethod#575 (line 561)
; {runtime_call}
The comments show the current inlining stack corresponding to the given machine instruction.
;*if_icmpeq
; - org.apache.spark.xyz.abc.SomeClass::SomeMethod#178 (line 87)
; - org.apache.spark.abc.xyz.OtherClass::OtherMethod#575 (line 561)
In particular, the above comment means that the corresponding call instruction was generated as a result of compiling if_icmpeq bytecode at SomeClass.SomeMethod, which was inlined into the compilation of OtherClass.OtherMethod.
Here OtherClass.OtherMethod is the root method being compiled, and SomeClass.SomeMethod is a method at the first inlining level.
I'm running some code under Valgrind, compiled with gcc 7.5 targeting an aarch64 (ARM 64 bits) architecture, with optimizations enabled.
I get the following error:
==3580== Invalid write of size 8
==3580== at 0x38865C: ??? (in ...)
==3580== Address 0x1ffeffdb70 is on thread 1's stack
==3580== 16 bytes below stack pointer
This is the assembly dump in the vicinity of the offending code:
388640: a9bd7bfd stp x29, x30, [sp, #-48]!
388644: f9000bfc str x28, [sp, #16]
388648: a9024ff4 stp x20, x19, [sp, #32]
38864c: 910003fd mov x29, sp
388650: d1400bff sub sp, sp, #0x2, lsl #12
388654: 90fff3f4 adrp x20, 204000 <_IO_stdin_used-0x4f0>
388658: 3dc2a280 ldr q0, [x20, #2688]
38865c: 3c9f0fe0 str q0, [sp, #-16]!
I'm trying to ascertain whether this is a possible bug in my code (note that I've thoroughly reviewed my code and I'm fairly confident it's correct), or whether Valgrind will blindly report any writes below the stack pointer as an error.
Assuming the latter, it looks like a Valgrind bug since the offending instruction at 0x38865c uses the pre-decrement addressing mode, so it's not actually writing below the stack pointer.
Furthermore, at address 0x388640 a similar access (and again with pre-decrement addressing mode) is performed, yet this isn't reported by Valgrind; the main difference being the use of an x register at address 0x388640 versus a q register at address 38865c.
I'd also like to draw attention to the large stack pointer subtraction at 0x388650, which may or may not have anything to do with the issue (note this subtraction makes sense, given that the offending C code declares a large array on the stack).
So, will anyone help me make sense of this, and whether I should worry about my code?
The code looks fine, and the write is certainly not below the stack pointer. The message seems to be a valgrind bug, possibly #432552, which is marked as fixed. OP confirms that the message is not produced after upgrading valgrind to 3.17.0.
code declares a large array on the stack
should [I] worry about my code?
I think it depends upon your desire for your code to be more portable.
Take this bit of code that I believe represents at least one important thing you mentioned in your post:
#include <stdio.h>
#include <stdlib.h>
long long foo (long long sz, long long v) {
long long arr[sz]; // allocating a variable on the stack
arr[sz-1] = v;
return arr[sz-1];
}
int main (int argc, char *argv[]) {
long long n = atoll(argv[1]);
long long v = foo(n, n);
printf("v = %lld\n", v);
}
$ uname -mprsv
Darwin 20.5.0 Darwin Kernel Version 20.5.0: Sat May 8 05:10:33 PDT 2021; root:xnu-7195.121.3~9/RELEASE_X86_64 x86_64 i386
$ gcc test.c
$ a.out 1047934
v = 1047934
$ a.out 1047935
Segmentation fault: 11
$ uname -snrvmp
Linux localhost.localdomain 3.19.8-100.fc20.x86_64 #1 SMP Tue May 12 17:08:50 UTC 2015 x86_64 x86_64
$ gcc test.c
$ ./a.out 2147483647
v = 2147483647
$ ./a.out 2147483648
v = 2147483648
There are at least some minor portability concerns with this code. The amount of allocatable stack memory for these two environments differs significantly. And that's only for two platforms. Haven't tried it on my Windows 10 vm but I don't think I need to because I got bit by this one a long time ago.
Beyond OP issue that was due to a Valgrind bug, the title of this question is bound to attract more people (like me) who are getting "invalid write at X bytes below stack pointer" as a legitimate error.
My piece of advice: check that the address you're writing to is not a local variable of another function (not present in the call stack)!
I stumbled upon this issue while attempting to write into the address returned by yyget_lloc(yyscanner) while outside of function yyparse (the former returns the address of a local variable in the latter).
Assume I have a variable called Block_Size and without initialization.
Would
Block_Size db ?
mov DS:Block_Size, 1
be equal to
Block_Size db 1
No, Block_Size db ? has to go in the BSS or data section, not mixed in with your code.
If you wrote
my_function:
Block_Size db ?
mov DS:Block_Size, 1
...
ret
your code would crash. ? isn't really uninitialized, it's actually zeroed. So then the CPU decoded the instructions starting at my_function (e.g. after some other code ran call my_function), it would actually decode the 0 as code. (IIRC, opcode 0 is add, and then the opcode of the mov instruction would be decoded as the operand byte of add (ModR/M).)
Try assembling it, and then use a disassembler to show you how it would decode, along with the hex dump of the machine code.
db assembles a byte into the output file at the current position, just like add eax, 2 assembles 83 c0 02 into the output file.
You can't use db the way you declare variable in C
void foo() {
unsigned char Block_size = 1;
}
A non-optimizing compiler would reserve space on the stack for Block_size. Look at compiler asm output if you're curious. (But it will be more readable if you enable optimization. You can use volatile to force the compiler to actually store to memory so you can see that part of the asm in optimized code.)
Maybe related: Assembly - .data, .code, and registers...?
If you wrote
.data
Block_size db ?
.code
set_blocksize:
mov [Block_size], 1
ret
it would be somewhat like this C:
unsigned char Block_size;
void set_blocksize(void) {
Block_size = 1;
}
If you don't need something to live in memory, don't use db or dd for it. Keep it in registers. Or use Block_size equ 1 to define a constant, so you can do stuff like mov eax, Block_size + 4 instead of mov eax, 5.
Variables are a high-level concept that assembly doesn't really have. In asm, data you're working with can be in a register or in memory somewhere. Reserving static storage for it is usually unnecessary, especially for small programs. Use comments to keep track of what you put in which register.
db literally stands for "define byte" so it will put the byte there, where the move command can have you place a particular value in a register overwriting whatever else was there.
I am writing embedded code in Ada. I want to jump into bootloader code which is located at address 0x0E00. I am trying to use following code:
with Interfaces; use Interfaces;
with System;
package AVR.bootloader is
procedure Call;
pragma No_Return(Call);
pragma Import (Assembler,Call);
for Call'Address use System'To_Address (16#0E00#);
end AVR.bootloader;
The problem is this does not work.
Edit: I want to do a following C equivalent:
void (*boot)(void)=0x0E00;
I did a small experiment on this Macbook Pro, and your code seems to do what you meant it to; I modified the code to read
with System;
procedure Bootloader is
procedure Call;
pragma No_Return (Call);
pragma Import (Assembler, Call);
for Call'Address use System'To_Address (16#0E00#);
begin
Call;
end Bootloader;
and when I compile with gnatmake -c -u -f -S bootloader.adb the saved assembler is
.text
.globl __ada_bootloader
__ada_bootloader:
LFB1:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
subq $16, %rsp
LCFI2:
movq $3584, -8(%rbp)
movq -8(%rbp), %rax
call *%rax
leave
LCFI3:
ret
[...]
which looks hopeful, though I’m not familiar enough with asm to know.
Running it under gdb I get (after a lot of chatter)
(gdb) run
Starting program: /Users/simon/tmp/bootloader
Reading symbols for shared libraries ++........................ done
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000e00
0x0000000000000e00 in ?? ()
(gdb) bt
#0 0x0000000000000e00 in ?? ()
Cannot access memory at address 0xe00
#1 0x0000000100000d93 in main (argc=1, argv=140734799805048, envp=140734799805064) at /Users/simon/tmp/b~bootloader.adb:121
#2 0x0000000100000bf4 in start ()
which looks even more hopeful.
Perhaps your AVR compiler isn’t code-generating properly?
Since normally a boot-loader runs on reset, the simplest method is to force a processor reset. A boot-loader may reasonably assume that it is running on an uninitialised system in reset state and may perform initialisation that is not valid on an already initialised system, so forcing a reset is the safest method.
Your processor may have a reset instruction or a reset controller that can perform this directly. Failing that it may have a watchdog timer that can generate a reset. Start the watchdog timer with a suitably short time-out and let it run without servicing it.
I'm looking at some disassembly code and see something like 0x01c8f09b <+0015> mov 0x8(%edx),%edi and I am wondering what the value of %edx or %edi is.
Is there a way to print the value of %edx or other assembly variables? Is there a way to print the value at the memory address that %edx points at (I'm assuming edx is a register containing a pointer to ... something here).
For example, you can print an objet by typing po in the console, so is there a command or syntax for printing registers/variables in the assembly?
Background:
I'm getting EXC_BAD_ACCESS on this line and I would like to debug what is going on. I'm aware this error is related to memory management and I'm looking at figuring out where I may be missing/too-many retain/release/autorelease calls.
Additional Info:
This is on IOS, and my application is running in the iPhone simulator.
You can print a register (e.g, eax) using:
print $eax
Or for short:
p $eax
To print it as hexadecimal:
p/x $eax
To display the value pointed to by a register:
x $eax
Check the gdb help for more details:
help print
help x
Depends up which Xcode compiler/debugger you are using. For gcc/gdb it's
info registers
but for clang/lldb it's
register read
(gdb) info reg
eax 0xe 14
ecx 0x2844e0 2639072
edx 0x285360 2642784
ebx 0x283ff4 2637812
esp 0xbffff350 0xbffff350
ebp 0xbffff368 0xbffff368
esi 0x0 0
edi 0x0 0
eip 0x80483f9 0x80483f9 <main+21>
eflags 0x246 [ PF ZF IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
From Debugging with gdb:
You can refer to machine register contents, in expressions, as variables with names
starting with `$'. The names of registers are different for each machine; use info
registers to see the names used on your machine.
info registers
Print the names and values of all registers except floating-point
registers (in the selected stack frame).
info all-registers
Print the names and values of all registers, including floating-point
registers.
info registers regname ...
Print the relativized value of each specified register regname.
regname may be any register name valid on the machine you are using,
with or without the initial `$'.
If you are using LLDB instead of GDB you can use register read
Those are not variables, but registers.
In GDB, you can see the values of standard registers by using the following command:
info registers
Note that a register contains integer values (32bits in your case, as the register name is prefixed by e). What it represent is not known. It can be a pointer, an integer, mostly anything.
If po crashes when you try to print a register's value as a pointer, it's likely that the value is not a pointer (or an invalid one).