Updating variable that lives in the data segment from the stack and its segment - variables

I currently have three segments of memory, my main data segment, stack segment and the segment where my API lives. The following instructions are executed from the data segment, they push the address of cursorRow and welcomeMsg then do a far call to the function in my API segment. The cursorRow variable lives in the main data segment that is calling the API function. The call looks like this:
push cursorRow
push welcomeMsg
call API_SEGMENT:API_printString
How can I alter cursorRow, inside of the segment where my API lives, through the stack? cursorRow needs to be updated from the API. NO API functions alter the data segment. I have tried things like: inc byte [ds:bp+8] and add [ds:bp+8], 1.
Here is the API procedure being called:
printStringProc:
push bp
mov bp, sp
mov si, [bp+6]
.printloop:
lodsb
cmp al, 0
je printStringDone
mov ah, 0x0E ; teletype output
mov bh, 0x00 ; page number
mov bl, 0x07 ; color (only in graphic mode)
int 0x10
jmp .printloop
printStringDone:
; move the cursor down
mov ah, 02h ; move cursor
mov dh, [bp+8]
mov dl, 0 ; column
mov bh, 0 ; page number
int 10h
add [ds:bp+8], 1
pop bp
retf
it prints strings, but the cursorRow variable doesn't correctly update. I hope I'm clear enough on my issue. It's hard to explain :D

This is because you passed the pointer to cursorRow, not cursorRow itself. When you perform
inc [ds:bp+8]
you: 1) get the value of bp, 2) add 8, 3) assume the result is a pointer in ds, 4) increment the value stored there (the pointer to cursorRow). Since the pointer is stored on the stack, you are incrementing the pointer when you do this. What you need to do is take the pointer off of the stack and increment the value that points to.
mov bx, [bp+8]
inc [bx]
This code: 1) gets the value of bp, 2) adds 8, 3) assumes the result is a pointer in ss, 4) load the value stored there (the pointer to cursorRow) into bx, 5) assumes bx is a pointer in ds, 6) increments the value stored there (the value of cursorRow).

It's look like you just pushed the value of cursorRow onto the stack. Without the address you cannot update it. With the address you can easily reference that addresses' value, put it into a register, perform operations on it, then take the value that's in that register and put it into the address of cursorRow.

Related

Intel prefixes instructions, checking optimisations problems

I wanted to learn more on ptrace's functions with x86_64 binaries, disassembling instructions.
The goal is to check if a byte is one of instructions prefixes.
I found some information in the Intel® 64 and IA-32 Architectures Software Developer’s Manual (volume 2, chapter 2).
The section 2.1.1 INSTRUCTION PREFIXES shows the following prefixes:
[0x26] ES segment override
[0x36] SS segment override prefix
[0x2E] CS segment override prefix or Branch not taken
[0x3E] DS segment override prefix or Branch taken
[0x64] FS segment override prefix
[0x65] GS segment override prefix
[0x66] Operand-size override prefix
[0x67] Address-size override prefix
[0xF0] LOCK prefix
[0xF2] REPNE/REPNZ prefix or BND prefix
[0xF3] REP or REPE/REPZ prefix
Visually, this chart shows prefixes in yellow.
If I want to know if a byte is a prefix, I will try to be efficient and check if it is possible to perform binaries operations.
If I take 0x26, 0x36, 0x2E and 0x3E as a group. These numbers in base 2 (00100110, 00110110, 00101110 and 00111110) show a common part: 001XX110.
An and-binary operation of 11100111 (0xE7) can found if my byte is in this group.
Great. Now, if I take a second group which contains 0x64, 0x65, 0x66 and 0x67 (01100100, 01100101, 01100110, 01100111), I found an other common part: 011001XX.
Then, the and-binary operation of 11111100 (0xFC) can found if the byte is in the second group.
The problem comes for remaining instruction prefixes (0xF0, 0xF2 and 0xF3): There is no common part. An and-operation of 11111100 (0xFC) would let the byte 0xF1.
One solution would be to check after if the byte isn't 0xF1.
So, a possible implementation in C would be:
if ((byte & 0xE7) == 0x26) {
/* This `byte` is a ES, SS, CS or DS segment override prefix */
}
if ((byte & 0xFC) == 0x64) {
/* This `byte` is a FS, GS, Operand-size or address-size override prefix */
}
if ((byte & 0xFC) == 0xF0) {
if (byte != 0xF1) {
/* This `byte` is a LOCK, REPN(E/Z) or REP(_/E/Z) prefix */
}
}
Coming from Intel, I would except that this last group would be possible to check in only one operation.
Then, the final question is: Can I check in one operation if the byte is 0xF0, 0xF2 or 0xF3?
Then, the final question is: Can I check in one operation if the byte is 0xF0, 0xF2 or 0xF3?
The closest you can get to one instruction is something like:
;ecx = the byte
bt [table],ecx ;Is the byte F0, F2 or F3?
jc .isF0F2orF3 ; yes
However, sometimes a prefix isn't considered a prefix (e.g. pause instruction, which is encoded like rep nop for compatibility with old CPUs).
Also note that for a high speed disassembler the fastest approach is likely "jump table driven", where one register points to the table corresponding to the decoder's state and another register contains the next byte of the instruction, like:
;ebx = address of table corresponding to the decoder's current state
movzx eax,byte [esi] ;eax = next byte of the instruction
inc esi ;esi = address of byte after the next byte of this instruction
jmp [ebx+eax*4] ;Go to the code that figures out what to do
In this case, some of the pieces of code jumped to would set some flags without changing the current table (e.g. the entry for 0xF3 in the initial table would cause a jump to code that sets a "rep prefix was seen" flag), and some of the pieces of code jumped to would switch to a different table (e.g. the entry for 0x0F in the initial table would cause a jump to code that changes EBX to point to a completely different table used for all instructions that begin with an 0x0F, ...); and some of the pieces of code jumped to would display an instruction (and reset the state of the decoder).
For example; for pause the code might be:
table0entryF3:
or dword [prefixes],REP
movzx eax,byte [esi] ;eax = next byte of the instruction
inc esi ;esi = address of byte after the next byte
jmp [ebx+eax*4]
table0entry90:
mov edx,instructionNameString_NOP
test dword [prefixes],REP ;Was it a PAUSE or NOP?
je doneInstruction_noOperands ; NOP, current name is right
and dword [prefixes],~REP ; PAUSE, pretend the REP prefix wasn't there
mov edx,instructionNameString_PAUSE ; and use the right name
jmp doneInstruction_noOperands
doneInstruction_noOperands:
call displayPrefixes
call displayInstructionName
mov dword [prefixes],0 ;Reset prefixes
mov ebx,table0 ;Switch current table back to the initial table
movzx eax,byte [esi] ;eax = first byte of next instruction
inc esi ;esi = address of byte after the next byte
jmp [ebx+eax*4]

Accepts and displays one character of 1 through 9 using assembly code in c

Can someone please explain me each line of this assembly code?
void main(void){
_asm{
mov ah,8 ;read key no echo
int 21h
cmp al,‘0’ ;filter key code
jb big
cmp al,‘9’
ja big
mov dl,al ;echo 0 – 9
mov ah,2
int 21h
big:
}
}
PS: I am new to assembly in c/c++.
Per the docs, the return value is in al, not ah. That's why it compares to al.
Edit: Adding more detail:
Looking at this code:
mov ah,8 ;read key no echo
int 21h
Think of this like a function call. Now normally a function call in asm looks like call myroutine. But DOS used interrupts to allow you to call various operating system functions (read a key from the keyboard, read data from a file, etc).
So, executing the int 21h instruction called the operating system. But how was the operating system supposed to know which OS function you wanted? Typically by putting a value in ah. If you search, you can find a number of resources that show listings of all the int 21h functions (like this). The numbers on the right are the values you put in ah.
So, mov ah,8 is preparing to call the "Wait for console input without echo" function. mov ah,2 is "Display output." Other registers are used to pass various parameters to the function being called. You need to read the description of the specific interrupt to understand what goes where.
Note that NONE of this is related to "writing inline asm in C." This is just how to call OS function from C code running under DOS. If you aren't running under DOS, int 21 won't work.

Assembly - assign int value to string

It is possible to assign an int value to a string in assembly?
For example, I put >> rat = 2, and when I call a program that uses the variables, if I put >> rat + 2, it has to return 4.
And, if it's possible, which is the best way to do it?
Any help will be appreciated.
It sounds like you are wanting to store an integer in a variable, that just happens to be named using a string? That's very different from how rkhb interpreted your question, but I think it is more in line with your actual question, judging from the tags you've used.
There are two basic types of variables supported in NASM: initialized data and uninitialized data.
With initialized data, you assign a static value when you declare the variable. Actually, initialized data is more like a constant, but you name it symbolically. DB (Declare Byte), DW (Declare Word), DD (Declare Doubleword), and DQ (Declare Quadword) are the commands used to declare initialized data. So you could do:
rat DD 2
And then somewhere in your code, do:
mov eax, DWORD [rat]
add eax, 2
; eax now contains 4
With uninitialized data, you are basically just reserving space to hold data. This data is not initialized statically; you fill it at run-time. You use RESB (Reserve Byte), RESW (Reserve Word), RESD (Reserve Doubleword), and RESQ (Reserve Quadword) for this; for example:
rat RESD 1 ; reserve space for 1 DWORD-sized value
And then later in your code, you would go:
call GetValue ; returns value in EAX
mov DWORD [rat], eax ; store value in 'rat'
This is all explained in Chapter 3 of the NASM manual.
That's wasn't what I was looking for, but thanks
I'm trying to make a calculator with variables with NASM
So, as a Casio calculator (for example), that you can put variables like X, Y, M, Z and others, and then you can assign values to those variables
That is what I'm looking for, not inside the code, but in screen.
Again, thanks for help, it helped me with another error with my code

ASM 8086 : Reading the value of a variable is different from the value assigned to the variable

I'm writing a little program in Assembly 8086 and I have to use variables.
So I have a variable that is defined in the data segment :
myVar BYTE 3,0
Afterwards in my code I have to acces the variable and use it's value. But the program did not work like expected. So I searched the error in my code and I found that when I acces "myVar", the value is different from the value I assigned to it.
When I print the contents of "myVar" it prints 173 instead of 3 :
xor dx, dx
mov dl, myVar
push dx
CALL tprint
"tprint" is a function I wrote, that will display the number passed as argument via the stack. So in this case it will print the content of the DX register.
When I put 3 in dx and then print it, it prints 3, so "tprint" works fine :
xor dx, dx
mov dl, 3
push dx
CALL tprint
So the problem is that when I move the contents of the variable "myVar" in the DL register, the wrong value is put in DL (another value than the value assigned to "myVar") :
xor dx, dx
mov dl, myVar ; DL != 3 --> why???
I really don't understand this behaviour, I searched a lot of sites and they all do it this way, why does it works fine for them and not for me?
Remark : The "tprint" function is a function for printing signed numbers using two's complement method.
Thanks for your help!
When you move a value from a register, you want to use brackets to move the actual value and not the memory address. So for
mov dl, myVar
you're likely moving just the pointer instead of the value.
See this link

How do I get started with ARM on iOS?

Just curious as to how to get started understanding ARM under iOS. Any help would be super nice.
In my opinion, the best way to get started is to
Write small snippets of C code (later Objective-C)
Look at the corresponding assembly code
Find out enough to understand the assembly code
Repeat!
To do this you can use Xcode:
Create a new iOS project (a Single View Application is fine)
Add a C file scratchpad.c
In the Project Build Settings, set "Generate Debug Symbols" to "No"
Make sure the target is iOS Device, not Simulator
Open up scratchpad.c and open the assistant editor
Set the assistant editor to Assembly and choose "Release"
Example 1
Add the following function to scratchpad.c:
void do_nothing(void)
{
return;
}
If you now refresh the Assembly in the assistant editor, you should see lots of lines starting with dots (directives), followed by
_do_nothing:
# BB#0:
bx lr
Let's ignore the directives for now and look at these three lines. With a bit of searching on the internet, you'll find out that these lines are:
A label (the name of the function prefixed with an underscore).
Just a comment emitted by the compiler.
The return statement. The b means branch, ignore the x for now (it has something to do with switching between instruction sets), and lr is the link register, where callers store the return address.
Example 2
Let's beef it up a bit and change the code to:
extern void do_nothing(void);
void do_nothing_twice(void)
{
do_nothing();
do_nothing();
}
After saving and refreshing the assembly, you get the following code:
_do_nothing_twice:
# BB#0:
push {r7, lr}
mov r7, sp
blx _do_nothing
pop.w {r7, lr}
b.w _do_nothing
Again, with a bit of searching on the internet, you'll find out the meaning of each line. Some more work needs to be done because make two calls: The first call needs to return to us, so we need to change lr. That is done by the blx instruction, which does not only branch to _do_nothing, but also stores the address of the next instruction (the return address) in lr.
Because we change the return address, we have to store it somewhere, so it is pushed on the stack. The second jump has a .w suffixed to it, but let's ignore that for now. Why doesn't the function look like this?
_do_nothing_twice:
# BB#0:
push {lr}
blx _do_nothing
pop.w {lr}
b.w _do_nothing
That would work as well, but in iOS, the convention is to store the frame pointer in r7. The frame pointer points to the place in the stack where we store the previous frame pointer and the previous return address.
So what the code does is: First, it pushes r7 and lr to the stack, then it sets r7 to point to the new stack frame (which is on the top of the stack, and sp points to the top of the stack), then it branches for the first time, then it restores r7 and lr, finally it branch for the second time. Abx lr at the end is not needed, because the called function will return to lr, which points to our caller.
Example 3
Let's have a look at a last example:
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
The assembly code is:
_swap:
# BB#0:
ldr r2, [r0]
ldr r3, [r1]
str r3, [r0]
str r2, [r1]
bx lr
With a bit of searching, you will learn that arguments and return values are stored in registers r0-r3, and that we may use those freely for our calculations. What the code does is straightforward: It loads the value that r0 and r1 point to in r2 and r3, then it stores them back in exchanged order, then it branches back.
And So On
That's it: Write small snippets, get enough info to roughly understand what's going on in each line, repeat. Hope that helps!