I'm getting started with a program in MARS MIPS that will allow the user to input something in the MMIO input window in the form of "x+y=" and get "x+y=z". However, I just don't really know where to start. I have the basics setup, but I need to write an entire interrupt handler.
I'm using MARS MIPS< and have enabled the interrupt bit, but that's about all I've figured out.
.text
main:
#Turn on the interupt enable bit
lui $t0, Oxffff
lw $t1, 0($t0)
ori $t0, $t1, 0x0002
sw $t1, 0($t0)
.data
expBuffer: .space 60
expBuff: .word 0
.ktext 0x80000180
#Store all used registers
#Recover all used registers
.kdata
#Registers
Related
I would like to allow the user to tell me how many numbers they would like to enter then allow them to enter them and store them in an array.I would like to call it myArray
I cannot find anything clear anywhere.
print_str ("How many numbers would you like ot enter? ")
li $v0, 5 #taking input in from user
syscall
move $s7, $v0
li $s6, 0 ` # i = 0
inputLoop:
bgt $s6, $s7, exitInputLooop
li $a0, $s7
sll $s7, $s7, 2 #user input x 4 for mememory allocation
li $v0,9 # (1) Allocate a block of memory
li $a0, ($s7) # 4 bytes long
syscall # $v0 <-- address
move myArray, ($s7) # (2) Make a safe copy
addi $s6, $s6, 4 #i++
exitInputLoop
How to count the number of CPU clock cycles between the start and end of a benchmark in gem5?
I'm interested in all of the following cases:
full system userland benchmark. Maybe the m5 guest tool has a way to do it?
bare metal benchmark. When gem5 exits it dumps the stats automatically, so the main question is how to skip the cycles for bootloader and go straight to the benchmark itself.
Is there a way besides modifying the benchmark source with instrumentation instructions? How to write those instrumentation instructions in detail?
syscall emulation benchmark. I think gem5 just outputs the stats.txt at the end of the run, and then you ca just grep system.cpu.numCycles, but I have to confirm it, currently blocked on: How to solve "FATAL: kernel too old" when running gem5 in syscall emulation SE mode?
I want to use this to learn:
learn how CPUs work
how to optimize assembly code or compiler settings to run optimally on a given CPU
m5 tool
A good approximation is to run, ideally from a shell script that is the /init program:
m5 resetstats
run-benchmark
m5 dumpstats
Then on host:
grep -E '^system.cpu.numCycles ' m5out/stats.txt
Gives something like:
system.cpu.numCycles 33942872680 # number of cpu cycles simulated
Note that if you replay from a m5 checkpoint with a different CPU, e.g.:
--restore-with-cpu=HPI --caches
then you need to grep for a different identifier:
grep -E '^system.switch_cpus.numCycles ' m5out/stats.txt
resetstats zeroes out the cumulative stats, and dumpstats dumps what has been collected during the benchmark.
This is not perfect since there is some time between the exec syscall for m5 dumpstats finishing and the benchmark starting, but if the benchmark enough, this shouldn't matter.
http://arm.ecs.soton.ac.uk/wp-content/uploads/2016/10/gem5_tutorial.pdf also proposes a few more heuristics:
#!/bin/sh
# Wait for system to calm down
sleep 10
# Take a checkpoint in 100000 ns
m5 checkpoint 100000
# Reset the stats
m5 resetstats
run-benchmark
# Exit the simulation
m5 exit
m5 exit also works since GEM5 dumps stats when it finishes.
Instrumentation instructions
Sometimes those seem to be just inevitable that you have to modify the input source code a bit with those instructions in order to:
skip initialization and go directly to steady state
evaluate individual main loop runs
You can of course deduce those instructions from the gem5 m5 tool code code, but here are some very easy to re-use one line copy pastes for arm and aarch64, e.g. for aarch64:
/* resetstats */
__asm__ __volatile__ ("mov x0, #0; mov x1, #0; .inst 0XFF000110 | (0x40 << 16);" : : : "x0", "x1")
/* dumpstats */
__asm__ __volatile__ ("mov x0, #0; mov x1, #0; .inst 0xFF000110 | (0x41 << 16);" : : : "x0", "x1")
The m5 tool uses the same mechanism under the hood, but by adding the instructions directly into the source, we avoid the syscall, and therefore more precise and representative (at the cost of more manual work).
To ensure that the assembly is not reordered around your ROI by the compiler however, you might want to use the techniques mentioned at: Enforcing statement order in C++
Address monitoring
Another technique that can be used is to monitory addresses of interest instead of adding magic instructions to the source.
E.g., if you know that a benchmark starts with PIC == 0x400, it should be possible to do something when that addresses is hit.
To find the addresses of interest, you would have for example to use readelf or gdb or tracing, and the if running full system on top of Linux, ensure that ASLR is turned off.
This technique would be the least intrusive one, but the setup is harder, and to be honest I haven't done it yet. One day, one day.
I'm working on an operating system project, using isolinux (syslinux 4.5) as bootloader, loading my kernel with multiboot header organised at 0x200000.
As I know the kernel is already in 32-bit protected mode. My question: Is there any easier way to get access to BIOS Interrupts? (Basically I want 0x10 :D)
After loading, my kernel sets up its own GDT and IDT entries and further remaps IRQs. So, is it possible to jump into real mode just after the kernel is loaded and set up VGA/SVGA modes (VBE 2.0 mode). Then after I'll proceed with my kernel and jump into protected mode where I use VBE 2.0 physical buffer address to write onto screen? If yes how? I tried a lot but didn't get success :(
Side note:
I searched a lot on internet and found that syslinux 1.x+ provides _intcall api, I'm not 100% sure about it.
Refer to "syslinux 4.5\com32\lib\sys\initcall.c"
BIOS was designed for 16-bit machines. However, still you have three options to call BIOS interrupts in protected mode.
Switch back to real mode and re-enter protected mode (Easiest approach).
Use v86 mode (not available in 64-bit long mode).
Write your own 16-bit x86 processor emulator (Hardest approach).
I used first approach in my operating system for VBE and disk access through BIOS.
Code used for this purpose in my operating system:
;______________________________________________________________________________________________________
;Switch to 16-bit real Mode
;IN/OUT: nothing
go16:
[BITS 32]
cli ;Clear interrupts
pop edx ;save return location in edx
jmp 0x20:PM16 ;Load CS with selector 0x20
;For go to 16-bit real mode, first we have to go to 16-bit protected mode
[BITS 16]
PM16:
mov ax, 0x28 ;0x28 is 16-bit protected mode selector.
mov ss, ax
mov ds, ax
mov es, ax
mov gs, ax
mov fs, ax
mov sp, 0x7c00+0x200 ;Stack hase base at 0x7c00+0x200
mov eax, cr0
and eax, 0xfffffffe ;Clear protected enable bit in cr0
mov cr0, eax
jmp 0x50:realMode ;Load CS and IP
realMode:
;Load segment registers with 16-bit Values.
mov ax, 0x50
mov ds, ax
mov fs, ax
mov gs, ax
mov ax, 0
mov ss, ax
mov ax, 0
mov es, ax
mov sp, 0x7c00+0x200
cli
lidt[.idtR] ;Load real mode interrupt vector table
sti
push 0x50 ;New CS
push dx ;New IP (saved in edx)
retf ;Load CS, IP and Start real mode
;Real mode interrupt vector table
.idtR: dw 0xffff ;Limit
dd 0 ;Base
The short answer is no. BIOS calls are designed to operate in real mode and don't respect restraints set by protected mode, so you're not allowed to use them and the CPU will triple-fault if you try.
However, x86 processors provide Virtual 8086 mode, which can be used to emulate an x86 processor running in 16-bit real mode. The OSDev wiki and forums provide a wealth of information on this topic. If you go this route, it is generally a good idea to map the kernel to the higher half (Linux uses 0xC0000000) to avoid interfering with VM86 code.
I have a section of code that I'd like to reuse, while changing only one register in one instruction. The initial register is $f18 in coproc1, and each time this code is run I want it to use the next COP1 register (max of 4 times). In this case I am very limited on memory and available GPR registers so I would not like to make a separate subroutine for this.
I know I can use self-modifying code to change the actual instruction, but doing so seems to require me to know the exact address of the line in question. This makes developing my code difficult because the size will constantly fluctuate until I'm finished.
Is it possible to reference a memory address by label+offset?
And is there a better way to do this using very few instructions and additional registers?
calc_and_add_color:
srl $t2, $t2, $t0
andi $s2, $t2, 0x1F
mtc1 $s2, $f22 #f22 is now red/green/blue component
cvt.s.w $f22, $f22
mul.s $f25, $f22, $f18 #<<<F18 HERE IS WHAT I WANT TO CHANGE
round.w.s $f25, $f25
lh $s2, 0x0($s1)
mfc1 $s5, $f25
addu $s5, $s5, $s2 #add new red comp. to existing one
andi $s5, $s5, 0x1F #cap at 31
sh $s2, 0x0($s1) #store it
addiu $s1, $s1, 0x6C0 #next color
addiu $s2, $r0, 0x5 #bit shifter
andi $s5, $fp, 0x0003 #isolate counter
bnel $s5, $zero, calc_and_add_color #when fp reaches zero every color has been done
addiu $fp, $fp, -0x1
I am a newcomer to assembly language in MIPS but I have prior experience in JAVA. I have the following block of code and was wondering how I could make it significantly faster. As you can see, this code takes a total of 45 cycles to run. You will notice that the div instruction is a big portion of the total. Maybe I could add something else in the code in place of div to optimize the code and cut down on cycles?
The code:
li $t0, -32 ----------------------2 cycles
lw $t2, 0($s1)--------------------1 cycle
div $t2, $t2, $t0------------------41 cycles
sw $t2, 0($s1)--------------------1 cycle
total cycles----------------45 cycles
Your help is much appreciated. Thanks.