Why does QEMU use JIT compilation? - virtual-machine

The TCG "accelerator" is used by QEMU when requesting full virtualization of a guest with a different hardware architecture (or with -accel=tcg).
TCG is a JIT compiler which emulates the guest architecture set by translating instructions and immediately invoking them at runtime. Portability depends on the list of architectures that TCG supports.
Would it be possible, realistically speaking, to compile an operating system into some efficient IR (similar to Java bytecode) and implement a virtual machine for that bytecode completely in software?

The short answer to "why does QEMU use JIT compilation" is "because it is faster than other ways to do it, like interpreting, but it can still handle any arbitrary guest binary". There has been some work done (not in QEMU itself, but by other projects or research work) on emulation by statically translating guest binaries into code for the host architecture, but this is tricky and you still have to be able to fall back to something like JIT to handle guest binaries that involve self-modifying code or which themselves are JITs (think of running a Java guest inside QEMU).
It is certainly possible to have an operating system which is compiled into an IR bytecode which then executes portably on a virtual machine on a variety of hosts. Historical examples of this include Taos (http://www.uruk.org/emu/Taos.html) and the UCSD p-System (https://en.wikipedia.org/wiki/UCSD_Pascal). Note that you would still here probably want to implement the bytecode-execution engine in such a VM using a JIT, because it's faster than interpreting the bytecode, and there might well be some host-CPU-specific bits of the VM implementation as a result.
However, that sort of portable-operating-system endeavour is an entirely separate idea from QEMU, whose purpose is to run under emulation existing pre-built binaries for a given guest CPU architecture.

Related

Instruction set: how to test an external library

Depending on the CPU architecture, some computers can run software with some specific instruction set. Using these instructions can greatly improve the speed of the program, but can also lead to crashes when not supported.
But sometimes, when shipping a software that depends on external libraries (binaries), we may want to check what instruction set they rely on (like AVX2, SSE2, etc) and assess if we can use safely this library or executable (e.g.: on Windows, a .lib, .dll or an .exe). Mostly when the final executable has to be shipped to hardware that is out of our control, but should follow some specifications.
Most of the related questions seems to tackle the problem the other way around: from the software, to check if a set of instruction is supported on the current hardware:
how verify that operating system support avx2 instructions
Detecting SIMD instruction sets to be used with C++ Macros in Visual Studio 2015
How one can check, from the binary, what kind of instruction sets are required or used? Is there some OS tools for that?
The OSs of interest would be Windows, Linux and MacOS.

Bootloader Written for Java

Is there a boot loader written for booting Java virtual machine without an operating system? As far as I know Java virtual machine can run on a machine by itself, without help of an operating system.
Java defines the guest language, not the host / JVM.
You'd need a JVM written to run on bare-metal of whatever machine you want to run it on. (i.e. to be an OS kernel as well as a JVM, handling interrupts and so on). So there isn't something generic called "Java" that a bootloader could load.
The mainstream JVMs like OpenJDK / HotSpot are not written to work as kernels, only to run under some existing mainstream OSes. But as you found, there are some: Can you run JVM on a computer with no operating system?
Even for a specific platform, the things a kernel needs a bootloader to do may depend on the kernel. There are a few standards, like multiboot for x86, that define a kernel file format that bootloaders like GRUB know how to recognize and load, but otherwise you'd probably expect a bare-metal JVM to come with its own custom bootloader, especially if it's for a platform other than an x86 PC. Or perhaps be bootable as an "EFI application".

How can I use QEMU to simulate mixed platforms?

Backgournd
There is a lot of documentation about using QEMU for simulating a system of particular architecture (a "platform").
For example, x86, ARM or RISCV system.
The first step is to configure QEMU target-list, for example ./configure --target-list=riscv32-softmmu.
It's also possible to provide multiple targets in the target-list, but apparently that builds an independent simulation for each specified platform.
My goal, however, is to simulate a system with mixed targets: an x86 machine which also hosts a RISCV embedded processor over PCI.
Obviously I need to implement a QEMU PCI device which would host the RISCV device on the x86 platform, and
I have a good idea how to implement a generic PCI device.
However, I'm not sure about the best approach to simulate both x86 and RISCV together on the same QEMU simulation.
One approach is to run two instances of QEMU (as two separate processes) and use some sort of IPC for communicating between the x86 and the RISCV simulation.
Another possible (?) approach could be to build RISCV QEMU as a loadable library and load it from x86 QEMU.
Perhaps it's even possible to have a single QEMU application that simulates both x86 and RISCV?
Yet another approach is not to use QEMU for simulating the RISCV device. I could implement a QEMU PCI device that completely encapsulates a RISCV simulation such as tiny-emu, but I would rather use QEMU for both x86 and RISCV.
My questions are:
Are there some guidelines or examples for a mixed-target QEMU project?
I've searched for examples but only found references to using QEMU as a single platform simulation, where first you choose which platform you would like to run.
What would be the best approach for simulating a mixed platform in QEMU? Separate QEMU processes with IPC? Or is there a way to configure QEMU in such a way that it could simulates a mixed platform?
Related
https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg01969.html
QEMU does not support running multiple target architectures in the same QEMU process. (This is something we would in theory like to be able to do, but it would require a lot of reworking of core parts of QEMU which assume that the target architecture is known at compile time. So far nobody has felt it important enough to put in the significant development effort needed.)
So if you want to do this you'll need to somehow stitch together a QEMU process for the primary architecture with some other process to do the secondary architecture (QEMU or otherwise). This has been done (for instance Xilinx have an out-of-tree QEMU-based system that does this kind of thing with multiple QEMU processes) but I'm not aware of any easy off-the-shelf frameworks or setups to do it. I suspect that figuring out how time/clocks interact between the two simulations is one of the tricky aspects.
There is another option
you can start 2 QEMU processes and connect them through socket
Then you can create run script that start both of them in your order
its less "clock" accurate but good enough for virtual your HW
The other option is https://wiki.qemu.org/Features/MultiProcessQEMU
but you will need some hacking this experimental code
Use renode. It not only provides easy multi cpu simulation, but also hdl and multimachine simulation synchronozed in a single process.

What exactly is QEMU? Emulator? VM?

I am trying to make a use of QEMU in my embedded software development process. I think it will be useful for me to run my code without having to touch the hardware. Especially when the software is sitting in the user-space of Linux. Now, I am trying to get my head wrapped around the big concepts in QEMU.
At what point is QEMU virtualizing the hardware? Can I assume it virtualizes x86 when the host platform is also x86 with virtualization technology built into the processor?
In other words, can I assume QEMU is emulating the hardware when the target platform is not the same as host platform?
It's a general-purpose emulator software (type 2 hypervisor) which can use virtualization when the target and hosts are of the same architecture. In Linux you need to enable the KVM kernel module to be able to use the virtualization technology of the processor.

Why can an executable run on both Intel and AMD processors?

How is it that an executable can work on both AMD and Intel systems. Aren't AMD's and Intel's instruction sets different? How does the executable work on both? How exactly do they compile the files to work like that. And what exactly is the role of the OS in all this?
The only real difference between AMD and Intel at a given processor iteration is their implementation of the instruction sets they support. x86 (32 bit) and x64 (64 bit) are the two most common instruction sets for Intel and AMD processors.
The differences come in when Intel and AMD implement the instruction sets in their chips - but these implementations should have no effect on the instruction sets themselves. So if a program was compiled for an x64 processor, it can run on any processor that implements the x64 instruction set, which almost all modern Intel and AMD processors implement.
A great example of an implementation difference is the way that Intel likes to hyperthread cores whereas AMD likes to just add more cores. They do this for a multitude of reasons, such as power consumption and better concurrent processing, but it doesn't really impact if programs run because it doesn't change the instruction set. Another difference between Intel and AMD is the number of pipeline stages, which can affect speed.
Huge complexities come into play when operating systems are considered. Windows has huge libraries that programs have to use if they want to run on windows. The same goes for Linux and Mac OS X. Since these libraries aren't shared between operating systems, programs written on one operating system probably won't run on another.
Essentially these days, compilation is done for the OS not for hardware, as most hardware have universal protocols and/or tech, as mentioned above, x86 or x64 machine code/opcodes/instruction sets, some programmers do make software designed to run better on certain hardware i.e optimized for AMD or Intel etc...
but still have other versions for other hardware
mainly due to the OS you need to worry about bit length and or running OS
most compilers or software makers tend to compile out to shared machine code instead of manufacturer specific, it should be remember different people use the same things in a different way, the guys in MIT, may decide to code their own OS for their needs and may want to use advance specific features of Intel ins tsrcution set some people fully re do their own androids etc...