Meaning of "loading a process into its address space"

Meaning of "loading a process into its address space" - process

I'm reading about operating systems. What exactly does it mean that a process is loaded into its adress space.
I know that each process has its own adress space and that the process only sees virtual adresses.
But I have trouble with the formulation "load into the adress space". What exactly does that mean?
Does it mean that e.g. the variables used by the process are assigned certain virtual memory adresses?

every OS supports some specific binary formats e.g. unix supports ELF and Windows EXE. when you double click the binary file, the contents from your binary stored on hard disk are loaded into process virtual address space. This is called "a process is loaded into its address space"

Related

Does the computer erase the values which have been reserved for variables in the RAM after the program is finished?

We have learned that the information in the hard disk are never erased, only the links to it are erased. Does this apply to RAM as well (as long as the computer is on of course)?
For example, When I declare a variable before I give it a value, the value can be 0 or some other value that was previously in that place.
Can I say that the computer doesn't know how to delete something it just replaces it with another?

When a process (program/app) starts, it asks the operating system for some memory. The operating system keeps track of the address/size of the memory each process uses. A process can't access a region of memory it doesn't own (virtual address mapping). When the process ends the operating system may choose to write a value (like 0x00) to all the memory a process had. I believe most existing operating systems don't overwrite memory.
Take a look at this question for a further insight about this subject.

Where is page table located?

I've been studying about paging and page tables. I don't see to understand where page tables are located.
In one of the answers from stack exchange(https://unix.stackexchange.com/questions/487052/where-is-page-table-stored-in-linux), it is said that page tables are in kernel address space, which is in virtual memory(from what I understood).
However in lecture slides from University of Illinois(https://courses.engr.illinois.edu/cs241/sp2014/lecture/09-VirtualMemory_II_sol.pdf), page tables seem to be in RAM, which is physical memory.
Can anyone tell me clearly where the page tables are stored?
Thank you in advance.

The answer to this question is too broad, and I think it belongs to super-user stack exchange.
In x86 systems, page tables are structures used by the CPU, but they are too large to be hold in registers, so they are kept in RAM.
Any process has a memory map in which there is two big zones: user space and kernel space. Kernel space is the same space for all process. User space is private to that process. On 32 bit X86 based Linux systems, any logical address equal or greater than 0xC0000000 belongs to kernel. Below that address, it's user space.
The page table of the process is held in the kernel space. The kernel may have several page tables in RAM, but only one is the active page table. In x86 CPUs, it's the page table pointed by register CR3.
There is a more detailed explanation of how it works here: https://stackoverflow.com/a/20792205/3011009

i think you have a problem about understanding the virtual and physical memory.
as the name suggest the virtual memory is not real. the reason of the idea of virtual memory was that the process sees all the storage in a computer as the available memory. for example in a 64 bit system, a process might see 2^64 as the memory available to it and another process may see the same thing. so using the virtual memory every process would see a continuous memory available to it which might be so much bigger than the available memory on the system. all the addresses in the virtual memory then should be translated to the equivalent physical memory using something called page tables.
pages are blocks of cells(addresses), for example lets say that the available memory(physical) in a system is 2 GB, and the pages or blocks of cells has been chosen as 4 KB, in this case in a 4 KB block or page 4096 different cells or addresses are available which we could address using 12 bits , since we have:
2^12 = 4096
if the overall memory is 2 GB, then it means we could have:
2GB/4KB = 524288
which means we could have 524288 different pages in the physical memory, now some of these pages are only assigned to the operating system code, which means only the os could have access to it, these are the codes and instructions of the operating system program which could help the execution of every other program. other pages are available for other processes.
now lets say we have an address like this in the virtual memory:
0x000075fe
first of all we said that we need 12 bits to tell the position of every address in the page itself since the page is 4 KB, this position is 5fe, what operating system or every other memory management tool does! is that it won't translate this OFFSET, the position of every address in the virtual page would be the same thing in the physical page, i think this is one of the main features which makes translation beneficial , now the rest of the address should be translated to the related page in the physical which is :
0x00007
for this , the page table should be looked, which as we said is just a table in the kernel memory, which is not accessible in the user space, for example is something like this:
0x00001 0x00004
0x00002 disk ----> means every these addresses are in the disk
0x00007 0x004fe
so the 0x00007 page should be translated to the 0x004fe and therefore the address of:
0x000075fe in the virtual memory would be translated to:
0x004fe5fe in the physical memory , which means this is an address in the page number 0x004fe and the position of 5feth - 1.(since we know the starting point is zero).

Is a software image loaded into non-volatile RAM when using tftpboot from U-boot?

I have a Xilinx development board connected to a RHEL workstation.
I have U-boot loaded over JTAG and connect to it with minicom.
Then I tftpboot the helloworld standalone application.
Where do these images go?
I understand I am specifying a loadaddr, but I don't fully understand the meaning.
When I run the standalone application, I get various outputs on the serial console.
I had it working correctly the first time, but then started trying different things when building.
It almost feelings like I am clobbering memory, but I assumed after a power cycle anything tftp'd would be lost.
The issue stills occurs through a power cycle though.

Where do these images go?
The U-Boot command syntax is:
tftpboot [loadAddress] [[hostIPaddr:]bootfilename]
You can explicitly specify the memory destination address as the loadAddress parameter.
When the loadAddress parameter is omitted from the command, then the memory destination address defaults to the value of the environment variable loadaddr.
Note that several other U-Boot commands also use this loadaddr variable, such as "bootp", "rarpboot", "loadb" and "diskboot".
I understand I am specifying a loadaddr, but I don't fully understand the meaning.
When I run the standalone application, I get various outputs on the serial console.
The loadAddress is simply the start address in memory to which the file transfered will be written.
For a standalone application, this loadAddress should match the CONFIG_STANDALONE_LOAD_ADDR that was used to link this program.
Likewise the "go" command to execute this standalone application program should use the same CONFIG_STANDALONE_LOAD_ADDR.
For example, assume the physical memory of your board starts at 0x20000000.
To allow the program to use the maximum amount of available memory, the program is configured to start at:
#define CONFIG_STANDALONE_LOAD_ADDR 0x20000000
For convenient loading, define the environment variable (at the U-Boot prompt):
setenv loadaddr 0x20000000
Assuming that the serverip variable is defined with the IP address of the TFTP server, then the U-Boot command
tftpboot hello_world.bin
should retrieve that file from the server, and store it at 0x20000000.
Use
go 20000000
to execute the program.
I assumed after a power cycle anything tftp'd would be lost.
It should.
But what might persist in "volatile" memory after a power cycle is unpredictable. Nor can you be assured of a default value such as all zeros or all ones. The contents of dynamic RAM should always be presumed to be unknown unless you know it has been initialized and has been written.
Is a software image loaded into non-volatile RAM when using tftpboot from U-boot?
Only if your board has main memory that is non-volatile (e.g. ferrite core or battery-backed SRAM, which are not likely).
You can use the "md" (memory display) command to inspect RAM.

kernel symbols in kernel module

First of all I need to know addresses in System.map or /proc/kallsyms are virtual or physical?
then I want to read from addresses of kernel symbols, for example I want to read pid field of init_task symbol. I can find init_task address from System.map and also offset of pid. but I don't know how to read from an address in kernel.
I really appreciate any reference or link to say things in detail because I'm not familiar with kernel programming.
another question: when they say DKOM(dynamic kernel object manipulation) what does it mean? I searched but just find something about windows system!
and when they say you can access exported symbols in LKM? what operations do they mean? are specific functions to read or write from kernel symbols?

Just about any pointer address you can see is virtual. Its either user space process virtual space (namely your process), or the kernel virtual address space. It is only when the kernel needs to inform one hardware component how to access another that it will convert the pointer to is physical representation.
Its worth noting that event the physical address space is virtual in the sense that different hardware component are pragmatically assigned memory ranges and are expected to react when those are addressed. It is still very physical in the sense that those address values are the ones that are encoded on the BUS address and no software translation is needed.
As for reading/writing kernel pointers from userspace. Unless granted by a specific API and setup both by the user and the kernel (like shared memory), you can't. Its the most basic security protection etched into core of the operating system. (you can't even access the memory of another user for that matter).
Having said that, if you wish to intentionally decrease your kernel security, as root you may do just about anything, including loading a module that does just that...
here is another discussion on the same topic:
how-to-access-kernel-space-from-user-spacein-linux

First, addresses in System.map or /proc/kallsyms are virtual addresses.
Second, if you'd like traverse data structure in kernel, you could use Crash tool. It is based on gdb, easy to use. But you should recompile your kernel with debug information first. With crash tool, you can easily read every data structure of kernel in user space. And it supports multiply distributions, like Ubuntu, Fedora, and so on.
Another tool is Volatility, wrote by Python, you could take a snapshot of your system. Then read the memory snapshot with Volatility.

Running multiple executables linked to 0x400000

I'm interested in operating systems topic and I have a dummy question. Standard PE executable files are linked to 0x400000. My question is how can operating system load multiply executables with same image base, when virtual memory just maps virtual addresses to physical. Is it storing PDE and PTE index of thread somewhere? Is there some addition to each address before execution starts? How does it work?

Each process gets its own virtual address space, and hence there's no conflict. All virtual address spaces that exist in any one time in the system get mapped into the physical address space. Virtual memory that can't or currently isn't mapped onto a particular physical memory is held in the swap file (swap partition, or alike) — this is called paging.
During thread switches, when the CPU is about to execute a thread from a different process than it was executing so far, the operating system's scheduler informs the CPU (sets the respective registers) about the new virtual address translation table to use. Thus the CPU thinks there's just one virtual address space at the given time, while the operating system can manage many more, one for each process.
Disclaimer: My answer may be a thought of as a bit superficial or imprecise as opposed to the reality. This for the sake of simplicity in respect to the nature of the OPs question. Also, these mechanisms are CPU-dependent and operating system-dependent.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas