File I/O in a Linux kernel module

File I/O in a Linux kernel module - file-io

I'm writing a Linux kernel module that needs to open and read files. What's the best way to accomplish that?

Can I ask why are you trying to open a file?
I like to follow Linux development (out of curiosity, I'm not a kernel developer, I do Java), and I've seen discussion of this question before. I was able to find a LKML message about this, basically mentioning it's usually a bad idea. I'm almost positive that LWN covered it in the last year, but I'm having trouble finding the article.
If this is a private module (like for some custom hardware and the module won't be distributed) then you can do this, but I'm under the impression that if you are going to submit your code to the mainline then it may not be accepted.
Evan Teran mentioned sysfs, which seems like a good idea to me. If you really need to do harder custom stuff you could always make new ioctrls.
EDIT:
OK, I found the article I was looking for, it's from Linux Journal. It explains why doing this kind of stuff is generally a bad idea, then goes on to tell you exactly how to do it anyway.

assuming you can get pointers to the relavent function pointers to the open/read/close system calls, you can do something like this:
mm_segment_t fs = get_fs();
set_fs(KERNEL_DS);
fd = (*syscall_open)(file, flags, mode);
if(fd != -1) {
(*syscall_read)(fd, buf, size);
(*syscall_close)(fd);
}
set_fs(fs);
you will need to create the "syscall_*" function pointers I have shown though. I am sure there is a better way, but I believe that this would work.

Generally speaking, if you need to read/write files from a kernel module, you're doing something wrong architecturally.
There exist mechanisms (netlink for example - or just register a character device) to allow a kernel module to talk to a userspace helper process. That userspace helper process can do whatever it wants.
You could also implement a system call (or such like) to take a file descriptor opened in userspace and read/write it from the kernel.
This would probably be neater than trying to open files in kernel space.
There are some other things which already open files from kernel space, you could look at them (the loop driver springs to mind?).

/proc filesystem is also good for private use, and it's easy.
http://www.linuxtopia.org/online_books/Linux_Kernel_Module_Programming_Guide/x773.html

All of the kernel developers say that file I/O from kernel space is bad (especially if you're referring to these files by their paths) but the mainstream kernel does this when you load firmware. If you just need to read from files, use the
kernel_read_file_from_path(const char *path, void **buf, loff_t *size, loff_t max_size, enum kernel_read_file_id id)
function, which is what the firmware loader code uses, declared in include/linux/fs.h. This function returns a negative value on error.
I'm not really sure of the point of the id variable at the end, if you look at the code it's not really used, so just put something like READING_FIRMWARE there (no quotes).
buf is not null terminated, instead refer to its size in size. If you need it to be null terminated, create a string size + 1 bytes long and copy it over or rewrite the kernel_read_file() function (used by kernel_read_file_from_path(), defined in fs/exec.c) and add one to i_size where memory is allocated. (If you want to do this, you can redefine the kernel_read_file() function in your module with a different function name to avoid modifying the whole kernel.)
If you need to write to files, there is a kernel_write() function (analogous to kernel_read(), which is used by kernel_read_file() and therefore also by kernel_read_file_from_path()), but there is no kernel_write_file() or kernel_write_file_from_path() function. You can look at the code in the fs/exec.c file in the Linux kernel source tree where kernel_read_file() and kernel_read_file_from_path() are defined to write your own kernel_write_file() and kernel_write_file_from_path() functions that you can include in your module.
And as always, you can store a file's contents in a char pointer instead of a void pointer with this function by casting it.

You can also find some informations about sys_call_open in this Linux Kernel Module Programing Guide.

Related

Can variables in memory by made immune to malicious memory editing?

Let's say I am making a game and the player's health is stored in a variable float player_hp. While playing the game, a user can open a memory editor like Cheat Engine, do some clever searches through the memory, and find the location of the variable in memory. They can then edit it to be whatever they want, effectively allowing them to cheat.
Is there a fancy way to store my variables that will effectively stop users from maliciously editing them? Should I move them around in memory? Should I encrypt them or hash them?
If it matters, I am primarily using C++.

Disclaimer: I have no experience programming games, only making cheats for them.
Calculate all important variables such as health & ammo server side and replicate that data to the client. Pickups must also be validated by the server. If you serverside ammo, we'll just call PickUp(ammoCrate) if it's client sided.
You'll never stop everyone but I think it's important to make it annoying enough for 90% of people to stop trying to cheat. There's some basic stuff every game should have and then more advanced protections that popular games must have to avoid being destroyed by cheaters.
Encrypt commonly modified values such as health with a basic XOR like algorithm that changes over time. Change address of the key also and use a pointer to it, make the pointer more than 6 layers deep. This will force cheaters to make a script and use pattern scanning instead just a cheat table, that will stop most people.
Use IsDebuggerPresent() to detect debuggers being attached and then close the process. When you do "Find What Accesses" in Cheat Engine that will attach a debugger and if using default options will be detected by IsDebuggerPresent.
This can be hooked and patched, the next step is to manually check the BeingDebugged flag in the Process Environment Block
Get PEB pointer:
PEB* GetPEB()
{
#ifdef _WIN64
return (PEB*)__readgsword(0x60); //64 bit
#else
return (PEB*)__readfsdword(0x30); //32bit
#endif
}
Read the debugger flag:
bool IsDebugFlagSet(PEB* peb)
{
if (peb->BeingDebugged == 1) return true;
else return false;
}
If they're hooking IsDebuggerPresent() you can compare the bytes on disk to the bytes in memory at the beginning of the function and compare, or even easier just check the first byte and see if it's a relative jmp (x86 example)
Check for Hooks:
bool IsHooked(char* functionName, char* dllName)
{
BYTE* functionAddress = (BYTE*)GetProcAddress(GetModuleHandle(dllName), functionName);
return (*functionAddress == 0xE9);
}
Other tricks:
Hook LoadLibrary() & LDRLoadDLL() and stop them from loading rogue dlls
Scan running processes for memory editors by comparing the image name or image path using CreateToolHelp32Snapshot() against a blacklist
Fill all debug registers so debugger cannot be attached
Beyond these tricks you can use a commercial anticheat or kernel mode driver to protect your game, but the above tips will stop the average person.

Two ways to protect your data:
1) Only provide the user with binary files. Compiled code. Don't give them source code.
2) If your game is an online game, store sensitive data server side.
You should research an idea called "encapsulation" for use in designing APIs such that sensitive data is protected.

How to detect if a function called fopen or not?

I'm trying to write a pam backdoor scanner, which may call fopen function in pam_sm_authenticate(normal file will not call fopen in this function) to store username and password, but I can't use external command such as "nm, readelf" or something like that, so the only way seems to scan pam_sm_authenticate function and find all call instructions and caculate the address to check if it is calling fopen, but it is too troublesome and i'm not very familiar with ELF file(I even dont know how to find offset of pam_sm_authenticate, I'm useing dlopen and dlsym to get the address..), so I wonder if there is a better or easy way to detect it? Thankyou.

TL;DR: building a robust "pam backdoor scanner" is theoretically impossible, so you should give up now and think about other ways to solve your problem.
Your question is very confusing, but I think what you are asking is: "can I determine programmatically whether pam_sm_authenticate calls fopen".
That is the wrong question to ask, for several reasons:
if pam_sm_authenticate calls foo, and foo calls fopen, then you still have a problem, so you really should be scanning pam_sm_authenticate and every function it calls (recursively),
the fopen is far from the only way to write files: you could also use open, or system (as in system("echo $secret > /tmp/backdoor"), or direct sys_open syscall, or a multitude of other hacks.
finally, the pam_sm_authenticate can use just-in-time compilation techniques to build arbitrary code (including code calling fopen) at runtime, and answering whether it does by examining its code is equivalent to solving the halting problem (i.e. impossible).

How to write in kernel mode to some process's virtual memory

I want to use my Unix module in order to write to another process memory (I would like to do it in kernel mode and avoid the pthread interface).
I have to use function (like do_mmap(..), do_unmmap(..), sys_mprotect(..), etc.) which affect the current process memory instead of the process I'd like to it to affect.
So I figured, I need find a way to do a context switch to the process I want in order to make the process I want the current. I tried to copy the implementation of the schedule() with a minor change:
I replaced the line:
next = pick_next_task(rq);
with:
next = myNext;
My problem is that schedule requires so much structs and functions which I can't include, so I have to re-implement them. it seems pretty bad to do such a thing. Do you have any suggestions?
I want to avoid to modify the existing kernel, so I won't have to force the users to restart and modify their operating system in order to use my program (which is why I use modules).
By the way, I use the "2.6.38-11-generic" version of Linux.

Use the get_user_pages() function to get the pages of the target process (more precisely, its mm_struct)
Map the page(s) that you need via kmap() or kmap_atomic() (depending on the context)
Write/read at the address returned by the mapping (withing a page size).
Destroy the mapping via kunmap() or kunmap_atomic()

Using open source SNES emulator code to turn a rom file into a self-contained executable game

Would it be possible to take the source code from a SNES emulator (or any other game system emulator for that matter) and a game ROM for the system, and somehow create a single self-contained executable that lets you play that particular ROM without needing either the individual rom or the emulator itself to play? Would it be difficult, assuming you've already got the rom and the emulator source code to work with?

It shouldn't be too difficult if you have the emulator source code. You can use a method that is often used to store images in c source files.
Basically, what you need to do is create a char * variable in a header file, and store the contents of the rom file in that variable. You may want to write a script to automate this for you.
Then, you will need to alter the source code so that instead of reading the rom in from a file, it uses the in memory version of the rom, stored in your variable and included from your header file.
It may require a little bit of work if you need to emulate file pointers and such, or you may be lucky and find that the rom loading function just loads the whole file in at once. In this case it would probably be as simple as replacing the file load function with a function to return your pointer.
However, be careful for licensing issues. If the emulator is licensed under the GPL, you may not be legally allowed to store a proprietary file in the executable, so it would be worth checking that, especially before you release / distribute it (if you plan to do so).

Yes, more than possible, been done many times. Google: static binary translation. Graham Toal has a good howto paper on the subject, should show up early in the hits. There may be some code out there I may have left some code out there.
Completely removing the rom may be a bit more work than you think, but not using an emulator, definitely possible. Actually, both requirements are possible and you may be surprised how many of the handheld console games or set top box games are translated and not emulated. Esp platforms like those from Nintendo where there isnt enough processing power to emulate in real time.
You need a good emulator as a reference and/or write your own emulator as a reference. Then you need to write a disassembler, then you have that disassembler generate C code (please dont try to translate directly to another target, I made that mistake once, C is portable and the compilers will take care of a lot of dead code elimination for you). So an instruction of a make believe instruction set might be:
add r0,r0,#2
And that may translate into:
//add r0,r0,#2
r0=r0+2;
do_zflag(r0);
do_nflag(r0);
It looks like the SNES is related to the 6502 which is what Asteroids used, which is the translation I have been working on off and on for a while now as a hobby. The emulator you are using is probably written and tuned for runtime performance and may be difficult at best to use as a reference and to check in lock step with the translated code. The 6502 is nice because compared to say the z80 there really are not that many instructions. As with any variable word length instruction set the disassembler is your first big hurdle. Do not think linearly, think execution order, think like an emulator, you cannot linearly translate instructions from zero to N or N down to zero. You have to follow all the possible execution paths, marking bytes in the rom as being the first byte of an instruction, and not the first byte of an instruction. Some bytes you can decode as data and if you choose mark those, otherwise assume all other bytes are data or fill. Figuring out what to do with this data to get rid of the rom is the problem with getting rid of the rom. Some code addresses data directly others use register indirect meaning at translation time you have no idea where that data is or how much of it there is. Once you have marked all the starting bytes for instructions then it is a trivial task to walk the rom from zero to N disassembling and or translating.
Good luck, enjoy, it is well worth the experience.

How do I create an in-memory handle in Haskell?

I want something that looks like a file handle but is really backed by an in-memory buffer to use for I/O redirects. How can I do this?

I just wrote a library which provides this, called "knob" [hackage]. You can use it to create Handles which reference/modify a ByteString:
import Data.ByteString (pack)
import Data.Knob
import System.IO
main = do
knob <- newKnob (pack [])
h <- newFileHandle knob "test.txt" WriteMode
hPutStrLn h "Hello world!"
hClose h
bytes <- Data.Knob.getContents knob
putStrLn ("Wrote bytes: " ++ show bytes)

If you can express what you want to do in terms of C or system calls you could use Haskell's Foreign Function Interface (FFI). I started to suggest using mmap, but on second thought I think mmap might be a mapping the wrong way even if you used it with the anonymous option.
You can find more information about the Haskell FFI at the haskell.org wiki.

This is actually a bug in the library design, and one that's annoyed me, too. I see two approaches to doing what you want, neither of which is terribly attractive.
Create a new typeclass, make the current handle an instance of it, write another instance to do the in-memory-data thing, and change all of your programs that need to use this facility. Possibly this is as simple as importing System.SIO (or whatever you want to call it) instead of System.IO. But if you use the custom I/O routines in libraries such as Data.ByteString, there's more work to be done there.
Rewrite the I/O libraries to extend them to support this. Not trivial, and a lot of work, but it wouldn't be particularly difficult work to do. However, then you've got a compatibility issue with systems that don't have this library.

This may not be possible. GHC, at least, seems to require a handle to have an OS file descriptor that is used for all read/write/seek operations.
See /libraries/base/IOBase.lhs from the GHC sources.
You may be able to get the same effect by enlisting the OS's help: create a temporary file, connect the handle to it and then memory map the file for the I/O redirects. This way, all the handle I/O would become visible in the memory mapped section.

To add a modern answer to this question, you could use createPipe from System.Process:
createPipe :: IO (Handle, Handle)
https://www.stackage.org/haddock/lts-10.3/process-1.6.1.0/System-Process.html#v:createPipe

It's not possible without modifying the compiler. This is because Handle is an abstract data type, not a typeclass.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas