Hacky ways to retrieve data from thread local storage - jvm

I've learned something about thread local storage(TLS). From my point of view, this is a totally black box - you give it a key, and it gives you the thread local data back. No idea what the key is like and where these local data is saved.
Recently I've found some hacky ways to retrieve TLS data, but couldn't figure out how they work.
In project async-profiler(at src/vmStructs.cpp line:347), pthread key is an int, why? I read the source code about pthread manipulation of Hotspot JVM and didn't find any clue. Hotspot doesn't specify the type of pthread key (and I don't think it can either). And why we could guarantee to find the pthread key with just a simple integer loop? Not sure this is a TLS question or a Hotspot question. :(
pthread_tis an address of pthread. I think it points to some OS thread instance that we may not parse it directly. But, according to Phoenix87's answer, we could regard pthread_t as an address of pthread_t list in which we could find TID(the return of gettid()) with a simple loop (a simple loop, again). You could find it in Austin (at src/linux/py_proc.h line: 613). How does it work?

Both of these make assumptions about the pthread implementation they are working on.
Such assumptions are not warranted in general, and are almost guaranteed to break with a different implementation.
Don't write code like that unless absolutely necessary, and if you do write code like that, please clearly state exactly which implementation(s) and which versions you support.

Related

Useless use of LOOP_BLOCK_1 symbol in sink context

With a snippet like
perl6 -e 'loop { FIRST say "foo"; last }'
I get
WARNINGS for -e:
Useless use of LOOP_BLOCK_1 symbol in sink context (line 1)
foo
I know how to work around the warning. I'm wondering about what the source of the warning is. I found this open ticket, but it doesn't seem to have received any attention.
What is this warning about?
And what about this is useless?
Version
$ perl6 --version
This is Rakudo version 2018.06 built on MoarVM version 2018.06
implementing Perl 6.c.
It's a bug, a bogus warning.
I know how to work around the warning.
That's the main thing.
I'm wondering about what the source of the warning is.
It's a bogus warning from the compiler.
I found this open ticket, but it doesn't seem to have received any attention.
I think it got some attention.
bbkr, who filed the bug, linked to another bug in which they showed their workaround. (It's not adding do but rather removing the FIRST phaser and putting the associated statement outside of the loop just before it.)
If you follow the other links in bbkr's original bug you'll arrive at another bug explaining that the general "unwanted" mechanism needs to be cleaned up. I imagine available round tuits are focused on bigger fish such as this overall mechanism.
Hopefully you can see that it's just a bizarre warning message and a minor nuisance in the bigger scheme of things. It appears to come up if you use the FIRST phaser in a loop construct. It's got the very obvious work around which you presumably know and bbkr showed.
What is this warning about?
Many languages allow you to mix procedural and functional paradigms. Procedural code is run for its side effects. Functional code for its result. Some constructs can do both.
But what if you use a construct that's normally used with the intent of its result being used, and the compiler knows that, but it also knows it's been used in a context in which its value will be ignored?
Perls call this "useless use of ... in sink context" and generally warn the coder about it. ("sink" is an alternative/traditional term for what is often called "void" context in other language cultures.)
This error message is one of these warnings, albeit a bogus one.
And what about this is useless?
Nothing.
The related compiler warning mechanism has gotten confused.
The "Useless use of ... in sink context" part of the message is generic and hopefully self-explanatory.
But there's no way it should be saying things like "LOOP_BLOCK_1 symbol". That's internal mumbo-jumbo.
It's a warning message bug.

How to detect if a function called fopen or not?

I'm trying to write a pam backdoor scanner, which may call fopen function in pam_sm_authenticate(normal file will not call fopen in this function) to store username and password, but I can't use external command such as "nm, readelf" or something like that, so the only way seems to scan pam_sm_authenticate function and find all call instructions and caculate the address to check if it is calling fopen, but it is too troublesome and i'm not very familiar with ELF file(I even dont know how to find offset of pam_sm_authenticate, I'm useing dlopen and dlsym to get the address..), so I wonder if there is a better or easy way to detect it? Thankyou.
TL;DR: building a robust "pam backdoor scanner" is theoretically impossible, so you should give up now and think about other ways to solve your problem.
Your question is very confusing, but I think what you are asking is: "can I determine programmatically whether pam_sm_authenticate calls fopen".
That is the wrong question to ask, for several reasons:
if pam_sm_authenticate calls foo, and foo calls fopen, then you still have a problem, so you really should be scanning pam_sm_authenticate and every function it calls (recursively),
the fopen is far from the only way to write files: you could also use open, or system (as in system("echo $secret > /tmp/backdoor"), or direct sys_open syscall, or a multitude of other hacks.
finally, the pam_sm_authenticate can use just-in-time compilation techniques to build arbitrary code (including code calling fopen) at runtime, and answering whether it does by examining its code is equivalent to solving the halting problem (i.e. impossible).

How does a OS X Launch Agent persist settings/state?

I'm writing an OS X launch agent (which watches, as it happens, for FSEvents);
it therefore has no UI and is not started from a bundle – it's just a program.
The relevant documentation and sample
code
illustrates persisting FS event IDs between invocations, and does so using
NSUserDefaults. This is clear the Right Thing To Do(TM).
The documentation for NSUserDefaults in the
Preferences and Settings Programming Guide
would appear to be the appropriate thing to read.
This shows only the Application and Global domains as being persistent, but (fairly
obviously) only the Application domain is writable for an application. However
the preferences in the application domain are keyed on the
ApplicationBundleIdentifier, which a launch agent won't have. So I'm at a loss
how such an agent should persist state.
All I can think of is that the Label in the launchd job can act as the
ApplicationBundleIdentifier – it at least has the correct form. But I can't see any
hint that that's correct, in the documentation.
The obvious (unix-normal) thing to do would be to write to a dot-file
in $HOME, but that's presumably not the Cocoa Way. Google searches
on 'osx daemon preferences', and the like, don't show up anything useful,
or else my google-fu is sadly lacking today. Googling for 'set application
bundle identifier' doesn't turn up anything likely, either.
NSUserDefaults:persistentDomainForName looks like it should be relevant,
but I can't work out its intentions from its method documentation.
I've found one question here which seems relevant, but while it's tantalisingly close, it doesn't say where the daemon gets its identifier from.
I've limited experience with Objective-C and Cocoa, which means that by
now I rather suspect I'm barking up the wrong tree, but don't really know where
to look next.
You can (and imo should) have an Info.plist even in a single-file executable.(see http://www.red-sweater.com/blog/2083/the-power-of-plist)
However, NSUserDefaults is a little more questionable. Conceptually, it's intended for user settings, rather than internal state. However, there's no real reason it wouldn't be suited to this, so I'd probably go ahead and do so.

Why Decompilers cant produce original code theoretically

I searched the internet but did not find a concrete answer that why decompilers are unable to produce original source code. I dint get a satisfactory answer. Somewhere it was written that it is similar to halting problem but dint tell how. So what is the theoretical and technical limitation of creating a decompiler which is perfect.
It is, quite simply, a many-to-one problem. For example, in C:
b++;
and
b+=1;
and
b = b + 1;
may all get compiled to the same set of operations once the compiler and optimizer are done. It reorders things, drops in-effective operations, and rewrites entire sections of code. By the time it is done, it has no idea what you wrote, just a pretty good idea what you intended to happen, at a raw-CPU (or vCPU) level.
It is even smart enough to remove variables that aren't needed:
{
a=5;
b=func();
c=a+b;
d=func2(c);
}
## gets rewritten as:
REGISTERA=func()
REGISTERA+=5
return(func2(REGISTERA))
For starters, the variable names are never preserved when your program is compiled. ...so the best it could possibly do would be to use meaningless variable names throughout your re-constituted program. Compiling is generally a one-way transformation - like a one-way hashing function. Like the hash, it may be possible to generate something else that could hash to the same value, but it's highly unlikely the decompiled program will be the exact same as your original.
Compilers throw out information; not all the information that is in the source code is in the compiled code. For example in compiled Java, you can't tell the difference between a parameterized and unparameterized generic type because the information is only used by the compiler; some annotations are only used at compile time and are not included in the compiled output. That doesn't mean you couldn't get some sort of source code by decompiling; it just wouldn't match nor would be as informative as the actual source code.
There is usually not a 1-to-1 correspondence between source code and compiled code. If an essentially infinite number of possible sources could result in the same object code (given unbounded variable name lengths, etc.), how is a decompiler to guess which one to spit out?

File I/O in a Linux kernel module

I'm writing a Linux kernel module that needs to open and read files. What's the best way to accomplish that?
Can I ask why are you trying to open a file?
I like to follow Linux development (out of curiosity, I'm not a kernel developer, I do Java), and I've seen discussion of this question before. I was able to find a LKML message about this, basically mentioning it's usually a bad idea. I'm almost positive that LWN covered it in the last year, but I'm having trouble finding the article.
If this is a private module (like for some custom hardware and the module won't be distributed) then you can do this, but I'm under the impression that if you are going to submit your code to the mainline then it may not be accepted.
Evan Teran mentioned sysfs, which seems like a good idea to me. If you really need to do harder custom stuff you could always make new ioctrls.
EDIT:
OK, I found the article I was looking for, it's from Linux Journal. It explains why doing this kind of stuff is generally a bad idea, then goes on to tell you exactly how to do it anyway.
assuming you can get pointers to the relavent function pointers to the open/read/close system calls, you can do something like this:
mm_segment_t fs = get_fs();
set_fs(KERNEL_DS);
fd = (*syscall_open)(file, flags, mode);
if(fd != -1) {
(*syscall_read)(fd, buf, size);
(*syscall_close)(fd);
}
set_fs(fs);
you will need to create the "syscall_*" function pointers I have shown though. I am sure there is a better way, but I believe that this would work.
Generally speaking, if you need to read/write files from a kernel module, you're doing something wrong architecturally.
There exist mechanisms (netlink for example - or just register a character device) to allow a kernel module to talk to a userspace helper process. That userspace helper process can do whatever it wants.
You could also implement a system call (or such like) to take a file descriptor opened in userspace and read/write it from the kernel.
This would probably be neater than trying to open files in kernel space.
There are some other things which already open files from kernel space, you could look at them (the loop driver springs to mind?).
/proc filesystem is also good for private use, and it's easy.
http://www.linuxtopia.org/online_books/Linux_Kernel_Module_Programming_Guide/x773.html
All of the kernel developers say that file I/O from kernel space is bad (especially if you're referring to these files by their paths) but the mainstream kernel does this when you load firmware. If you just need to read from files, use the
kernel_read_file_from_path(const char *path, void **buf, loff_t *size, loff_t max_size, enum kernel_read_file_id id)
function, which is what the firmware loader code uses, declared in include/linux/fs.h. This function returns a negative value on error.
I'm not really sure of the point of the id variable at the end, if you look at the code it's not really used, so just put something like READING_FIRMWARE there (no quotes).
buf is not null terminated, instead refer to its size in size. If you need it to be null terminated, create a string size + 1 bytes long and copy it over or rewrite the kernel_read_file() function (used by kernel_read_file_from_path(), defined in fs/exec.c) and add one to i_size where memory is allocated. (If you want to do this, you can redefine the kernel_read_file() function in your module with a different function name to avoid modifying the whole kernel.)
If you need to write to files, there is a kernel_write() function (analogous to kernel_read(), which is used by kernel_read_file() and therefore also by kernel_read_file_from_path()), but there is no kernel_write_file() or kernel_write_file_from_path() function. You can look at the code in the fs/exec.c file in the Linux kernel source tree where kernel_read_file() and kernel_read_file_from_path() are defined to write your own kernel_write_file() and kernel_write_file_from_path() functions that you can include in your module.
And as always, you can store a file's contents in a char pointer instead of a void pointer with this function by casting it.
You can also find some informations about sys_call_open in this Linux Kernel Module Programing Guide.