Is this a reasonable "Application entry point"? - definition

I have recently come across a situation where code is dynamically loading some libraries, wiring them up, then calling what is termed the "application entry point" (one of the libraries must implement IApplication.Run()).
Is this a valid "Appliation entry point"?
I would always have considered the application entry point to be before the loading of the libraries and found the IApplication.Run() being called after a considerable amount of work slightly misleading.

The terms application and system are terms that are so widely and diversely used that you need to agree what they mean upfront with your conversation partner. E.g. sometimes an application is something with a UI, and a system is 'UI-less'. In general it's just a case of you say potato, I say potato.
As for the example you use: that's just what a runtime (e.g. .NET or java) does: loading a set of libraries and calling the application entry point, i.e. the "main" method.
So in your case, the code loading the libraries is doing just the same, and probably calling a method on an interface, you could then consider the loading code to be the runtime for that application. It's just a matter of perspective.

The term "application" can mean whatever you want it to mean. "Application" merely means a collection of resources (libraries, code, images, etc) that work together to help you solve a problem.
So to answer your question, yes, it's a valid use of the term 'application'.

Application on its own means actually nothing. It is often used by people to talk about computer programs that provide some value to the user. A more correct term is application software and this has the following definition:
Application software is a subclass of
computer software that employs the
capabilities of a computer directly
and thoroughly to a task that the user
wishes to perform. This should be
contrasted with system software which
is involved in integrating a
computer's various capabilities, but
typically does not directly apply them
in the performance of tasks that
benefit the user. In this context the
term application refers to both the
application software and its
implementation.
And since application really means application software, and software is any piece of code that performs any kind of task on a computer, I'd say also a library can be an application.
Most terms are of artificial nature anyway. Is a plugin no application? Is the flash plugin of your browser no application? People say no, it's just a plugin. Why? Because it can't run on it's own, it needs to be loaded into a real process. But there is no definition saying only things that "can run on their own" are applications. Same holds true for a library. The core application could just be an empty container and all logic and functionality, even the interaction with the user, could be performed by plugins or libraries, in which case that would be more an application than the empty container that just provides some context for the application to run. Compare this to Java. A Java application can't run on it's own, it must run within a Java Virtual Machine (JVM), does that mean the JVM is the application and the Java Code is just... well what? Isn't the Java code the real application and the JVM just an empty runtime environment that provides nothing to the end user without the loaded Java code?

I think in this context "application entry point" means "the point at which the application (your code) enters the library".

I think probably what you're referring to is the main() function in C/C++ code or WinMain in a Windows app. That is, it's the point where execution is normally started in an app. Your question is pretty broad and vague--for example, which OS are you running this on--but this may be what you're looking for. This might also address the question.
Bear in mind when you're asking questions, details are your friend. People can give you a much better, more informed answer when you provide them with details.
EDIT:
In a broader context consider what has to happen from the standpoint of the OS. When the user specifies that they want to run an app, the OS has to load the app from the hard drive and then when the app is loaded into memory, it has to pass control to some point in the memory blocked occupied by the newly loaded app to continue execution. That would be the "Application Entry Point". When an app is constructed with dynamically linked code the OS has to load all that dynamically linked code in order to get the correct app image into memory. Loading up those shared bits of code does not change the fact that the OS must have a point to which to pass control when the app is loaded into memory.

Related

How to execute an untrusted function efficiently in a cross-platform way?

I am writing an open source cross-platform application written in C++ that targets Windows, Mac, and Linux on x86 CPUs. The application produces a stream of data (integers) that needs to be validated, and my application will perform actions depending on the validation result. There are multiple validators, which we shall call "modules", and they can be swapped out for one another.
Anybody can write and share modules with other users, so my application has to ensure that maliciously-written modules cannot harm the user in any way (perhaps except via high CPU usage, in which case my application should be able to kill the module after some amount of time - this can be done by using a surrogate process). Furthermore, the stream of data is being sent at a high rate (up to 100kB/s).
Fortunately, the code in these modules are usually simple arithmetic operations on data in the stream (usually processing each incoming integer in constant time), and they do not need to make any system calls (not even heap allocation).
I've considered the following possibilities (all of them with some drawbacks):
Kernel-based sandboxing
On Linux, we can use secure computing (seccomp), which prevents a process from making any system calls except for reading and writing with already-open file destriptors. Module creators would write their modules as a single function that takes in input and output file descriptors (in a language like C or C++) and compile it into a shared object, then distribute that shared object.
My application will probably prepare input and output file descriptors, then fork() itself or exec() a surrogate process, and this child process uses dlopen() and dlsym() to get a pointer to the untrusted function. Then strict secure computing mode will be enabled, before executing the untrusted function.
Drawbacks: There's the problem that dlopen() will actually run the constructor function from the shared library. This would have to be properly sandboxed as well, and I can't think of a way to do so. Also, of course, this thing will only work on Linux. As far as I know, there is no way to ban WinNT system calls on Windows, so a similar solution on Windows won't be very secure.
Application-level sandboxing
[[ Any form of application-level sandboxing means that we cannot run untrusted machine code of any form. An untrusted function can overwrite its return value or data outside its call stack, thereby compromising the whole application (and effectively acquiring any permissions that the original application had). ]]
Make modules use a simple scripting language that does not support any system calls - just pure arithmetic operations and perhaps the ability to read an input stream. My application would contain an interpreter for this language.
Drawbacks: Unfortunately I have not found this scripting language. Many scripting languages have extensive functionalities (e.g. Python) and a sandbox (e.g. PyPy's sandbox) simply filters OS system calls. I would be shipping a lot of useless interpreter code with my application, and it arguably is more prone to security issues due to bugs in the intepreter than a language with simply no functionality to do things other than simple calculations and control flow instructions (basically a function that does not make any system calls). Furthermore, marshalling the data between C++ (machine code) and the scripting language is usually a slow process.
Distribute modules with a 'safe' compiled language that again does not support any system calls. My application would contain a JIT for this language.
Marshalling won't be necessary because my application would call into the JITted machine code of the untrusted module, so performance across this boundary should be fast. The untrusted module now won't be able to corrupt the stack, attempt return-oriented programming, or perform any other malicious actions, due to the language restrictions and checks of the 'safe' language. WebAssembly is the first and only language that comes to mind (if it can be called a language). (As far as I can tell, WebAssembly seems to provide the security guarantees for my use case, right?)
Drawbacks: The existing implementations of WebAssembly seem to be all browser-based, so I would have to steal an implementation from an open source browser. This does seem like a lot of work, considering that I would have to uncouple it from all the JavaScript and other browser bits. However, a standalone WebAssembly JIT based on LLVM seems to be under development.
Question:
What is the best way to execute an untrusted function efficiently that works on Windows, Mac, and Linux?
Right now, I think that the scripting language way would probably be the safest, and be the easiest for module writers. But for a more efficient solution, WebAssembly is probably better. Am I right, or are there better or easier solutions that I have not thought of?
(Remark: I think several pairs of tags used in this question have never been seen together before!)
Regarding WebAssembly:
Unfortunately, there is no production-quality stand-alone implementation yet. I expect some to show up in the future, but it hasn't happened yet.
For historical reasons, existing production implementations are all part of a JavaScript VM. Fortunately, none of these VMs is tied to a browser. If you don't mind including some unused JS baggage, you can embed them as they are (ripping out the JS would be very hard). One problem, though, is that these VMs don't yet provide embedding interfaces for Wasm specifically. You have to go through JS, which is stupid.
There is an initial design for a C and C++ API for WebAssembly, which would give direct access to an embedded Wasm VM. It is meant to be VM-neutral, i.e., could be implemented by any existing VM (the repo contains a prototype implementation on top of V8). This may evolve into a standard, but I cannot promise any timeline. Right now it's only for the brave.

objective-c frameworks - Dynamic Library Install Name

I'm new to objective-c & osx architecture. I started playing with building a framework and then using it. I followed this great tutorial.
During the tutorial, I had to set the framework's target's Dynamic Library Install Name to #rpath/MyFramework.framework/Versions/A/MyFramework. My understanding is that #rpath will expand to the loader's (consumer's) run-path search paths.
It seems as if the responsibility of loading the framework is split between the framework author and the consumer author. Could someone please explain why the author of the framework needs to be concerned with the consumer's run-path search path? For example, if the framework-author set the Dynamic Library Install Name to point to some random directory (instead of #rpath) how would the client be able to consume the framework?
Thanks in advance.
It depends a lot on how the framework is being used. And it's important to remember that the framework construct has existed for a long time on the platform.
For a system framework, such as the ones that Apple creates, you're going to be quite happy that they keep the frameworks in a known location. In those cases, the paths that they use are fixed for the OS, and it guarantees that you don't accidentally load the wrong one. Further, as indicated in the Framework documentation, these frameworks are loaded only once on the machine, regardless of how many times they are used (see Apple:What Are Frameworks) . The benefit here is performance and it is for both the code and the resources in many cases.
Due to the recent move to randomize framework locations,and Apple's comments in the release notes that "Mountain Lion randomly relocates the kernel, kexts, and system frameworks at system boot," it certainly appears they're still sharing these resources, and thus still gaining from this benefit.
For embedded frameworks, the situation is a lot more tedious, and Apple has moved through a variety of methods over the years to make it easier to find frameworks wherever they may be. Due, again, to the shared nature, it would make sense for Applications which share common library requirements to share them on the machine, both for purposes of efficiency, and to make sure they're at the same version if they're sharing data. So, for example, if you have two separate apps that use the same framework to work with shared data, you might put the shared framework in /Library/Frameworks and have both apps explicitly look for that, making sure that some other (possibly older) version of the framework, that has been loaded by another App, is not used instead.
In the end, there's a lot of flexibility for the Framework producer and consumer the way that it currently works. It means that the developer can decide to share a framework, include a private copy of the framework, or even do both, depending upon whether the framework exists on the machine or not. However, the price for that flexibility is the complexity that we have today.
Another example of a reason you might not want to use #rpath specifically is for tightly-linked embedded frameworks (yes, people embed frameworks within other frameworks). In these cases, you don't know where the first framework is loaded, but you want to put the embedded framework inside of it, so that they stay together. In this case #loader_path is relative to the code that is loading it, so that your plug-in's framework can find its resources correctly.
In answer to your specific example about somebody setting the Dynamic Library Install Name
to a "random" location. In this case, you'd have to know that location. There might be many reasons for somebody doing this, such as wanting to discourage reuse by other programs, or because there are large resources within the framework that should only be installed in a known shared location.

Adding a COM interface to an existing application (EXE)

I intend to add a COM interface to an existing application (which, by the way, is written in C++ using Win32). I have some experience using COM objects, so I know the basic COM concepts of interfaces, etc., but this is the first time I'm actually implementing a component.
Ultimately I want to be able to use the COM interface to automate my application from scripts such as VB. I understand that there are two steps:
My application must act as an out-of-process server (i.e. I have to use MIDL and generate code for a proxy DLL and a stub DLL).
Once I have the server I can add automation capabilities by implementing the IDispatch interface.
Since the server-in-an-EXE thing with MIDL and what not is already a bit steep, I wanted to get a grasp on all that first before moving on to IDispatch.
I am reading the book "Inside COM" by Dale Rogerson and have completed the chapter on servers in EXEs (the following chapter will cover Automation).
The "Servers in EXEs" chapter provides example code that implements a server and a client. But it is necessary to start the server manually. This confuses me. Obviously, when my application (= server) is used by a client process, this extra manual step should not be necessary. Is there no mechanism to start the server automatically? Or is automation necessary to achieve that? At the moment, the prospect of having to start my server manually (once I even have one) makes me doubt I am moving in the right direction.
Hopefully someone with more knowledge of this can see what information I'm missing and point me in the right direction.
No, COM servers are not normally started by hand. Not sure why the book proposed it, possibly because it wanted to avoid talking about the registry keys you need to allow COM to automatically start the EXE. It isn't otherwise very complicated, you register the Application coclass of your app with the LocalServer32 key value giving the path to the EXE.
It is however not completely uncommon, especially with an existing program. One design decision to make is whether you let the client code completely control your program. Or if your program already has an existing user interface but you also want to expose services to other code. In the latter case it makes sense to let the user start the app by hand, like she'd normally does.
When your application is registered as LocalServer32, it will be invoked with the commandline specified there if no running process has registered a factory object for your CLSID yet.
This way, you can get the best of both worlds -- if the application is running already, this instance can provide the server side, and if it isn't, it will be started.
Automation is completely orthogonal to that -- your component becomes Automation compatible by implementing IDispatch.

NSBundle sandboxing in Objective-C

Odd title, I know. What I REALLY mean to ask is something like "can I create a TrustedBSD MAC Framework or iOS sandbox for NSBundles?" I have a very dynamic application in which modules can be "plugged in" or "un plugged" to a "switchboard" at any time. The user operating it is on another computer, perhaps miles away. There's no way for users to know how the switchboard modules can be compliant and not wreck the system in their absence. I know iOS sandboxing and TrustedBSD (as well as Lion's Seatbelt.framework) can do this for processes, but how can I do this for bundle code?
Things I've thought of:
Static binary analysis of all posix calls and ObjC messaging (seems impossible as binaries are.... binary at this point, not code).
Tracing ObjC messages at runtime and logging them, locking the module out after a single stray call. (This wouldn't affect POSIX calls or any C calls for that matter, nor assembly, and would have at minimum one call already sent).
Create a sandboxed external process using XPC and load modules there, and use PDO to route switchboard calls to that process (very doable, but I need snow leopard compatibility here).
Any Ideas? I realize that a bundle, once loaded, ends up as PART of your application, so this adds further problems on implementation of security.,
The instant that you load a bundle into your process, you are giving it full control over your process. (This is because a bundle can contain code that runs at initialization time, before you even call any of its functions.)
Tracing messages at runtime is pointless, as a malicious bundle could simply disable the code responsible for message tracing before it fires. (It's running in the same memory space, after all, and it needn't use any ObjC messages to do this.)
In short: do not load bundles you do not trust. If you must make use of an untrusted bundle, use some form of sandboxing (whether that's XPC or something else).

VB: get compiled DLL's calling application info; COM security

Through COM, one can potentially gain absolute control over a target system. For example: using javascript's ActiveXObject object in IE, one can create certain objects which were designed to have direct access or interaction with system properties and files. One would think common sense dictates users disable ActiveX features in IE immediately after installing the browser to ensure their system is protected while surfing the net, or at least paying close attention to which websites they permit. But, I doubt many average PC users know how or why to do this, or just get tired of mirco-managing it over time. I think any PC user or admin my COM class caters to would greatly appreciate not having to deal with that. Thankfully it looks like IE versions come packaged with ActiveX disabled by default nowadays.
I've built a very versatile COM class library in VB. I didn't intend for it to be callable from any website, but that feature is just part of the COM platform. I'd like to prevent the library from being called from IE unless the website is on a white-listed domain to proactively protect the user (and ultimately their entire intranet) from harm from malicious websites. What would be the best method in VB.Net to tell which application called my DLL, to be able to tell if it was called from any command or process originating from IE? And, what domain called my dll?
Edit: I believe this might be a duplicate. See: Calling Assembly to get Application Name VB.NET
System.Environment.GetCommandLineArgs()(0) gets me the calling application path. With this info, I can compare it to a black/white-list of applications. Problem solved for now.
Don't mark the control as Safe for Scripting.
Default security settings will not allow such controls to be scripted.
Self-answer, and possibly duplicate, I suppose. See System.Environment.GetCommandLineArgs()(0) from Calling Assembly to get Application Name VB.NET
In this case, the class never was marked as safe for scripting and the intent was already never to mark it safe. The issue was how to obtain the calling application info so I could add additional security measures in case those which the calling application had were not enough.