What does Objective-C program's startup routine do? - objective-c

I know that information about all loaded classes is gathered at the startup time. But I could not find any information on how it is done and how is Objective-C startup routine looks compared to a plain C program's startup routine.
I'm just wondering what was added in Objective-C from this point. Is Objective-C program a C program with some additions or it is completely different by its structure?

You should take a look at this article from Cocoa with Love: http://cocoawithlove.com/2008/03/cocoa-application-startup.html , which gives a good overview.
However, if you really want to know what's going on, it's going to be a bit of a dig, but you can look at the source to the runtime at http://opensource.apple.com/ . Look for the objc4* project inside of whichever OS you are interested in. Look to objcrt.c for the top of the initialization chain.
You asked two discrete questions in your original post: what the startup routine looks like (which is covered in the runtime) and "Is Objective-C program a C program with some additions". The answer to the latter is yes, it is a C program with some additions, in the same manner as C++. And, like C++, it contains some pretty significant additions to the runtime.


Does functions in API make system calls themselves or system calls made by API are aided by system-call interface in the runtime support system?

I was going through the Dinosaur book by Galvin where I faced the difficulty as asked in the question.
Typically application developers design programs according to an application programming interface (API). The API specifies a set of functions that are available to an application programmer, including the parameters that are passed to each function and the return values the programmer can expect.
The text adds that:
Behind the scenes the functions that make up an API typically invoke the actual system calls on behalf of the application programmer. For example, the Win32 function CreateProcess() (which unsurprisingly is used to create a new process) actually calls the NTCreateProcess() system call in the Windows kernel.
From the above two points I came to know that: Programmers using the API, make the function calls to the API corresponding to the system call which they want to make. The concerning function in the API then actually makes the system call.
Next what the text says confuses me a bit:
The run-time support system (a set of functions built into libraries included with a compiler) for most programming languages provides a system-call interface that serves as the link to system calls made available by the operating system. The system-call interface intercepts function calls in the API and invokes the necessary system calls within the operating system. Typically, a number is associated with each system call, and the system-call interface maintains a table indexed according to these numbers. The system call interface then invokes the intended system call in the operating-system kernel and returns the status of the system call and any return values.
The above excerpt makes me feel that the functions in the API does not make the system calls directly. There are probably function built into the system-call interface of the runtime support system, which are waiting for an event of system call from the function in the API.
The above is a diagram in the text explaining the working of the system call interface.
The text later explains the working of a system call in the C standard library as follows:
which is quite clear.
I don't totally understand the terminology of the excerpts you shared. Some terminology is also wrong like in the blue image at the bottom. It says the standard C library provides system call interfaces while it doesn't. The standard C library is just a standard. It is a convention. It just says that, if you write a certain code, then the effect of that code when it is ran should be according to the convention. It also says that the C library intercepts printf() calls while it doesn't. This is general terminology which is confusing at best.
The C library doesn't intercept calls. As an example, on Linux, the open source implementation of the C standard library is glibc. You can browse it's source code here: https://elixir.bootlin.com/glibc/latest/source. When you write C/C++ code, you use standard functions which are specified in the C/C++ convention.
When you write code, this code will be compiled to assembly and then to machine code. Assembly is also a higher level representation of machine code. It is just closer to the actual code as it is easier to translate to it then C/C++. The easiest case to understand is when you compile code statically. When you compile code statically, all code is included in your executable. For example, if you write
#include <stdio.h>
int main() {
printf("Hello, World!");
return 0;
the printf() function is called in stdio.h which is a header provided by gcc written specifically for one OS or a set of UNIX-like OSes. This header provides prototypes which are defined in other .c files provided by glibc. These .c files provide the actual implementation of printf(). The printf() function will make a system call which rely on the presence of an OS like Linux to run. When you compile statically, the code is all included up to the system call. You can see my answer here: Who sets the RIP register when you call the clone syscall?. It specifically explains how system calls are made.
In the end you'll have something like assembly code pushing some arguments into some conventionnal registers then the actual syscall instruction which jumps to an MSR. I don't totally understand the mechanism behind printf() but it will jump to the Linux kernel's implementation of the write system call which will write to the console and return.
I think what confuses you is that the "runtime-support system" is probably referring to higher level languages which are not compiled to machine code directly like Python or Java. Java has a virtual machine which translates the bytecode produced by compilation to machine code during runtime using a virtual machine. It can be confusing to not make this distinction when talking about different languages. Maybe your book is lacking examples.

What is formal comp. sci. name of this language property?

As a self-taught programmer, my definitions get fuzzy sometimes.
I'm very used to C and ObjC. In both of those your code must adhere to the language "structure". You can only do certain things in certain places. As an example, this is an error:
// beginning of file
NSLog(#"Hello world!"); // can't do this
#implementation MYClass
However, in Ruby, anything you put anywhere is executed as the interpreter goes through it. So what is the difference between Ruby and Objective-C that allows this?
At first I thought it was that one was interpreted and the other compiled. Then I read some SO posts and the wikipedia definitions. Interpreted or compiled is a property of the implementation not the language. So that would mean there could (theoretically) be an interpreted implementation of Objective-C? In that case, the fact that a statement cannot be outside the implementation can't be a property of compiled languages, and vice-versa if there was a compiled implementation of Ruby. Or am I wrong in assuming that different implementations of a language would work the same way?
I'm not sure there's a technical term for it, but in most programming languages the context of the statement is extremely important.
Ruby has a concept of a root or main context where code is allowed. Other scripting languages follow this convention, presumably made popular by languages like Perl which allowed for very concise programming.
This allows things like this to be a complete and valid program:
print "Hello world!\n"
In other languages you need to define an entry point, such as a main routine, that is executed instead. Arbitrary code is not really allowed at the top level, which instead is reserved for things like function, type, constant, structure and class definitions.
A language like Ruby has a lot of control over the order in which the code is executed. C, by comparison, is usually composed of separate source files that are then linked together, where there's no inherent order to the way things are linked. All the modules are simply assembled into the final library or executable. This is why the main entry point is required, it defines which function to run first.
In short, it boils down to syntax, context, and language design considerations.
Ruby hides lots of stuff.
Ruby is OO like C++, Objective C and Java, and has main like C but you don't see this.
puts(42) is method call. It is a method of the main object called main. You can see it by typing puts self.
If you don't specify the receiver (receiver.method()) Ruby will use the implicit one, main.
Check available methods:
puts Object.private_methods.sort
Why you can put everything anywhere?
C/C++ look for main method called main, and when C/C++ find it, it will be executed.
Ruby on other hands doesn't need main or other method/class to run first.
It execute code from the first line until it meet the end of file(or __END__ on the separate line).
class Strongman
puts "I'm the best!"
is just syntactic sugar for Class.new method:
Strongman = Class.new do
puts "I'm the best!"
The same goes for 'module`.
for calls each and returns some kind of object. So you may think of it as something similar to method.
a = for i in 1..12; 42;end
puts a
# 1..12
In the end, it doesn't matter if it is method call or some kind of structure like C's int main(). Programming language decides what it should run first.

Is overriding Objective-C framework methods ever a good idea?

ObjC has a very unique way of overriding methods. Specifically, that you can override functions in OSX's own framework. Via "categories" or "Swizzling". You can even override "buried" functions only used internally.
Can someone provide me with an example where there was a good reason to do this? Something you would use in released commercial software and not just some hacked up tool for internal use?
For example, maybe you wanted to improve on some built in method, or maybe there was a bug in a framework method you wanted to fix.
Also, can you explain why this can best be done with features in ObjC, and not in C++ / Java and the like. I mean, I've heard of the ability to load a C library, but allow certain functions to be replaced, with functions of the same name that were previously loaded. How is ObjC better at modifying library behaviour than that?
If you're extending the question from mere swizzling to actual library modification then I can think of useful examples.
As of iOS 5, NSURLConnection provides sendAsynchronousRequest:queue:completionHandler:, which is a block (/closure) driven way to perform an asynchronous load from any resource identifiable with a URL (local or remote). It's a very useful way to be able to proceed as it makes your code cleaner and smaller than the classical delegate alternative and is much more likely to keep the related parts of your code close to one another.
That method isn't supplied in iOS 4. So what I've done in my project is that, when the application is launched (via a suitable + (void)load), I check whether the method is defined. If not I patch an implementation of it onto the class. Henceforth every other part of the program can be written to the iOS 5 specification without performing any sort of version or availability check exactly as if I was targeting iOS 5 only, except that it'll also run on iOS 4.
In Java or C++ I guess the same sort of thing would be achieved by creating your own class to issue URL connections that performs a runtime check each time it is called. That's a worse solution because it's more difficult to step back from. This way around if I decide one day to support iOS 5 only I simply delete the source file that adds my implementation of sendAsynchronousRequest:.... Nothing else changes.
As for method swizzling, the only times I see it suggested are where somebody wants to change the functionality of an existing class and doesn't have access to the code in which the class is created. So you're usually talking about trying to modify logically opaque code from the outside by making assumptions about its implementation. I wouldn't really support that as an idea on any language. I guess it gets recommended more in Objective-C because Apple are more prone to making things opaque (see, e.g. every app that wanted to show a customised camera view prior to iOS 3.1, every app that wanted to perform custom processing on camera input prior to iOS 4.0, etc), rather than because it's a good idea in Objective-C. It isn't.
EDIT: so, in further exposition — I can't post full code because I wrote it as part of my job, but I have a class named NSURLConnectionAsyncForiOS4 with an implementation of sendAsynchronousRequest:queue:completionHandler:. That implementation is actually quite trivial, just dispatching an operation to the nominated queue that does a synchronous load via the old sendSynchronousRequest:... interface and then posts the results from that on to the handler.
That class has a + (void)load, which is the class method you add to a class that will be issued immediately after that class has been loaded into memory, effectively as a global constructor for the metaclass and with all the usual caveats.
In my +load I use the Objective-C runtime directly via its C interface to check whether sendAsynchronousRequest:... is defined on NSURLConnection. If it isn't then I add my implementation to NSURLConnection, so from henceforth it is defined. This explicitly isn't swizzling — I'm not adjusting the existing implementation of anything, I'm just adding a user-supplied implementation of something if Apple's isn't available. Relevant runtime calls are objc_getClass, class_getClassMethod and class_addMethod.
In the rest of the code, whenever I want to perform an asynchronous URL connection I just write e.g.
[NSURLConnection sendAsynchronousRequest:request
queue:[self anyBackgroundOperationQueue]
^(NSURLResponse *response, NSData *data, NSError *blockError)
// oh dear; was it fatal?
// hooray! You know, unless this was an HTTP request, in
// which case I should check the response code, etc.
/* etc */
So the rest of my code is just written to the iOS 5 API and neither knows nor cares that I have a shim somewhere else to provide that one microscopic part of the iOS 5 changes on iOS 4. And, as I say, when I stop supporting iOS 4 I'll just delete the shim from the project and all the rest of my code will continue not to know or to care.
I had similar code to supply an alternative partial implementation of NSJSONSerialization (which dynamically created a new class in the runtime and copied methods to it); the one adjustment you need to make is that references to NSJSONSerialization elsewhere will be resolved once at load time by the linker, which you don't really want. So I added a quick #define of NSJSONSerialization to NSClassFromString(#"NSJSONSerialization") in my precompiled header. Which is less functionally neat but a similar line of action in terms of finding a way to keep iOS 4 support for the time being while just writing the rest of the project to the iOS 5 standards.
There are both good and bad cases. Since you didn't mention anything in particular these examples will be all-over-the-place.
It's perfectly normal (good idea) to override framework methods when subclassing:
When subclassing NSView (from the AppKit.framework), it's expected that you override drawRect:(NSRect). It's the mechanism used for drawing views.
When creating a custom NSMenu, you could override insertItemWithTitle:action:keyEquivalent:atIndex: and any other methods...
The main thing when subclassing is whether or not your behaviour completes re-defines the old behaviour... or extends it (in which case your override eventually calls [super ...];)
That said, however, you should always stand clear of using (and overriding) any private API methods (those normally have an underscore prefix in their name). This is a bad idea.
You also should not override existing methods via categories. That's also bad. It has undefined behaviour.
If you're talking about categories, you don't override methods with them (because there is no way to call original method, like calling super when subclassing), but only completely replace with your own ones, which makes the whole idea mostly pointless. Categories are only useful for safely extending functionality, and that's the only use I have even seen (and which is a very good, an excellent idea), although indeed they can be used for dangerous things.
If you mean overriding by subclassing, that is not unique. But in Obj-C you can override everything, even private undocumented methods, not just what was declared 'overridable' like in other languages. Personally, I think it's nice, as I remember in Delphi and C++ I used to “hack” access to private and protected members to workaround an internal bug in framework. This is not a good idea, but at some moments it can be a life saver.
There is also method swizzling, but that's not standard language feature, that's a hack. Hacking undocumented internals is rarely a good idea.
And regarding “how can you explain why this can best be done with features in ObjC”, the answer is simple — Obj-C is dynamic, and this freedom is common to almost all dynamic languages (Javascript, Python, Ruby, Io, a lot more). Unless artificially disabled, every dynamic language has it.
Refer to the wikipedia page on dynamic languages for longer explanation and more examples. For example, an even more miraculous things possible in Obj-C and other dynamic languages is that an object can change it's type (class) in place, without recreation.

Custom performance profiler for Objective C

I want to create a simple to use and lightweight performance profile framework for Objective C. My goal is to measure the bottlenecks of my application.
Just to mention that I am not a beginner and I am aware of Instruments/Time Profiler. This is not what I am looking for. Time Profiler is a great tool but is too developer oriented. I want a framework that can collect performance data from a QA or pre production users and even incorporate in a real production environment to gather the real data.
The main part of this framework is the ability to measure how much time was spent in Objective C message (I am going to profile only Objective C messages).
The easiest way is to start timer in the beginning of a message and stop it at the end. It is the simplest way but its disadvantage is that it is to tedious and error prone - if any message has more than 1 return path then it will require to add the "stop timer" code before each return.
I am thinking of using method swizzling (just to note that I am aware that Apple are not happy with method swizzling but these profiled builds will be used internally only - will not be uploaded on the App Store).
My idea is to mark each message I want to profile and to generate automatically code for the method swizzling method (maybe using macros). When started, the application will swizzle the original selector with the generated one. The generated one will just start a timer, will call the original method and then will stop the timer. So in general the swizzled method will be just a wrapper of the original one.
One of the problems of the above idea is that I cannot think of an easy way how to automatically generate the methods to use for swizzling.
So I greatly will appreciate if anyone has any ideas how to automate the whole process. The perfect scenario is just to write one line of code anywhere mentioning the class and the selector I want to profile and the rest to be generated automatically.
Also will be very thankful if you have any other idea (beside method swizzling) of how to measure the performance.
I came up with a solution that works for me pretty well. First just to clarify that I was unable to find out an easy (and performance fast) way to automatically generate the appropriate swizzled methods for arbitrary selectors (i.e. with arbitrary arguments and return value) using only the selector name. So I had to add the arguments types and the return value for each selector, not only the selector name. In reality it should be relatively easy to create a small tool that would be able to parse all source files and detect automatically what are the arguments types and the returned value of the selector which we want to profile (and prepare the swizzled methods) but right now I don't need such an automated solution.
So right now my solution includes the above ideas for method swizzling, some C++ code and macros to automate and minimize some coding.
First here is the simple C++ class that measures time
class PerfTimer
PerfTimer(PerfProfiledDataCounter* perfProfiledDataCounter);
uint64_t _startTime;
PerfProfiledDataCounter* _perfProfiledDataCounter;
I am using C++ to use that the destructor will be executed when object has exited the current scope. The idea is to create PerfTimer in the beginning of each swizzled method and it will take care of measuring the elapsed time for this method
The PerfProfiledDataCounter is a simple struct that counts the number of execution and the whole elapsed time (so it may find out what is the average time spent).
Also I am creating for each class I'd like profile, a category named "__Performance_Profiler_Category" and to conforms to "__Performance_Profiler_Marker" protocol. For easier creating I am using some macros that automatically create such categories. Also I have a set of macros that take selector name, return type and arguments type and create selectors for each selector name.
For all of the above tasks, I've created a set of macros to help me. Also I have a single file with .mm extension to register all classes and all selectors I'd like to profile. On app start, I am using the runtime to retrieve all classes that conforms to "__Performance_Profiler_Marker" protocol (i.e. the registered ones) and search for selectors that are marked for profiling (these selectors starts with predefined prefix). Note that this .mm file is the only file that needs .mm extension and there is no need to change file extension for each class I want to profile.
Afterwards the code swizzles the original selectors with the profiled ones. In each profiled one, I just create PerfTimer and call the swizzled method.
In brief that is my idea which turned out to work pretty smoothly.

STM32 programming tips and questions

I could not find any good document on internet about STM32 programming. STM's own documents do not explain anything more than register functions. I will greatly appreciate if anyone can explain my following questions?
I noticed that in all example programs that STM provides, local variables for main() are always defined outside of the main() function (with occasional use of static keyword). Is there any reason for that? Should I follow a similar practice? Should I avoid using local variables inside the main?
I have a gloabal variable which is updated within the clock interrupt handle. I am using the same variable inside another function as a loop condition. Don't I need to access this variable using some form of atomic read operation? How can I know that a clock interrupt does not change its value in the middle of the function execution? Should I need to cancel clock interrupt everytime I need to use this variable inside a function? (However, this seems extremely ineffective to me as I use it as loop condition. I believe there should be better ways of doing it).
Keil automatically inserts a startup code which is written in assembly (i.e. startup_stm32f4xx.s). This startup code has the following import statements:
IMPORT SystemInit
IMPORT __main
.In "C", it makes sense. However, in C++ both main and system_init have different names (e.g. _int_main__void). How can this startup code can still work in C++ even without using "extern "C" " (I tried and it worked). How can the c++ linker (armcc --cpp) can associate these statements with the correct functions?
you can use local or global variables, using local in embedded systems has a risk of your stack colliding with your data. with globals you dont have that problem. but this is true no matter where you are, embedded microcontroller, desktop, etc.
I would make a copy of the global in the foreground task that uses it.
unsigned int myglobal;
void fun ( void )
unsigned int myg;
and then only use myg for the rest of the function. Basically you are taking a snapshot and using the snapshot. You would want to do the same thing if you are reading a register, if you want to do multiple things based on a sample of something take one sample of it and make decisions on that one sample, otherwise the item can change between samples. If you are using one global to communicate back and forth to the interrupt handler, well I would use two variables one foreground to interrupt, the other interrupt to foreground. yes, there are times where you need to carefully manage a shared resource like that, normally it has to do with times where you need to do more than one thing, for example if you had several items that all need to change as a group before the handler can see them change then you need to disable the interrupt handler until all the items have changed. here again there is nothing special about embedded microcontrollers this is all basic stuff you would see on a desktop system with a full blown operating system.
Keil knows what they are doing if they support C++ then from a system level they have this worked out. I dont use Keil I use gcc and llvm for microcontrollers like this one.
Here is an example of what I am talking about
stm32 using timer based interrupts, the interrupt handler modifies a variable shared with the foreground task. The foreground task takes a single snapshot of the shared variable (per loop) and if need be uses the snapshot more than once in the loop rather than the shared variable which can change. This is C not C++ I understand that, and I am using gcc and llvm not Keil. (note llvm has known problems optimizing tight while loops, very old bug, dont know why they have no interest in fixing it, llvm works for this example).
Question 1: Local variables
The sample code provided by ST is not particularly efficient or elegant. It gets the job done, but sometimes there are no good reasons for the things they do.
In general, you use always want your variables to have the smallest scope possible. If you only use a variable in one function, define it inside that function. Add the "static" keyword to local variables if and only if you need them to retain their value after the function is done.
In some embedded environments, like the PIC18 architecture with the C18 compiler, local variables are much more expensive (more program space, slower execution time) than global. On the Cortex M3, that is not true, so you should feel free to use local variables. Check the assembly listing and see for yourself.
Question 2: Sharing variables between interrupts and the main loop
People have written entire chapters explaining the answers to this group of questions. Whenever you share a variable between the main loop and an interrupt, you should definitely use the volatile keywords on it. Variables of 32 or fewer bits can be accessed atomically (unless they are misaligned).
If you need to access a larger variable, or two variables at the same time from the main loop, then you will have to disable the clock interrupt while you are accessing the variables. If your interrupt does not require precise timing, this will not be a problem. When you re-enable the interrupt, it will automatically fire if it needs to.
Question 3: main function in C++
I'm not sure. You can use arm-none-eabi-nm (or whatever nm is called in your toolchain) on your object file to see what symbol name the C++ compiler assigns to main(). I would bet that C++ compilers refrain from mangling the main function for this exact reason, but I'm not sure.
STM's sample code is not an exemplar of good coding practice, it is merely intended to exemplify use of their standard peripheral library (assuming those are the examples you are talking about). In some cases it may be that variables are declared external to main() because they are accessed from an interrupt context (shared memory). There is also perhaps a possibility that it was done that way merely to allow the variables to be watched in the debugger from any context; but that is not a reason to copy the technique. My opinion of STM's example code is that it is generally pretty poor even as example code, let alone from a software engineering point of view.
In this case your clock interrupt variable is atomic so long as it is 32bit or less so long as you are not using read-modify-write semantics with multiple writers. You can safely have one writer, and multiple readers regardless. This is true for this particular platform, but not necessarily universally; the answer may be different for 8 or 16 bit systems, or for multi-core systems for example. The variable should be declared volatile in any case.
I am using C++ on STM32 with Keil, and there is no problem. I am not sure why you think that the C++ entry points are different, they are not here (Keil ARM-MDK v4.22a). The start-up code calls SystemInit() which initialises the PLL and memory timing for example, then calls __main() which performs global static initialisation then calls C++ constructors for global static objects before calling main(). If in doubt, step through the code in the debugger. It is important to note that __main() is not the main() function you write for your application, it is a wrapper with different behaviour for C and C++, but which ultimately calls your main() function.