When creating an iOS program is there any performance hit if I pass around SELs (#selectors) and invoke them in other classes? Is this significantly slower than normal method invocations?
Why would there be a performance hit for messaging (the one thing ObjC is famous for) from other classes? Of course, compared to C functions, there is some overhead (thanks to the addition of two more parts to a method). Selectors are simply data types, so passing them to type SEL is no more costly than sending a BOOL or an int over. However, to actually call a SEL type from a passed selector, the creation of an NSInvocation object is recommended, which would slightly increase the overhead time.
And you are more or less safe in objC, as messages to nil (you did mention other classes), produce nil.
Well it might not have much of effect except on first run as compiler creates a reference to every objects and it's dependencies along with classes at compile time , so it might be a bit slow at loading time(that too in very large programs..) but not much after that provided not very big objects are created in the intermediate steps and also it doesn't involve any large dynamic operation as here i am talking only about calling a local function and calling the same function from a different class.
Anyways why would you use selector to reference to some function in a different class.
From my limited knowledge, a selector is simply an encoding of a method name. Given that in objective-c methods are called by sending messages to objects, I don't see why there should be a performance difference between an explicit method call ([object method]) and an implicit call ([objectDelegate selector]).
Related
I often use simple non compile-time immutable objects: like an array #[#"a", #"b"] or a dictionary #{#"a": #"b"}.
I struggle between reallocating them all the time:
- (void)doSomeStuff {
NSArray<NSString *> *fileTypes = #[#"h", #"m"];
// use fileTypes
}
And allocating them once:
- (void)doSomeStuff {
static NSArray<NSString *> * fileTypes;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
fileTypes = #[#"h", #"m"];
});
// use fileTypes
}
Are there recommendations on when to use each construct? Like:
depending on the size of the allocated object
depending on the frequency of the allocation
depending on the device (iPhone 4 vs iMac 2016)
...
How do I figure it out?
Your bullet list is a good start. Memory would be another consideration. A static variable will stay in memory from the time it is actually initialized until the termination of the app. Obviously a local variable will be deallocated at the end of the method (assuming no further references).
Readability is something to consider too. The reallocation line is much easier to read than the dispatch_once setup.
For a little array, I'd reallocate as the first choice. The overhead is tiny. Unless you are creating the array in a tight loop, performance will be negligible.
I would use dispatch_once and a static variable for things that take more overhead such as creating a date formatter. But then there is the overhead of reacting to the user changing the device's locale.
In the end, my thought process is to first use reallocation. Then I consider whether there is tangible benefit to using static and dispatch_once. If there isn't a worthwhile reason to use a static, I leave it a local variable.
Use static if the overhead (speed) of reallocation is too much (but not if the permanent memory hit is too large).
Your second approach is more complex. So you should use it, if it is needed, but not as a default.
Typically this is done when the object creating is extreme expensive (almost never) or if you need a single identity of the instance object (shared instance, sometimes called singleton, what is incorrect.) In such a case you will recognize that you need it.
Although this question could probably be closed as "primarily opinion-based", I think this question addresses a choice that programmers frequently make.
The obvious answer is: don't optimize, and if you do, profile first - because most of the time you'll intuitively address the wrong part of your code.
That being said, here's how I address this.
If the needed object is expensive to build (like the formatters per documentation) and lightweight on resources, reuse / cache the object.
If the object is cheap to build, create a new instance when needed.
For crucial things that run on the main thread (UI stuff) I tend to draw the line more strict and cache earlier.
Depending on what you need those objects for, sometimes there are valid alternatives that are cheaper but offer a similar programming comfort. For example, if you wanted to look up some chessboard coordinates, you could either build a dictionary {#"a1" : 0, #"b1" : 1, ...}, use an array alone and the indices, take a plain C array (which now has a much less price tag attached to it), or do a small integer-based calculation.
Another point to consider is where to put the cached object. For example, you could store it statically in the method as in your example. Or you could make it an instance variable. Or a class property / static variable. Sometimes caching is only the half way to the goal - if you consider a table view with cells including a date formatter, you can think about reusing the very same formatter for all your cells. Sometimes you can refactor that reused object into a helper object (be it a singleton or not), and address some issues there.
So there is really no golden bullet here, and each situation needs an individual approach. Just don't fall into the trap of premature optimization and trade clear code for bug-inviting, hardly readable stuff that might not matter to your performance but comes with sure drawbacks like increased memory footprint.
dispatch_once_t
This allows two major benefits: 1) a method is guaranteed to be called only once, during the lifetime of an application run, and 2) it can be used to implement lazy initialization, as reported in the Man Page below.
From the OS X Man Page for dispatch_once():
The dispatch_once() function provides a simple and efficient mechanism to run an initializer exactly once, similar to pthread_once(3). Well designed code hides the use of lazy initialization.
Some use cases for dispatch_once_t
Singleton
Initialization of a file system resource, such as a file handle
Any static variable that would be shared by a group of instances, and takes up a lot of memory
static, without dispatch_once_t
A statically declared variable that is not wrapped in a dispatch_once block still has the benefit of being shared by many instances. For example, if you have a static variable called defaultColor, all instances of the object see the same value. It is therefore class-specific, instead of instance-specific.
However, any time you need a guarantee that a block will be called only once, you will need to use dispatch_once_t.
Immutability
You also mentioned immutability. Immutability is independent of the concern of running something once and only once--so there are cases for both static immutable variables and instance immutable variables. For instance, there may be times when you need to have an immutable object initialized, but it still may be different for each instance (in cases where it's value depends on other instance variables). In that case, an immutable object is not static, and still may be initialized with different values from multiple instances. In this case, a property is derived from other instance variables, and therefore should not be allowed to be changed externally.
A note on immutability vs mutability, from Concepts in Objective-C Programming:
Consider a scenario where all objects are capable of being mutated. In your application you invoke a method and are handed back a reference to an object representing a string. You use this string in your user interface to identify a particular piece of data. Now another subsystem in your application gets its own reference to that same string and decides to mutate it. Suddenly your label has changed out from under you. Things can become even more dire if, for instance, you get a reference to an array that you use to populate a table view. The user selects a row corresponding to an object in the array that has been removed by some code elsewhere in the program, and problems ensue. Immutability is a guarantee that an object won’t unexpectedly change in value while you’re using it.
Objects that are good candidates for immutability are ones that encapsulate collections of discrete values or contain values that are stored in buffers (which are themselves kinds of collections, either of characters or bytes). But not all such value objects necessarily benefit from having mutable versions. Objects that contain a single simple value, such as instances of NSNumber or NSDate, are not good candidates for mutability. When the represented value changes in these cases, it makes more sense to replace the old instance with a new instance.
A note on performance, from the same reference:
Performance is also a reason for immutable versions of objects representing things such as strings and dictionaries. Mutable objects for basic entities such as strings and dictionaries bring some overhead with them. Because they must dynamically manage a changeable backing store—allocating and deallocating chunks of memory as needed—mutable objects can be less efficient than their immutable counterparts.
Why does the Objective-C compiler need to know at compile-time the signature of the methods that will be invoked on objects when it could defer that to runtime (i.e., dynamic binding)? For example, if I write [foo someMethod], why it is necessary for the compiler to know the signature of someMethod?
Because of calling conventions at a minimum (with ARC, there are more reasons, but calling conventions have always been a problem).
You may have been told that [foo someMethod] is converted into a function call:
objc_msgSend(foo, #selector(someMethod))
This, however, isn't exactly true. It may be converted to a number of different function calls depending on what it returns (and what's returned matters whether you use the result or not). For instance, if it returns an object or an integer, it'll use objc_msgSend, but if it returns a structure (on both ARM and Intel) it'll use objc_msgSend_stret, and if it returns a floating point on Intel (but not ARM I believe), it'll use objc_msgSend_fpret. This is all because on different processors the calling conventions (how you set up the stack and registers, and where the result is stored) are different depending on the result.
It also matters what the parameters are and how many there are (the number can be inferred from ObjC method names unless they're varargs... right, you have to deal with varargs, too). On some processors, the first several parameters may be put in registers, while later parameters may be put on the stack. If your function takes a varargs, then the calling convention may be different still. All of that has to be known in order to compile the function call.
ObjC could be implemented as a more pure object model to avoid all of this (as other, more dynamic languages do), but it would be at the cost of performance (both space and time). ObjC can make method calls surprisingly cheap given the level of dynamic dispatch, and can easily work with pure C machine types, but the cost of that is that we have to let the compiler know more specifics about our method signatures.
BTW, this can (and every so often does) lead to really horrible bugs. If you have a couple of methods:
- (MyPointObject *)point;
- (CGPoint)point;
Maybe they're defined in completely different files as methods on different classes. But if the compiler chooses the wrong definition (such as when you're sending a message to id), then the result you get back from -point can be complete garbage. This is a very, very hard bug to figure out when it happens (and I've had it happen to me).
For a bit more background, you may enjoy Greg Parker's article explaining objc_msgSend_stret and objc_msgSend_fpret. Mike Ash also has an excellent introduction to this topic. And if you want to go deep down this rabbit hole, you can see bbum's instruction-by-instruction investigation of objc_msgSend. It's outdated now, pre-ARC, and only covers x86_64 (since every architecture needs its own implementation), but is still highly educational and recommended.
my question as the title says.obviously, the first parameter was used for this pointer , in some taste of c++.what about the second one? thak you.
The signature of objc_msgSend() is:
id objc_msgSend(id self, SEL op, ...);
Every method call is compiled down to a call to this function. I.e., if you call:
[anArray objectAtIndex:42];
That will be compiled as if it were:
objc_msgSend(anArray, #selector(objectAtIndex:), 42);
Now, to your question, why do methods get compiled down to a function that has the SEL as the second argument. Or, more specifically, why is this method:
- (id)objectAtIndex:(NSUInteger)index;
Exactly equivalent to this C function:
id object_at_index(id object, SEL _cmd, NSUInteger index);
The answer is speed speed speed.
Speed
Specifically, by doing this, then objc_msgSend() never has to rewrite the stack frame* and it can also use a tail call optimization to jump directly to the method invocation. This is the same reason why you never see objc_msgSend() in backtraces in the debugger (save for when you actually crash/break in the messenger).
objc_msgSend() uses the object and the _cmd to look up the implementation of the method and then, quite literally, jumps to that implementation.
Very fast. Stack frame untouched.
And, as others have stated, having _cmd around in the method implementation can be handy for a variety of reasons. As well, it also means that the messenger can do neat tricks like proxy support via NSInvocation and the like.
*rewriting the stack frame can be insanely complex and expensive. Some of the arguments might be in registers some of the time, etc... All architecture dependent ABI nastiness. One of the biggest challenges to writing things like imp_implementationWithBlock() was figuring out how to do so without touching the stack because doing so would have been too slow and too bloated to be viable.
The purpose of having the second parameter contain the selector is to enable a common dispatch mechanism. As such, the method dispatch code always expects the second parameter to be the selector, and dispatches based on that, or follows the inheritance chain up, or even creates an NSInvocation and calls forwardInvocation:.
Generally, only system-level routines use the selector argument, although it's rather nice to have it when you hit an exception or are in the debugger trying to figure out what routine is giving you difficulties if you are using forwardInvocation
From the documentation:
Discussion
This data type is a pointer to the start of the function that implements the method. This function uses standard C calling conventions as implemented for the current CPU architecture. The first argument is a pointer to self (that is, the memory for the particular instance of this class, or, for a class method, a pointer to the metaclass). The second argument is the method selector. The method arguments follow.
In Objective-C when you call a method you need to know the target, the selector and the eventual arguments. Let's suppose that you are trying to do this manually: how can you know which method to call if you don't know the selector? Do you call some random method? No, you call the right method because you know the method name.
I am having problem understanding part of the function of "selectors", as described in Apple's guide. I've bolded the parts where I am getting confused:
In Objective-C, selector has two meanings. It can be used to refer
simply to the name of a method when it’s used in a source-code message
to an object. It also, though, refers to the unique identifier that
replaces the name when the source code is compiled. Compiled
selectors are of type SEL. All methods with the same name have the
same selector. You can use a selector to invoke a method on an
object—this provides the basis for the implementation of the
target-action design pattern in Cocoa.
Methods and Selectors For efficiency, full ASCII names are not used as
method selectors in compiled code. Instead, the compiler writes each
method name into a table, then pairs the name with a unique identifier
that represents the method at runtime. The runtime system makes sure
each identifier is unique: No two selectors are the same, and all
methods with the same name have the same selector.
Can anyone explain these bits? Additionally, if different classes have methods with the same name, will they also have the same selector?
All selectors are uniqued -- both at compile time and, dynamically, at runtime through sel_getUid() or the preferred sel_registerName() (the latter being largely preferred, the former still around for historical reasons) -- for speed.
Back story: to call a method, the runtime needs a selector that identifies what is to be called and an object that it will be called on. This is why every single method call in Objective-C has two parameters: the obvious and well known self and the invisible, implied, parameter _cmd. _cmd is the SEL of the method currently executing. That is, you can paste this code into any method to see the name -- the selector -- of the currently executing method:
NSLog(#"%#", NSStringFromSelector(_cmd));
Note that _cmd is not a global; it really is an argument to your method. See below.
By uniquing the selectors, all selector based operations are implemented using pointer equality tests instead of string processing or any pointer de-referencing at all.
In particular, every single time you make a method call:
[someObject doSomething: toThis withOptions: flags]; // calls SEL doSomething:withOptions:
The compiler generates this code (or a very closely related variant):
objc_msgSend(someObject, #selector(doSomething:withOptions:), toThis, flags);
The very first thing objc_msgSend() does is check to see if someObject is nil and short-circuit if it is (nil-eats-message). The next (ignoring tagged pointers) is to look up the selector in someObjects class (the isa pointer, in fact), find the implementation, and call it (using a tail call optimization).
That find the implementation thing has to be fast and to make it really fast, you want the key to finding the implementation of the method to be as fast and stable as possible. To do that, you want the key to be directly usable and globally unique to the process.
Thus, the selectors are uniqued.
That it also happens to save memory is an fantastic benefit, but the messenger would use more memory than it does today if messenging could be made 2x faster (but not 10x for 2x -- or even 2x memory for 2x speed -- while speed is critical, memory use is also critical, certainly).
If you really want to dive deep on how objc_msgSend() works, I wrote a bit of a guide. Note that it is slightly out of date as it was written before tagged pointers, blocks-as-implementation, and ARC were disclosed. I should update the articles.
Yes. Classes do share selectors.
I can give an example from the source code objc-sel.mm, but when you use sel_registerUid() (used behind the scenes in #selector()),
It copies the input string into an internal buffer (if the string hasn't been registered before), for which all future SELs point to.
This is done for less memory usage, and easier message forwarding.
It also, though, refers to the unique identifier that replaces the name when the source code is compiled... All methods with the same name have the same selector.
For this, I refer to an excellent blog post on selectors:
A selector is the same for all methods that have the same name and parameters — regardless of which objects define them, whether those objects are related in the class hierarchy, or actually have nothing to do with each other. At runtime, Objective-C goes to the class and outright asks it, "Do you respond to this selector?", and calls the resulting function pointer if it does.
The runtime system makes sure each identifier is unique: No two selectors are the same, and all methods with the same name have the same selector.
In a screwed up way, this makes sense. If Method A and Method B have the exact same name and arguments, wouldn't it be more efficient to store their selector as one lookup and query the receiver instead of deciding between two basically equally named selectors at runtime?
Look at the SEL type, you don't have to define which class this selector is from, you just give it a method name, for example:
SEL animationSelector = #selector(addAnimation:forKey:);
You can imagine it as a streetname, for example. Many cities can have the same streetnames, but a streetname without a city is worthless. Same for selectors, you can define a selector without adding the object where it's in. But it's complete worthless without fitting class..
In Objective-C, when I want to call a subroutine, I send a message to an object, like:
[self mySubroutine:myParameter];
There is a (negligible?) performance penalty, so I could just use a C-style function call:
mySubroutine(myParameter);
The implementation of the latter would then reside outside the class’s #implementation context.
Is this a no-no? Is it common? Is there a best-practice on this?
Note that those are not necessarily equivalent. Since -mySubroutine is an instance method, it probably needs to access a given instance. In that case, your mySubroutine() function should have another parameter for the instance, too.
In general, use a method. If you’re worried about performance,1 you can always get an IMP to the method and use it as a function instead of the standard Objective-C message dispatch infrastructure.
That said, some disadvantages of using functions:
They cannot be overridden by subclasses;
There’s no introspection (when using the runtime to obtain a list of methods declared by an Objective-C class, functions aren’t enumerated);
They cannot be used as accessors/mutators of declared properties;
They aren’t visible to Key-Value Coding;
They cannot be directly used for Objective-C message forwarding;
They cannot be directly used in the various cases where a Cocoa API expects a selector (e.g. when using NSTimer).
Some advantages of using functions:
They cannot be overridden by subclasses (if you want to prevent this);
There’s no introspection (if you want to prevent this);
They can be inlined;
They can have file scope (static), preventing code from other files from accessing them.
1When you’ve determined that the message dispatch infrastructure is actually a bottleneck. This does happen; for instance, some Apple audio examples do not use Objective-C for audio processing.
Edit: Based on OP’s comment, another advantage of functions is that they aren’t necessarily related to a class. If the computation of an approximate value for the sine of an angle doesn’t depend on an Objective-C instance, there’s no need to make it a method — a function is a better fit.
It might be worth using where you have static utility functions, such as in a maths library.
In general though, if you need methods that act on the state of an object, the C approach won't be much use, as you won't have implicit access to self, unless you explicitly pass it as a parameter.
You may also run into namespace issues. With the Objective-C different classes can share method names, with the c approach all your functions will need different signatures.
Personally I would always use objective-c methods, the performance difference will be negligible.