In Objective-C, there are at least two ways to get (or create? Hence the question) a selector: #selector(foo:bar:), or NSSelectorFromString(#"foo:bar:"). But what is the lifetime of a selector?
Since selectors know (at least) their name, they are unlikely to be copiable values of a fixed size that can be shuffled around the program. This means that someone needs to allocate and possibly deallocate them. Most objects from the Cocoa framework have retain-release semantics, which make their ownership explicit and relatively easy to track. However, I see no clear concept of ownership for selectors.
I expect that selectors obtained with the first syntax live as globals for the whole life of the program (like literal strings), but what about the other? If I create/get a selector with NSSelectorFromString(#"foo:bar:"), will it be valid for the whole life of my program too?
It's "get", not "create". Both of those simply retrieve the selector, which is created and owned by the runtime system. The SEL's lifetime is therefore effectively immortal.
If you wanted to create a selector yourself, you would need to use the runtime function sel_registerName(). This function is used by NSSelectorFromString() if you pass it a name which is as yet unknown to the runtime.
As per Apple's documentation, selectors are registered globally and live forever. If you pass a new or unknown selector name to NSSelectorFromString it will be registered as a new selector.
Related
I'm trying to understand the method "withArgs: executeMethod: " in smalltalk, squeak.
1. I am trying to understand what is the role of the method?
2. What arguments need to be passed to it for it to be carried out?
A good way to understand this method is by considering it as a syntactic variant of the general expression
object msg: arg (*)
where object is the receiver of the message with selector msg: and arg its argument. There are of course variants with no or multiple arguments, but the idea is the same.
When object receives this message (*) the Virtual Machine (VM) looks up for the CompiledMethod with selector msg: in the object's hierarchy, and transfers it the control, binding self to object and the formal argument of the method to arg.
Notice that this invocation is managed by the VM, no by the Virtual Image (VI). So, how could we reflect the same in the VI? Well, there are two steps in this behavior (1) find the method and (2) bind its formal receiver and arguments to the actual ones and let it run.
Step (1) is the so called lookup algorithm. It is easily implemented in Smalltalk: just ask the receiver its class, check whether the class includes the selector #msg: and, if not, go to the superclass and repeat. If all checks fail, issue the doesNotUnderstand: message.
Step (2) exactly requires what #withArgs:executeMethod: provides. It allows us to say
object withArgs: {arg} executeMethod: method
where method is the CompiledMethod found in step (1). [We have to use {arg} rather than arg because the plural in withArgs: suggests that the method expects an Array of arguments.]
Why would we want this?
Generally speaking, giving the VI the capability to mimic behavior implemented in the VM is good because it makes metaprogramming easier (and more natural).
More practically, a relevant example of the use of this capability is the implementation of Method Wrappers. Briefly described, given any particular method, you can wrap it (as the wrappee) inside a wrapper method, which also has a preBlock. If you then substitute the original method in the MethodDictionary where it belongs, with the wrapper, you can let the wrapper first execute the preBlock and then the intended method. The first task is easy: just send the message preBlock value. For the second we have the method (the wrappee), the receiver and the arguments (if any). So, to complete the task you only need to send to the receiver withArgs:executeMethod: with the actual argument(s) and the wrappee.
Ah! Let's not forget to mention that one of the reasons for having Method Wrappers is to measure testing coverage.
Note also that withArgs:executeMethod: does not require the second argument, i.e., the method to execute, to be in any class, let alone the class of the receiver. In particular, you could create a CompiledMethod on the fly and execute it on any given object. Of course, it is up to you to make sure that the execution will not crash the VM by, say, using the third ivar of the receiver if the receiver has only two etc. A simple way to create a CompiledMethod without installing it in any class is by asking the Smalltalk compiler to do so (look for senders of newCompiler to learn how to do that).
I'm new to Objective-C and was wondering if anyone could provide any information to clarify this for me. My (possibly wrong) understanding of object instantiation in other languages is that the object will get it's own copies of instance variables as well as instance methods, but I'm noticing that all the literature I've read thus far about Objective-C seems to indicate that the object only gets copies of instance variables, and that even when calling an instance method, program control reverts back to the original method defined inside the class itself. For example, this page from Apple's developer site shows program flow diagrams that suggest this:
https://developer.apple.com/library/mac/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/WorkingwithObjects/WorkingwithObjects.html#//apple_ref/doc/uid/TP40011210-CH4-SW1
Also in Kochan's "Programming in Objective-C", 6th ed., pg. 41, referring to an example fraction class and object, the author states that:
"The first message sends the setNumerator: message to myFraction...control is then sent to the setNumerator: method you defined for your Fraction class...Objective-C...knows that it's the method from this class to use because it knows that myFraction is an object from the Fraction class"
On pg. 42, he continues:
"When you allocate a new object...enough space is reserved in memory to store the object's data, which includes space for its instance variables, plus a little more..."
All of this would seem to indicate to me that there is only ever one copy of any method, the original method defined within the class, and when calling an instance method, Objective-C simply passes control to that original copy and temporarily "wires it" to the called object's instance variables. I know I may not be using the right terminology, but is this correct? It seems logical as creating multiple copies of the same methods would be a waste of memory, but this is causing me to rethink my entire understanding of object instantiation. Any input would be greatly appreciated! Thank you.
Your reasoning is correct. The instance methods are shared by all instances of a class. The reason is, as you suspect, that doing it the other way would be a massive waste of memory.
The temporary wiring you speak of is that each method has an additional hidden parameter passed to it: a pointer to the calling object. Since that gives the method access to the calling object, then it can easily access all of the necessary instance variables and all is well. Note that any static variable exists in only a single instance as well and if you are not aware of that, unexpected things can happen. However, regular local variables are not shared and are recreated for each call of a method.
Apple's documention on the topic is very good so have a look for more info.
Just think of a method as a set of instructions. There is no reason to have a copy of the same method for each object. I think you may be mistaken about other languages as well. Methods are associated with the class, not individual objects.
Yes, your thinking is more or less right (although it's simpler than that: behind the scenes in most such languages methods don't need to be "wired" to anything, they just take an extra parameter for self and insert struct lookups before references to instance variables).
What might be confusing you is that not all languages work this way, in their implementations and semantically. Object-oriented languages are (very roughly) divided into two camps: class-based, like Objective-C; and prototype-based, like Javascript. In the second camp of languages, a method or procedure really is an object in its own right and can often be assigned directly to an object's instance variables as well - there are no classes to lookup methods from, only objects and other objects, all with the same first-class status (this is an oversimplification, good languages still allow for sharing and efficiency).
I am having problem understanding part of the function of "selectors", as described in Apple's guide. I've bolded the parts where I am getting confused:
In Objective-C, selector has two meanings. It can be used to refer
simply to the name of a method when it’s used in a source-code message
to an object. It also, though, refers to the unique identifier that
replaces the name when the source code is compiled. Compiled
selectors are of type SEL. All methods with the same name have the
same selector. You can use a selector to invoke a method on an
object—this provides the basis for the implementation of the
target-action design pattern in Cocoa.
Methods and Selectors For efficiency, full ASCII names are not used as
method selectors in compiled code. Instead, the compiler writes each
method name into a table, then pairs the name with a unique identifier
that represents the method at runtime. The runtime system makes sure
each identifier is unique: No two selectors are the same, and all
methods with the same name have the same selector.
Can anyone explain these bits? Additionally, if different classes have methods with the same name, will they also have the same selector?
All selectors are uniqued -- both at compile time and, dynamically, at runtime through sel_getUid() or the preferred sel_registerName() (the latter being largely preferred, the former still around for historical reasons) -- for speed.
Back story: to call a method, the runtime needs a selector that identifies what is to be called and an object that it will be called on. This is why every single method call in Objective-C has two parameters: the obvious and well known self and the invisible, implied, parameter _cmd. _cmd is the SEL of the method currently executing. That is, you can paste this code into any method to see the name -- the selector -- of the currently executing method:
NSLog(#"%#", NSStringFromSelector(_cmd));
Note that _cmd is not a global; it really is an argument to your method. See below.
By uniquing the selectors, all selector based operations are implemented using pointer equality tests instead of string processing or any pointer de-referencing at all.
In particular, every single time you make a method call:
[someObject doSomething: toThis withOptions: flags]; // calls SEL doSomething:withOptions:
The compiler generates this code (or a very closely related variant):
objc_msgSend(someObject, #selector(doSomething:withOptions:), toThis, flags);
The very first thing objc_msgSend() does is check to see if someObject is nil and short-circuit if it is (nil-eats-message). The next (ignoring tagged pointers) is to look up the selector in someObjects class (the isa pointer, in fact), find the implementation, and call it (using a tail call optimization).
That find the implementation thing has to be fast and to make it really fast, you want the key to finding the implementation of the method to be as fast and stable as possible. To do that, you want the key to be directly usable and globally unique to the process.
Thus, the selectors are uniqued.
That it also happens to save memory is an fantastic benefit, but the messenger would use more memory than it does today if messenging could be made 2x faster (but not 10x for 2x -- or even 2x memory for 2x speed -- while speed is critical, memory use is also critical, certainly).
If you really want to dive deep on how objc_msgSend() works, I wrote a bit of a guide. Note that it is slightly out of date as it was written before tagged pointers, blocks-as-implementation, and ARC were disclosed. I should update the articles.
Yes. Classes do share selectors.
I can give an example from the source code objc-sel.mm, but when you use sel_registerUid() (used behind the scenes in #selector()),
It copies the input string into an internal buffer (if the string hasn't been registered before), for which all future SELs point to.
This is done for less memory usage, and easier message forwarding.
It also, though, refers to the unique identifier that replaces the name when the source code is compiled... All methods with the same name have the same selector.
For this, I refer to an excellent blog post on selectors:
A selector is the same for all methods that have the same name and parameters — regardless of which objects define them, whether those objects are related in the class hierarchy, or actually have nothing to do with each other. At runtime, Objective-C goes to the class and outright asks it, "Do you respond to this selector?", and calls the resulting function pointer if it does.
The runtime system makes sure each identifier is unique: No two selectors are the same, and all methods with the same name have the same selector.
In a screwed up way, this makes sense. If Method A and Method B have the exact same name and arguments, wouldn't it be more efficient to store their selector as one lookup and query the receiver instead of deciding between two basically equally named selectors at runtime?
Look at the SEL type, you don't have to define which class this selector is from, you just give it a method name, for example:
SEL animationSelector = #selector(addAnimation:forKey:);
You can imagine it as a streetname, for example. Many cities can have the same streetnames, but a streetname without a city is worthless. Same for selectors, you can define a selector without adding the object where it's in. But it's complete worthless without fitting class..
I am trying to make a catch-all proxy that takes any selector with any type of arguments and sends an RPC call down the wire. In this case, the signature of the method is not known because it can be created arbitrarily by the end user.
I see that the method signature requires supplying type encodings for each of the arguments. Is there a type encoding that signifies absolutely anything (pointer, int, everything else)? Otherwise, is there another way to accomplish this effect?
"sort of", but not really... or, at least, not practically.
You can implement the method forwarding protocol such that unrecognized method calls will be wrapped up in an NSInvocation and then you could tear into the NSInvocation, but it really isn't practical for a number of reasons.
First, it only really works for relatively simple argument types. The C ABI is such that complex arguments -- structures, C++ objects, etc.. -- can be encoded on the stack in wonky ways. In fact, they can be encoded in ways where there isn't enough metadata to decode the frames.
Secondly, any kind of a system where "selectors can be created arbitrarily by the user" has a very distinct odor about it; an odor of "you are doing it the hard way". Objective-C, while exceptionally dynamic, was really not designed to support this level of pure meta-object pattern.
As well, any such resulting system is going to be exceptionally fragile. What if the "arbitrary selector" happens to be, say, #selector(hash)?
Can you describe in more detail about this catch-all proxy and what it needs to forward to?
If your proxy only needs to forward each message to one target, then it can do so in its -forwardingTargetForSelector: at runtime. If not (e.g. you need to forward to multiple targets or do other complicated manipulation), you need to implement -forwardInvocation: to handle it. Using -forwardInvocation: to handle calls requires you to implement -messageSignatureForSelector: because it needs to get the method signature in order to be able to create the invocation. (Even if you forward it to another object, that object also needs to either implement the method directly, add the method in response to +resolveInstanceMethod:, or handle it using -forwardInvocation:, all of which requires it to also have the signature.)
A method signature encodes the types of the arguments and return type. The reason that this information is needed for an invocation is that when these arguments are passed, they are laid out at compile-time (probably consecutively) in memory according to their types in the declaration. A large struct parameter is going to take up more space than an int parameter. A double is also probably bigger than an int. An invocation needs to store all of these arguments and let you access or change them by index. There is no way to figure out how the arguments are laid out at runtime unless you knew the types (or at least the sizes of the types).
Also, the message passing mechanism is different for methods that return structs (they call objc_msgSend_stret) from other methods (they call objc_msgSend) (and on some platforms, methods that return doubles use objc_msgSend_fpret). In the former case, the struct is not returned directly, but the location to write to is passed as an extra pointer argument as an out parameter. So knowing the return type is also critical to handling the call and the return value in the invocation.
Even if you are forwarding the invocation to some other object, ultimately that object (or some object it forwards to down the line) has to know the method signature somehow. So why not ask that object for the signature of the selector when you need it?
There is no "safe" signature that will work for all things, because different types have different sizes.
When ARC came to Objective-C, I did my best to read through the Objective-C Automatic Reference Counting (ARC) guide posted on the Clang project website to get a better hang of what it was about. What I found there (and no where else) was mention of using __attribute__ declarations to signify to ARC whether certain code autoreleases its return value, for instance (__attribute__((ns_returns_autoreleased))), or whether it 'consumes' a parameter (__attribute((ns_consumed)), and so on.
However, it seems that the guide gives very little word on the actual level of necessity these declarations hold. Excluding them seems to make no difference, neither when running the static analyzer nor when running the project itself. Do these even make a difference? Is there any advantage to labeling a method with __attribute__((objc_method_family(new)))? No article I've found on ARC makes mention of these specifiers at all; perhaps an ARC guru can give word on what these are used for.
(Personally, I include all relevant specifiers just in case, but find that they make code obfuscated and messy.)
These attributes are expressly for abnormal cases, such as:
A function or method parameter of retainable object pointer type may be marked as consumed, signifying that the callee expects to take ownership of a +1 retain count.
A function or method which returns a retainable object pointer type may be marked as returning a retained value, signifying that the caller expects to take ownership of a +1 retain count.
You don't normally do these things, so you don't normally use these attributes. With no attributes, the normal behavior—the NARC rule, or perhaps under ARC I should say CAN—is what the compiler implements and expects.
There are two reasons to use these attributes:
In order to violate the CAN rule; that is, to have a method not so named that returns a reference, or a method so named that doesn't. The attribute documents the violation in the method's prototype, and may even be necessary to implement it, if the implementation uses ARC.
Working with Core Foundation types, including Core Graphics types. These aren't ARCed, so you need to use the bridging attributes to aid conversion to and from “retainable object pointer” types.
That's not necessary in most of the cases, since LLVM & Clang knows ObjC naming conventions. So if you follow the standard naming conventions of Cocoa, LLVM automagically assumes the corresponding family/return memory policy to follow.
Namely, if you declare a method named initWith... it will automatically consider it as the "init" family of methods, no need to specify __attribute__((objc_method_family(init))), Clang automatically detect it; same for the new family, etc.
In fact, you only need to use the __attribute__ specifiers when Clang can't guess such cases, which in practice rarely occurs (in practice I never had to use it), or only if you don't respect naming conventions:
Quoting Clang Language Extensions Documentation:
Many methods in Objective-C have conventional meanings determined by their selectors. For the purposes of static analysis, it is sometimes useful to be able to mark a method as having a particular conventional meaning despite not having the right selector, or as not having the conventional meaning that its selector would suggest. For these use cases, we provide an attribute to specifically describe the method family that a method belongs to.
So as soon as you respect the naming conventions (which you should always do) you won't have anything do to.
You should definitely stick to naming conventions wherever possible.
It's clearer to read.
Attributes can introduce build errors if there is a conflict.
ARC semantics combined with attributes are relatively fragile.