Passing copy of object to method -- who does the copying?

Passing copy of object to method -- who does the copying? - oop

I have an object that I'm passing in a method call. Say I'm using a language that only allows you to pass objects by reference, like Java or PHP. If the method makes changes to the object, it will affect the caller. I don't want this to happen. So it seems like I need to make a copy of the object.
My question is: whose responsibility is it to clone the object? The caller, before it calls the method? Or the callee, before it changes the object?
EDIT: Just to clarify, I want this to be part of the contract of this method -- that it never modifies the original object. So it seems like it should be up to the method to make the copy. But then the caller has no protection from a method that doesn't do this properly. I guess that's acceptable -- the only other alternative seems to be to have this built into the language.

Generally, the caller should make the copy if it is concerned about changes. If the caller doesn't care, the method should make the copy if it needs to do something that it knows shouldn't persist.

So you want to do something like
MyObject m = new MyObject();
MyObject n = MyObjectProcessor.process(m);?
It seems simpler to me to do something like
MyObject n = MyObjectProcessor.alter(m.clone());
where it's clear who's doing what to who. You could make the argument that the processor class function should be free of side effects, i.e. it should return a new object any time it's going to change state, but (AFAIK) that's not so consistently followed in OO as opposed to functional programming.
Something like the above is probably harmless, as long as it's clearly named and documented.

We could look at ruby for guidance. They use a ! symbol to indicate that an object is modified in-place. So, salary.add(10000) returns a new object but salary.add!(10000) returns the original object but modified. You could use the same idea in Java or PHP by using a local naming convention.

The caller. Because, sometimes you want to make changes to the objects themselves and other times to a copy.
Although, I consider it a bad practice for callee to modify passed objects (at least in object oriented languages). This can cause many unwanted side effects.
(after your) EDIT: In that case it is callee's responsibility to enforce the contract, so there are two options:
The callee simply does not modify the object
or the callee copies the object and works with the copy afterwards

Depends, is there any reason that the method could be called in the future where you want the change to be seen by the caller? If so then the caller should make the copy. Otherwise the callee should make it. I would say that the second case is probably more common.

If you have the caller clone the object, it gives you the flexibility to not use a copy (by not cloning it first), and also means you don't have to return a new object, you can just operate on the reference passed in.

My first reaction would be that it is the caller's responsibility, but I think it actually depends.
It depends on the contract defined between the two methods. The method that is making changes should explicitly identify that fact and let the caller make the decision. OR, The method that is making the changes should explicitly identify that it will NOT make any changes to the passed object and then it would be responsible for making the copy.

I would say the callee: it simplifies calls and caller won't have to worry for the integrity of the given objects. It is the responsibility of the callee to preserve the integrity.

I assume you would have something like const declaration. This would be compiler enforced and would be more efficient than creating copies of your objects.

I think the caller should make the clone, just to make the naming easier. You can name your method Foo() instead of CloneBarAndFooClone().
Compare:
void RemoveZeroDollarPayments(Order order)
vs.
Order CloneOrderAndRemoveZeroDollarPaymentsFromClone(Order order)

If not changing the object is part of the method contract, the only possibility is having the copy made inside the method. Otherwise you are lying to your client.
The fact that you actually need to modify an object exactly like the one given to you is just an implementation detail that should not put a burden on the caller. In fact, he does not even need to have visibility of that.

Related

CF_RETURNS_RETAINED or CF_RETURNS_NOT_RETAINED: which to use when?

I am unsure whether to use CF_RETURNS_RETAINED or CF_RETURNS_NOT_RETAINED for my custom function returning a CFDataProviderRef.
According to the documentation at the location where the macros are defined, both should only used in exceptional circumstances, and the correct fix should be to fix my naming convention.
However The swift/objective-c documentation suggests using them to annotate any function returning a CoreFoundation pointer, without really explaining when to use which --- if I don't annotate them I need to manually specify the behaviour every time in the swift code.
Further documentation I could find explains how one uses a ARC value of +1 and the other of 0, however I'm afraid that doesn't help me much in understanding.
My questions:
Should I be using them at all, or improve my naming as Base.h suggests?
What is the exact naming convention that is being used there?
Am I correct in assuming that I should use one if I want the memory used by the returned object free'd as soon as it goes out of scope in the caller, and the other if I don't want this to happen (because I use a pointer to it somewhere else, or clean up myself)?
All my function does is return a CFDataProviderRef that I got from a call to CGDataProviderCreateSequential. I guess that means I want the the behaviour as CGDataProviderCreateSequential (right?). How can I find whether that function uses CF_RETURNS_RETAINED or CF_RETURNS_NOT_RETAINED (it's not there in the CGDataProvider.h file)?

This is all about ownership. The naming convention is that of Core Foundation, which is described here. The Create Rule says that functions with "Create" and "Copy" in their name leave the caller with ownership of the returned object (although not necessarily sole ownership). That means that the caller has a responsibility to eventually release the returned object using CFRelease(). The Get Rule says that functions other than creation or copy functions do not give the caller ownership of the object, so the caller must not call CFRelease() on it (except to balance any explicit calls to CFRetain(), of course).
If a function is semantically a creation or copy function which returns ownership to the caller, but its name doesn't indicate that, you should use CF_RETURNS_RETAINED to indicate that. Similarly, if a function's name contains "Create" or "Copy" but it doesn't have semantics matching the Create Rule, you should use CF_RETURNS_NOT_RETAINED to indicate that. (Try to avoid this.)
Since your code is calling CGDataProviderCreateSequential() and since that has "Create" in the name, your code is responsible for releasing the returned CGDataProvider object. If you release it right there in your function, then your caller can't get access to it. You want to return the object to the caller. You also want to pass the responsibility for releasing the object to the caller, since you don't know when the caller will be done with it. So, you should name your function with "Create" in its name to indicate both to the caller and to automated systems that the caller receives responsibility for releasing the object. Alternatively, you could annotate your function with CF_RETURNS_RETAINED to signify that, but following the naming convention is better.
It's possible that Swift only respects the naming convention for system headers. I don't know. In that case, you'd have to annotate with CF_RETURNS_RETAINED even if your function has "Create" in its name. There's no harm in annotating a function whose name already follows the convention. It's redundant but harmless.

Do Objective-C objects get their own copies of instance methods?

I'm new to Objective-C and was wondering if anyone could provide any information to clarify this for me. My (possibly wrong) understanding of object instantiation in other languages is that the object will get it's own copies of instance variables as well as instance methods, but I'm noticing that all the literature I've read thus far about Objective-C seems to indicate that the object only gets copies of instance variables, and that even when calling an instance method, program control reverts back to the original method defined inside the class itself. For example, this page from Apple's developer site shows program flow diagrams that suggest this:
https://developer.apple.com/library/mac/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/WorkingwithObjects/WorkingwithObjects.html#//apple_ref/doc/uid/TP40011210-CH4-SW1
Also in Kochan's "Programming in Objective-C", 6th ed., pg. 41, referring to an example fraction class and object, the author states that:
"The first message sends the setNumerator: message to myFraction...control is then sent to the setNumerator: method you defined for your Fraction class...Objective-C...knows that it's the method from this class to use because it knows that myFraction is an object from the Fraction class"
On pg. 42, he continues:
"When you allocate a new object...enough space is reserved in memory to store the object's data, which includes space for its instance variables, plus a little more..."
All of this would seem to indicate to me that there is only ever one copy of any method, the original method defined within the class, and when calling an instance method, Objective-C simply passes control to that original copy and temporarily "wires it" to the called object's instance variables. I know I may not be using the right terminology, but is this correct? It seems logical as creating multiple copies of the same methods would be a waste of memory, but this is causing me to rethink my entire understanding of object instantiation. Any input would be greatly appreciated! Thank you.

Your reasoning is correct. The instance methods are shared by all instances of a class. The reason is, as you suspect, that doing it the other way would be a massive waste of memory.
The temporary wiring you speak of is that each method has an additional hidden parameter passed to it: a pointer to the calling object. Since that gives the method access to the calling object, then it can easily access all of the necessary instance variables and all is well. Note that any static variable exists in only a single instance as well and if you are not aware of that, unexpected things can happen. However, regular local variables are not shared and are recreated for each call of a method.
Apple's documention on the topic is very good so have a look for more info.

Just think of a method as a set of instructions. There is no reason to have a copy of the same method for each object. I think you may be mistaken about other languages as well. Methods are associated with the class, not individual objects.

Yes, your thinking is more or less right (although it's simpler than that: behind the scenes in most such languages methods don't need to be "wired" to anything, they just take an extra parameter for self and insert struct lookups before references to instance variables).
What might be confusing you is that not all languages work this way, in their implementations and semantically. Object-oriented languages are (very roughly) divided into two camps: class-based, like Objective-C; and prototype-based, like Javascript. In the second camp of languages, a method or procedure really is an object in its own right and can often be assigned directly to an object's instance variables as well - there are no classes to lookup methods from, only objects and other objects, all with the same first-class status (this is an oversimplification, good languages still allow for sharing and efficiency).

When using reference to objects, do we have a mechanism similar to "pass by value" for callee not to be able to make any change to the original data?

For the mechanism of "pass by value", it was so that the callee cannot alter the original data. So the callee can change the parameter variable in any way, but when the function returns, the original value in the argument variable is not changed.
But in Objective-C or Ruby, since all variables for objects are references to objects, when we pass the object to any method, the method can "send a message" to alter the object. After the method returns, the caller will continue with the argument already in a different state.
Or is there a way to guarantee the passed in object not changed (its states not altered) -- is there such a mechanism?

You're somewhat misusing the term "pass by value" and "pass by reference" here. What you really are discussing is const. In C++, you can refer to a const instance of a mutable class. There is no similar concept for ObjC objects (or in Ruby I believe, though I am much less familiar with Ruby than ObjC). ObjC does, via C, have the concept of const pointers, but these are a much weaker promise.
The best solution to this in ObjC is to prefer value (immutable) classes whenever possible. See Imutability in Objective-c for more discussion on that.
The next-best solution is to, as a matter of design, avoid this situation. Avoid side effects in your methods that are not obvious from the name. By avoiding this as a matter of design, callers should not need to worry about it. Remember, the caller and the called are on the same team. Neither should be trying to protected itself from the other. Good naming and good API design help the developer avoid error without compiler enforcement. ObjC has little compiler enforcement, so good naming and good API design are absolutely critical. I would say the same for Ruby, despite my limited experience there, in that it is also a highly dynamic language.
Finally, if you are dealing with a poorly behaved API that does modify your object when it shouldn't, you can resort to passing it a copy.
But if you're designing this from scratch, think hard about using an immutable class whenever possible.

I'm not sure what you are getting at. Ruby is pass-by-value. You cannot "change the argument variable":
def is_ruby_pass_by_value?(foo)
foo = 'No, Ruby is not pass-by-value.'
return nil
end
bar = 'Yes, of course, Ruby *is* pass-by-value!'
is_ruby_pass_by_value?(bar)
p bar
# 'Yes, of course, Ruby *is* pass-by-value!'
I'm not sure about Objective-C, but I would be surprised if it were different.

If blocks are objects, how do they keep internal state and what are their advantages over regular objects?

I was under the impression that blocks were supposed to resemble first-class functions and allow for lambda calc-style constructs. From a previous question however, I was told they are actually just objects.
Then I have 2 questions really:
Besides the feature of having access to their defining scope, which
I guess makes them usable in a way resembling C++ "friendship", why
would one go for a block instead of an object then? Are they more
lightweight? Because if not I might as well keep passing objects as
parameters instead of blocks.
Do blocks have a way of keeping an internal state? for instance,
some variable declared inside the block which will retain its value
across invocations.

Besides the feature of having access to their defining scope, which I guess makes them usable in a way resembling C++ "friendship", why would one go for a block instead of an object then?
Flexibility. Less to implement. A block is able to represent more than a parameter list or specific object type.
Are they more lightweight?
Not necessarily. Just consider them another tool in the toolbox, and use them where appropriate (or required).
Do blocks have a way of keeping an internal state? for instance, some variable declared inside the block which will retain its value across invocations.
Yes, they are able to perform reference counting as well as copy stack objects. That doesn't necessarily make them lighter-weight to use than an object representing the parameters you need.
Related
What's the difference between NSInvocation and block?

blocks were supposed to resemble first-class functions [...] they are actually just objects.
They are in fact first-class functions, implemented for the purposes of ObjC as objects. In plain-C, where they are also available, they have a closely-related but non-object-based implementation. You can think about them in whichever way is most convenient at the moment.
why would one go for a block instead of an object then?
A block is an executable chunk of code which automatically captures variables from its enclosing scope. The state and actions of a custom object have to be more explicitly handled, and are less generic; you can't use any old object as a completion argument, whereas an executable object fits that bill perfectly.
Do blocks have a way of keeping an internal state? for instance, some variable declared inside the block which will retain its value across invocations.
Sure, you can declare a static variable just like you could with a function or method:
void (^albatross)(void);
albatross = ^{
static int notoriety;
NSLog(#"%d", notoriety++);
};
albatross();
albatross();
albatross();
albatross();

In ObjC, how to describe balance between alloc/copy/retain and auto-/release, in terms of location

As is common knowledge, calls to alloc/copy/retain in Objective-C imply ownership and need to be balanced by a call to autorelease/release. How do you succinctly describe where this should happen? The word "succinct" is key. I can usually use intuition to guide me, but would like an explicit principle in case intuition fails and that can be use in discussions.
Properties simplify the matter (the rule is auto-/release happens in -dealloc and setters), but sometimes properties aren't a viable option (e.g. not everyone uses ObjC 2.0).
Sometimes the release should be in the same block. Other times the alloc/copy/retain happens in one method, which has a corresponding method where the release should occur (e.g. -init and -dealloc). It's this pairing of methods (where a method may be paired with itself) that seems to be key, but how can that be put into words? Also, what cases does the method-pairing notion miss? It doesn't seem to cover where you release properties, as setters are self-paired and -dealloc releases objects that aren't alloc/copy/retained in -init.
It feels like the object model is involved with my difficulty. There doesn't seem to be an element of the model that I can attach retain/release pairing to. Methods transform objects from valid state to valid state and send messages to other objects. The only natural pairings I see are object creation/destruction and method enter/exit.
Background:
This question was inspired by: "NSMutableDictionary does not get added into NSMutableArray". The asker of that question was releasing objects, but in such a way that might cause memory leaks. The alloc/copy/retain calls were generally balanced by releases, but in such a way that could cause memory leaks. The class was a delegate; some members were created in a delegate method (-parser:didStartElement:...) and released in -dealloc rather than in the corresponding (-parser:didEndElement:...) method. In this instance, properties seemed a good solution, but the question still remained of how to handle releasing when properties weren't involved.

Properties simplify the matter (the rule is auto-/release happens in -dealloc and setters), but sometimes properties aren't a viable option (e.g. not everyone uses ObjC 2.0).
This is a misunderstanding of the history of properties. While properties are new, accessors have always been a key part of ObjC. Properties just made it easier to write accessors. If you always use accessors, and you should, than most of these questions go away.
Before we had properties, we used Xcode's built-in accessor-writer (in the Script>Code menu), or with useful tools like Accessorizer to simplify the job (Accessorizer still simplifies property code). Or we just typed a lot of getters and setters by hand.

The question isn't where it should happen, it's when.
Release or autorelease an object if you have created it with +alloc, +new or -copy, or if you have sent it a -retain message.
Send -release when you don't care if the object continues to exist. Send -autorelease if you want to return it from the method you're in, but you don't care what happens to it after that.

I wouldn't say that dealloc is where you would call autorelease. And unless your object, whatever it may be, is linked to the life of a class, it doesn't necessarily need to be kept around for a retain in dealloc.
Here are my rules of thumb. You may do things in other ways.
I use release if the life of the
object I am using is limited to the
routine I am in now. Thus the object
gets created and released in that
routine. This is also the preferred
way if I am creating a lot of objects
in a routine, such as in a loop, and
I might want to release each object
before the next one is created in the
loop.
If the object I created in a method
needs to be passed back to the
caller, but I assume that the use of
the object will be transient and
limited to this run of the runloop, I
use autorelease. Here, I am trying to mimic many of Apple's convenience routines. (Want a quick string to use for a short period? Here you go, don't worry about owning it and it will get disposed appropriately.)
If I believe the object is to be kept
on a semi-permanent basis (like
longer than this run of the runloop),
I use create/new/copy in my method
name so the caller knows that they
are the owner of the object and will
have to release the object.
Any objects that are created by a
class and kept as a property with
retain (whether through the property
declaration or not), I release those
in dealloc (or in viewDidUnload as
appropriate).
Try not to let all this memory management overwhelm you. It is a lot easier than it sounds, and looking at a bunch of Apple's samples, and writing your own (and suffering bugs) will make you understand it better.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas