Objective-C: Why use non-NSMutable objects? - objective-c

Why should someone ever use the non-NSMutable equivalents of the data structures in Objective-C? When it's a situation when you need a const object that should not be modified? Does using non-NSMutable classes improve performance in any way? Any other situations?

The two main reasons off the top of my head:
An object returning a property can be certain nobody will alter it if it's immutable. The object can therefore return the original instead of making copies all the time. So it's a memory and performance benefit.
When writing your own immutable objects, it's very easy to be thread safe. That naturally flows into being able to write multi-threaded functional-style code which is reasonably efficient and error free.
You also tend to see arguments in favour of the inherent preservation of the original value being useful, especially in terms of semantics and design patterns.
Immutable classes don't tend to be much more efficient in and of themselves with one exception — if you take an immutable copy of a mutable array, for example, then it's clear exactly how much storage is needed and exactly that much can be allocated. Because memory allocation costs time, mutable collections tend to keep some spare storage around because they can't predict how they're going to grow.

const is not directly related to non-mutable objects; I'm more familiar with the latter, so that's what I'll talk about.
A non-mutable object is like a reservation. Imagine that you work at a busy restaurant that only works on a reservation basis—all guests must make a reservation. When someone calls and makes a reservation for eight people at six, you know that you'll be expecting eight people at 6. Of course, this keeps things predictable. You know to set out one table that can sit eight people (it wouldn't make sense to use more than one table, especially at a busy restaurant). You notify the kitchen and tell them to expect eight orders a few minutes after six (okay, maybe you won't, but you might as well). In this way, everything runs smoothly and there are no delays. When the party of eight arrives promptly at six (because everyone in this world is perfectly punctual), you lead them right over to their seats, they order, and enjoy their meal. No problems whatsoever.
A problem arises if the reservation never specifies the number of people or the time. Imagine someone calls and tells you to expect a group of people for dinner. In this case, you have no information. A group could be a couple on a date, a four-person family, or two dozen people for a corporate function. They might arrive late because they were at a movie, really early because they have a young child, or at different times because it was impossible to coordinate everyone. In this case, you would have to scramble to find seating for everyone and the kitchens might suddenly be swamped with a large number of orders. Or you could have blocked off to many seats and the kitchen might find itself with nothing to do. In either case, where you over-estimate or under-estimate, there are delays and lost potential. Anything could happen.
In this metaphor, the restaurant would be the runtime system, and the reservations are the objects. In the first scenario, you have a non-mutable object, like an NSArray. The system knows how much data it'll hold, how many elements there are, and by runtime, what type they are. The system knows that the size won't change, so it can optimize RAM to go around that array, without leaving any precautionary bits. Everything runs smoothly because everything is known.
By contrast, nothing is known with an NSMutableArray. The user might add more elements, so the system has to scramble to find more RAM, rather than using those same clock cycles to crunch some operation; the user might replace an element in the middle with a larger one, having to offset all the later elements—which involves copying all tho elements after. In certain cases, it could involve copying all the elements of the array or string or whatever to a new location, a (potentially) expensive operation. This can impart a significant performance overhead, especially when you use a lot of them. In Java for example, concatenating a string involves copying the entire existing string to a new memory location, and leaving the garbage collector to deal with the old string.
Another compelling reason is that you make it a bit harder to change the data. Users (of the class) have to explicitly make a mutable copy, which helps to ensure that they know what they're doing. This advantage is particularly notable with multiple threads—you don't want to pass a mutable object to something that's running on a background thread, because the foreground thread (or any other) could then be modifying the object, as it's being modified by the original thread, leading to very interesting results.

Related

Vulkan: Any downsides to creating pipelines, command-buffers, etc one at a time?

Some Vulkan objects (eg vkPipelines, vkCommandBuffers) are able to be created/allocated in arrays (using size + pointer parameters). At a glance, this appears to be done to make it easier to code using common usage patterns. But in some cases (eg: when creating a C++ RAII wrapper), it's nicer to create them one at a time. It is, of course, simple to achieve this.
However, I'm wondering whether there are any significant downsides to doing this?
(I guess this may vary depending on the actual object type being created - but I didn't think it'd be a good idea to ask the same question for each object)
Assume that, in both cases, objects are likely to be created in a first-created-last-destroyed manner, and that - while the objects are individually created and destroyed - this will likely happen in a loop.
Also note:
vkCommandBuffers are also deallocated in arrays.
vkPipelines are destroyed individually.
Are there any reasons I should modify my RAII wrapper to allow for array-based creation/destruction? For example, will it save memory (significantly)? Will single-creation reduce performance?
Remember that vkPipeline creation does not require external synchronization. That means that the process is going to handle its own mutexes and so forth. As such, it makes sense to avoid locking those internal mutexes whenever possible.
Also, the process is slow. So being able to batch it up and execute it into another thread is very useful.
Command buffer creation doesn't have either of these concerns. So there, you should feel free to allocate whatever CBs you need. However, multiple creation will never harm performance, and it may help it. So there's no reason to avoid it.
Vulkan is an API designed around modern graphics hardware. If you know you want to create a certain number of objects up front, you should use the batch functions if they exist, as the driver may be able to optimize creation/allocation, resulting in potentially better performance.
There may (or may not) be better performance (depending on driver and the type of your workload). But there is obviously potential for better performance.
If you create one or ten command buffers in you application then it does not matter.
For most cases it will be like less than 5 %. So if you do not care about that (e.g. your application already runs 500 FPS), then it does not matter.
Then again, C++ is a versatile language. I think this is a non-problem. You would simply have a static member function or a class that would construct/initialize N objects (there's probably a pattern name for that).
The destruction may be trickier. You can again have static member function that would destroy N objects. But it would not be called automatically and it is annoying to have null/husk objects around. And the destructor would still be called on VK_NULL_HANDLE. There is also a problem, that a pool reset or destruction would invalidate all the command buffer C++ objects, so there's probably no way to do it cleanly/simply.

Stamping / Tagging / Branding Object Instances

I have a routine which accepts an object and does some processing on it. The objects may or may-not be mutable.
void CommandProcessor(ICommand command) {
// do a lot of things
}
There is a probability that the same command instance loops back in the processor. Things turn nasty when that happens. I want to detect these return visitors and prevent them from being processed. question is how can I do that transparently i.e. without disturbing the object themselves.
here is what i tried
Added a property Boolean Visited {get, set} on the ICommand.
I dont like this because the logic of one module shows up in other. The ShutdownCommand is concerned with shutting down, not with the bookkeeping. Also an EatIceCreamCommand may always return False in a hope to get more. Some non-mutable objects have outright problems with a setter.
privately maintain a lookup table of all processed instances. when an object comes first check against the list.
I dont like this either. (1) performance. the lookup table grows large. we need to do liner search to match instances. (2) cant rely on hashcode. the object may forge a different hashcode from time to time. (3) keeping the objects in a list prevents them from being garbage collected.
I need a way to put some invisible marker on the instance (of ICommand) which only my code can see. currently i dont discriminate between the invocations. just pray the same instances dont come back. does anyone have a better idea to implement this functionality..?
Assuming you can't stop this from happening just logically (try to cut out the loop) I would go for a HashSet of commands that you've already seen.
Even if the objects are violating the contracts of HashCode and Equals (which I would view as a problem to start with) you can create your own IEqualityComparer<ICommand> which uses System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode to call Object.GetHashCode non-virtually. The Equals method would just test for reference identity. So your pool would contain distinct instances without caring whether or how the commands override Equals and GetHashCode.
That just leaves the problem of accumulating garbage. Assuming you don't have the option of purging the pool periodically, you could use WeakReference<T> (or the non-generic WeakReference class for .NET 4) to avoid retaining objects. You would then find all "dead" weak references every so often to prevent even accumulating those. (Your comparer would actually be an IEqualityComparer<WeakReference<T>> in this case, comparing the targets of the weak references for identity.)
It's not particularly elegant, but I'd argue that's inherent in the design - you need processing a command to change state somewhere, and an immutable object can't change state by definition, so you need the state outside the command. A hash set seems a fairly reasonable approach for that, and hopefully I've made it clear how you can avoid all three of the problems you mentioned.
EDIT: One thing I hadn't considered is that using WeakReference<T> makes it hard to remove entries - when the original value is garbage collected, you're not going to be able to find its hash code any more. You may well need to just create a new HashSet with the still-alive entries. Or use your own LRU cache, as mentioned in comments.

Slow Deletion of Handle Object in MATLAB

I used MATLAB to write a simulation engine for the simulation of product flows in a production environment. I inherited all used class from handle and used these handles (quite excessively, I guess) to link between e.g. products and work systems, orders, etc.
Now, to run multiple instances of my model, I create a simulation object that contains all other objects and their relations, run the model and free the simulation variable.
Creating and running the model takes ~50 seconds (this including the generation of all objects, their relations and of course the calculation over the course of the simulation run). Freeing the variable before the next run, currently takes ~3-4 minutes!
I tried clear, delete and plain overwriting of the old simulation object, without notifying significant differences in performance.
Is there a way to improve the performance without rewriting the code?
It is hard to say anything particular about your code without seeing it, or at least some high level design.
A short advice before optimizing the OO aspects :
Are you sure that the bottleneck is in the objects creation? Verify it with the profiler.
If the OO is indeed the bottleneck, here are some guesses:
You have used circular references. Matlab does not use garbage collector, but rather a smart reference counting mechanism, which can be quite slow in this case. Change the references between the objects to be tree-like instead.
You have created an enormous amount of objects. Matlab has a significant overhead for each object, much more than the traditional languages (c++, java). Re-design the system to have a smaller amount of objects.
Do you happen to use cell arrays to store other handle objects from within a handle object? This can cause serious slowdowns prior to Matlab R2011A. See http://www.mathworks.com/support/solutions/en/data/1-6VVMS0/index.html?product=ML
A workaround is to use a temp local variable to manipulate cell array, then assign this tmp variable back to your handle object property. I saw ~ 100X improvement in performance after doing this in one case.

Using OOP results in heavy objects? Will they be slow?

I'm using OOP to write small games with different types of characters (e.g. platformers, shooters) that do different types of things. I typically try to spread out functionality into easily manageable, simple classes (e.g. an Environment class would perform common physics calculations for all its Inhabitants, so they don't need to worry about that). But, it seems that the more I refactor these programs to align with OOP principles, the heavier my character objects get. Since they're the ones with the important data, they use their own data to perform functions on themselves. This keeps them decoupled from things outside of their realm, but makes their classes seem to grow and grow. I'm totally comfortable with breaking these character classes down into more manageable components, but I worry that having many objects onscreen that are instantiated from classes with a lot of methods will result in a slow-running game.
1) Do the number of instance methods on an object directly impact its runtime performance?
2) Am I using OOP correctly if I end up with heavy character objects?
No. Or at least mostly no, anyway.
Maybe, but probably not.
For a character-based game, it's perfectly reasonable that a character would have a lot of associated data. Efficiency is rarely affected by representing that as a single "flat" collection of primitive objects, or a tree-like collection of a few large objects, each of which (recursively) has a number of smaller constituent parts.
As far as number of methods affecting performance: the number of methods can affect cache utilization, especially if you have (for example) lots of extremely small methods, and heavily-used methods are more or less interleaved with less used ones, so you end up with a lot of cache space devoted to less-used methods because they happen to be in the same cache line with something that's used more. Being methods affects this primarily because a compiler will typically arrange methods of the same class close to each other in memory, so sharing cache lines becomes more common. At least with typical implementations, however, calling a method will be O(1), so the number of methods doesn't directly affect speed.
No, its not what methods you have in an object, but what you do with them that increases runtime cost. Ofcourse there is a limit to this, but with current hardware you can completely forget about it. However, it is often questionable to go beyond a dozen or two members in a class from a design standpoint. Splitting your objects up doesn't need to incur any significant cost, you can inline all your getters and setters, and pass values by pointers and references. The compiler can flatten all your design decisions out and mostly the code from a "heavy" class is equivalent to code from a constellation of small classes
Correctly in this context is entirely dependant on the taste of the people developing the code. The processor doesn't care about what software engineering design decisions you make. If you wan't to make you objects all encompassing and it feels right to you then do it. There might come a point where things don't feel "right" to you, at that point you might split things up.

Does the logic behind objects matter?

When working on the early stages of a console-based Python remake of the classic game 'Snake', someone submitted a patch to spawn food at random locations. The code defined a Food class which worked fine, but the logic behind it seemed a little weird.
I think we should delete the food once it's been consumed, then create another one. However, this person simply moves the food to a new random location once it's been consumed. While the latter seems illogical to me, it seems to do the exact same thing, maybe even more efficiently.
My question is: Would it be better to use the former logic, or the later, or am I simply nit-picking over nothing?
This all started at: https://bugs.launchpad.net/snakes-game/+bug/628180
Either is fine - within certain common-sense boundaries.
The latter approach will save re-allocating the object, so recycling it in this way will be more efficient - the gain is likely to be irrelevant in your particular example though unless heap fragmentation is a concern (e.g. on an embedded app with very limited RAM).
The danger with recycling is that the object may retain some vestige of its former state, so may not behave in the same manner as a new object would - in your case the logic is simple, so there is little danger, but with more complex objects this could become significant.
So in general I'd suggest the "create a new object" approach (it follows the principle of "least surprise", and will be less likely to confuse other programmers who come to work on the code) unless there are performance implications (e.g. on an embedded application like a phone where you have very limited resources and don't want a fragmented heap), in which case the "re-use an existing object" may be a smart solution.
I believe both solutions are fine. Relocating the food to another location is brobably less error prone in memory management terms, but due to garbage collection, you shouldn't care about that too much.
I'd argue, while instantiating a new food object is more logical, and closer to the real life model, relocating is more efficient.
The main issue as far as OOP is concerned isn't so much whether the food re-instantiates vs. relocates, but rather that this behavior remain transparent outside of the object. The game engine should be telling the object "you've been eaten" and such, but how the object handles that internally shouldn't be known to the game engine. If, internally, the object maintains a singleton of "food" and the "consume" method simply re-forms the food object with new values, that's fine. That's all internal to the implementation of "food" and just shouldn't be known outside of that class.