Implement a thread-safe invalidate-able cache with lock-free reads? - objective-c

I am trying to reason out in my head how to implement a thread-safe caching mechanism for a reference counted value with an API roughly like this: (Note: I'm using Objective-C syntax, but the problem is not language-specific)
typedef id (^InvalidatingLazyGenerator)();
#interface InvalidatingLazyObject : NSObject
- (id)initWithGenerator: (InvalidatingLazyGenerator)generator;
#property (readonly) id value;
- (void)invalidate;
#end
When someone requests -value, if it has an existing cached value, it should return a -retain/-autoreleased version of that value. If it doesn't have a value, or if the value isn't valid, it should generate one using a generation block passed in at init time, then it should cache that value for any future reads until someone calls -invalidate.
Let's assume we don't care if the generator block is called multiple times (i.e. a second reader arrives while the first reader is in the generator block), as long as the objects it returns aren't leaked when that happens. A first pass, non-wait-free implementation of this might look something like:
- (id)value
{
id retVal = nil;
#synchronized(self)
{
retVal = [mValue retain];
}
if (!retVal)
{
retVal = [[mGenerator() retain] retain]; // Once for the ivar and once for the return value
id oldVal = nil;
#synchronized(self)
{
oldVal = mValue;
mValue = retVal;
}
[oldVal release];
}
return [retVal autorelease];
}
- (void)invalidate
{
id val = nil;
#synchronized(self)
{
val = mValue;
mValue = nil;
}
[val release];
}
Naturally, this results in crappy read performance because concurrent reads are serialized by the lock. A reader/writer lock improves things from this, but is still quite slow in the read path. The performance goal here is for cached reads to be as fast as possible (hopefully lock-free). It's OK for reads to be slow if we have to calculate a new value, and it's OK for -invalidate to be slow.
So... I am trying to figure out a way to make reads lock/wait-free. My first (flawed - see below) thought involved adding an invalidation counter whose value is atomically, monotonically incremented and read using memory barriers. It looked like this:
- (id)value
{
// I think we don't need a memory barrier before this first read, because
// a stale read of the count can only cause us to generate a value unnecessarily,
// but should never cause us to return a stale value.
const int64_t startCount = mWriteCount;
id retVal = [mValue retain];
OSMemoryBarrier(); // But we definitely want a "fresh" read here.
const int64_t endCount = mWriteCount;
if (retVal && startCount == endCount)
{
return [retVal autorelease];
}
// Now we're in the slow path
retVal = [mGenerator() retain]; // we assume generator has given us an autoreleased object
#synchronized(self)
{
mValue = retVal;
OSAtomicIncrement64Barrier(&mWriteCount);
}
return retVal;
}
- (void)invalidate
{
id value = nil;
#synchronized(self)
{
value = mValue;
mValue = nil;
OSAtomicIncrement64Barrier(&mWriteCount);
}
[value release];
}
But I can already see problems here. For instance, the [mValue retain] in the read path: we need the value to be retained, but it's possible that in the time between the read of mValue and the call to -retain another thread could -invalidate, causing the value to have been dealloc'ed by the time the retain call is made. So this approach won't work as is. There may be other problems too.
Has anyone already worked something like this out and care to share? Or have a pointer to something similar in the wild?

I ended up taking this approach to the problem: Read-Copy-Update. It seems to work quite well -- the read performance is 50x faster than the lock-based approach (but that's hardly surprising.)

Related

Lazy loading Objective-C class with int properties

I use the following as a getter for a property in one of my classes:
- (NSString *)version
{
if (_version == nil) {
_version = [[[NSBundle mainBundle] infoDictionary] objectForKey:#"CFBundleVersion"];
}
return _version;
}
This works well. However, when I try the same for an int property I obviously get an error since int are never nil. What is the best way around this?
- (int)numberOfDays
{
if (_numberOfDays == nil) {
// relatively memory intense calculation that works out numberOfDays:
_numberOfDays = X;
}
return _numberOfDays;
}
Firstly, using int is not recommended Objective-C if possible. If you need to use a primitive integer type, you should use NSInteger. The size of NSInteger is determined at compile time based on the architecture(s) being built for. int is a static size that will not widen for different architectures. It's OK to use it, just be aware.
Using NSInteger, you still face the same problem, it can't be nil. You should therefore make your property an NSNumber which you can init with the result of your computation with [NSNumber numberWithInteger:anInteger];. That way, you can keep you nil check on your property and only do the computation once to create your NSNumber.
Add another boolean instance variable _numberOfDaysCalculated.
A thread-safe version would be
- (int)numberOfDays
{
#synchronized(self) {
if (!_numberOfDaysCalculated) {
// relatively memory intense calculation that works out numberOfDays:
_numberOfDays = X;
_numberOfDaysCalculated = YES;
}
}
return _numberOfDays;
}
Alternatively, if there is some "invalid" value of the property, you can use that
as a "not yet computed" marker. For example, if the computed value of numberOfDays has to be non-negative, you could initialize _numberOfDays = -1 in the init method,
and then test for if (_numberOfDays == -1) in the lazy getter method.
Use GCD.
static dispatch_once_t tok;
dispatch_once(&tok, ^{ memory_intensive_computation(); });
No, don't use GCD, I missed the point. In an instance method, you want to tie information to each instance, so using a static dispatch token is not appropriate. Maybe you should just stick with the "boolean flag as instance variable" approach.
Alternatively, you can initialize the int to a value which is known to be out of its valid range (for example, I suppose that numberOfDays can never be negative) and use that as a condition for performing the calculation.
Use a NSNumber to store the int value.
- (int)numberOfDays
{
if (_numberOfDays == nil) {
// relatively memory intense calculation that works out numberOfDays:
_numberOfDays = #(X);
}
return [_numberOfDays intValue];
}
I would initialize the _numberOfDays in the -init with NSNotFound and test for that in the getter.

Using #synchronized, volatile and OSMemoryBarrier() all together. Does one imply the other?

Coming from Java I'm trying to learn thread safety in Objective-C. So far I've leaned that
#synchronized blocks prevent concurrent access to the same block of code
volatile variables assure visibility of changes accross threads
OSMemoryBarrier(); assures proper ordering of access
My question is: Does one of those imply one or more of the others? If I want all three, do I need to use all three techniques?
Example:
volatile int first = 0;
volatile int second = 0;
[...]
#synchronized {
OSMemoryBarrier();
first++;
OSMemoryBarrier();
second++;
OSMemoryBarrier();
}
In Java all three are assured when entering and leaving a synchronized block and when reading or writing a volatile variable. True?
The #synchronized directive gets converted as follows...
- (NSString *)myString {
#synchronized(self) {
return [[myString retain] autorelease];
}
}
becomes...
- (NSString *)myString {
NSString *retval = nil;
pthread_mutex_t *self_mutex = LOOK_UP_MUTEX(self);
pthread_mutex_lock(self_mutex);
retval = [[myString retain] autorelease];
pthread_mutex_unlock(self_mutex);
return retval;
}
#synchronized doesn't protect a block of code from being reentered - it prevents executing any code that also uses #synchronized with the same object. So if you have two methods
- (void)method1 {
#synchronized (self) { dothis (); }
}
- (void)method2 {
#synchronized (self) { dothat (); }
}
and two different threads call method1 and method2 for the same object, then dothis() and dothat() will be called one after the other. Of course that's also true if two different threads call method1 for the same object. #synchronized doesn't stop you from entering a block on the same thread though, so in the example above dothis() could call [self method2] and it wouldn't be blocked.
If you are using volatile or OSMemoryBarrier() then I suggest that your design is much, much, much too complicated and you will run into trouble sooner or later.

Implementing NSFastEnumerator: EXC_BAD_ACCESS when iterating with for…in

I have a data structure that I wanted to enumerate. I tried to implement my object's NSFastEnumerator as follows:
- (NSUInteger)countByEnumeratingWithState:(NSFastEnumerationState *)state
objects:(__unsafe_unretained id [])buffer
count:(NSUInteger)len {
NSUInteger c = 0;
while (c < len) {
id obj = [self objectAtIndex:state->state];
if (obj == nil) break;
buffer[c] = obj;
c++;
state->state++;
}
state->itemsPtr = buffer;
state->mutationsPtr = nil;
return c;
}
If I use objectAtIndex directly, my object works properly. I get a nil when the index doesn't exist. But when I then use the for loop:
for (Pin *pin in coll) { ... }
the code runs through the above function fine and fills in state with what appears to be valid values and returns the number of objects, then I get an EXC_BAD_ACCESS failure at the for statement itself.
What am I doing wrong in this implementation?
I just had a similar issues, and after looking more closely into Apple's FastEnumerationSample, this part (that I had overlooked) jumped at me:
// We are not tracking mutations, so we'll set state->mutationsPtr to point into one of our extra values,
// since these values are not otherwise used by the protocol.
// If your class was mutable, you may choose to use an internal variable that is updated when the class is mutated.
// state->mutationsPtr MUST NOT be NULL.
state->mutationsPtr = &state->extra[0];
The important part being: state->mutationsPtr MUST NOT be NULL. I just used the example line provided and it worked like a charm!
I'm assuming you're using ARC. The problem may be that the buffer is an array of __unsafe_unretained objects, so ARC might be over-releasing them. But what does your objectAtIndex: method look like? This shouldn't be a problem if you are returning objects that are guaranteed to be alive at least as long as your object itself.
Instead of:
id obj = [self objectAtIndex:state->state];
use
__unsafe_unretained id = [self objectAtIndex:state->state];

memory leak when using callback

I'm having an issue with memory management when dealing with callbacks and async code in objective c.
I cant seem to find a way to release the instance that the callback is set on.
For example:
MyClass *myArchive = [[MyClass alloc] init] ;
[myArchive callBack:^(RKObjectLoader* objectLoader, id object ) {
NSLog(#"success");
} fail:^(RKObjectLoader* objectLoader, NSError* error) {
NSLog(#"failed");
}];
[myArchive searchArchive:words:paging];
The problem being that I don't know when or how to release the instance *myArchive. Using Instruments within xcode to profile my code I always get a leak here. The function searchArchive performs an async request to a server using restkit. I wont reference the instance from within the callback as I heard this causes a retain cycle and I have done some reading about using __block and other c approaches to avoid retain cycles which is all fine but as it stands now with no actual code happening within the callback how do I release the *myArchive instance. anyone able to explain how I should deal with this within objective-c?
EDIT:
This is where I set the callback in myclass
// Sets internal backs on this object which basically wrap the delegate
//
- (void)callBack: (void (^)(RKObjectLoader* objectLoader, id object))success
fail: (void (^)(RKObjectLoader* objectLoader, NSError* error))fail {
//sanity check
NSAssert(_currentDelegate != self, #"Delegate is another object. Can not set callback");
// store our callback blocks in the instance
_success = [success copy] ;
_fail = [fail copy] ;
}
and then release _success and _fail in dealloc
and within the #interface
#interface myClass : NSObject<RKObjectLoaderDelegate> {
// holds the block callback for "success"
void (^_success)(RKObjectLoader* objectLoader, id object);
// holds the block callback for "fail"
void (^_fail)(RKObjectLoader* objectLoader, NSError* error);
}
I hope this gives more insight into what I'm doing wrong.
EDIT 2:
Ok I'm beginning to see the errors now:
-(void)retrieveGallery{
//create call back for async and deal with the result
[_galleryItems callBack:^(RKObjectLoader* objectLoader, NSArray *objects) {
//success happy days. do a bunch of code here that does not cause leaks
} fail:^(RKObjectLoader* objectLoader, NSError* error) {
//retry the attempt to retrieve gallery data from the server
_retryCount++;
if (_retryCount < _maxRetryCount) {
[self retrieveGallery];
}
}];
//read the collection of gallery items from server
[_galleryItems readGallery];
}
The only actual memory leaks are when the callback catches a fail for what ever reason and then calls the [self retrieveGallery] function from within callback to attempt again. this is what is causing the leak so I'm guessing that is a big no no. How should I attempt the function (retrieveGallery in this case) again.
Memory management isn't really any different because you are using an asynchronous callback. myArchive should be a property of whatever class you are doing this in. You want it to stick around until the task is complete, right?
#property (retain) MyClass *myArchive;
Then..
myArchive = [[MyClass alloc] init];
void (^on_success_callback)(void) = ^(void){
NSLog(#"success");
self.myArchive = nil;
};
You need to make sure you are managing the callbacks properly, i.e. copying them from the stack and releasing them when you are done.
If you have retains and releases in your code you probably aren't using the accessor methods properly.

Pesky leak in setter/getter methods

It seems like I keep asking the same questions, memory related. My current code works exactly as I intend it, but I cannot figure why I am showing a leak here in Instruments.
-(NSDate *)startTimeAndDate {
NSDate *dateToReturn = nil;
if (startTimeAndDate != nil) {
dateToReturn = [startTimeAndDate retain];
} else { //is currently nil, this will be the initial setting
//return default time if we have a working date
if (finishTimeAndDate != nil) {
dateToReturn = [[self dateFromDate:finishTimeAndDate withNewTime:defaultStartTime]retain];
} else {
//return the default time with today's date if we have nothing set as yet
dateToReturn = [[self dateFromDate:[NSDate date] withNewTime:defaultStartTime] retain];
}
//save the initial setting
self.initialStartDateAndTime = [[dateToReturn copy] autorelease];
}
[startTimeAndDate release];
startTimeAndDate = dateToReturn;
return startTimeAndDate;
}
-(void)setStartTimeAndDate:(NSDate *)inStartTimeAndDate {
BOOL initialAssignment = NO;
if (startTimeAndDate == nil) {
initialAssignment = YES;
}
if (startTimeAndDate != inStartTimeAndDate) { //skip everything if passed object is same as current
//check that the start time is prior to finish only if finish time has been entered
NSDate *dateToSetStartTo = nil;
if (finishTimeAndDate != nil) {
if ([inStartTimeAndDate earlierDate:finishTimeAndDate] == inStartTimeAndDate) {
// use the new time, it is earlier than current finish time
dateToSetStartTo = [inStartTimeAndDate retain];
} else { //start time is not earlier then finish time
// the received entry is invalid, set start time to 1 default interval from finish
dateToSetStartTo = [[finishTimeAndDate dateByAddingTimeInterval:-self.defaultTimeInterval] retain];
}
} else { //finish time is nil
// use the new time without testing, nothing else is set
dateToSetStartTo = [inStartTimeAndDate retain];
}
[startTimeAndDate release];
startTimeAndDate = dateToSetStartTo;
}
if (initialAssignment) {
self.initialStartDateAndTime = [[self.startTimeAndDate copy] autorelease];
}
}
So far as I can see, I am balancing all retains with release or autorelease. The leak appears to be caused on the first pass only. I have a view controller, it creates my model (wherein this code lies) and sets a start date, nothing else is done at that point. If I close that view controller at that point, Instruments shows that I am leaving the date object as a leak.
I placed a NSLog to show retain count at dealloc and, sure enough, it has retain count of 2 before my final release is called, leaving a retain count of 1 when it should have been destroyed. It is always the same regardless if I close immediately after initialization or set and get a hundred times. retainCount is 2 prior to my final call to release in dealloc.
I have been looking at this all weekend and cannot figure where I've gone wrong.
To clarify, the initial call is to set the startTimeAndDate property. At that point all other fields are nil or 0 if not objects. That startTimeAndDate object appears to be the leaking object.
Firstly, can you describe the problem you are trying to solve with this code? I ask because it appears very complex and my initial thought is that simplification will not only clarify what you are doing, but is also likely to solve your leak as well.
Secondly, (and I may have this wrong), you only need to retain/release objects if you expect those objects to exist beyond the scope of the method, or you expect that they may be released by some code that you are claling in you method. Based on this, you appear to be over retaining and releasing in your code. I think you can remove a lot of it.
Again I may be wrong, but it appears that you will indeed leak. The reason I think so is this - on your first pass you retain some data in dateToReturn which is a local variable. Then you do
self.initialStartDateAndTime = [[dateToReturn copy] autorelease];
But this is not releasing dateToReturn. Instead it is releasing the copy of dateToReturn. dateToReturn is still retained. Presuming that you intend to autorelease the copy because initialStartDateAndTime is set with retain, I think you should be doing:
self.initialStartDateAndTime = [[dateToReturn copy] autorelease];
[dateToReturn release];
Of course, if you remove the extra retain/release's then this becomes simpler again.
The final thing I would suggest is around naming. The problem with code like this is that you have a number of methods and variables, all with very similar names. This can make it difficult to follow and lead to bugs. So ask yourself if you really need this many variables. And can you make your code more readible by changing some of the names.
Dam, ignore what I said. I just went through the code again and you're right. I think you are basically being burned by the complexity of the code. I found it quite difficult to follow, especially with the number of properties. I think what I would do at this stage is to copy the code to a unit test and run it from there. Then you can better test and debug it. I would recommend GHUnit if you do not already have unit testing in place.
The other thing that occurs to be is that there is code executing somewhere else in your program that is retaining the date. Therefore triggering the leak. For example if inStartTimeAndDate is coming in with a retain count of 1, but is not released by the code that called the setter then you could end up with startTimeAndDate with a retain of 2.
Having said that, here's my rewrite of the getter in an attempt to clarify whats going on:
-(NSDate *)startTimeAndDate {
// If we have it, bail out fast.
if (startTimeAndDate == nil) {
return startTimeAndDate;
}
// Is currently nil, this will be the initial setting
NSDate *dateToReturn = nil;
//return default time if we have a working date
if (finishTimeAndDate != nil) {
dateToReturn = [self dateFromDate:finishTimeAndDate withNewTime:defaultStartTime];
} else {
//return the default time with today's date if we have nothing set as yet
dateToReturn = [self dateFromDate:[NSDate date] withNewTime:defaultStartTime];
}
//save the initial setting
self.initialStartDateAndTime = [[dateToReturn copy] autorelease];
startTimeAndDate = [dateToReturn retain];
return startTimeAndDate;
}
The main reason for this re-write was that it appeared that if there was a startTimeAndDate then the code was doing this:
dateToReturn = [startTimeAndDate retain];
...
[startTimeAndDate release];
startTimeAndDate = dateToReturn;
Which seemed a little pointless because it's effective doing a retain, release and self assignment. It would work, but there's less chance of a bug if we leave it out.