My application creates an index which is saved in thousands of files on the disk. I want to access the index files with multiple instances of the application. Therefore I use the java FileLock to lock the files I am reading between the different JVMs.
If I try to acquire a FileLock twice with the same application, I get an OverlappingFileLockException (that is how FileLock is supposed to work). To prevent multiple acquires on the same file within one instance, I create a Map. If the app can acquire the semaphore for the specific file, it can also acquire a FileLock. If not, the file is currently in use by the application.
With a ConcurrentHashMap, it works like a charm. I use computeIfAbsent to add Semaphores to the map if needed.
The problem is: The more files I create, the more semaphores I store in the map. Running the application for several days can cause the map to explode. To prevent this, I want to remove unused entries, when they are released.
I can't just remove the semaphore like in the 2nd version of the function. If the Semaphore has queued Threads, the next thread in the queue will acquire the FileLock. A new thread won't find the semaphore in the map and will create a new one, acquiring the FileLock again. I have to check if semaphore.hasQueuedThreads() and only remove it, if there is no queued thread. But this is not an atomic operation.
I tried to lock the acquire/release functions for the Semaphore with a single semaphore to make both functions synchronized (bad practice), just to see if it could work. This ended in a deadlock.
val lockMap = ConcurrentHashMap<String, Semaphore>()
fun accessFile(fileName: String) {
acquireSemaphore(fileName: String)
acquireFileLock(fileName: String)
doSomethingWithFile(fileName: String)
releaseFileLock(fileName: String)
releaseSemaphore(fileName: String)
}
fun acquireSemaphore(fileName: String) {
(lockMap.computeIfAbsent(fileName){Semaphore(1, true)}).acquire()
}
fun releaseSemaphore(fileName: String) {
lockMap[fileName]?.release() // works, but the map keeps every semaphore
}
fun releaseSemaphore(fileName: String) {
lockMap.remove(filename)?.release() // removes the semaphore, but causes OverlappingFileLockExceptions
}
I want to remove a semaphore from the map when it is released and there are no threads waiting for the semaphore. This should be an atomic operation so no 2nd semaphore is created for the same file.
Related
Is this implementation safe to synchronize the access to the public fields/properties?
class Attributes(
private val attrsMap: MutableMap<String, Any?> = Collections.synchronizedMap(HashMap())
) {
var attr1: Long? by attrsMap
var attr2: String? by attrsMap
var attr3: Date? by attrsMap
var attr4: Any? = null
...
}
Mostly.
Because the underlying map is is only accessible via the synchronised wrapper, you can't have any issues caused by individual calls, such as simultaneous gets and/or puts (which is the main cause of race conditions): only one thread can be making such a call, and the Java memory model ensures that the results are then visible to all threads.
You could have race conditions involving a sequence of calls, such as iterating through the map, or a check followed by a modify, if the map could be modified in between. (That sort of problem can occur even on a single thread.) But as long as the rest of your class avoided such sequences, and didn't leak a reference to the map, you'd be safe.
And because the types Long, String, and Date are immutable, you can't have any issues with their contents being modified.
That is a concern with the Any parameter, though. If it stored e.g. a StringBuilder, one thread could be modifying its contents while another was accessing it, with hilarious consequences. There's not much you can do about that in a wrapper class, though.
By the way, instead of using a synchronised wrapper, you could use a ConcurrentHashMap, which would avoid the synchronisation in most cases (at the cost of a bit more memory). It also provides many methods which can replace call sequences, such as getOrPut(); it's a really powerful tool for writing high-performance multithreaded code.
Update I can confirm that objectWithID could potentially need a parent (or grandparent, etc) context's thread to do some fetching so avoid blocking your parent thread using something like waitUntilAllOperationsAreFinished.
As a quick test I pointed the children moc's parent to their grandparent instead and left the children threads blocking the original parent. In this setup the deadlock never occurred. This is a poor architecture though so I'll be rearchitecting.
Original Question
I have two layers of NSOperationQueue. The first is an NSOperation graph with operations that have a set of dependencies between them. They all run fine without deadlocking each other. Within one of these operations (a Scheduler for groups of people) I have broken out its work to more discrete chunks that can be run on another NSOperationQueue. However I still will want the Scheduler to finish creating all of its schedules before the larger operation is considered finished. To that end, once I create all Schedule operations and add them to the Scheduler operation queue, I call waitUntilAllOperationsAreFinished on the operation queue. This is where I deadlock.
I am using Core Data and have an NSBlockOperation subclass called BlockOperation that handles the routine of taking a parent managed object context, creating a PrivateQueueConcurrencyType child context, calling the provided block using performBlockAndWait and finally waiting on the parent context to merge changes. Here's some code...
init(block: (NSManagedObjectContext?) -> Void, withDependencies dependencies: Array<NSOperation>, andParentManagedObjectContext parentManagedObjectContext: NSManagedObjectContext?) {
self.privateContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
super.init()
self.queuePriority = NSOperationQueuePriority.Normal
addExecutionBlock({
if (parentManagedObjectContext != nil) {
self.parentContext = parentManagedObjectContext!
self.privateContext.parentContext = parentManagedObjectContext!
self.privateContext.performBlockAndWait({ () -> Void in
block(self.privateContext)
})
self.parentContext!.performBlockAndWait({ () -> Void in
var error: NSError?
self.parentContext!.save(&error)
})
}
})
for operation in dependencies {
addDependency(operation)
}
}
This is working really well for me already. But now I want to block a calling thread until an operation queue on it has finished all of its operations. Like this...
for group in groups {
let groupId = group.objectID
let scheduleOperation = BlockOperation(
block: { (managedObjectContext: NSManagedObjectContext?) -> Void in
ScheduleOperation.scheduleGroupId(groupId, inManagedObjectContext: managedObjectContext!)
},
withDependencies: [],
andParentManagedObjectContext: managedObjectContext)
scheduleOperationQueue.addOperation(scheduleOperation)
}
scheduleOperationQueue.waitUntilAllOperationsAreFinished()
...this thread gets stuck on that last line (obviously). But we never see the other threads make any progress past a certain point. Pausing the debugger I see where the queued operations are stuck. It's in a ScheduleOperation's init method where we fetch the group using the provided id. (ScheduleOperation.scheduleGroupId calls this init)
convenience init(groupId: NSManagedObjectID, inManagedObjectContext managedObjectContext: NSManagedObjectContext) {
let group = managedObjectContext.objectWithID(groupId) as Group
...
Does objectWithID need to execute code on the "parent" thread that its parent moc is associated with and therefore creating a deadlock? Is there anything else about my approach that could be causing this?
Note: Although I am writing this is Swift, I have added Objective-C as a tag because I feel like this is not a language specific issue, but a framework specific one.
In general it's not specified on which thread objectWithID will be called, it's an implementation detail. I had some problems with Core Data deadlocks in the past (although in different circumstances) and I found out that the framework does some locking internally when you invoke methods on NSManagedObjectContext. So yes, I think it might result in a deadlock.
I have no advice other than re-designing your architecture, maybe it can be simplified a little. Keep in mind that you already have a private serial queue associated with a context, which guarantees that the operations will be called in the specified order. You can therefore share the same context between all the ScheduleOperation instances. Set scheduleOperationQueue.maxConcurrentOperationsCount to 1, so that operations will execute one after another. And instead of blocking the calling thread, call a completion handler when the last operation finishes (you can use oepration's completionBlock).
Option Strict On
Public Class UtilityClass
Private Shared _MyVar As String
Public Shared ReadOnly Property MyVar() As String
Get
If String.IsNullOrEmpty(_MyVar) Then
_MyVar = System.Guid.NewGuid.ToString()
End If
Return _MyVar
End Get
End Property
Public Shared Sub SaveValue(ByVal newValue As String)
_MyVar = newValue
End Sub
End Class
While locking is a good general approach to adding thread safety, in many scenarios involving write-once quasi-immutability, where a field should become immutable as soon as a non-null value is written to it, Threading.Interlocked.CompareExchange may be better. Essentially, that method reads a field and--before anyone else can touch it--writes a new value if and only if the field matches the supplied "compare" value; it returns the value that was read in any case. If two threads simultaneously attempt a CompareExchange, with both threads specifying the field's present value as the "compare" value, one of the operations will update the value and the other will not, and each operation will "know" whether it succeeded.
There are two main usage patterns for CompareExchange. The first is most useful for generating mutable singleton objects, where it's important that everyone see the same instance.
If _thing is Nothing then
Dim NewThing as New Thingie() ' Or construct it somehow
Threading.Interlocked.CompareExchange(_thing, NewThing, Nothing)
End If
This pattern is probably what you're after. Note that if a thread enters the above code between the time another thread has done so and the time it has performed the CompareExchange, both threads may end up creating a new Thingie. If that occurs, whichever thread reaches the CompareExchange first will have its new instance stored in _thing, and the other thread will abandon its instance. In this scenario, the threads don't care whether they win or lose; _thing will have a new instance in it, and all threads will see the same instance there. Note also that because there's no memory barrier before the first read, it is theoretically possible that a thread which has examined the value of _thing sometime in the past might continue seeing it as Nothing until something causes it to update its cache, but if that happens the only consequence will be the creation of a useless new instance of Thingie which will then get discarded when the Interlocked.CompareExchange finds that _thing has already been written.
The other main usage pattern is useful for updating references to immutable objects, or--with slight adaptations--updating certain value types like Integer or Long.
Dim NewThing, WasThing As Thingie
Do
WasThing = _thing
NewThing = WasThing.WithSomeChange();
Loop While Threading.Interlocked.CompareExchange(_thing, NewThing, WasThing) IsNot WasThing
In this scenario, assuming there is some means by which, given a reference to Thingie, one may cheaply produce a new instance that differs in some desired way, it's possible to perform any such operation on _thing in a thread-safe manner. For example, given a String, one may easily produce a new String which has some characters appended. If one wished to append some text to a string in a thread-safe manner (such that if one thread attempts to add Fred and the other tries to add Joe, the net result would be to either append FredJoe or JoeFred, and not something like FrJoeed), the above code would have each thread read _thing, generate a version with its text appended and, try to update _thing. If some other thread updated _thing in the mean-time, abandon the last string that was constructed, make a new string based upon the updated _thing, and try again.
Note that while this approach isn't necessarily faster than the locking approach, it does offer an advantage: if a thread which acquires a lock gets stuck in an endless loop or otherwise waylaid, all threads will be forever blocked from accessing the locked resource. By contrast, if the WithSomeChanges() method above gets stuck in an endless loop, other users of _thing won't be affected.
With multithreaded code, the relevant question is: Can state be modified from several threads? If so, the code isn’t thread safe.
In your code, that’s the case: there are several places which mutate _MyVar and the code is therefore not thread safe. The best way to make code thread safe is almost always to make it immutable: immutable state is simply thread safe by default. Furthermore, code that doesn’t modify state across threads is simpler and usually more efficient than mutating multi-threaded code.
Unfortunately, it’s impossible to see without context whether (or how) your code could be made immutable from several threads. So we need to resort to locks which is slow, error-prone (see the other answer for how easy it is to get it wrong) and gives a false sense of security.
The following is my attempt to make the code correct with using locks. It should work (but keep in mind the false sense of security):
Public Class UtilityClass
Private Shared _MyVar As String
Private Shared ReadOnly _LockObj As New Object()
Public Shared ReadOnly Property MyVar() As String
Get
SyncLock _LockObj
If String.IsNullOrEmpty(_MyVar) Then
_MyVar = System.Guid.NewGuid.ToString()
End If
Return _MyVar
End SyncLock
End Get
End Property
Public Shared Sub SaveValue(ByVal newValue As String)
SyncLock _lockObj
_MyVar = newValue
End SyncLock
End Sub
End Class
A few comments:
We cannot lock on _MyVar since we change the reference of _MyVar, thus losing our lock. We need a separate dedicated locking object.
We need to lock each access to the variable, or at the very least every mutating access. Otherwise all the locking is for naught since it can be undone by changing the variable in another place.
Theoretically we do not need to lock if we only read the value – however, that would require double-checked locking which introduces the opportunity for more errors, so I’ve not done it here.
Although we don’t necessarily need to lock read accesses (see previous two points), we might still have to introduce a memory barrier somewhere to prevent reordering of read-write access to this property. I do not know when this becomes relevant because the rules are quite complex, and this is another reason I dislike locks.
All in all, it’s much easier to change the code design so that no more than one thread at a time has write access to any given variable, and to restrict all necessary communication between threads to well-defined communication channels via synchronised data structures.
I am using GPARs asynchronous functions to fire off a process as each line in a file is parsed.
I am seeing some strange behavior that makes me wonder if I have an issue with thread safety.
Let's say I have a current object that is being loaded up with values from the current row in an input spreadsheet, like so:
Uploader {
MyRowObject currentRowObject
}
Once it has all the values from the current row, I fire off an async closure that looks a bit like this:
Closure processCurrentRowObject = { ->
myService.processCurrentRowObject (currentRowObject)
}.asyncFun()
It is defined in the same class, so it has access to the currentRowObject.
While that is off and running, I parse the next row, and start by creating a new object:
MyObject currentObject = new MyObject()
and start loading it up with values.
I assumed that this would be safe, that the asynchronous function would be pointing to the previous object. However, I wonder if because I am letting the closure bind to the reference, if somehow the reference is getting updated in the async function, and I am pulling the object instance out from under it, so to speak - changing it while it's trying to work on the previous instance.
If so, any suggestions for fixing? Or am I safe?
Thanks!
I'm not sure I fully understand your case, however, here's a quick tip.
Since it is always dangerous to share a single mutable object among threads, I'd recommend to completely separate the row objects used for different rows:
final localRowObject = currentRowObject
currentRowObject = null
Closure processCurrentRowObject = { ->
myService.processCurrentRowObject (localRowObject)
}.asyncFun()
I've searched StackOverflow and there are many ConcurrentModificationException questions. After reading them, I'm still confused. I'm getting a lot of these exceptions. I'm using a "Registry" setup to keep track of Objects:
public class Registry {
public static ArrayList<Messages> messages = new ArrayList<Messages>();
public static ArrayList<Effect> effects = new ArrayList<Effect>();
public static ArrayList<Projectile> proj = new ArrayList<Projectile>();
/** Clears all arrays */
public static void recycle(){
messages.clear();
effects.clear();
proj.clear();
}
}
I'm adding and removing objects to these lists by accessing the ArrayLists like this: Registry.effects.add(obj) and Registry.effects.remove(obj)
I managed to get around some errors by using a retry loop:
//somewhere in my game..
boolean retry = true;
while (retry){
try {
removeEffectsWithSource("CHARGE");
retry = false;
}
catch (ConcurrentModificationException c){}
}
private void removeEffectsWithSource(String src) throws ConcurrentModificationException {
ListIterator<Effect> it = Registry.effects.listIterator();
while ( it.hasNext() ){
Effect f = it.next();
if ( f.Source.equals(src) ) {
f.unapplyEffects();
Registry.effects.remove(f);
}
}
}
But in other cases this is not practical. I keep getting ConcurrentModificationExceptions in my drawProjectiles() method, even though it doesn't modify anything. I suppose the culprit is if I touched the screen, which creates a new Projectile object and adds it to Registry.proj while the draw method is still iterating.
I can't very well do a retry loop with the draw method, or it will re-draw some of the objects. So now I'm forced to find a new solution.. Is there a more stable way of accomplishing what I'm doing?
Oh and part 2 of my question: Many people suggest using ListIterators (as I have been using), but I don't understand.. if I call ListIterator.remove() does it remove that object from the ArrayList it's iterating through, or just remove it from the Iterator itself?
Top line, three recommendations:
Don't do the "wrap an exception in a loop" thing. Exceptions are for exceptional conditions, not control flow. (Effective Java #57 or Exceptions and Control Flow or Example of "using exceptions for control flow")
If you're going to use a Registry object, expose thread-safe behavioral, not accessor methods on that object and contain the concurrency reasoning within that single class. Your life will get better. No exposing collections in public fields. (ew, and why are those fields static?)
To solve the actual concurrency issues, do one of the following:
Use synchronized collections (potential performance hit)
Use concurrent collections (sometimes complicated logic, but probably efficient)
Use snapshots (probably with synchronized or a ReadWriteLock under the covers)
Part 1 of your question
You should use a concurrent data structure for the multi-threaded scenario, or use a synchronizer and make a defensive copy. Probably directly exposing the collections as public fields is wrong: your registry should expose thread-safe behavioral accessors to those collections. For instance, maybe you want a Registry.safeRemoveEffectBySource(String src) method. Keep the threading specifics internal to the registry, which seems to be the "owner" of this aggregate information in your design.
Since you probably don't really need List semantics, I suggest replacing these with ConcurrentHashMaps wrapped into Set using Collections.newSetFromMap().
Your draw() method could either a) use a Registry.getEffectsSnapshot() method that returns a snapshot of the set; or b) use an Iterable<Effect> Registry.getEffects() method that returns a safe iterable version (maybe just backed by the ConcurrentHashMap, which won't throw CME under any circumstances). I think (b) is preferable here, as long as the draw loop doesn't need to modify the collection. This provides a very weak synchronization guarantee between the mutator thread(s) and the draw() thread, but assuming the draw() thread runs often enough, missing an update or something probably isn't a big deal.
Part 2 of your question
As another answer notes, in the single-thread case, you should just make sure you use the Iterator.remove() to remove the item, but again, you should wrap this logic inside the Registry class if at all possible. In some cases, you'll need to lock a collection, iterate over it collecting some aggregate information, and make structural modifications after the iteration completes. You ask if the remove() method just removes it from the Iterator or from the backing collection... see the API contract for Iterator.remove() which tells you it removes the object from the underlying collection. Also see this SO question.
You cannot directly remove an item from a collection while you are still iterating over it, otherwise you will get a ConcurrentModificationException.
The solution is, as you hint, to call the remove method on the Iterator instead. This will remove it from the underlying collection as well, but it will do it in such a way that the Iterator knows what's going on and so doesn't throw an exception when it finds the collection has been modified.