I'm looking to implement an Iterator<> on a ChronicleQueue methodReader (or tailer if this cant be done with a methodReader).
Is there a way to see if a queue has more data ( can I use lastReadIndex() < ??? ) (so I can implement the hasNext() of the Iterator?
Related
I wonder if it is possible to implement something similar to the do-notation of Haskell in Kotlin on Lists or List-Like structures with monadic properties.
Take following example:
fun <A, B> cartesianProduct(xs: List<A>, ys: List<B>): List<Pair<A, B>> =
xs.flatMap { x -> ys.flatMap { y -> listOf(x to y) } }
It would be nice if I could write something like
suspend fun <A, B> cartesianProduct(xs: List<A>, ys: List<B>): List<Pair<A, B>> =
list {
val x = xs.bind()
val y = xs.bind()
yield(x to y)
}
Arrow-Kt defines similar comprehensions using coroutines for either, nullable, option and eval. I looked at the implementation and also its Effect documentation, but I have trouble to translate the concept to Lists. Is this even possible in kotlin?
It's not possible at the moment to implement monad comprehension for List, Flow, and other non-deterministic data structures that emit more than one value. The current implementation of continuations in Kotlin is single shot only. This means a continuation can resume a program with a single emitted value. Resuming the program more than once requires hijacking the continuation stack labels with reflection in order to replay their state in the second resumption. Additionally replaying a block in which a multishot data type is binding would replay all effects previous to the bind since the block has to emit again.
list {
println("printed 3 times and not cool")
val a = listOf(1, 2, 3).bind()
a
}
The arrow-continuations library already includes a MultiShot delimited scope for reset/shift but it's currently internal since is not safe until Kotlin suspension or continuations provide the ability to multishot without replaying the current block. Alternatively we would need real for comprehensions or a similar structure to enforce binds happen before other code which would also solve the block replaying issue.
The Effect interface ultimately delegates to one of these scopes for its implementation. The current versions of Reset.suspended and Reset.restricted are single shot.
Need to have a queue and like mapSet to having unique key, so that it supports
queue.put(key, value) //<== put in the queue and keep the order (it become last in the queue)
queue.get(key)
queue.remove(key)
and it can also do something like
queue.pop() // remove from head
seems it could be implemented with LinkedHashMap but it is not thread safe. Any other data structure in kotlin may be used for this case? or LruCache?
I am working on a distributed algorithm and decided to use a Akka to scale it across machines. The machines need to exchange messages very frequently and these messages reference some immutable objects that exist on every machine. Hence, it seems sensible to "compress" the messages in the sense that the shared, replicated objects should not be serialized in the messages. Not only would this save on network bandwidth but it also would avoid creating duplicate objects in the receiver side whenever a message is deserialized.
Now, my question is how to do this properly. So far, I could think of two options:
Handle this on the "business layer", i.e., converting my original message objects to some reference objects that replace references to the shared, replicated objects by some symbolic references. Then, I would send those reference objects rather than the original messages. Think of it as replacing some actual web resource with a URL. Doing this seems rather straight-forward in terms of coding but it also drags serialization concerns into the actual business logic.
Write custom serializers that are aware of the shared, replicated objects. In my case, it would be okay that this solution would introduce the replicated, shared objects as global state to the actor systems via the serializers. However, the Akka documentation does not describe how to programmatically add custom serializers, which would be necessary to weave in the shared objects with the serializer. Also, I could imagine that there are a couple of reasons, why such a solution would be discouraged. So, I am asking here.
Thanks a lot!
It's possible to write your own, custom serializers and let them do all sorts of weird things, then you can bind them at the config level as usual:
class MyOwnSerializer extends Serializer {
// If you need logging here, introduce a constructor that takes an ExtendedActorSystem.
// class MyOwnSerializer(actorSystem: ExtendedActorSystem) extends Serializer
// Get a logger using:
// private val logger = Logging(actorSystem, this)
// This is whether "fromBinary" requires a "clazz" or not
def includeManifest: Boolean = true
// Pick a unique identifier for your Serializer,
// you've got a couple of billions to choose from,
// 0 - 40 is reserved by Akka itself
def identifier = 1234567
// "toBinary" serializes the given object to an Array of Bytes
def toBinary(obj: AnyRef): Array[Byte] = {
// Put the code that serializes the object here
//#...
Array[Byte]()
//#...
}
// "fromBinary" deserializes the given array,
// using the type hint (if any, see "includeManifest" above)
def fromBinary(
bytes: Array[Byte],
clazz: Option[Class[_]]): AnyRef = {
// Put your code that deserializes here
//#...
null
//#...
}
}
But this raises an important question: if your messages all references data that is shared on the machines already, why would you want to put in the message the pointer to the object (very bad! messages should be immutable, and a pointer isn't!), rather than some sort of immutable, string objectId (kinda your option 1) ? This is a much better option when it comes to preserving the immutability of the messages, and there is little change in your business logic (just put a wrapper over the shared state storage)
for more info, see the documentation
I finally went with the solution proposed by Diego and want to share some more details on my reasoning and solution.
First of all, I am also in favor of option 1 (handling the "compaction" of messages in the business layer) for those reasons:
Serializers are global to the actor system. Making them stateful is actually a most severe violation of Akka's very philosophy as it goes against the encapsulation of behavior and state in actors.
Serializers have to be created upfront, anyway (even when adding them "programatically").
Design-wise, one can argue that "message compaction is not a responsibility of the serializer, either. In a strict sense, serialization is merely the transformation of runtime-specific data into a compact, exchangable representation. Changing what to serialize, is not a task of a serializer, though.
Having settled upon this, I still strived for a clear separation of "message compaction" and the actual business logic in the actors. I came up with a neat way to do this in Scala, which I want to share here. The basic idea is to make the message itself look like a normal case class but still allow these messages to "compactify" themselves. Here is an abstract example:
class Sender extends ActorRef {
def context: SharedContext = ... // This is the shared data present on every node.
// ...
def someBusinessLogic(receiver: ActorRef) {
val someData = computeData
receiver ! MyMessage(someData)
}
}
class Receiver extends ActorRef {
implicit def context: SharedContext = ... // This is the shared data present on every node.
def receiver = {
case MyMessage(someData) =>
// ...
}
}
object Receiver {
object MyMessage {
def apply(someData: SomeData) = MyCompactMessage(someData: SomeData)
def unapply(myCompactMessage: MyCompactMessage)(implicit context: SharedContext)
: Option[SomeData] =
Some(myCompactMessage.someData(context))
}
}
As you can see, the sender and receiver code feels just like using a case class and in fact, MyMessage could be a case class.
However, by implementing apply and unapply manually, one can insert its own "compactification" logic and also implicitly inject the shared data necessary to do the "uncompactification", without touching the sender and receiver. For defining MyCompactMessage, I found Protocol Buffers to be especially suited, as it is already a dependency of Akka and efficient in terms of space and computation, but any other solution would do.
Update I can confirm that objectWithID could potentially need a parent (or grandparent, etc) context's thread to do some fetching so avoid blocking your parent thread using something like waitUntilAllOperationsAreFinished.
As a quick test I pointed the children moc's parent to their grandparent instead and left the children threads blocking the original parent. In this setup the deadlock never occurred. This is a poor architecture though so I'll be rearchitecting.
Original Question
I have two layers of NSOperationQueue. The first is an NSOperation graph with operations that have a set of dependencies between them. They all run fine without deadlocking each other. Within one of these operations (a Scheduler for groups of people) I have broken out its work to more discrete chunks that can be run on another NSOperationQueue. However I still will want the Scheduler to finish creating all of its schedules before the larger operation is considered finished. To that end, once I create all Schedule operations and add them to the Scheduler operation queue, I call waitUntilAllOperationsAreFinished on the operation queue. This is where I deadlock.
I am using Core Data and have an NSBlockOperation subclass called BlockOperation that handles the routine of taking a parent managed object context, creating a PrivateQueueConcurrencyType child context, calling the provided block using performBlockAndWait and finally waiting on the parent context to merge changes. Here's some code...
init(block: (NSManagedObjectContext?) -> Void, withDependencies dependencies: Array<NSOperation>, andParentManagedObjectContext parentManagedObjectContext: NSManagedObjectContext?) {
self.privateContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
super.init()
self.queuePriority = NSOperationQueuePriority.Normal
addExecutionBlock({
if (parentManagedObjectContext != nil) {
self.parentContext = parentManagedObjectContext!
self.privateContext.parentContext = parentManagedObjectContext!
self.privateContext.performBlockAndWait({ () -> Void in
block(self.privateContext)
})
self.parentContext!.performBlockAndWait({ () -> Void in
var error: NSError?
self.parentContext!.save(&error)
})
}
})
for operation in dependencies {
addDependency(operation)
}
}
This is working really well for me already. But now I want to block a calling thread until an operation queue on it has finished all of its operations. Like this...
for group in groups {
let groupId = group.objectID
let scheduleOperation = BlockOperation(
block: { (managedObjectContext: NSManagedObjectContext?) -> Void in
ScheduleOperation.scheduleGroupId(groupId, inManagedObjectContext: managedObjectContext!)
},
withDependencies: [],
andParentManagedObjectContext: managedObjectContext)
scheduleOperationQueue.addOperation(scheduleOperation)
}
scheduleOperationQueue.waitUntilAllOperationsAreFinished()
...this thread gets stuck on that last line (obviously). But we never see the other threads make any progress past a certain point. Pausing the debugger I see where the queued operations are stuck. It's in a ScheduleOperation's init method where we fetch the group using the provided id. (ScheduleOperation.scheduleGroupId calls this init)
convenience init(groupId: NSManagedObjectID, inManagedObjectContext managedObjectContext: NSManagedObjectContext) {
let group = managedObjectContext.objectWithID(groupId) as Group
...
Does objectWithID need to execute code on the "parent" thread that its parent moc is associated with and therefore creating a deadlock? Is there anything else about my approach that could be causing this?
Note: Although I am writing this is Swift, I have added Objective-C as a tag because I feel like this is not a language specific issue, but a framework specific one.
In general it's not specified on which thread objectWithID will be called, it's an implementation detail. I had some problems with Core Data deadlocks in the past (although in different circumstances) and I found out that the framework does some locking internally when you invoke methods on NSManagedObjectContext. So yes, I think it might result in a deadlock.
I have no advice other than re-designing your architecture, maybe it can be simplified a little. Keep in mind that you already have a private serial queue associated with a context, which guarantees that the operations will be called in the specified order. You can therefore share the same context between all the ScheduleOperation instances. Set scheduleOperationQueue.maxConcurrentOperationsCount to 1, so that operations will execute one after another. And instead of blocking the calling thread, call a completion handler when the last operation finishes (you can use oepration's completionBlock).
I've searched StackOverflow and there are many ConcurrentModificationException questions. After reading them, I'm still confused. I'm getting a lot of these exceptions. I'm using a "Registry" setup to keep track of Objects:
public class Registry {
public static ArrayList<Messages> messages = new ArrayList<Messages>();
public static ArrayList<Effect> effects = new ArrayList<Effect>();
public static ArrayList<Projectile> proj = new ArrayList<Projectile>();
/** Clears all arrays */
public static void recycle(){
messages.clear();
effects.clear();
proj.clear();
}
}
I'm adding and removing objects to these lists by accessing the ArrayLists like this: Registry.effects.add(obj) and Registry.effects.remove(obj)
I managed to get around some errors by using a retry loop:
//somewhere in my game..
boolean retry = true;
while (retry){
try {
removeEffectsWithSource("CHARGE");
retry = false;
}
catch (ConcurrentModificationException c){}
}
private void removeEffectsWithSource(String src) throws ConcurrentModificationException {
ListIterator<Effect> it = Registry.effects.listIterator();
while ( it.hasNext() ){
Effect f = it.next();
if ( f.Source.equals(src) ) {
f.unapplyEffects();
Registry.effects.remove(f);
}
}
}
But in other cases this is not practical. I keep getting ConcurrentModificationExceptions in my drawProjectiles() method, even though it doesn't modify anything. I suppose the culprit is if I touched the screen, which creates a new Projectile object and adds it to Registry.proj while the draw method is still iterating.
I can't very well do a retry loop with the draw method, or it will re-draw some of the objects. So now I'm forced to find a new solution.. Is there a more stable way of accomplishing what I'm doing?
Oh and part 2 of my question: Many people suggest using ListIterators (as I have been using), but I don't understand.. if I call ListIterator.remove() does it remove that object from the ArrayList it's iterating through, or just remove it from the Iterator itself?
Top line, three recommendations:
Don't do the "wrap an exception in a loop" thing. Exceptions are for exceptional conditions, not control flow. (Effective Java #57 or Exceptions and Control Flow or Example of "using exceptions for control flow")
If you're going to use a Registry object, expose thread-safe behavioral, not accessor methods on that object and contain the concurrency reasoning within that single class. Your life will get better. No exposing collections in public fields. (ew, and why are those fields static?)
To solve the actual concurrency issues, do one of the following:
Use synchronized collections (potential performance hit)
Use concurrent collections (sometimes complicated logic, but probably efficient)
Use snapshots (probably with synchronized or a ReadWriteLock under the covers)
Part 1 of your question
You should use a concurrent data structure for the multi-threaded scenario, or use a synchronizer and make a defensive copy. Probably directly exposing the collections as public fields is wrong: your registry should expose thread-safe behavioral accessors to those collections. For instance, maybe you want a Registry.safeRemoveEffectBySource(String src) method. Keep the threading specifics internal to the registry, which seems to be the "owner" of this aggregate information in your design.
Since you probably don't really need List semantics, I suggest replacing these with ConcurrentHashMaps wrapped into Set using Collections.newSetFromMap().
Your draw() method could either a) use a Registry.getEffectsSnapshot() method that returns a snapshot of the set; or b) use an Iterable<Effect> Registry.getEffects() method that returns a safe iterable version (maybe just backed by the ConcurrentHashMap, which won't throw CME under any circumstances). I think (b) is preferable here, as long as the draw loop doesn't need to modify the collection. This provides a very weak synchronization guarantee between the mutator thread(s) and the draw() thread, but assuming the draw() thread runs often enough, missing an update or something probably isn't a big deal.
Part 2 of your question
As another answer notes, in the single-thread case, you should just make sure you use the Iterator.remove() to remove the item, but again, you should wrap this logic inside the Registry class if at all possible. In some cases, you'll need to lock a collection, iterate over it collecting some aggregate information, and make structural modifications after the iteration completes. You ask if the remove() method just removes it from the Iterator or from the backing collection... see the API contract for Iterator.remove() which tells you it removes the object from the underlying collection. Also see this SO question.
You cannot directly remove an item from a collection while you are still iterating over it, otherwise you will get a ConcurrentModificationException.
The solution is, as you hint, to call the remove method on the Iterator instead. This will remove it from the underlying collection as well, but it will do it in such a way that the Iterator knows what's going on and so doesn't throw an exception when it finds the collection has been modified.