Possible race condition when enabling multithreading

Possible race condition when enabling multithreading - optaplanner

Suppose I have a slight variant of the cloud balancing problem, in which the Process has not just one weight, but a map of (positive) weights, such as
Map<Long, Long> groupMap = new HashMap<>();
where the the key is specific to my domain and the value is the weight.
On the class Computer (still referring to the cloud balancing example) I have a shadow variable hist which is also a (Hash)Map<Long, Long>, and a custom listener updating hist:
public class HistListener implements VariableListener {
#Override
public void beforeVariableChanged(ScoreDirector scoreDirector, Object o) {
Process p = (Process) o;
if (p.getComputer() != null) {
Computer kc = p.getComputer();
//update hist Map
scoreDirector.beforeVariableChanged(kc, "hist");
for (Map.Entry<Long, Long> entrySet:k.getGroupMap().entrySet()){
kc.getHist().put(entrySet.getKey(), kc.getHist().get(entrySet.getKey()) - k.getGroupMap().get(entrySet.getKey()));
}
scoreDirector.afterVariableChanged(kc, "hist");
}
}
and pretty much the same for afterVariableChanged just with reversed sign.
I annotate both Process and Computer as #PlanningEntity and register them in the solverConfig.
There are no constraints, so the solver should be able to assign the computers to the processes arbitrarily. As a result, I expect hist only to have natural numbers (incl. 0) as values.
When running it with <moveThreadCount>NONE</moveThreadCount>, this is indeed the case:
<"Computer"+computer.id: hist>
Computer0: {0=0, 1=0, 2=20, 3=0, 4=10, 5=20, 6=0, 7=10, 8=10, 9=20}
Computer1: {0=0, 1=10, 2=0, 3=0, 4=10, 5=0, 6=10, 7=0, 8=0, 9=0}
Computer2: {0=0, 1=0, 2=0, 3=0, 4=0, 5=0, 6=0, 7=0, 8=0, 9=0}
When running exactly the same code with <moveThreadCount>AUTO</moveThreadCount>, I partially get negative values in hist:
Computer0: {0=0, 1=-20, 2=30, 3=0, 4=-40, 5=50, 6=-10, 7=30, 8=40, 9=150}
Computer1: {0=0, 1=-40, 2=-20, 3=0, 4=-90, 5=-50, 6=-40, 7=-20, 8=-20, 9=-30}
Computer2: {0=0, 1=80, 2=-20, 3=0, 4=30, 5=-30, 6=50, 7=0, 8=-20, 9=-50}
This discrepancy disappears when I refactor the keys of groupMap on process and those of hist on computer as individual shadow variables.
The trace logs suggest a race condition, where several threads access hist simultaneously. (According the Oracle docs, I only need a synchronizedMap implementation if the map is structurally changed, i.e., if keys are added or removed - I'm not doing that.)
The use of a Map as a shadow variable greatly enhances the flexibility of my solution, it would be great if this were supported with multithreading. I know I could probably fix this very simply example with an appropriate ConstraintProvider. My actual problem is much more complex than this and is not amenable to be treated with ConstraintProviders.
Question: Is it possible to have a Map based structure as a shadow variable in a multithreading context?
If it is not possible , I recommend adding a short note in the docs of optaplanner 8.29.0.Final (the version I'm using).
I had a look at questions regarding Lists as PlanningVariables in optaplanner, but I don't see how these questions relate to mine.

Is it possible to have a Map based structure as a shadow variable in a multithreading context?
Yes, because each move thread in a multithreading context has it's own ScoreDirector and own workingSolution internally. From a shadow variable's point of view and that map, it's single threaded.
What can mess this up?
Bad #PlanningId's in your dataset so the Move.rebase() operations go wrong. Duplicate IDs or lack of IDs. OptaPlanner detects most of these. Unlikely that this is your problem.
Incomplete planning cloning in your model. That's probably it. This will also cause issues you haven't seen yet in a single threaded context, especially when the last working solution greatly differs from the last best found solution when the termination runs out. FULL_ASSERT should detect those, but they might not occur on every run...
Each move thread has their own workingSolution internally. That's not entirely true. They all have a planning clone from the original. But if the planning clone doesn't clone all of the shadow variable affected data, it's corrupted. In a multithreaded solving context this will cause issues much faster.
Ok, this is getting complex. How do I solve this?
Experiment with adding a #DeepPlanningClone annotation on your Map field. But making a shadow variable already implies deep planning cloning it automatically IIRC. My guess it's keys or values in that map that need to get planning cloned too. Read the planning clone section in the docs.

Related

Is there any possibility that QAbstractItemModel::beginResetModel and endResetModel can create a performance issue?

My Dev setup:
Qt version : Qt 5.15.0
OS: Embedded Linux
I have a list of information.
Assume I have a structure called MyStruct
My model class is having a member variable of QList of above structure, to hold data for my view. Whenever I am opening the view, I am updating the QList (Note: There may or may not be a change). Updating here is something like assigning a new QList to existing one. before assignment, I am calling beginResetModel and after assignment I am calling endResetModel,
void MyModelClass::SomeInsertMethod(const QList<MyStruct>& aNewData)
{
beginResetModel();
m_lstData = aNewData;
endResetModel();
}
One thing I believe can be improved, is putting a check, if the new data is different than the existing data and then doing the above. Something like this:
void MyModelClass::SomeInsertMethod(const QList<MyStruct>& aNewData)
{
if (m_lstData != aNewData)
{
beginResetModel();
m_lstData = aNewData;
endResetModel();
}
}
Apart from that, is there any possibilities of getting a performance issue for calling beginResetModel/endResetModel? I m seeing a very small delay in the view coming up in my application.
I checked the documentation of QAbstractItemModel for above methods. Didn't get anything specific to the performance issue.
The other way, which this can be done, is by individually comparing the elements of the lists and triggering a dataChanged signal with appropriate model index and roles. But I feel, this will unnecessarily introduce some additional loops and comparisons, which again may cause some other performance issue. Correct me if I am wrong.
Is there any advantage of using dataChanged over beginResetModel/EndResetModel?
Please let me know your views on the above.

Optaplanner: Access indictment map in custom phase

I am trying to implement a custom phase in order to clean up the solution provided by the CH phase. This is an overconstrained TWVRP problem with a lot of extra constraints on top, so I understand why the CH is strugglig. My custom phase will just take all stops breaking a hard constraint and assign them to the dummy vehicle, thereby getting me up to a hard score of 0.
However, the scoreDirector passed to the custom phase command does not allow me to access scoreDirector.getIndictmentMap()
My phase so far:
public class CleanUpPhase implements CustomPhaseCommand<Schedule> {
private static final Logger LOG = Logger.getLogger(CleanUpPhase.class);
//Clean up the solution from the construction heuristic phase.
#Override
public void changeWorkingSolution(ScoreDirector<Schedule> scoreDirector) {
ConstraintStreamScoreDirector constraintStreamScoreDirector = (ConstraintStreamScoreDirector<Schedule, HardMediumSoftScore>)scoreDirector;
constraintStreamScoreDirector.getIndictmentMap();
}
}
I tried to trick Optaplanner into giving me access to the indictment map with a cast but no luck:
java.lang.IllegalStateException: When constraintMatchEnabled (false) is disabled in the constructor, this method should not be called.
Is there a way to easily locate the entities breaking the hard constraints some other way, or can I instruct the CH phase to assign offending entities to the dummy vehicle through configuration somehow? All I need is a feasible solution when entering the local search phase.
UPDATE:
It seems that if I implement my own phase completely, I get access to a score director which can give me the indictment map. However, I get stuck on
java.lang.IllegalArgumentException: Unknown PhaseConfig type: (org.acme.CleanUpPhaseConfig).
at org.optaplanner.core.impl.phase.PhaseFactory.create(PhaseFactory.java:52)
How can I get the phase factory to recognize my newly created phase?

If you use a CustomPhase, there is no need for a new Config. The Existing CustomPhaseConfig accepts the CustomPhaseCommand implementation as a part of its configuration.
Please refer to this section of the documentation.
However, the CustomPhase might not be a solution to your problem. You may run into the same issue in Local Search after your CustomPhase unless you make sure your constraints take the dummy vehicle into account. There is a chapter about overconstrained planning, that describes two approaches: either making the planning variable nullable or using virtual values, as is your dummy vehicle. If you follow the chapter, you can avoid the CustomPhase.

Efficiently make a view of (or copy) a subset of a large HashMap in Kotlin

I am trying to create a subhashmap from a huge hashmap without copy the original one.
currently I use this:
val map = hashMapOf<Job, Int>()
val copy = HashMap(map)
listToRemoveFromCopy.forEach { copy.remove(it) }
this cost me around 50% of my current algorithm. Because java is calculating the hash of the job really often.
I only want the map minus the listToRemoveFromCopy in a new variable without removing the listToRemoveFromCopy elements from the original list.
anyone know this?
Thanks for help

First, you need to cache the hashcode for Job because any approach you use will be inefficient if you cannot have a set or a map of Job objects that operate at top speed.
Hopefully, the parts that make it a hashcode are immutable otherwise it should not be used as a key. It is very dangerous to mutate a key hashcode/equals while in use in a map or set. You should cache it on the first call to hashCode() so that you do not incur a cost until then unless you are sure you will always need it.
Then change listToRemoveFromCopy to be a Set so it can be efficiently used in many ways. You need to do the prior step before this.
Now you have multiple options. The most efficient is:
Guava has a utility function Maps.filterKeys which returns a view into a map, and you can create a predicate that works against a Set of the items to remove.
val removeKeys = listToRemoveFromCopy.toSet()
val mapView = Maps.filterKeys(map, Predicates.not(Predicates.in(removeKeys)))
But be sure to note some methods on the view are not very efficient. If you avoid those methods, this will be the top performing option:
Many of the filtered map's methods, such as size(), iterate across every key/value mapping in the underlying map and determine which satisfy the filter. When a live view is not needed, it may be faster to copy the filtered map and use the copy.
If you need to make a copy instead, you have a few approaches:
Use filterKeys on the map to create a new map in one shot. This is good if the remove list might be a larger percentage of the total keys.
val removeKeys = listToRemoveFromCopy.toSet()
val newMap = map.filterKeys { it !in removeKeys }
Another tempting option you should be careful about is the minus - operator which copies the full map and then removes the items. It can use the listToRemoveFromCopy as-is without it being a set, but the full map copy might undo the benefit. So do not do this unless the remove list is a small percentage of keys.
val newMapButSlower = map - listToRemoveFromCopy
You could pick one model over the other depending on the ratio between map size and remove list size, find a breaking point that works for your "huge".
Implementing your own view into the map to avoid a copy is possible, but not trivial (and by that I mean very complex). Every method you override has to do the correct thing at all times (including the map's own hashCode and equals), and other views would have to be created around the key set and values. The entrySet would be nasty to get right. I'd look for a pre-written solution before attempting your own (the Guava one above or other). This zero-copy model would be the most efficient solution but the most code and is what I would do in the same case if "huge" meant significant processing time. There is a lot that you can get wrong with this approach if you misunderstand any part of the implementation contract.
You could wrap the Guava solution with one that maintains the size attribute as items are manipulated and therefore be efficient for that case. You can also write a more efficient solution if you know the original map is read-only. For ideas, check out the Guava implementation of FilteredKeyMap and its ancestor AbstractFilteredMap.
In summary, likely the caching of your hashcode is going to give you the biggest result for the effort. Start there. You'll need it to do even for the Guava approach.

In addition to Axel's direct answer:
Could calculating the hashcode of a Job be optimised?  If the calculation can't be sped up, could it cache the result?  (There's ample precedent for this, including java.lang.String.)  Or if the class isn't under your control, could you create a delegate/wrapper that overrides the hashcode calculation?

You can use filterKeys function. It will iterate map only once
val copy = map.filterKeys { it !in listToRemoveFromCopy }

Minecraft bukkit scheduler and procedural instance naming

This question is probably pretty obvious to any person who knows how to use Bukkit properly, and I'm sorry if I missed a solution in the others, but this is really kicking my ass and I don't know what else to do, the tutorials have been utterly useless. There are really 2 things that I need help doing:
I need to learn how to create an indefinite number of instances of an object. I figure it'd be like this:
int num = 0;
public void create(){
String name = chocolate + num;
Thingy name = new Thingy();
}
So you see what I'm saying? I need to basically change the name that is given to each new instance so that it doesn't overwrite the last one when created. I swear I've looked everywhere, I've asked my Java professor and I can't get any answers.
2: I need to learn how to use the stupid scheduler, and I can't understand anything so far. Basically, when an event is detected, 2 things are called: one method which activates instantly, and one which needs to be given a 5 second delay, then called. The code is like this:
public onEvent(event e){
Thingy thing = new Thingy();
thing.method1();
thing.doOnDelay(method2(), 100 ticks);
}
Once again, I apologize if I am not giving too many specifics, but I cannot FOR THE LIFE OF ME find anything about the Bukkit event scheduler that I can understand.
DO NOT leave me links to the Bukkit official tutorials, I cannot understand them at all and it'll be a waste of an answer. I need somebody who can help me, I am a starting plugin writer.
I've had Programming I and II with focus in Java, so many basic things I know, I just need Bukkit-specific help for the second one.
The first one has had me confused since I started programming.

Ok, so for the first question I think you want to use a data structure. Depending on what you're doing, there are different data structures to use. A data structure is simply a container that you can use to store many instances of a type of object. The data structures that are available to you are:
HashMap
HashSet
TreeMap
List
ArrayList
Vector
There are more, but these are the big ones. HashMap, HashSet, and TreeMap are all part of the Map class, which is notable for it's speedy operations. To use the hashmap, you instantiate it with HashMap<KeyThing, ValueThingy> thing = new HashMap<KeyThing, ValueThing>(); then you add elements to it with thing.put(key, value). Thn when you want to get a value out of it, you just use thing.get(key) HashMaps use an algorithm that's super fast to get the values, but a consequence of this is that the HashMap doesn't store it's entries in any particular order. Therefore when you want to loop though it with a for loop, it randomly returns it's entries (Not truly random because memory and stuff). Also, it's important to note that you can only have one of each individual key. If you try to put in a key that already exists in the map, it will over-right the value for that key.
The HashSet is like a HashMap but without storing values to go with it. It's a pretty good container if all you need to use it for is to determine if an object is inside it.
The TreeMap is one of the only maps that store it's values in a particular order. You have to provide a Comparator (something that tells if an object is less than another object) so that it knows the order to put the values if it wants them to be in ascending order.
List and ArrayList are not maps. Their elements are put in with a index address. With the List, you have to specify the number of elements you're going to be putting into it. Lists do not change size. ArrayLists are like lists in that each element can be retrieved with arrayListThing.get(index) but the ArrayList can change size. You add elements to an ArrayList by arrayListThing.add(Thing).
The Vector is a lot like an ArrayList. It actually functions about the same and I'm not quite sure what the difference between them is.
At any rate, you can use these data structures to store a lot of objects by making a loop. Here's an example with a Vector.
Vector<Thing> thing = new Vector<Thing>();
int numberofthings = 100;
for(int i = 0; i < numberofthings; i++) {
thing.add(new Thing());
}
That will give you a vector full of things which you can then iterate through with
for(Thing elem:thing) {
thing.dostuff
}
Ok, now for the second problem. You are correct that you need to use the Bukkit Scheduler. Here is how:
Make a class that extends BukkitRunnable
public class RunnableThing extends BukkitRunnable {
public void run() {
//what you want to do. You have to make this method.
}
}
Then what you want to do when you want to execute that thing is you make a new BukkitTask object using your RunnableThing
BukkitTask example = new RunnableThing().runTaskLater(plugin, ticks)
You have to do some math to figure out how many ticks you want. 20 ticks = 1 second. Other than that I think that covers all your questions.

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

I find this class definition a bit odd:
http://www.extremeoptimization.com/Documentation/Reference/Extreme.Mathematics.LinearAlgebra.SingleLeastSquaresSolver_Members.aspx
The Solve method does have a return value but would not need to because the result is also available in the Solution property.
This is what I see as traditional code:
var sqrt2 = Math.Sqrt(2)
This would be an alternative in the same spirit as the solver in the link:
var sqrtCalculator = new SqrtCalculator();
sqrtCalculator.Parameter = 2;
sqrtCalculator.Run();
var sqrt2 = sqrtCalculator.Result;
What are the pros and cons besides the second version being a bit "untraditional"?
Yes, the compiler won't help the user who forgot to assign some property (parameter) BUT this is the case with all components that contain writeable properties and don't have mandatory values in the constructor.
Yes, threading will not work, BUT each thread can create its own solver.
Yes, the garbage collector won't be able to dispose the solver's result, BUT if the entire solver is disposed it will.
Yes, compilers and processors have special treatment of parameters and return values which makes them fast, BUT the time for parameter handling is mostly neglectable.
And so on. Other ideas?

Well, after a year I found a clear flaw with this "introvert" approach. I am using an existing filter object which should operate on a measurement object but rather operates on itself in a "it's all me and nothing else"-fashion described above. Now the customer wants a recalculation of a measurement object a few minutes after the first calculation, and meanwhile the filter has processed other measurement objects. If it had been stateless and stored its data in the measurement object, it would have been an easy matter to implement a Recalculate method. The only way to solve the problem with an introvert filter is to let a filter instance be a part of the measurement object. Then filters need to be instantiated for every new measurement object. And since filters are a part of a chain the entire chain needs to be recreated. Well, there is some merit to being stateless.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas