GPARs async functions and passing references that are being updated by another thread - gpars

I am using GPARs asynchronous functions to fire off a process as each line in a file is parsed.
I am seeing some strange behavior that makes me wonder if I have an issue with thread safety.
Let's say I have a current object that is being loaded up with values from the current row in an input spreadsheet, like so:
Uploader {
MyRowObject currentRowObject
}
Once it has all the values from the current row, I fire off an async closure that looks a bit like this:
Closure processCurrentRowObject = { ->
myService.processCurrentRowObject (currentRowObject)
}.asyncFun()
It is defined in the same class, so it has access to the currentRowObject.
While that is off and running, I parse the next row, and start by creating a new object:
MyObject currentObject = new MyObject()
and start loading it up with values.
I assumed that this would be safe, that the asynchronous function would be pointing to the previous object. However, I wonder if because I am letting the closure bind to the reference, if somehow the reference is getting updated in the async function, and I am pulling the object instance out from under it, so to speak - changing it while it's trying to work on the previous instance.
If so, any suggestions for fixing? Or am I safe?
Thanks!

I'm not sure I fully understand your case, however, here's a quick tip.
Since it is always dangerous to share a single mutable object among threads, I'd recommend to completely separate the row objects used for different rows:
final localRowObject = currentRowObject
currentRowObject = null
Closure processCurrentRowObject = { ->
myService.processCurrentRowObject (localRowObject)
}.asyncFun()

Related

Kotlin - java.util.ConcurrentModificationException: null

Guys how can I fix the error java.util.ConcurrentModificationException: null
experiments.forEach {
if(NAME_VARIANT == it.variantName) {
for (i in (0..Math.min(result.size - 1, Constants.MAX_METHODS_APPLIED))) {
if (response.paymentMethods[i].scoring.rules!!.none { it.ruleName == NAME_RULE}) {
response.appliedExperiments.clear()
}
}
}
}
This exception ConcurrentModificationException is thrown when you're trying to modify a collection at the same time as iterating over it.
In the piece of code you provided, you're iterating over experiments, and you're modifying response.appliedExperiments (by calling clear() on it). If those 2 variables actually point to the same collection, calling clear() is expected to throw.
In order to do what you want, you probably want those lists to start off as copies of each other, but still be different. When you create response.appliedExperiments, make sure it's a new list and not the same list.
EDIT: in the code you provided, you are passing experiments directly to the constructor of SortingServiceResponse, and I'm guessing this constructor uses the list as-is as the appliedExperiments property of reponse. Instead, you should pass a copy of the list, for instance using toMutableList():
val response = SortingServiceResponse(experiments.toMutableList(), result)
An even better approach would be to use read-only List instead of MutableList for experiments to avoid making this kind of mistakes in the first place. Only use mutable collections when you really need to. Most of the time, you can use operators (like filter or map) that create new read-only lists instead of working with mutable lists directly.

whats the best way of referencing a class in object oriented programming?

A good code is what every Programmer wants to write , optimized, robust, good regarding performance, re-usable, etc. I am doing programming for quite a long time in object oriented programming. I have seen many different codes in which different developers used different referencing mechanisms.
some of the developers used
Classname c = new Classname();
c.method();
c.method2();
etc...
while some developers have used following strategy
(new ClassName()).method();
(new ClassName()).method2();
I want to know whats best in both of them whats the actual difference between both of them ?
The second example doesn't make sense. It suggests that the ClassName is stateless, so you can make the methods static. Even if they need to use some constructor parameters (not present in your example though), then why would you instantiate an object twice, if you can do it once? In microscale, this is slower than creating the object only once. Also it doesn't help garbage collectors in memory-managed environments. Although, of course, in most scenarios these two issues are negligible.
The only case I can imagine for the second example is a bootstrapping scenario, when in your main partition you set up the whole application, and start it up:
main()
{
(new Application(...)).Run();
}
There will be only one application object, and only one method needs to be called so it doesn't really matter to retain a handle to its instance. Another example, would be starting up some custom thread-class to perform some background operations:
{
(new BackgroundWorkerThread(...)).Start();
}
I've never seen a sane example of calling two instance methods from the same class, in a way you've presented.
Classname c = new Classname();
c.method();
c.method2();
Is best if you are going to reuse your object, it creates the object once before calling the methods, so it's better than
(new ClassName()).method();
(new ClassName()).method2();
Which is basically creating a new object each time you call a method.
Consider this:
(new ClassName()).getName(); // Returns default value John Doe
(new ClassName()).setName('Steve');
(new ClassName()).getName(); // Returns default value John Doe
Classname c = new Classname();
c.getName(); // Returns default value John Doe
c.setName('Steve');
c.getName(); // Returns Steve

Stateful objects, properties and parameter-less methods in favour of stateless objects, parameters and return values

I find this class definition a bit odd:
http://www.extremeoptimization.com/Documentation/Reference/Extreme.Mathematics.LinearAlgebra.SingleLeastSquaresSolver_Members.aspx
The Solve method does have a return value but would not need to because the result is also available in the Solution property.
This is what I see as traditional code:
var sqrt2 = Math.Sqrt(2)
This would be an alternative in the same spirit as the solver in the link:
var sqrtCalculator = new SqrtCalculator();
sqrtCalculator.Parameter = 2;
sqrtCalculator.Run();
var sqrt2 = sqrtCalculator.Result;
What are the pros and cons besides the second version being a bit "untraditional"?
Yes, the compiler won't help the user who forgot to assign some property (parameter) BUT this is the case with all components that contain writeable properties and don't have mandatory values in the constructor.
Yes, threading will not work, BUT each thread can create its own solver.
Yes, the garbage collector won't be able to dispose the solver's result, BUT if the entire solver is disposed it will.
Yes, compilers and processors have special treatment of parameters and return values which makes them fast, BUT the time for parameter handling is mostly neglectable.
And so on. Other ideas?
Well, after a year I found a clear flaw with this "introvert" approach. I am using an existing filter object which should operate on a measurement object but rather operates on itself in a "it's all me and nothing else"-fashion described above. Now the customer wants a recalculation of a measurement object a few minutes after the first calculation, and meanwhile the filter has processed other measurement objects. If it had been stateless and stored its data in the measurement object, it would have been an easy matter to implement a Recalculate method. The only way to solve the problem with an introvert filter is to let a filter instance be a part of the measurement object. Then filters need to be instantiated for every new measurement object. And since filters are a part of a chain the entire chain needs to be recreated. Well, there is some merit to being stateless.

Another ConcurrentModificationException question

I've searched StackOverflow and there are many ConcurrentModificationException questions. After reading them, I'm still confused. I'm getting a lot of these exceptions. I'm using a "Registry" setup to keep track of Objects:
public class Registry {
public static ArrayList<Messages> messages = new ArrayList<Messages>();
public static ArrayList<Effect> effects = new ArrayList<Effect>();
public static ArrayList<Projectile> proj = new ArrayList<Projectile>();
/** Clears all arrays */
public static void recycle(){
messages.clear();
effects.clear();
proj.clear();
}
}
I'm adding and removing objects to these lists by accessing the ArrayLists like this: Registry.effects.add(obj) and Registry.effects.remove(obj)
I managed to get around some errors by using a retry loop:
//somewhere in my game..
boolean retry = true;
while (retry){
try {
removeEffectsWithSource("CHARGE");
retry = false;
}
catch (ConcurrentModificationException c){}
}
private void removeEffectsWithSource(String src) throws ConcurrentModificationException {
ListIterator<Effect> it = Registry.effects.listIterator();
while ( it.hasNext() ){
Effect f = it.next();
if ( f.Source.equals(src) ) {
f.unapplyEffects();
Registry.effects.remove(f);
}
}
}
But in other cases this is not practical. I keep getting ConcurrentModificationExceptions in my drawProjectiles() method, even though it doesn't modify anything. I suppose the culprit is if I touched the screen, which creates a new Projectile object and adds it to Registry.proj while the draw method is still iterating.
I can't very well do a retry loop with the draw method, or it will re-draw some of the objects. So now I'm forced to find a new solution.. Is there a more stable way of accomplishing what I'm doing?
Oh and part 2 of my question: Many people suggest using ListIterators (as I have been using), but I don't understand.. if I call ListIterator.remove() does it remove that object from the ArrayList it's iterating through, or just remove it from the Iterator itself?
Top line, three recommendations:
Don't do the "wrap an exception in a loop" thing. Exceptions are for exceptional conditions, not control flow. (Effective Java #57 or Exceptions and Control Flow or Example of "using exceptions for control flow")
If you're going to use a Registry object, expose thread-safe behavioral, not accessor methods on that object and contain the concurrency reasoning within that single class. Your life will get better. No exposing collections in public fields. (ew, and why are those fields static?)
To solve the actual concurrency issues, do one of the following:
Use synchronized collections (potential performance hit)
Use concurrent collections (sometimes complicated logic, but probably efficient)
Use snapshots (probably with synchronized or a ReadWriteLock under the covers)
Part 1 of your question
You should use a concurrent data structure for the multi-threaded scenario, or use a synchronizer and make a defensive copy. Probably directly exposing the collections as public fields is wrong: your registry should expose thread-safe behavioral accessors to those collections. For instance, maybe you want a Registry.safeRemoveEffectBySource(String src) method. Keep the threading specifics internal to the registry, which seems to be the "owner" of this aggregate information in your design.
Since you probably don't really need List semantics, I suggest replacing these with ConcurrentHashMaps wrapped into Set using Collections.newSetFromMap().
Your draw() method could either a) use a Registry.getEffectsSnapshot() method that returns a snapshot of the set; or b) use an Iterable<Effect> Registry.getEffects() method that returns a safe iterable version (maybe just backed by the ConcurrentHashMap, which won't throw CME under any circumstances). I think (b) is preferable here, as long as the draw loop doesn't need to modify the collection. This provides a very weak synchronization guarantee between the mutator thread(s) and the draw() thread, but assuming the draw() thread runs often enough, missing an update or something probably isn't a big deal.
Part 2 of your question
As another answer notes, in the single-thread case, you should just make sure you use the Iterator.remove() to remove the item, but again, you should wrap this logic inside the Registry class if at all possible. In some cases, you'll need to lock a collection, iterate over it collecting some aggregate information, and make structural modifications after the iteration completes. You ask if the remove() method just removes it from the Iterator or from the backing collection... see the API contract for Iterator.remove() which tells you it removes the object from the underlying collection. Also see this SO question.
You cannot directly remove an item from a collection while you are still iterating over it, otherwise you will get a ConcurrentModificationException.
The solution is, as you hint, to call the remove method on the Iterator instead. This will remove it from the underlying collection as well, but it will do it in such a way that the Iterator knows what's going on and so doesn't throw an exception when it finds the collection has been modified.

God object - decrease coupling to a 'master' object

I have an object called Parameters that gets tossed from method to method down and up the call tree, across package boundaries. It has about fifty state variables. Each method might use one or two variables to control its output.
I think this is a bad idea, beacuse I can't easily see what a method needs to function, or even what might happen if with a certain combination of parameters for module Y which is totally unrelated to my current module.
What are some good techniques for decreasing coupling to this god object, or ideally eliminating it ?
public void ExporterExcelParFonds(ParametresExecution parametres)
{
ApplicationExcel appExcel = null;
LogTool.Instance.ExceptionSoulevee = false;
bool inclureReferences = parametres.inclureReferences;
bool inclureBornes = parametres.inclureBornes;
DateTime dateDebut = parametres.date;
DateTime dateFin = parametres.dateFin;
try
{
LogTool.Instance.AfficherMessage(Variables.msg_GenerationRapportPortefeuilleReference);
bool fichiersPreparesAvecSucces = PreparerFichiers(parametres, Sections.exportExcelParFonds);
if (!fichiersPreparesAvecSucces)
{
parametres.afficherRapportApresGeneration = false;
LogTool.Instance.ExceptionSoulevee = true;
}
else
{
The caller would do :
PortefeuillesReference pr = new PortefeuillesReference();
pr.ExporterExcelParFonds(parametres);
First, at the risk of stating the obvious: pass the parameters which are used by the methods, rather than the god object.
This, however, might lead to some methods needing huge amounts of parameters because they call other methods, which call other methods in turn, etcetera. That was probably the inspiration for putting everything in a god object. I'll give a simplified example of such a method with too many parameters; you'll have to imagine that "too many" == 3 here :-)
public void PrintFilteredReport(
Data data, FilterCriteria criteria, ReportFormat format)
{
var filteredData = Filter(data, criteria);
PrintReport(filteredData, format);
}
So the question is, how can we reduce the amount of parameters without resorting to a god object? The answer is to get rid of procedural programming and make good use of object oriented design. Objects can use each other without needing to know the parameters that were used to initialize their collaborators:
// dataFilter service object only needs to know the criteria
var dataFilter = new DataFilter(criteria);
// report printer service object only needs to know the format
var reportPrinter = new ReportPrinter(format);
// filteredReportPrinter service object is initialized with a
// dataFilter and a reportPrinter service, but it doesn't need
// to know which parameters those are using to do their job
var filteredReportPrinter = new FilteredReportPrinter(dataFilter, reportPrinter);
Now the FilteredReportPrinter.Print method can be implemented with only one parameter:
public void Print(data)
{
var filteredData = this.dataFilter.Filter(data);
this.reportPrinter.Print(filteredData);
}
Incidentally, this sort of separation of concerns and dependency injection is good for more than just eliminating parameters. If you access collaborator objects through interfaces, then that makes your class
very flexible: you can set up FilteredReportPrinter with any filter/printer implementation you can imagine
very testable: you can pass in mock collaborators with canned responses and verify that they were used correctly in a unit test
If all your methods are using the same Parameters class then maybe it should be a member variable of a class with the relevant methods in it, then you can pass Parameters into the constructor of this class, assign it to a member variable and all your methods can use it with having to pass it as a parameter.
A good way to start refactoring this god class is by splitting it up into smaller pieces. Find groups of properties that are related and break them out into their own class.
You can then revisit the methods that depend on Parameters and see if you can replace it with one of the smaller classes you created.
Hard to give a good solution without some code samples and real world situations.
It sounds like you are not applying object-oriented (OO) principles in your design. Since you mention the word "object" I presume you are working within some sort of OO paradigm. I recommend you convert your "call tree" into objects that model the problem you are solving. A "god object" is definitely something to avoid. I think you may be missing something fundamental... If you post some code examples I may be able to answer in more detail.
Query each client for their required parameters and inject them?
Example: each "object" that requires "parameters" is a "Client". Each "Client" exposes an interface through which a "Configuration Agent" queries the Client for its required parameters. The Configuration Agent then "injects" the parameters (and only those required by a Client).
For the parameters that dictate behavior, one can instantiate an object that exhibits the configured behavior. Then client classes simply use the instantiated object - neither the client nor the service have to know what the value of the parameter is. For instance for a parameter that tells where to read data from, have the FlatFileReader, XMLFileReader and DatabaseReader all inherit the same base class (or implement the same interface). Instantiate one of them base on the value of the parameter, then clients of the reader class just ask for data to the instantiated reader object without knowing if the data come from a file or from the DB.
To start you can break your big ParametresExecution class into several classes, one per package, which only hold the parameters for the package.
Another direction could be to pass the ParametresExecution object at construction time. You won't have to pass it around at every function call.
(I am assuming this is within a Java or .NET environment) Convert the class into a singleton. Add a static method called "getInstance()" or something similar to call to get the name-value bundle (and stop "tramping" it around -- see Ch. 10 of "Code Complete" book).
Now the hard part. Presumably, this is within a web app or some other non batch/single-thread environment. So, to get access to the right instance when the object is not really a true singleton, you have to hide selection logic inside of the static accessor.
In java, you can set up a "thread local" reference, and initialize it when each request or sub-task starts. Then, code the accessor in terms of that thread-local. I don't know if something analogous exists in .NET, but you can always fake it with a Dictionary (Hash, Map) which uses the current thread instance as the key.
It's a start... (there's always decomposition of the blob itself, but I built a framework that has a very similar semi-global value-store within it)