only have 1 active ICalculateScore.CalculateScore? - encog

I have a class that implements ICalculateScore in C# Encog 3.2. The "CalculateScore" method is a long running, CPU intensive task. I would like to make sure that during an iteration, only 1 call to the CalculateScore is active at once.
setting RequireSingleThreaded to true I was hoping would do the trick but did not.
Is there a Parallel.ForEach statement or something in the Encog code that could be changed to a normal non-parallel method?
Possible to tell Encog to only have 1 active ICalculateScore.CalculateScore?
Thanks,
Dan Hickman

Related

How do I stop a SystemC simulation, from a CTHREAD, and terminate the simulation with a specific exit code?

I've got a SystemC testbench (for a VHDL DUT but that's irrelevant right now). I'd like to be able to cause the test to terminate, with a specific exit code, from within an SC_CTHREAD. I am pretty new to SystemC (I'm mostly a verilog/sv guy).
I've tried simply including "exit(error_code)" but while that terminates the simulation, the final exit code comes from my sc_main's return statement. I guess this makes sense since the "exit" probably terminates the separate thread with that exit code, but not the sc_main thread.
I've tried calling sc_stop before that exit, and usually get an error about calling a pure virtual method (I believe that's end_of_simulation, which I have not defined....but that should be optional, right?).
The only way I see to affect the exit code of the process is to change the return statement of sc_main, but that doesn't quite work when all my action is happening in an SC_THREAD created by a constructor of an object somewhere (if it's not obvious I truly don't understand how SystemC works very well, probably because there's a lot of hidden action with secret classes that we can't see, as opposed to verilog where simulation ordering is all there is, assuming you understand blocking/non-blocking).
Is there a way for the SC_THREAD to pass some failure information up to sc_main?
Note: I am running with a commercial simulation tool, and not pure C++. It's possible that there's an implementation bug there. I'm trying to get our pure C++ compile running but it seems to have broken a while ago (I'm not responsible for the SystemC model).
The answer, of course, is to have sc_main get its return value by reaching directly into the module that it instantiates to get at a member variable:
return(test_module.exit_code);
I need to remember that we're still running C++ and basic C++ rules apply, so this is possible. In my hardware-design mind, I was spinning off modules and threads that had no connection to the classes that they actually are.

Cog VM and indirect variable access

Does anyone know whether the Cog VM for Pharo and Squeak is able to optimize away simple indirect variable accesses with accessors like this:
SomeClass>>someProperty
^ someProperty
SomeClass>>someSecondProperty
^ someSecondProperty
that just return an instance variable, so that methods like this:
SomeClass>>someMethod
^ self someProperty doWith: self someSecondProperty
will be no slower than methods like this:
SomeClass>>someMethod
^ someProperty doWith: someSecondProperty
I did some benchmarks, and they do seem roughly equivalent in speed, but I'm curious if anyone familiar with Cog knows for certain, because if there is a difference (no matter how slight), then there might be situations however rare where one is inappropriate.
There's a little cost right now but it's so little that you should not bother. If you want performance, you are willing to change other parts of your code, not instance variable access.
A quick bench:
bench
^ { [ iv yourself ] bench . [ self iv yourself ] bench }
=> #('52,400,000 per second.' '49,800,000 per second.')
The difference does not look so big.
Once jitted and executed once, the difference is that "self iv" executes an inline cache check, a cpu call and a cpu return in addition of fetching the instance variable value. The call and return instructions are most probably going to be anticipated by the cpu and not really executed. So it's about the inline cache check which is a very cheap operation.
What the inlining compiler in development will add is that the cpu call and return are really going to be removed with inlining, which will cover the cases where the cpu has not anticipated them. In addition, the inline cache check may or may not be removed depending on circumstances.
There are details such as the getter method needs to be compiled to native code which takes room in the machine code zone which could increase the number of machine code zone garbage collection, but that's even more anecdotic than the inline cache check overhead.
So in short, there is a very very very little overhead right now but that overhead will decrease in the future.
Clement
This is a tough question... And I don't know the exact answer. But I can help you learning how to check by yourself with a few clues.
You'll need to load the VMMaker package in an image. In Pharo, there is a procedure to build such image by just downloading everything from the net and github. See https://github.com/pharo-project/pharo-vm
Then the main hint is that methods that just return an instance variable are compiled as if executing primitive 264 + inst var offset... (for example, you'll see this by inspecting Interval>>#first or any other simple inst var getter)
In classical interpreter VM, this is handled in Interpreter>>internalExecuteNewMethod.
It seems like you pay the cost of a method lookup (some caches make this cheaper), but not of a real method activation.
I suppose that it explains that debuggers can't enter into such simple methods... This however is not a real inlining.
In COG, the same happens in StackInterpreter>>internalQuickPrimitiveResponse if ever interpreter is used.
As for the JIT, this is handled by Cogit>>compilePrimitive, see also implementors of genQuickReturnInstVar. This is not proper inlining either, but you can see that there are very few instructions generated. Again, I bet you generally don't pay the price of a lookup thank to so called Polymorphic Inline Cache (PIC).
For real inlining, I didn't find a clue after this quick browsing of source code...
My understanding is that it will happen at image side thru callback from Sista VM, but this is work in progress and only my vague recollection. Clement Bera is writing a blog about this (the sista chronicles at http://clementbera.wordpress.com)
If you're afraid of digging in VMMaker source code, I invite you to ask on vm-dev.lists.squeakfoundation.org I'm pretty sure Eliot Miranda or Clement will be happy to give you a far more accurate answer.
EDIT
I forgot to tell you about the conclusion of above perigrinations: I think that there will be a very small difference if you directly use the inst. var. rather than a getter, but this shouldn't be really noticeable, and in all cases, your programming style should NOT be guided by such neglectable optimizations.

What does Objective-C program's startup routine do?

I know that information about all loaded classes is gathered at the startup time. But I could not find any information on how it is done and how is Objective-C startup routine looks compared to a plain C program's startup routine.
I'm just wondering what was added in Objective-C from this point. Is Objective-C program a C program with some additions or it is completely different by its structure?
You should take a look at this article from Cocoa with Love: http://cocoawithlove.com/2008/03/cocoa-application-startup.html , which gives a good overview.
However, if you really want to know what's going on, it's going to be a bit of a dig, but you can look at the source to the runtime at http://opensource.apple.com/ . Look for the objc4* project inside of whichever OS you are interested in. Look to objcrt.c for the top of the initialization chain.
You asked two discrete questions in your original post: what the startup routine looks like (which is covered in the runtime) and "Is Objective-C program a C program with some additions". The answer to the latter is yes, it is a C program with some additions, in the same manner as C++. And, like C++, it contains some pretty significant additions to the runtime.

STM32 programming tips and questions

I could not find any good document on internet about STM32 programming. STM's own documents do not explain anything more than register functions. I will greatly appreciate if anyone can explain my following questions?
I noticed that in all example programs that STM provides, local variables for main() are always defined outside of the main() function (with occasional use of static keyword). Is there any reason for that? Should I follow a similar practice? Should I avoid using local variables inside the main?
I have a gloabal variable which is updated within the clock interrupt handle. I am using the same variable inside another function as a loop condition. Don't I need to access this variable using some form of atomic read operation? How can I know that a clock interrupt does not change its value in the middle of the function execution? Should I need to cancel clock interrupt everytime I need to use this variable inside a function? (However, this seems extremely ineffective to me as I use it as loop condition. I believe there should be better ways of doing it).
Keil automatically inserts a startup code which is written in assembly (i.e. startup_stm32f4xx.s). This startup code has the following import statements:
IMPORT SystemInit
IMPORT __main
.In "C", it makes sense. However, in C++ both main and system_init have different names (e.g. _int_main__void). How can this startup code can still work in C++ even without using "extern "C" " (I tried and it worked). How can the c++ linker (armcc --cpp) can associate these statements with the correct functions?
you can use local or global variables, using local in embedded systems has a risk of your stack colliding with your data. with globals you dont have that problem. but this is true no matter where you are, embedded microcontroller, desktop, etc.
I would make a copy of the global in the foreground task that uses it.
unsigned int myglobal;
void fun ( void )
{
unsigned int myg;
myg=myglobal;
and then only use myg for the rest of the function. Basically you are taking a snapshot and using the snapshot. You would want to do the same thing if you are reading a register, if you want to do multiple things based on a sample of something take one sample of it and make decisions on that one sample, otherwise the item can change between samples. If you are using one global to communicate back and forth to the interrupt handler, well I would use two variables one foreground to interrupt, the other interrupt to foreground. yes, there are times where you need to carefully manage a shared resource like that, normally it has to do with times where you need to do more than one thing, for example if you had several items that all need to change as a group before the handler can see them change then you need to disable the interrupt handler until all the items have changed. here again there is nothing special about embedded microcontrollers this is all basic stuff you would see on a desktop system with a full blown operating system.
Keil knows what they are doing if they support C++ then from a system level they have this worked out. I dont use Keil I use gcc and llvm for microcontrollers like this one.
Edit:
Here is an example of what I am talking about
https://github.com/dwelch67/stm32vld/tree/master/stm32f4d/blinker05
stm32 using timer based interrupts, the interrupt handler modifies a variable shared with the foreground task. The foreground task takes a single snapshot of the shared variable (per loop) and if need be uses the snapshot more than once in the loop rather than the shared variable which can change. This is C not C++ I understand that, and I am using gcc and llvm not Keil. (note llvm has known problems optimizing tight while loops, very old bug, dont know why they have no interest in fixing it, llvm works for this example).
Question 1: Local variables
The sample code provided by ST is not particularly efficient or elegant. It gets the job done, but sometimes there are no good reasons for the things they do.
In general, you use always want your variables to have the smallest scope possible. If you only use a variable in one function, define it inside that function. Add the "static" keyword to local variables if and only if you need them to retain their value after the function is done.
In some embedded environments, like the PIC18 architecture with the C18 compiler, local variables are much more expensive (more program space, slower execution time) than global. On the Cortex M3, that is not true, so you should feel free to use local variables. Check the assembly listing and see for yourself.
Question 2: Sharing variables between interrupts and the main loop
People have written entire chapters explaining the answers to this group of questions. Whenever you share a variable between the main loop and an interrupt, you should definitely use the volatile keywords on it. Variables of 32 or fewer bits can be accessed atomically (unless they are misaligned).
If you need to access a larger variable, or two variables at the same time from the main loop, then you will have to disable the clock interrupt while you are accessing the variables. If your interrupt does not require precise timing, this will not be a problem. When you re-enable the interrupt, it will automatically fire if it needs to.
Question 3: main function in C++
I'm not sure. You can use arm-none-eabi-nm (or whatever nm is called in your toolchain) on your object file to see what symbol name the C++ compiler assigns to main(). I would bet that C++ compilers refrain from mangling the main function for this exact reason, but I'm not sure.
STM's sample code is not an exemplar of good coding practice, it is merely intended to exemplify use of their standard peripheral library (assuming those are the examples you are talking about). In some cases it may be that variables are declared external to main() because they are accessed from an interrupt context (shared memory). There is also perhaps a possibility that it was done that way merely to allow the variables to be watched in the debugger from any context; but that is not a reason to copy the technique. My opinion of STM's example code is that it is generally pretty poor even as example code, let alone from a software engineering point of view.
In this case your clock interrupt variable is atomic so long as it is 32bit or less so long as you are not using read-modify-write semantics with multiple writers. You can safely have one writer, and multiple readers regardless. This is true for this particular platform, but not necessarily universally; the answer may be different for 8 or 16 bit systems, or for multi-core systems for example. The variable should be declared volatile in any case.
I am using C++ on STM32 with Keil, and there is no problem. I am not sure why you think that the C++ entry points are different, they are not here (Keil ARM-MDK v4.22a). The start-up code calls SystemInit() which initialises the PLL and memory timing for example, then calls __main() which performs global static initialisation then calls C++ constructors for global static objects before calling main(). If in doubt, step through the code in the debugger. It is important to note that __main() is not the main() function you write for your application, it is a wrapper with different behaviour for C and C++, but which ultimately calls your main() function.

DLL Reflection?

Is something like this possible? If so, could you point me in the right direction for learning how?
applicationx tries to run the method start() in dll_one.dll
dll_one.dll runs the command
applicationx tries to run the method run() in dll_one.dll
dll_one.dll doesn't have a method run() and hasn't prepared for such an occurance.
dll_one.dll asks dll_two.dll if it has a run()
dll_two runs run()
Basically, I want it so if dllA doesn't have a method that the application is looking for, it asks dllB. This is assuming, as well, that ApplicationX and dllB don't know anything about dllA and dllA kind of just appeared out of nowhere (I want dlls dynamically like a patch to my applications without having to rewrite ALL of the methods, properties, etc. in the dll and have everything else just routed to the old dll).
Any ideas? Keep in mind, I'm using vb.net so a .net reference is appreciated.
It seems like you're asking for a plug-in architecture for your app (except that "patch" part is bothering me). If so, you can try MEF, which solves this exact problem.
The specific thing you ask for isn't possible. You can't have a non-existent method call automatically re-routed to a different dll. You can't "run the method run() in dll_one.dll" unless you've compiled that code, and it won't compile if the method doesn't exist. You also can't compile code against dllB and then drop dllA in and have it intercept method calls. Reflection could conceivably solve part of your problem, but you'd not want to base your code around calling all methods by reflection - it'd be horrendously unperformant and not very maintainable.
As Anton suggests, a plugin approach might work. However, this would rely on you being able to specify up-front the interface for your plugin, which sounds like it would contradict your original requirement.
Another problem: if you'd not deployed dllA until later, how would your ApplicationX know to call method start() in dll_one.dll anyway? You'd surely need to re-deploy at least the base application for that part to work.
These kinds of problem are often best solved by having a more specific set of requirements to work to: what functionality are you likely to want to extend or change in the future? Could you support a common set of interfaces that allow extensibility via plugins, or can you need to redeploy encapsulated chunks of your application with new functionality? Is there UI involved or is this just to change back-end logic? Questions like this could help to suggest more viable solutions.