Automating CoreWars - automation

Automating CoreWars - automation

I plan to make an evolutionary program to make redcode warriors for CoreWars. However, I have absolutely no idea how to run the code generated without having to manually open the program and put in the Warriors. Since I hope to have the evolutionary program run through several warriors per minute at least, I'd rather not have to play the role of administrator that much. I'm using the ARES simulator, but as for my research on running warriors through it with a script, I haven't found anything.
I'm really just looking for something like:
SomeSimulator.exe --warrior1 megalordthedestroyer.red --warrior2 tinathebabybunny.red

So I found a python module that supports both parsing redcode and running pmars simulations. With a little bit of patching I'm able to hook this up to my evolutionary factory and automate it all.
The link: https://github.com/rodrigosetti/corewar

Related

Show me your ID(E)!

I often work on very small pieces of code, on the order of max 100 lines, especially in scenarios when I learn something new and just play with the code, or when I debug.
Because I frequently change code and want to see how that changes the contents of my variables and output, it is tedious to either
1) hit the debug button, wait for the debugger to start (in my case I use PyCharm as an IDE) and then inspect the output
or
2) insert some prints for the variables that I want to observe and compile the code (slightly faster than starting the debugger).
To eliminate this time consuming workflow, where I constantly hit the compile or debug button every few seconds, is there an IDE where I can set a watch to a few variables and then each time I change in my source code a single character (or, alternatively, every half a second) the IDE automatically compiles everything and I will see then new values of my variables?
(Of course while I intermediatelychange the code the compilation will give errors, but that is ok. This feature would be a big time saver. Maybe PyCharm has it already implemented? If not, ideally I would hope for a language agnostic IDE, similar to PyCharm, where variants for Java etc. also exist. If not, since I code in Python, a Python IDE would also be great.)

This might not be exactly what you are looking for but PyCharm (and IntelliJ and probably others) can run tests automatically when code changes.
In the PyCharm Run toolbar look for "Toggle auto-test" button.
For example in PyCharm you can create test cases that just runs the code you're interested in and prints the variables you need.
Then create a run configuration that runs only those tests and set it to run automatically.
For more details see PyCharm documentation on rerunning tests.

The Scala plugin for IntelliJ has exactly what you need in the form of "worksheets," where every expression is automatically recompiled when its value or the value of anything it references is changed.
Since (based on your usage of PyCharm), I assume you're using Python primarily, I think Jupyter notebook is your best bet. Jupyter is language agnostic but began as specific to python (it was called IPython notebook for this reason).
I have not tried it, but this guide purports to show to get Jupyter to work with PyCharm
EDIT: Here is another possibility called vim worksheet; I haven't tried it, but it purports to do the same thing as Scala worksheets, but in vim, and for a number of languages, including Python.

The python Spyder IDE (comes with Anaconda) has this feature. When you hit run, you can see all of the variables at the top right of the screen and you can click on them to see their values (this is very helpful with Numpy Arrays too!).

If your interest is in the actual workflow improvement:
I used to program like you, looking at what my variables changed to, and design or debug my code based on such modifications, however is way to inefficient and costly to set what variables to watch over and over again and besides when if it bugs, you have to go all over again for the debugging process.
I changed my design process to better my workflow and adopted Test Driven Development (TDD), with it you can look at tools for you specific implementations or IDEs but the principles and workflow stay with you, with it you stop looking on how the variables changed and instead focus on what the functions should do, meaning faster iteration (with real time tools for testing), easier debugging and far more better, safe refactoring.
My favorite tool for it is Cucumber, and agnostic tool (for IDE or programming language) which help you test your code scenarios and at the same time documenting your features.
Hope it helps, i know its a very opinionated answer but it's an honest advices for improvement in ones workflow.

You should try Thonny. It is developed by Institute of Computer Science of University of Tartu.
The 4 features which might be of help to you are below (verbatim from the website):
No-hassle variables.
Once you're done with hello-worlds, select View → Variables and see how your programs and shell commands affect Python variables.
Simple debugger.
Just press Ctrl+F5 instead of F5 and you can run your programs step-by-step, no breakpoints needed. Press F6 for a big step and F7 for a small step. Steps follow program structure, not just code lines.
Stepping through statements
Step through expression evaluation. If you use small steps, then you can even see how Python evaluates your expressions. You can think of this light-blue box as a piece of paper where Python replaces subexpressions with their values, piece-by-piece.
Visualization of expression evaluation
Faithful representation of function calls.
Stepping into a function call opens a new window with separate local variables table and code pointer. Good understanding of how function calls work is especially important for understanding recursion.

Writing a MP in zimpl to be solved with scip

This might be a quite basic question, however, I did not find any suggestions so far.
I am running the Scip Opt Suite on OSX and everything runs well so far. No I wanted to start to model my first mathematical problem in zimpl, however I do not know how to start.
However, in the user's guide there is just prescribed how to load existing zpl-files, but not how to create on files.
Do you have any suggestions or any further threads dealing with that task?
Kind Regards

In the zimpl package root, there's an example directory. The .zpl files in example are a great starting point for writing your own zimpl inputs. Also, zimpl's author wrote this pdf that walks through the process of writing a formulation to solve a Sudoku puzzle using zimpl.

Key binding to interactively execute commands from Python interpreter history in order?

I sometimes test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality. Of course, since my code is in development there are bugs, and frequent restarts of the interpreter are required. This isn't too painful when I've only executed a couple of interpreter lines before restarting: my key sequence when the interpreter restart looks like Up Up Enter Up Up Enter... but extrapolate it to 5 or more statements to be repeated and it gets seriously painful!
Of course I could put my test code into a script which I execute with python -i, but this is such a scratch activity that it doesn't seem quite "above threshold" for opening a text editor :) What I'm really pining for is the Ctrl-r behaviour from the bash shell: executing a sequence of 10 commands in sequence in bash involves finding the command in history (repeated Up or Ctrl-r for a search -- both work in the Python interpreter shell) and then just pressing Ctrl-o ten times. One of my favourite bash shell features.
The problem is that while lots of other readline binding functionality like Ctrl-a, Ctrl-e, Ctrl-r, and Ctrl-s work in the Python interpreter, Ctrl-o does not. I've not been able to find any references to this online, although perhaps the readline module can be used to add this functionality to the python prompt. Any suggestions?
Edit: Yes, I know that using the interactive interpreter is not a development methodology that scales beyond a few lines! But it is convenient for small tests, and IMO the interactiveness can help to work out whether a developing API is natural and convenient, or too heavy. So please confine the answers to the technical question of whether readline history-stepping can be made to work in python, rather than the side-opinion of whether one should or shouldn't choose to (sometimes) work this way!
Edit: Since posting I realised that I am already using the readline module to make some Python interpreter history functions work. But the Ctrl-o binding to the operate-and-get-next readline command doesn't seem to be supported, even if I put readline.parse_and_bind("Control-o: operate-and-get-next") in my PYTHONSTARTUP file.

I often test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality.
Stop using this pattern and start writing your test code in a file and your life will be much easier.
No matter what, running that file will be less trouble.
If you make the checks automatic rather than reading the results, it will be quicker and less error-prone to check your code.
You can save that file when you're done and run it whenever you change your code or environment.
You can perform metrics on the tests, like making sure you don't have parts of your code you didn't test.
Are you familiar with the unittest module?

Answering my own question, after some discussion on the python-ideas list: despite contradictory information in some readline documentation it seems that the operate-and-get-next function is in fact defined as a bash extension to readline, not by core readline.
So that's why Ctrl-o neither behaves as hoped by default when importing the readline module in a Python interpreter session, nor when attempting to manually force this binding: the function doesn't exist in the readline library to be bound.
A Google search reveals https://bugs.launchpad.net/ipython/+bug/382638, on which the GNU readline maintainer gives reasons for not adding this functionality to core readline and says that it should be implemented by the calling application. He also says "its implementation is not complicated", although it's not obvious to me how (or whether it's even possible) to do this as a pure Python extension to the readline module behaviour.
So no, this is not possible at the moment, unless the operate-and-get-next function from bash is explicitly implemented in the Python readline module or in the interpreter itself.

This isn't exactly an answer to your question, but if that is your development style you might want to look at DreamPie. It is a GUI wrapper for the Python terminal that provides various handy shortcuts. One of these is the ability to drag-select across the interpreter display and copy only the code (not the output). You can then paste this code in and run it again. I find this handy for the type of workflow you describe.

Your best bet will be to check that project : http://ipython.org
This is an example with a history search with Ctrl+R :
EDIT
If you are running debian or derivated :
sudo apt-get install ipython

Has anyone worked with TestCocoon?

I was trying out TestCocoon the other day, and everything seemed great. I compiled my code using cscl,cslib and cslink and I was expecting this to take care of all the instrumentation. I get some .csmes files and .exe.csmes files, but when I load them into the CoverageBrowser I cannot see anything relevant. No covered/uncovered lines. All the lines are grey.
Is anything else needed in order for TestCocoon to report coverage? Do I need to modify my source files? I also posted on their forums here, but no result:
http://www.testcocoon.org/forum/viewtopic.php?f=8&t=44

I tried this tool with few projects using Visual Studio 2008, and I found:
Pros:
- it can collect results from multiple runs, you can run your software at different machines and collect results together
- it has useful GUI for browsing results
- you can merge coverage from many modules and anlyse it as whole application
- forum works, I submited two problems and got implemented fixtures in few days
- it works almost without any problems (I found two minor compilation problems) with quite complicated sources, with tons of templates, boost::spirit parsers, other boost stuff (including meta-programming modules etc.), STL, Qt (everything together)
- well documented
- it's free
Cons:
- instrumentation is definitely slow
- multi-process single project compilation using Visual Studio 2008 doesn't work, only one file at a time is compiled which makes building slower (you will get better performance building whole solution with many projects)
At this moment I didn't try to use this tool for continuous coverage measurement.
Either way, in my opinion it's worth to try.
BTW, Tony, PC-Lint is static-analysis tool, isn't it? interesting idea to compare it with dynamic-analysis tool...

TestCocoon (now at 1.6.7) works well with the small C code bases we tend to unit test. The performance impact seems about normal for other instrumentation methods we've used.
We are able to extract coverage information in our makefiles and the coverage browser is very useful.

Dont use testcocoon, I am currently using it, and its shoddy as hell. Pay for something better (it will cost alot). It is the ultimate death sentence, seriously, don't do it. Whatever you do, stay away from testcocoon at all costs. Worst move ever. You might as well sell your kids for drug money.

How would one go about testing an interpreter or a compiler?

I've been experimenting with creating an interpreter for Brainfuck, and while quite simple to make and get up and running, part of me wants to be able to run tests against it. I can't seem to fathom how many tests one might have to write to test all the possible instruction combinations to ensure that the implementation is proper.
Obviously, with Brainfuck, the instruction set is small, but I can't help but think that as more instructions are added, your test code would grow exponentially. More so than your typical tests at any rate.
Now, I'm about as newbie as you can get in terms of writing compilers and interpreters, so my assumptions could very well be way off base.
Basically, where do you even begin with testing on something like this?

Testing a compiler is a little different from testing some other kinds of apps, because it's OK for the compiler to produce different assembly-code versions of a program as long as they all do the right thing. However, if you're just testing an interpreter, it's pretty much the same as any other text-based application. Here is a Unix-centric view:
You will want to build up a regression test suite. Each test should have
Source code you will interpret, say test001.bf
Standard input to the program you will interpret, say test001.0
What you expect the interpreter to produce on standard output, say test001.1
What you expect the interpreter to produce on standard error, say test001.2 (you care about standard error because you want to test your interpreter's error messages)
You will need a "run test" script that does something like the following
function fail {
echo "Unexpected differences on $1:"
diff $2 $3
exit 1
}
for testname
do
tmp1=$(tempfile)
tmp2=$(tempfile)
brainfuck $testname.bf < $testname.0 > $tmp1 2> $tmp2
[ cmp -s $testname.1 $tmp1 ] || fail "stdout" $testname.1 $tmp1
[ cmp -s $testname.2 $tmp2 ] || fail "stderr" $testname.2 $tmp2
done
You will find it helpful to have a "create test" script that does something like
brainfuck $testname.bf < $testname.0 > $testname.1 2> $testname.2
You run this only when you're totally confident that the interpreter works for that case.
You keep your test suite under source control.
It's convenient to embellish your test script so you can leave out files that are expected to be empty.
Any time anything changes, you re-run all the tests. You probably also re-run them all nightly via a cron job.
Finally, you want to add enough tests to get good test coverage of your compiler's source code. The quality of coverage tools varies widely, but GNU Gcov is an adequate coverage tool.
Good luck with your interpreter! If you want to see a lovingly crafted but not very well documented testing infrastructure, go look at the test2 directory for the Quick C-- compiler.

I don't think there's anything 'special' about testing a compiler; in a sense it's almost easier than testing some programs, since a compiler has such a basic high-level summary - you hand in source, it gives you back (possibly) compiled code and (possibly) a set of diagnostic messages.
Like any complex software entity, there will be many code paths, but since it's all very data-oriented (text in, text and bytes out) it's straightforward to author tests.

I’ve written an article on compiler testing, the original conclusion of which (slightly toned down for publication) was: It’s morally wrong to reinvent the wheel. Unless you already know all about the preexisting solutions and have a very good reason for ignoring them, you should start by looking at the tools that already exist. The easiest place to start is Gnu C Torture, but bear in mind that it’s based on Deja Gnu, which has, shall we say, issues. (It took me six attempts even to get the maintainer to allow a critical bug report about the Hello World example onto the mailing list.)
I’ll immodestly suggest that you look at the following as a starting place for tools to investigate:
Software: Practice and Experience April 2007. (Payware, not available to the general public---free preprint at http://pobox.com/~flash/Practical_Testing_of_C99.pdf.
http://en.wikipedia.org/wiki/Compiler_correctness#Testing (Largely written by me.)
Compiler testing bibliography (Please let me know of any updates I’ve missed.)

In the case of brainfuck, I think testing it should be done with brainfuck scripts. I would test the following, though:
1: Are all the cells initialized to 0
2: What happens when you decrement the data pointer when it's currently pointing to the first cell? Does it wrap? Does it point to invalid memory?
3: What happens when you increment the data pointer when it's pointing at the last cell? Does it wrap? Does it point to invalid memory
4: Does output function correctly
5: Does input function correctly
6: Does the [ ] stuff work correctly
7: What happens when you increment a byte more than 255 times, does it wrap to 0 properly, or is it incorrectly treated as an integer or other value.
More tests are possible too, but this is probably where i'd start. I wrote a BF compiler a few years ago, and that had a few extra tests. Particularly I tested the [ ] stuff heavily, by having a lot of code inside the block, since an early version of my code generator had issues there (on x86 using a jxx I had issues when the block produced more than 128 bytes or so of code, resulting in invalid x86 asm).

You can test with some already written apps.

The secret is to:
Separate the concerns
Observe the law of Demeter
Inject your dependencies
Well, software that is hard to test is a sign that the developer wrote it like it's 1985. Sorry to say that, but utilizing the three principles I presented here, even line numbered BASIC would be unit testable (it IS possible to inject dependencies into BASIC, because you can do "goto variable".

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas