Perl 6: automated testing of terminal-based programs - testing

How could I automate interactions with command line programs that expose a text terminal interface with Perl 6 for testing purposes?

If you want to use Perl 6 to automate execution or testing of console applications, I think you're going to use NativeCall to interact with the expect library. Once expect is installed, man libexpect will show its API documentation, though the way of accessing the documentation (such as the manpage name) may differ per package distribution.
Expect has APIs to launch a program, wait for text to appear on the (emulated) console (to "expect" text), and send text to the console (to emulate typing). The most common use case is to automate programs which require password input. Expect is often scripted--it is an interpreter--but there's no reason not to use it from a higher level programming language.
Edit: I somewhat answered the wrong question. The OP is interested in testing Perl 6 modules with Perl 6. That said, using expect to launch a second Perl 6 interpreter which uses the module is still the strongest, most strict way to test the application. You don't need to know what type of terminal library the module uses, because expect should be compatible with nearly all of them. You can send text to the STDIN pipe of a subprocess, but that's not as strong as the subprocess (console) communication you can get from expect. I don't know if there's a way to hijack whichever terminal library the module uses and communicate with it directly.

If it's just a plain interface, you could just run the program and collect output. The currently-experimental Testo module has is-run routine. You could use that directly, or if experimental status is bothersome, copy the guts of it into your own helper routine.

Take a look at Sparrow6 Task Check Language - Perl6 based DSL to verify text output. I've done a lot terminal apps testing using it.

Related

Key binding to interactively execute commands from Python interpreter history in order?

I sometimes test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality. Of course, since my code is in development there are bugs, and frequent restarts of the interpreter are required. This isn't too painful when I've only executed a couple of interpreter lines before restarting: my key sequence when the interpreter restart looks like Up Up Enter Up Up Enter... but extrapolate it to 5 or more statements to be repeated and it gets seriously painful!
Of course I could put my test code into a script which I execute with python -i, but this is such a scratch activity that it doesn't seem quite "above threshold" for opening a text editor :) What I'm really pining for is the Ctrl-r behaviour from the bash shell: executing a sequence of 10 commands in sequence in bash involves finding the command in history (repeated Up or Ctrl-r for a search -- both work in the Python interpreter shell) and then just pressing Ctrl-o ten times. One of my favourite bash shell features.
The problem is that while lots of other readline binding functionality like Ctrl-a, Ctrl-e, Ctrl-r, and Ctrl-s work in the Python interpreter, Ctrl-o does not. I've not been able to find any references to this online, although perhaps the readline module can be used to add this functionality to the python prompt. Any suggestions?
Edit: Yes, I know that using the interactive interpreter is not a development methodology that scales beyond a few lines! But it is convenient for small tests, and IMO the interactiveness can help to work out whether a developing API is natural and convenient, or too heavy. So please confine the answers to the technical question of whether readline history-stepping can be made to work in python, rather than the side-opinion of whether one should or shouldn't choose to (sometimes) work this way!
Edit: Since posting I realised that I am already using the readline module to make some Python interpreter history functions work. But the Ctrl-o binding to the operate-and-get-next readline command doesn't seem to be supported, even if I put readline.parse_and_bind("Control-o: operate-and-get-next") in my PYTHONSTARTUP file.
I often test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality.
Stop using this pattern and start writing your test code in a file and your life will be much easier.
No matter what, running that file will be less trouble.
If you make the checks automatic rather than reading the results, it will be quicker and less error-prone to check your code.
You can save that file when you're done and run it whenever you change your code or environment.
You can perform metrics on the tests, like making sure you don't have parts of your code you didn't test.
Are you familiar with the unittest module?
Answering my own question, after some discussion on the python-ideas list: despite contradictory information in some readline documentation it seems that the operate-and-get-next function is in fact defined as a bash extension to readline, not by core readline.
So that's why Ctrl-o neither behaves as hoped by default when importing the readline module in a Python interpreter session, nor when attempting to manually force this binding: the function doesn't exist in the readline library to be bound.
A Google search reveals https://bugs.launchpad.net/ipython/+bug/382638, on which the GNU readline maintainer gives reasons for not adding this functionality to core readline and says that it should be implemented by the calling application. He also says "its implementation is not complicated", although it's not obvious to me how (or whether it's even possible) to do this as a pure Python extension to the readline module behaviour.
So no, this is not possible at the moment, unless the operate-and-get-next function from bash is explicitly implemented in the Python readline module or in the interpreter itself.
This isn't exactly an answer to your question, but if that is your development style you might want to look at DreamPie. It is a GUI wrapper for the Python terminal that provides various handy shortcuts. One of these is the ability to drag-select across the interpreter display and copy only the code (not the output). You can then paste this code in and run it again. I find this handy for the type of workflow you describe.
Your best bet will be to check that project : http://ipython.org
This is an example with a history search with Ctrl+R :
EDIT
If you are running debian or derivated :
sudo apt-get install ipython

Is it possible to execute shell commands within a mac application?

Basically I'm wondering if I can compile code that a user inputs in a mac app (I'm trying to make an OCaml text editor that compiles your code) using executables that are already available in the user's system, such as ocamlc etc. I don't have any code to show or anything because I'm still figuring out if/how I could build this mac app. Not really sure what other info I should include, so just ask. Thanks!
You can use either Sys.command "<your shell command>" or Unix.open_process* and Unix.create_process commands. See man Sys and man Unix for more information.
In Objective-C, C, and C++, and a multitude of other languages, use system(3). Also see:
exec(3)
popen(3)
If you are using Objective-C, check out NSTask.
If not, look at popen. popen gives your parent process control over the I/O streams.

How would you effectively test command line software, with many switches and arguments

A command line utility/software could potentially consist of many different switches and arguments.
Lets say your software is called CLI and lets say CLI has the following features:
The general syntax of CLI is:
CLI <data structures> <operation> <required arguments> [optional arguments]
<data structures> could be 'matrix', 'complex numbers', 'int', 'floating point', 'log'
<operation> could be 'add', 'subtract', 'multiply', 'divide'
I cant think of any required and optional arguments, but lets say your software does support it
Now you want to test this software. And you wish to test interface itself, not the logic. Essentially the interface must return the correct success codes and error codes.
Essentially a lot of real word software still present a Command Line interface with several options. I am curious if there is any formal testing methodology established for this. One idea i had was to construct a grammar (like EBNF) and describing the 'language' of the interface. But I fail to push this idea ahead. What good is a grammar for in this case? How does it enable the generation of many many combinations .
I am curious to learn more about any theoretical models which could be applied to such a problem or if anyone in here has actually done such testing with satisfying coverage
There is a command-line tool as part of a product i maintain, and i have a situation thats very similar to what you describe. What i did was employ a unit testing framework, and encode each combination of arguments as a test method.
The program is implemented in c#/.NET, so i use microsoft's testing framework that's builtin to Visual Studio, but the approach would work with any unit testing framework.
Each test invokes a utility function that starts the process and sends in the input and cole ts the output. Then, each test is responsible for verifying that the output from the CLI matches what was expected. In some cases, there's a family of test cases that can be performed by a single test method, wih a for loop in it. The logic needs to run the CLI and check the output for each iteration.
The set of tests i have does not cover every permutation of arguments, but it covers the 80% cases and i can add new tests if there are ever any defects.
Using a recursive grammar to generate switches is an interesting idea. If you where to try this then you would need to first write the grammar in such a way that all switches could be used, and then do a random walk of the grammar.
This provides an easy method of randomly walking a grammar and outputting the result.

expect replacement

I want to work with a modem interfaced on a serial port on an embedded platform.
Here are some solutions I have rejected so far :
Expect plus a terminal program :
My (cross)build system does not have any package rules for expect, and according to the installation instructions from the expect sources, the configure script needs to be interactive because it does some test with the terminale it is invoked in. Thid does not look like something you want to do when cross compiling.
Python plus pyserial :
I would love to use this, but the size of the whole thing won't fit on my limited flash space.
Chat (from the pppd package):
Well, I may give it a try but it is very, very limited
So I am looking for some sort of lightweight, embeddable expect replacement. I have no knwoledge of lua. Would it be a good candidate for expect like scipting ?
Well, Expect is just Tcl plus extensions to drive other programs via pseudo-terminals and do pattern-matching on the results. If you just want to drive a serial port you can drop the external terminal program and have Tcl drive the serial port directly - see sample code. See also the Tcl Wiki page on cross-compiling.

How would one go about testing an interpreter or a compiler?

I've been experimenting with creating an interpreter for Brainfuck, and while quite simple to make and get up and running, part of me wants to be able to run tests against it. I can't seem to fathom how many tests one might have to write to test all the possible instruction combinations to ensure that the implementation is proper.
Obviously, with Brainfuck, the instruction set is small, but I can't help but think that as more instructions are added, your test code would grow exponentially. More so than your typical tests at any rate.
Now, I'm about as newbie as you can get in terms of writing compilers and interpreters, so my assumptions could very well be way off base.
Basically, where do you even begin with testing on something like this?
Testing a compiler is a little different from testing some other kinds of apps, because it's OK for the compiler to produce different assembly-code versions of a program as long as they all do the right thing. However, if you're just testing an interpreter, it's pretty much the same as any other text-based application. Here is a Unix-centric view:
You will want to build up a regression test suite. Each test should have
Source code you will interpret, say test001.bf
Standard input to the program you will interpret, say test001.0
What you expect the interpreter to produce on standard output, say test001.1
What you expect the interpreter to produce on standard error, say test001.2 (you care about standard error because you want to test your interpreter's error messages)
You will need a "run test" script that does something like the following
function fail {
echo "Unexpected differences on $1:"
diff $2 $3
exit 1
}
for testname
do
tmp1=$(tempfile)
tmp2=$(tempfile)
brainfuck $testname.bf < $testname.0 > $tmp1 2> $tmp2
[ cmp -s $testname.1 $tmp1 ] || fail "stdout" $testname.1 $tmp1
[ cmp -s $testname.2 $tmp2 ] || fail "stderr" $testname.2 $tmp2
done
You will find it helpful to have a "create test" script that does something like
brainfuck $testname.bf < $testname.0 > $testname.1 2> $testname.2
You run this only when you're totally confident that the interpreter works for that case.
You keep your test suite under source control.
It's convenient to embellish your test script so you can leave out files that are expected to be empty.
Any time anything changes, you re-run all the tests. You probably also re-run them all nightly via a cron job.
Finally, you want to add enough tests to get good test coverage of your compiler's source code. The quality of coverage tools varies widely, but GNU Gcov is an adequate coverage tool.
Good luck with your interpreter! If you want to see a lovingly crafted but not very well documented testing infrastructure, go look at the test2 directory for the Quick C-- compiler.
I don't think there's anything 'special' about testing a compiler; in a sense it's almost easier than testing some programs, since a compiler has such a basic high-level summary - you hand in source, it gives you back (possibly) compiled code and (possibly) a set of diagnostic messages.
Like any complex software entity, there will be many code paths, but since it's all very data-oriented (text in, text and bytes out) it's straightforward to author tests.
I’ve written an article on compiler testing, the original conclusion of which (slightly toned down for publication) was: It’s morally wrong to reinvent the wheel. Unless you already know all about the preexisting solutions and have a very good reason for ignoring them, you should start by looking at the tools that already exist. The easiest place to start is Gnu C Torture, but bear in mind that it’s based on Deja Gnu, which has, shall we say, issues. (It took me six attempts even to get the maintainer to allow a critical bug report about the Hello World example onto the mailing list.)
I’ll immodestly suggest that you look at the following as a starting place for tools to investigate:
Software: Practice and Experience April 2007. (Payware, not available to the general public---free preprint at http://pobox.com/~flash/Practical_Testing_of_C99.pdf.
http://en.wikipedia.org/wiki/Compiler_correctness#Testing (Largely written by me.)
Compiler testing bibliography (Please let me know of any updates I’ve missed.)
In the case of brainfuck, I think testing it should be done with brainfuck scripts. I would test the following, though:
1: Are all the cells initialized to 0
2: What happens when you decrement the data pointer when it's currently pointing to the first cell? Does it wrap? Does it point to invalid memory?
3: What happens when you increment the data pointer when it's pointing at the last cell? Does it wrap? Does it point to invalid memory
4: Does output function correctly
5: Does input function correctly
6: Does the [ ] stuff work correctly
7: What happens when you increment a byte more than 255 times, does it wrap to 0 properly, or is it incorrectly treated as an integer or other value.
More tests are possible too, but this is probably where i'd start. I wrote a BF compiler a few years ago, and that had a few extra tests. Particularly I tested the [ ] stuff heavily, by having a lot of code inside the block, since an early version of my code generator had issues there (on x86 using a jxx I had issues when the block produced more than 128 bytes or so of code, resulting in invalid x86 asm).
You can test with some already written apps.
The secret is to:
Separate the concerns
Observe the law of Demeter
Inject your dependencies
Well, software that is hard to test is a sign that the developer wrote it like it's 1985. Sorry to say that, but utilizing the three principles I presented here, even line numbered BASIC would be unit testable (it IS possible to inject dependencies into BASIC, because you can do "goto variable".