Best way to test command line tools? [closed] - testing

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We have a large collection of command-line utilities that we write ourselves and use frequently. At the moment, testing them is very cumbersome and consequently, we don't do as much testing as we aught to.
I am wondering if anyone can suggest good techniques or tools for doing a good job of this kind of thing.
This is UNIX.

I recommend structuring your command line tool's code so that the command line utility is a client to a library of functions and/or classes.
Rather than simply using std::cout to print output, have the libraries function take an ostream reference that defaults to std::cout. When you are testing, provide a std::stringstream to collect the output.
Finally, simply compare your utility's output with expected results using your favorite unit testing framework.
(I apologize for the C++ specific example... I'm sure there are ways to do similar things in other languages too).

You can write tests that resemble an interactive shell session using Cram. It has flexible test specification format that allows you to match output using Perl regex or shell-like wildcards. Cram will replay commands from the test, compare output to the reference, and report differences.

Aruba is a Cucumber extension for testing command line applications written in any programming language.
To use it, you will need ruby to run the tests, but the purpose of aruba is to provide a library of pre-defined step definitions so that you won't need to write any ruby code to make a workable test suite. (Though at some point you probably will want to write a bit of ruby to make a few custom steps.)
You can see a sophisticated example of a command line tool tested with aruba here: jingweno/gh

You should be able to call them from a shell script (batch file, on MS operating systems), redirect the output to a file, then scan the file programmatically to ensure that it has the correct output. I'm not aware of a testing framework that automates this for you, but it should be fairly straight forward to set it up yourself.

Bats (Bash Automated Testing System) by Sam Stephenson. It is tiny, written purely in shell and has a nice set of features.
Previously suggested Aruba looks interesting, but in some cases it might be quiet an overkill in terms of dependencies (ruby, cucumber)

I did a little bit of this (a loooong time ago hehe) using Expect to check that what happened was what I, umm, expected

I have developed a tool "Exactly"
https://github.com/emilkarlen/exactly
It executes the thing to test in a temporary sandbox directory.
The README contains a number of examples.
A test of a hypotethical program "classify-files-by-moving-to-appropriate-dir" can look like this:
[setup]
dir input
dir output/good
dir output/bad
file input/a.txt = <<EOF
GOOD contents
EOF
file input/b.txt = <<EOF
bad contents
EOF
[act]
classify-files-by-moving-to-appropriate-dir GOOD input/ output/
[assert]
dir-contents input empty
exists output/good/a.txt : type file
dir-contents output/good num-files == 1
exists output/bad/b.txt : type file
dir-contents output/bad num-files == 1

You can do this from a batch file oder windows scripting host.
But i promise to use a task scheduler like (http://www.splinterware.com/products/wincron.htm) or other free/professional software.
There you can easy copy/paste the commandline-parameters which you should vary on, when you wanna test your software for about many 100 times?!

You could use perl with Test::more library, which provides a great framework for testing CLIs.
Though primarily designed for unit testing, you could extend it to test user workflows.
Some of the methods:
# Various ways to say "ok"
ok($got eq $expected, $test_name);
is ($got, $expected, $test_name);
isnt($got, $expected, $test_name);
# Rather than print STDERR "# here's what went wrong\n"
diag("here's what went wrong");
like ($got, qr/expected/, $test_name);
unlike($got, qr/expected/, $test_name);
cmp_ok($got, '==', $expected, $test_name);
command-lineautomationtestingperl-testing

Related

Perl 6: automated testing of terminal-based programs

How could I automate interactions with command line programs that expose a text terminal interface with Perl 6 for testing purposes?
If you want to use Perl 6 to automate execution or testing of console applications, I think you're going to use NativeCall to interact with the expect library. Once expect is installed, man libexpect will show its API documentation, though the way of accessing the documentation (such as the manpage name) may differ per package distribution.
Expect has APIs to launch a program, wait for text to appear on the (emulated) console (to "expect" text), and send text to the console (to emulate typing). The most common use case is to automate programs which require password input. Expect is often scripted--it is an interpreter--but there's no reason not to use it from a higher level programming language.
Edit: I somewhat answered the wrong question. The OP is interested in testing Perl 6 modules with Perl 6. That said, using expect to launch a second Perl 6 interpreter which uses the module is still the strongest, most strict way to test the application. You don't need to know what type of terminal library the module uses, because expect should be compatible with nearly all of them. You can send text to the STDIN pipe of a subprocess, but that's not as strong as the subprocess (console) communication you can get from expect. I don't know if there's a way to hijack whichever terminal library the module uses and communicate with it directly.
If it's just a plain interface, you could just run the program and collect output. The currently-experimental Testo module has is-run routine. You could use that directly, or if experimental status is bothersome, copy the guts of it into your own helper routine.
Take a look at Sparrow6 Task Check Language - Perl6 based DSL to verify text output. I've done a lot terminal apps testing using it.

Testing tool: is there alternative than expect?

Our current testing is a simple in-house tool which runs the named target, logs its output and compares it with a expected output. The expected output and real output are both text files.
This has an obvious downside, if some changes of line number change which has no function effect is regarded as failure.
We are thinking to use tools like expect to do this, but we like to know if there are some other alternatives. Googling does not return any immediate answers.
Our platform is linux; the users of this tool does not need to write testing code, basically they currently just provide a plain text file of expected result and we should not ask them to write code or some complex form.
Thanks for your inputs.

Existing solutions to test a NSIS script

Does anyone know of an existing solution to help write tests for a NSIS script?
The motivation is the benefit of knowing whether modifying an existing installation script breaks it or has undesired side effects.
Unfortunately, I think the answer to your question depends at least partially on what you need to verify.
If all you are worried about is that the installation copies the right file(s) to the right places, sets the correct registry information etc., then almost any unit testing tool would probably meet your needs. I'd probably use something like RSpec2, or Cucumber, but that's because I am somewhat familiar with Ruby and like the fact that it would be an xcopy deployment if the scripts needed to be run on another machine. I also like the idea of using a BDD-based solution because the use of a domain-specific language that is very close to readable text would mean that others could more easily understand, and if necessary modify, the test specification when necessary.
If, however you are concerned about the user experience (what progress messages are shown, etc.) then I'm not sure that the tests you would need could be as easily expressed... or at least not without a certain level of pain.
Good Luck! Don't forget to let other people here know when/if you find a solution you like.
Check out Pavonis.
With Pavonis you can compile your NSIS script and get the output of any errors and warnings.
Another solution would be AutoIT.
You can compile your install using Jenkins and the NSIS command line compiler, set up an AutoIT test script and have Jenkins run the test.

Can I write IO statements inside a dll?

This is a newbie question. Can I write statements like printf or open a file inside a dll?
Opening a file is certainly possible in all cases.
However, using printf() depends on whether the executable calling your DLL is a console program or not. If it's a GUI program, then there is nowhere for the printf() output to go, so it will not appear. If it's a console program, you'll see the output on the console.
Your question and its title are asking two different questions. But the answer to the question body is yes -- libraries can certainly use those functions.
printf might not do anything though, depending on whether standard output has been closed by the program using the library.

How would one go about testing an interpreter or a compiler?

I've been experimenting with creating an interpreter for Brainfuck, and while quite simple to make and get up and running, part of me wants to be able to run tests against it. I can't seem to fathom how many tests one might have to write to test all the possible instruction combinations to ensure that the implementation is proper.
Obviously, with Brainfuck, the instruction set is small, but I can't help but think that as more instructions are added, your test code would grow exponentially. More so than your typical tests at any rate.
Now, I'm about as newbie as you can get in terms of writing compilers and interpreters, so my assumptions could very well be way off base.
Basically, where do you even begin with testing on something like this?
Testing a compiler is a little different from testing some other kinds of apps, because it's OK for the compiler to produce different assembly-code versions of a program as long as they all do the right thing. However, if you're just testing an interpreter, it's pretty much the same as any other text-based application. Here is a Unix-centric view:
You will want to build up a regression test suite. Each test should have
Source code you will interpret, say test001.bf
Standard input to the program you will interpret, say test001.0
What you expect the interpreter to produce on standard output, say test001.1
What you expect the interpreter to produce on standard error, say test001.2 (you care about standard error because you want to test your interpreter's error messages)
You will need a "run test" script that does something like the following
function fail {
echo "Unexpected differences on $1:"
diff $2 $3
exit 1
}
for testname
do
tmp1=$(tempfile)
tmp2=$(tempfile)
brainfuck $testname.bf < $testname.0 > $tmp1 2> $tmp2
[ cmp -s $testname.1 $tmp1 ] || fail "stdout" $testname.1 $tmp1
[ cmp -s $testname.2 $tmp2 ] || fail "stderr" $testname.2 $tmp2
done
You will find it helpful to have a "create test" script that does something like
brainfuck $testname.bf < $testname.0 > $testname.1 2> $testname.2
You run this only when you're totally confident that the interpreter works for that case.
You keep your test suite under source control.
It's convenient to embellish your test script so you can leave out files that are expected to be empty.
Any time anything changes, you re-run all the tests. You probably also re-run them all nightly via a cron job.
Finally, you want to add enough tests to get good test coverage of your compiler's source code. The quality of coverage tools varies widely, but GNU Gcov is an adequate coverage tool.
Good luck with your interpreter! If you want to see a lovingly crafted but not very well documented testing infrastructure, go look at the test2 directory for the Quick C-- compiler.
I don't think there's anything 'special' about testing a compiler; in a sense it's almost easier than testing some programs, since a compiler has such a basic high-level summary - you hand in source, it gives you back (possibly) compiled code and (possibly) a set of diagnostic messages.
Like any complex software entity, there will be many code paths, but since it's all very data-oriented (text in, text and bytes out) it's straightforward to author tests.
I’ve written an article on compiler testing, the original conclusion of which (slightly toned down for publication) was: It’s morally wrong to reinvent the wheel. Unless you already know all about the preexisting solutions and have a very good reason for ignoring them, you should start by looking at the tools that already exist. The easiest place to start is Gnu C Torture, but bear in mind that it’s based on Deja Gnu, which has, shall we say, issues. (It took me six attempts even to get the maintainer to allow a critical bug report about the Hello World example onto the mailing list.)
I’ll immodestly suggest that you look at the following as a starting place for tools to investigate:
Software: Practice and Experience April 2007. (Payware, not available to the general public---free preprint at http://pobox.com/~flash/Practical_Testing_of_C99.pdf.
http://en.wikipedia.org/wiki/Compiler_correctness#Testing (Largely written by me.)
Compiler testing bibliography (Please let me know of any updates I’ve missed.)
In the case of brainfuck, I think testing it should be done with brainfuck scripts. I would test the following, though:
1: Are all the cells initialized to 0
2: What happens when you decrement the data pointer when it's currently pointing to the first cell? Does it wrap? Does it point to invalid memory?
3: What happens when you increment the data pointer when it's pointing at the last cell? Does it wrap? Does it point to invalid memory
4: Does output function correctly
5: Does input function correctly
6: Does the [ ] stuff work correctly
7: What happens when you increment a byte more than 255 times, does it wrap to 0 properly, or is it incorrectly treated as an integer or other value.
More tests are possible too, but this is probably where i'd start. I wrote a BF compiler a few years ago, and that had a few extra tests. Particularly I tested the [ ] stuff heavily, by having a lot of code inside the block, since an early version of my code generator had issues there (on x86 using a jxx I had issues when the block produced more than 128 bytes or so of code, resulting in invalid x86 asm).
You can test with some already written apps.
The secret is to:
Separate the concerns
Observe the law of Demeter
Inject your dependencies
Well, software that is hard to test is a sign that the developer wrote it like it's 1985. Sorry to say that, but utilizing the three principles I presented here, even line numbered BASIC would be unit testable (it IS possible to inject dependencies into BASIC, because you can do "goto variable".