How to test SimPy simulation code with pytest? - testing

I'm using SimPy to write a simple network simulator. I'd like to write some tests (ideally with pytest or similar) to automatically test that my code is doing what it's supposed to do.
My problem is that the simulation contains a lot of randomly generated events (e.g., flow arrivals). Is it possible to test, for example, if my function that randomly generates new flows works correctly (with pytest)? If so, how?
My goal is to write some (unit) tests that I can run automatically with Travis whenever I want to merge new code.

You can set the random seed in the random library to get reproducable results. Other random libraries that you might use should have the same feature e.g. Numpy.

Related

'No data to report' when i execute coverage report

I use django.test to do unittest
At first i run
coverage run ./manage.py test audit.lib.tests.test_prune
And works well
----------------------------------------------------------------------
Ran 1 test in 1.493s
OK
But when i run coverage report, unexpected happens, it should show some reports as expected, but No data to report
root#0553f9cad609:/opt/buildaudit# coverage report
No data to report.
I have no ideas, it has confused me whole day.. Thank you all!
Some of my tests executed helper programs, and I wanted to gather coverage results for those programs too. That meant coverage had to gather and store metrics in multiple processes at the same time. Normally, it stores them in a single file named .coverage, which doesn't work when gathering metrics in parallel. Instead, coverage needs to be told to store results in separate files, one per process, giving them unique file names. Per the docs, that can be done by adding this to .coveragerc.
[run]
parallel = True
The report generators, like coverage html, expect those results to be combined into a single file. That can be done by running this after the tests have finished, and before trying to create a report from them.
% coverage combine
Not doing so produces the No data to report error in the question. Credit goes to #PengQunZhong for first suggesting this.
Going beyond the question a bit, this actually wasn't enough for me to get measurements from all sub-processes. The docs have a good description of the subtleties and solutions, but I'll summarize what I chose. I use the multiprocessing module to start some of the sub-processes, so I had to add the following in the [run] section of .coveragerc.
concurrency = multiprocessing
Also, sub-processes needed to tell coverage to gather metrics since, unlike top-level tests, sub-processes are not run by coverage. I did this by adding the following at the top of the code for each sub-process. See the reference for other options.
import os
if "COVERAGE_PROCESS_START" in os.environ:
import coverage
coverage.process_startup()
The environment variable used here is recognized by coverage; don't rename it. Also, I ran my tests with the following. I use pytest, but other test frameworks would be done similarly. There's also a pytest plug-in that can help.
% COVERAGE_PROCESS_START=.coveragerc coverage run pytest
Finally, some tests and their sub-processes needed small changes to ensure coverage was allowed to save its results when the process was terminated. An ungraceful exit, SIGKILL, etc. prevent this. coverage writes its results in an atexit hook, and if you have coverage 6.3 or newer, also in a signal handler for SIGTERM. If your sub-processes are terminated any other way, coverage will not be able to save its results. In my case, I usually sent a SIGTERM to the sub-process from its parent. A parent that used subprocess.Popen objects, for example, did this.
kid.terminate()

TestNG - Is it possible to use AnnotationTransformer with dataProvider?

I am writing functional tests using TestNG, and I have a few dozens of similar tests with different data. I would like to use DataProvider to reduce repeating code.
But some of those tests pass, some fail (due to a known defect). I want to disable failing tests until they are fixed, so they don't spoil whole picture of test run.
I see that AnnotationTransformer can change test annotations dynamically. Can AnnotationTransformer disable test only with some of the data sets? Or will it disable test with all provided data and it is better not to change anything?
Thanks in advance.
Why not simply put these failing tests in a group, say "broken", and exclude that test from your runs? Much simpler than using an annotation transformer, and the reports will show you which groups were excluded, so there is no risk to miss any when comes the time to ship.

TestNG & Selenium: Separate tests into "groups", run ordered inside each group

We use TestNG and Selenium WebDriver to test our web application.
Now our problem is that we often have several tests that need to run in a certain order, e.g.:
login to application
enter some data
edit the data
check that it's displayed correctly
Now obviously these tests need to run in that precise order.
At the same time, we have many other tests which are totally independent from the list of tests above.
So we'd like to be able to somehow put tests into "groups" (not necessarily groups in the TestNG sense), and then run them such that:
tests inside one "group" always run together and in the same order
but different test "groups" as a whole can run in any order
The second point is important, because we want to avoid dependencies between tests in different groups (so different test "groups" can be used and developed independently).
Is there a way to achieve this using TestNG?
Solutions we tried
At first we just put tests that belong together into one class, and used dependsOnMethods to make them run in the right order. This used to work in TestNG V5, but in V6 TestNG will sometimes interleave tests from different classes (while respecting the ordering imposed by dependsOnMethods). There does not seem to be a way to tell TestNG "Always run tests from one class together".
We considered writing a method interceptor. However, this has the disadvantage that running tests from inside an IDE becomes more difficult (because directly invoking a test on a class would not use the interceptor). Also, tests using dependsOnMethods cannot be ordered by the interceptor, so we'd have to stop using that. We'd probably have to create our own annotation to specify ordering, and we'd like to use standard TestNG features as far as possible.
The TestNG docs propose using preserve-order to order tests. That looks promising, but only works if you list every test method separately, which seems redundant and hard to maintain.
Is there a better way to achieve this?
I am also open for any other suggestions on how to handle tests that build on each other, without having to impose a total order on all tests.
PS
alanning's answer points out that we could simply keep all tests independent by doing the necessary setup inside each test. That is in principle a good idea (and some tests do this), however sometimes we need to test a complete workflow, with each step depending on all previous steps (as in my example). To do that with "independent" tests would mean running the same multi-step setup over and over, and that would make our already slow tests even slower. Instead of three tests doing:
Test 1: login to application
Test 2: enter some data
Test 3: edit the data
we would get
Test 1: login to application
Test 2: login to application, enter some data
Test 3: login to application, enter some data, edit the data
etc.
In addition to needlessly increasing testing time, this also feels unnatural - it should be possible to model a workflow as a series of tests.
If there's no other way, this is probably how we'll do it, but we are looking for a better solution, without repeating the same setup calls.
You are mixing "functionality" and "test". Separating them will solve your problem.
For example, create a helper class/method that executes the steps to log in, then call that class/method in your Login test and all other tests that require the user to be logged in.
Your other tests do not actually need to rely on your Login "Test", just the login class/method.
If later back-end modifications introduce a bug in the login process, all of the tests which rely on the Login helper class/method will still fail as expected.
Update:
Turns out this already has a name, the Page Object pattern. Here is a page with Java examples of using this pattern:
http://code.google.com/p/selenium/wiki/PageObjects
Try with depends on group along with depends on method. Add all methods in same class in one group.
For example
#Test(groups={"cls1","other"})
public void cls1test1(){
}
#Test(groups={"cls1","other"}, dependsOnMethods="cls1test1", alwaysrun=true)
public void cls1test2(){
}
In class 2
#Test(groups={"cls2","other"}, dependsOnGroups="cls1", alwaysrun=true)
public void cls2test1(){
}
#Test(groups={"cls2","other"}, dependsOnMethods="cls2test1", dependsOnGroups="cls1", alwaysrun=true)
public void cls2test2(){
}
There is an easy (whilst hacky) workaround for this if you are comfortable with your first approach:
At first we just put tests that belong together into one class, and used dependsOnMethods to make them run in the right order. This used to work in TestNG V5, but in V6 TestNG will sometimes interleave tests from different classes (while respecting the ordering imposed by dependsOnMethods). There does not seem to be a way to tell TestNG "Always run tests from one class together".
We had a similar problem: we need our tests to be run class-wise because we couldn't guarantee the test classes not interfering with each other.
This is what we did:
Put a
#Test( dependsOnGroups= { "dummyGroupToMakeTestNGTreatThisAsDependentClass" } )
Annotation on an Abstract Test Class or Interface that all your Tests inherit from.
This will put all your methods in the "first group" (group as described in this paragraph, not TestNG-groups). Inside the groups the ordering is class-wise.
Thanks to Cedric Beust, he provided a very quick answer for this.
Edit:
The group dummyGroupToMakeTestNGTreatThisAsDependentClass actually has to exist, but you can just add a dummy test case for that purpose..

Does there exist an established standard for testing command line arguments?

I am developing a command line utility that has a LOT of flags. A typical command looks like this:
mycommand --foo=A --bar=B --jar=C --gnar=D --binks=E
In most cases, a 'success' message is printed but I still want to verify against other sources like an external database to ensure actual success.
I'm starting to create integration tests and I am unsure of the best way to do this. My main concerns are:
There are many many flag combinations, how do I know which combinations to test? If you do the math for the 10+ flags that can be used together...
Is it necessary to test permutations of flags?
How to build a framework capable of automating the tests and then verifying results.
How to keep track of a large number of flags and providing an order so it is easy to tell what combinations have been implemented and what has not.
The thought of manually writing out individual cases and verifying results in a unit-test like format is daunting.
Does anyone know of a pattern that can be used to automate this type of test? Perhaps even software that attempts to solve this problem? How did people working on GNU commandline tools test their software?
I think this is very specific to your application.
First, how do you determine the success of the execution of you application? Is it a result code? Is it something printed to the console?
For question 2, it depends how you parse those flags in your application. Most of the time, order of flags isn't important, but there are cases where it is. I hope you don't need to test for permutations of flags, because it would add a lot of cases to test.
In a general case, you should analyse what is the impact of each flag. It is possible that a flag doesn't interfere with the others, and then it just need to be tested once. This is also the case for flags that are meant to be used alone (--help or --version, for example). You also need to analyse what values you should test for each flag. Usually, you want to try each kind of possible valid value, and each kind of possible invalid values.
I think a simple bash script could be written to perform the tests, or any scripting language, like Python. Using nested loops, you could try, for each flag, possibles values, including tests for invalid values and the case where the flag isn't set. I will produce a multidimensional matrix of results, that should be analysed to see if results are conform to what expected.
When I write apps (in scripting languages), I have a function that parses a command line string. I source the file that I'm developing and unit test that function directly rather than involving the shell.

Test framework for black box regression testing

I am looking for a tool for regression testing a suite of equipment we are building.
The current concept is that you create an input file (text/csv) to the tool specifying inputs to the system under test. The tool then captures the outputs from the system and records the inputs and outputs to an output file.
The output is in the same format as the original input file and can be used as an input for following runs of the tool, with the measured outputs matched with the values from the previous run.
The results of two runs will not be exact matches, there are some timing differences that depend on the state of the battery, or which depend on other internal state of the equipment.
We would have to write our own interfaces to pass the commands from the tool to the equipment and to capture the output of the equipment.
This is a relatively simple task, but I am looking for an existing tool / package / library to avoid re-inventing the wheel / steal lessons from.
I recently built a system like this on top of git (http://git.or.cz/). Basically, write a program that takes all your input files, sends them to the server, reads the output back, and writes it to a set of output files. After the first run, commit the output files to git.
For future runs, your success is determined by whether the git repository is clean after the run finishes:
test 0 == $(git diff data/output/ | wc -l)
As a bonus, you can use all the git tools to compare differences, and commit them if it turns out the differences were an improvement, so that future runs will pass. It also works great when merging between branches.
I'm not sure there will be a single package that exactly suits your needs. You have a few considerations to make:
How to pass data to the equipment and how to collect it back. This is very application specific, but a usually good option is the old'n'good serial port (RS232) for which an easy interfact exists for any programming language.
How to run the tests. A unit-testing framework can definitely help you here. The existing frameworks have a lot of the basic features implemented - selecting tests to run, selecting the detail-level of the report (very important for detailed debugging at first and production-stage PASS/FAIL analysis later on). I've had good experience using the test frameworks of both Perl and Python from testing embedded devices.
You also have to decide how to make the comparisons. As you correctly noted, the results won't be equal. This is where your domain knowledge comes in. Usually, it is simply implemented using error margins that are applicable in your domain. Of course, you won't be able to use a basic diff tool and will have to write an intelligent script.
You can just use any test framework. The hard part is writing the tools to send/retrieve the data from your test system, not the actual string comparisons.
Your tests would just all look like this:
x = read_input_file(ifilename);
y1 = read_expected_data(ofilename);
send_input_file_to_server();
y2 = read_output_from_server();
checkequal(y1, y2)