Sometimes the hudson build step fails and does not execute all my tests, which really screws up the test trend graph because it shows a 50% drop in the number of tests and then goes up again. Is there a way to exclude the failed builds? I tried to delete the whole failed build but that didn't help.
If you delete the "offending" build and then run another build I think you'll find the trend graph is rebuilt without the blip.
That said, I have to side with #DarkDust on this one. If your CI environment is prone to wobble then these stats can be useful for diagnostics.
Related
Currently I have a powershell, post-build script that launches our Selenium tests. The time of day is checked and if between 6:45 and 8:00 AM, the full test suite runs. If not, its a normal CI build and only a small subset of tests runs.
We are switching to TestCafe and I have added Build Definition steps to install testcafe, install testcafe-reporter-junit and run the tests. I's like to move the test runs to regular steps instead of scripted, if possible, but I would need to know if I can condition the full suite test step to only run during the above mentioned time period. Is that possible with custom conditions?
There is no condition syntax that deals with dates and time. But you can run a small scripted step to set a variable, then use that variable in the condition.
I have an Behat testing suite running through Travis CI on Pull Requests. I know that you can add a "--rerun" option to the command line to rerun failed tests, but for me Behat just keeps trying to rerun the failed tests, which eventually times out the test run session.
Is there a way to limit the number of times that failed tests are re-ran? Something like: "behat --rerun=3" for trying to run a failed scenario up to three times?
The only other way I can think to accomplish what I want is to use the database I'm testing Behat against or to write to a file and store test failures and the number of times they have been run.
EDIT:
Locally, running the following command ends up re-running only the one test I purposely made to fail...and it does it in a loop until something happens. Sometimes it was 11 times and sometimes 100+ times.
behat --tags #some_tag
behat --rerun
So, that doesn't match what the behat command line guide states. In my terminal, the help option give me "--rerun Re-run scenarios that failed during last execution." without any mention of the failed scenario file. I'm using a 3.0 version of behat though.
Packages used:
"behat/behat": "~3.0",
"behat/mink": "~1.5",
"behat/mink-extension": "~2.0",
"behat/mink-goutte-driver": "~1.0",
"behat/mink-selenium2-driver": "~1.1",
"behat/mink-browserkit-driver": "~1.1",
"drupal/drupal-extension": "~3.0"
Problem:
Test fail at random due to mainly Guzzle timeout errors going past 30 seconds trying to GET a URL. Sure you could try bumping up the max execution time, but since other tests have no issues and 30 seconds is a long time to wait for a request, I don't think that will fix the issue and it will make my test runs much longer for not a good reason.
It is possible that you might not have an efficient CI setup?
I think that this option should run the failed tests only once and maybe your setup enters in a loop.
If you need to run the failed scenarios for a number of times maybe you should write a script that checks some condition and runs with --rerun.
Be aware that as i see on the Behat CLI guide if no scenario was found in the file where the failed scenarios are saved then all scenarios will be executed.
I don't think that using '--rerun' in CI is good practice. You should do a high level review of the results before deciding to do a rerun.
I checked today the rerun on Behat 3 and it seems there might be a bug related to the rerun option, i saw today some pull requests on github.
One of them is https://github.com/Behat/Behat/pull/857
Related to the timeout you can check if Travis has some timeout to set, it it has enough resources and you can use the same steps to run it from desktop and see the difference for the same test environment.
Also set CURLOPT_TIMEOUT for guzzle with the value needed to pass in case is not to exaggerated and you will need to find other solution to improve the execution speed.
It should not be such an issue to have a higher value because this should be a conditional wait, so if is faster it will not impact the execution time, else it will wait longer for problematic scenarios.
I have many builds with failed tests.
I learn one test and want to find last build, where this test was succesfully.
How I can find this build or how I can get test results history?
If as a post-build step you are publishing your test results, you should be able to go to the job page and see the test results graph. If you are looking for the last build where all tests passed, you can click on "enlarge" under the graph. You will then see a high level view of all the builds from which you should be able to see the last build where all tests passed.
If you are looking for the last time a particular test passed, you can click on the latest build in the test results graph, drill down to the particular test you are interested in, and then click the previous build button until you find a build where that test passed.
If you aren't publishing your test results, you may be able to write some groovy code to scan through your build results, but beyond that I don't know of a way to find what your looking for without publishing test results in your post-build.
I have a build job which takes a parameter (say which branch to build) that, when it completes triggers a testing job (actually several jobs) which does some stuff like download a bunch of test data and checks that the new version is works with the test data.
My problem is that I can't seem to figure out a way to show the test results in a sensible way. If I just use one testing job then the test results for "stable" and "dodgy-future-branch" get mixed up which isn't what I want and if I create a separate testing job for each branch that the build job understands it quickly becomes unmanageable because of combinatorial explosion (say 6 branches and 6 different types of testing mean I need 36 testing jobs and then when I want to make a change, say to save more builds, then I need to update all 36 by hand)
I've been looking at Job Generator Plugin and ez-templates in the hope that I might be able to create and manage just the templates for the testing jobs and have the actual jobs be created / updated on the fly. I can't shake the feeling that this is so hard because my basic model is wrong. Is it just that the separation of the building and testing jobs like this is not recommended or is there some other method to allow the filtering of test results for a job based on build parameters that I haven't found yet?
I would define a set of simple use cases:
Check in on development branch triggers build
Successful build triggers UpdateBuildPage
Successful build of development triggers IntegrationTest
Successful IntegrationTest triggers LoadTest
Successful IntegrationTest triggers UpdateTestPage
Successful LoadTest triggers UpdateTestPage
etc.
So especially I wouldn't look into all jenkins job results for overviews, but create a web page or something like that.
I wouldn't expect the full matrix of build/tests, and the combinations that are used will become clear from the use cases.
I am trying to find a tool that can solve the following problem:
Our entire test suite takes hours to runs which often makes it difficult or at least very time consuming to find out which commit broke a specific test since there may be 50 to 200 commits in between test runs. At any given time there are only very few broken tests, so rerunning only the broken tests is very fast compared to running the entire test suite.
Is there a tool, e.g. a continuous integration server, that can rerun failed tests with a couple of revisions in between the last revision where the test was OK and the first revision where the test was not OK and therefore automatically find out on which specific commit a test switched from successful to broken.
For example:
Test A and B are ok in revision 100. Test A and B are broken in revision 200.
The tool should now run both tests with revision 150.
Then if e.g. test A was broken and test B was OK in revision 150, it could continue to check test A with revision 125 and test B with revision 175 and so on until every broken test can be explained by some specific commit.
For a single test, I could probably hack something together with git bisect. But for multiple failed test this is probably not sufficient, since we need to search in both directions for many revisions.
git
Are you using git or mercurial (see: What is Mercurial bisect good for?)?
Suppose version 2.6.18 of your project worked, but the version at "master" crashes. Sometimes the best way to find the cause of such a regression is to perform a brute-force search through the project's history to find the particular commit that caused the problem. The git bisect command can help you do this:[...]
From: Finding Issues - Git Bisect.
Essentially you mark two revisions as start and end and hit:
$ git bisect
Then run the test and depending on whether it passed or failled call
$ git bisect good
or
$ git bisect bad
Respectively. git will do binary search, always cutting remaining revisions in half. I guess you can script it easily. If you are using svn, git can easily import the whole repository.
Building single revision at a time
This is not an answer but an advice - just test every single commit! Today you can cluster continuous integration servers easily, with a farm of servers it is not that hard to test all commits.