I am a little confused on the concept of test automation (using Selenium etc) when doing regression testing. If the system that is being tested is constantly under change, how does it affect the test cases? and is automation the best way to go in this scenario?
Regression testing means you test to see if the system still behaves the way it should, in particular, if it still does everything it did correctly before any change.
Most users will complain when you break features in a software. So you don't get around regression testing before a release. That leaves the question as to how you do it.
You can manually test. Hire a bunch of monkeys, interns, testers, or whatever, and let them test. In order for them to find any regressions, they need to know what to test. So you need test scripts, which tell the tester what functionality to test: which button to click and what text to enter and then what result to expect. (This part rules out most monkeys, unfortunately.)
The alternative is automated testing: you still have a kind of test script, but at this time no manual tester works with the script, but a computer does instead.
Advantages of automated testing:
It's usually faster than manual testing.
You don't need to hire testers, interns, or monkeys.
You don't need to worry about humans getting tired of the repetitive work, missing a step or getting tired of clicking through the same old program over and over.
Disadvantages of automated testing:
Won't catch everything, in particular, some UI aspects may be hard to automate: a person will notice overlapping texts or pink on neon green text, but Selenium is happy if it can click it.
First you need to write the tests, and then maintain them. If you just add features, maintenance is not soo bad, but if e.g., you restructure your user interface, you may have to adjust all tests (Page Objects may come in handy here). But then again you also have to re-write all manual tests in such a situation.
Regression automation testing tools are the most widely used tools in the industry today. Let me help you with an example, considering your scenario which is 'Software undergoes continuous change'. Assume that we are following a Scrum based model in which the software will be developed across several sprints. Say each Sprint consists of 5 user stories/features. Sprint 1 is developed, tested and delivered.
Team moves to the next Sprint 2, which again has 5 big features. By the time, the development team hands over the features to the testing team, the testing team starts writing automated scripts for Sprint 1. Testing team runs the script say on a daily basis to check whether the features that are being developed in Sprint 2 do not break the previously working and tested features of Sprint 1. This is nothing but automating regression testing.
Of course, this is not as easy as it sounds. A lot of investment is needed for automated testing. Investment not only in terms of money but also time, training costs, hiring specialists etc.
I worked on project that consisted of approx. 25 sprints, hundreds of user stories and the project span across 2 years. With just 2 testers on the team, imagine the plight of the project had there been no Automation test suite.
Again, automation cannot entirely replace manual testing, but to quite-some extent. Automated tests can be functional as well visual regression ones. You can very well use Selenium to automate your functional tests and any other visual regression tool to check CSS breaks.
NOTE: Not every project needs to be automated. You have to consider the ROI (Return on Investment) when thinking about automating any project.
Regression testing is usually performed to verify code changes made into a system do not break existing code, introduce new bugs, or alter the system functionalities. As such, it should be performed every time you deploy a new functionality to your application, add a new module, alter system configurations, fix a defect, or perform changes to improve system performance.
Below is a simple regression test for some commonly used services in Python. This script helps catch errors that stem from changes made in a program’s source code.
#!/usr/local/bin/python
import os, sys # get unix, python services
from stat import ST_SIZE # or use os.path.getsize
from glob import glob # file name expansion
from os.path import exists # file exists test
from time import time, ctime # time functions
print 'RegTest start.'
print 'user:', os.environ['USER'] # environment variables
print 'path:', os.getcwd( ) # current directory
print 'time:', ctime(time( )), '\n'
program = sys.argv[1] # two command-line args
testdir = sys.argv[2]
for test in glob(testdir + '/*.in'): # for all matching input files
if not exists('%s.out' % test):
# no prior results
os.system('%s < %s > %s.out 2>&1' % (program, test, test))
print 'GENERATED:', test
else:
# backup, run, compare
os.rename(test + '.out', test + '.out.bkp')
os.system('%s < %s > %s.out 2>&1' % (program, test, test))
os.system('diff %s.out %s.out.bkp > %s.diffs' % ((test,)*3) )
if os.stat(test + '.diffs')[ST_SIZE] == 0:
print 'PASSED:', test
os.remove(test + '.diffs')
else:
print 'FAILED:', test, '(see %s.diffs)' % test
print 'RegTest done:', ctime(time( ))
Regression tests like the one above are designed to cover both functional and non-functional aspects of an application. This ensures bugs are caught with every build, thereby enhancing the overall quality of the final product. While regression tests are a vital part of the software QA process, performing these repetitive tests manually comes with a number of challenges. Manual tests can be tedious, time-consuming, and less accurate. Additionally, the number of test cases increases with every build, and so does the regression test suite grow.
An easy way of adressing the above challenges and maintaining a robust and cohesive set of regression test scripts is by automation. Test automation increases the accuracy and widens the coverage of your regression tests as your test suite grows.
Worth a mention is that even with automation, your regression tests are only as good as your test scripts. For this reason, you must understand what events trigger the need to improve and modify the test scenarios. For every change pushed to your codebase, you should evaluate its impact and modify the scripts to ensure all affected code paths are verified.
I hope this answered your question.
Related
Let's say I have a bunch of unit tests, integration tests, and e2e tests that cover my app. Does it make sense to have these continuously running against prod, e.g. every 10 mins?
I'm thinking no, here's why:
My tests are already ran after every prod deploy. If they passed and no code has changed after that, they should continue to pass. So testing them thereafter doesn't make sense.
What I really want to test continuously is my infrastructure -- is it still running? In this case, running an API integration test every 10 mins to check if my API is still working makes sense. So I'm dealing with a subset of my test suites -- the ones that test my infrastructure availability (integration+e2e) versus only single bits of code (unit test). So in practice, would I have seperate test suites that test prod uptime than the suites used to test pre/post deploy?
Such "redundant" verifications (they can include building as well, BTW, not only testing) offer additional datapoints increasing the monitoring precision for your actual production process.
Depending on the complexity of your production environment even the simple "is it up/running?" question might not have a simple answer and subset/shortcut versions of the verifications might not cut it - you'd only cover those versions, not the actual production ones.
For example just because a build server is up doesn't mean it's also capable of building the product successfully, you'd need to check every aspect of the build itself: availability of every tool, storage, dependencies, OS resources, etc. For complex builds it's probably simpler to just perform the build itself than to manage the code reliably checking if the build would be feasible ;)
There are 2 production process attributes that would benefit from a more precise monitoring (and for which subset/shortcut verifications won't be suitable either):
reliability/stability - the types, occurence rates and root causes of intermittent failures (yes, those nasty surprises which could make a difference between meeting the release date or not)
performance - the avg/min/max durations of various verifications; especially important if verifications are expensive in terms of duration/resources involved; trending could be desired for planning, budgeting, production ETAs, etc
Donno if any of these are applicable to or have acceptable cost/benefit ratios for your context but they are definitely important for most very large/complex sw projects.
If someone has a webpage, the usual way of testing the web site for user interaction bugs is to create each test case by hand and use selenium.
Is there a tool to create these testcases automatically? So if I have a webpage that gets altered, new test cases get created automatically?
You can look at a paid product. That type of technology is not being developed as open source and will probably cost a bit. Some of the major test tools get closer to this, but full auto I have not heard of.
If this was the case the role of QA Engineer and especially Automation Engineer would not be as important and the jobs would spike downwards pretty quickly. I would imagine that if such a tool was out there that it would be breaking news to the entire industry and be world wide.
If you go down the artificial intelligence path this is possible in theory and concept, however, usually artificial intelligence development efforts costs more than the app being developed that needs the testing, so...that's not going to happen.
The best to do at this point is separate out as much of the maintenance into a single section from the rest so you limit the maintenance headache when modyfying and keep a core that stays the same. I usually focus on control manipulation as generic and then workflow and specific maps and data change. That will allow it to function against any website...but you still have to write/update the tests and maintain the maps.
I think Growing Test Cases Automatically is more of what your asking. To be more specific I'll try to introduce basics and if you're interested take a closer look at Evolutionary Testing
Usually there is a standard set of constraints we meet like changing functionality of the system under test (SUT), limited timeframe, lack of appropriate test tools and the list goes on… Yet there is another type of challenge which arises as technological solutions progress further – increase of system complexity.
While the typical constraints are solvable through different technical and management approaches, in the case of system complexity we are facing the limit of our capability of defining a straight-forward analytical method for assessing and validating system behavior. Complex system consist of multiple, often heterogeneous components which when working together amplify each other’s statistical and behavioral deviations, resulting in a system which acts in ways that were not part of its initial design. To make matter worse, complex systems increase sensitivity to their environment as well with the help of the same mechanism.
Options for testing complex systems
How can we test a system which behaves differently each time we run a test scenario? How can we reproduce a problem which costs days and millions to recover from, but happens only from time to time under conditions which are known just approximately?
One possible solution which I want to focus on is to embrace our lack of knowledge and work with whatever we have by using evolutionary testing. In this context the evolutionary testing can be viewed as a variant of black-box testing, because we are working with feeding input into and evaluating output from a SUT without focusing on its internal structure. The fine line here is that we are organizing this process of automatic test case generation and execution on a massive scale as an iterative optimization process which mimics the natural evolution.
Evolutionary testing
Elements:
• Population – set of test case executions, participating into the optimization process
• Generation – the part of the Population, involved into given iteration
• Individual – single test case execution and its results, an element from the Population
• Genome – unified definition of all test cases, model describing the Population
• Genotype – a single test case instance, a model describing an Individual, instance of the Genome
• Recombination – transformation of one or more Genotypes into a new Genotype
• Mutation – random change in a Genotype
• Fitness Function – formalized criterion, expressing the suitability of the Individual against the goal of the optimization
How we create these elements?
• Definition of the experiment goal (selection criteria) – sets the direction of the optimization process and is related to the behavior of the SUT. Involves certain characteristics of SUT state or environment during the performed test case experiments. Examples:
o “SUT should complete the test case execution with an error code”
o “The test case should drive the SUT through the largest number of branches in SUT’s logical structure”
o “Ambient temperature in the room where SUT is situated should not exceed 40 ºC during test case execution”
o “CPU utilization on the system, where SUT runs should exceed 80% during test case execution”
Any measurable parameters of SUT and its environment could be used in a goal statement. Knowledge of the relation between the test input and the goal itself is not obligatory. This gives a possibility to cover goals which are derived directly from requirements, rather than based on some late requirement derivative like business, architectural or technical model.
• Definition of the relevant inputs and outputs of the tested system – identification of SUT inputs and outputs, as well as environment parameters, relevant to the experiment goal.
• Formal definition of the experiment genome – encoding the summarized set of test cases into a parameterized model (usually a data structure), expressing relevant SUT input data, environment parameters and action sequences. This definition also needs to comply with the two major operations applied over genome instances – recombination and mutation. The mechanism for those two operations can be predefined for the type of data or action present in the genome or have custom definitions
• Formal definition of the selection criteria (fitness function) – an evaluation mechanism which takes SUT output or environment parameters resulting from a test case execution (Individual) and calculates a number (Fitness), signifying how close is this particular Individual to the experiment goal.
How the process works?
We use the Genome to create a Generation of random Genotypes (test case instances).
We execute the test cases (Genotypes) generating results (Individuals)
We evaluate each execution result (Individual) against our goal using the Fitness Function
We select only those Individuals from given Generation which have Fitness above a given threshold (the top 10 %, above the average, etc.)
We use the selected individuals to produce a new, full Generation set by applying Recombination and Mutation
We repeat the process, returning on step 2
The iteration process usually stops by setting a condition with regard to the evaluated Fitness of a Generation. For example:
• If the top Fitness hasn’t changed with more than 0.1% since the last Iteration
• If the difference between the top and the bottom Fitness in a Generation is less than 0.3%
then probably it is time to stop.
Upsides and downsides
Upsides:
• We can work with limited knowledge for the SUT and goal-oriented test definitions
• We use a test case model (Genome) which allows us to mass-produce a large number of test cases (Genotypes) with little effort
• We can “seed” test cases (Genotypes) in the first iteration instead of generating them at random in order to speed up the optimization process.
• We could run test cases in parallel in order to speed up the process
• We could find multiple solutions which meet our test goal
• If the optimization process in convergent we have a guarantee that each following Generation is a better approximate solution of our test goal. This means that even if we need to stop before we have reached optimal Fitness we will still have better test cases than the one we started with.
• We can achieve replay of very complex, hard to reproduce test scenarios which mimic real life and which are far beyond the reach of any other automated or manual testing technique.
Downsides:
• The process of defining the necessary elements for evolutionary test implementation is non-trivial and requires specific knowledge.
• Implementing such automation approach is time- and resource-consuming and should be employed only when it is justifiable.
• The convergence of the optimization process depends on the smoothness of the Fitness Function. If its definition results in a zones of discontinuity or small/no gradient then we can expect slow or no convergence
Update:
I also recommend you to look at Genetic algorithms and this article about Test data generation can give you approaches and guidelines.
I happen to develop ecFeed - an open-source tool that may assist in test design. It's in pre-release phase and we are going to add better integration with Selenium, but you may have a look at the current snapshot: https://github.com/testify-no/ecFeed/wiki . The next version should arrive in October and will have major improvements in usability. Anyway, I am looking forward for constructive criticism.
In the Microsoft development world there is Visual Studio's Coded UI Test framework. This will record your actions in a web browser and generate test cases to replicate that use case. It won't update test cases with any changes to code though, you would need to update them manually or re-generate.
I'm going to work on the software testing process for a company, which has several projects (with uses different technologies), and I'm planing to improve and automatize the software testing process. I know some of the concepts such as black box and white box testing, and some of its techniques, but I do not have much experience in the field. I'm going to have access to the projects documentation, and I expect to be involved more with functional testing, rather than white-box testing (alhough I'm not entirely sure).
What's the "right way" to start? I know that it depends on several factors, so I don't expect to get a perfect answer, but if I could read how others start, it would be great for me.
What sort of guidelines do you follow from the start? Where do the CMMI and IEEE829 standards come in? Are the any other standards/guidelines worth of note?
What's the best way to make a correct assessment of the current efficiency/productivity level of the software testing process inside the company?
Different Phases of Testing Life Cycle
The life cycle of testing process intersects the software development lifecycle. When would testing start varies from one company to another. In some companies, testing start simultaneously with development, while in others, it starts after the software has been built. Both these methods have their own advantages and disadvantages. Whatever be the method adopted for testing the software, the steps followed are more or less as mentioned below.
Planning Phase
The process of software testing life cycle phase starts with the test planning stage. It is recommended that one spend a lot of time in this phase, to minimize headaches in the other software testing phases. It is in this phase that the 'Test Plan' is created. It is a document, where the items to be tested along with the features to be tested, pass/fail criteria of a test, exit criteria, environment to be created, risks and contingencies are mentioned. This gives the testing team refined specifications.
Analysis Phase
An analysis of the requirements is carried out, so that the testing team can be well versed with the software that has been developed. It is in this phase that the types of testing to be carried out in the different phases of the testing life cycle are decided upon. In some cases, the test may have to be automated and in others, manual tests would have to be carried out. Functional validation matrix, which is based on business requirements is made. It is normally based on one or more than one test cases. This matrix helps in analyzing, which of the test cases will have to be automated and which will have to be tested manually.
Designing Phase
In the software testing life cycle, this phase has an important role to play. Here the test plan, functional validation matrix, test cases, etc. are all revised. This ensures there are no problems existing in any of them. If the test cases have to be automated, the suitable scripts for the same are designed at this stage. Test data for both manual as well as automated test cases is also generated.
Development Phase
Based on the test plan and test cases the entire scripting takes place in this phase. If the testing activity starts along with the development activity of the software, the unit tests will also have been implemented in the development phase. Often along with the unit tests, stress and performance test plans are generated in this phase.
Execution Phase
After the test scripts have been made, they are executed. Initially, unit tests are executed, followed by functionality tests. In the initial phase testing is carried out on the superficial level, i.e. on the top level. This helps in identifying bugs on the top level, which are then reported to the development team. Then the software is tested in depth. The test reports are created and bugs are reported.
Retest and Regression Testing Phase
Once the bugs have been identified, they are sent to the development team. Depending on the nature of the bug, the bug may be rejected, deferred or fixed. If the bug has been accepted and fixed immediately, the software has to be retested to check if the bug has indeed been fixed. Regression testing is carried out to ensure that no new bugs have been created in the software, while fixing of the bug.
Implementation
After the system has been checked, final testing on the developers side is carried out. It is here that load, stress, performance and recovery testing is carried out. Then the software is implemented on the customers end. The end users tests the software and bugs if any are reported. The necessary documents for the same are generated.
The phases of testing life cycle does not end after the implementation phase. This is when the bugs found are studied, so as to rule out such problems in the future. This analysis helps in improving the software development process for the next software.
We run BDD tests (Cucumber/Selenium) with Jenkins in a Continuous Integration process. The number of tests is increasing day by day and the time to run these tests is getting higher, making the whole CI process not really responsive (if you commit in the afternoon you would risk to see your building results the day after). Is there a way/pattern to keep the CI process quick in spite of increasing number of tests?
You could choose one of the following schemes:
Seperate projects for unit tests and integration tests. The unit tests will return their results faster and the integration project will run once or just a couple of times per day and not after each commit. The drawback is obvious, if the integration tests suite break there is no correlation with the breaking change.
Google approach - sort your tests according to their size: small, medium, large and enormous. Use separate projects for each kind of test and according to the total time it takes to run the specific test suite. You can read more in this book. Also, read this blog to get more wise ideas.
Try to profile your current test suite to eliminate bottlenecks. This might bring it back to give feedback in a timely fasion.
Hope that helps.
#Ikaso gave some great answers there. One more option would be to set up some build slaves (if you haven't already) and split the integration tests into multiple jobs that can be run in parallel on the slaves.
I was taught that a regression test was a small (only enough to prove you didn't break anything with the introduction of a change or new modules) sample of the overall tests. However, this article by Ron Morrison and Grady Booch makes me think differently:
The desired strategy would be to bring each unit in one at a time, perform an extensive regression test, correct any defects and then proceed to the next unit.
The same document also says:
As soon as a small number of units are added, a test version is generated and "smoke tested," wherein a small number of tests are run to gain confidence that the integrated product will function as expected. The intent is neither to thoroughly test the new unit(s) nor to completely regression test the overall system.
When describing smoke testing, the authors say this:
It is also important that the Smoke Test perform a quick check of the entire system, not just the new component(s).
I've never seen "extensive" and "regression test" used together nor a regression test described as "completely regression test the overall system". Regression tests are supposed to be as light and quick as possible. And the definition of smoke test is what I learned a regression test was.
Did I misunderstand what I was taught? Was I taught incorrectly? Or are there multiple interpretations of "regression test"?
There are multiple interpretations. If you're only fixing a bug that affects one small part of your system then regression tests might only include a small suite of tests that exercise the class or package in question. If you're fixing a bug or adding a feature that has wider scope then your regression tests should have wider scope as well.
The "if it could possibly break, test it" rule of thumb applies here. If a change in Foo could affect Bar, then run the regressions for both.
Regression tests just check to see if a change caused a previously passed test to fail. They can be run at any level (unit, integration, system). Reference.
I always took regression testing to mean any tests whose purpose was to ensure that existing functionality is not broken by new changes. That would not imply any constraint on the size of the test suite.
Regression is generally used to refer to the whole suite of tests. It is the last thing QA does before a release. It is used to show that everything that used to work still works, to the extent that that is possible to show. In my experience, it is generally a system-wide set of tests regardless of how small the change was (although small changes may not trigger a regression test).
Where I work, regression tests are standardized for each application at the end of each release. They are intended to test all functionality, but they are not designed to catch subtle bugs. So if you have a form that has various kinds of validation done on it, for example, a regression suite for that form would be to confirm that each type of validation gets done (field level and form level) and that correct information can be submitted. It is not designed to cover every single case (i.e. what if I leave field A blank? How about field B? it will just test one of them and assume the others work).
However, on the current project I'm working on, the regression tests are much more thorough, and we have noticed a reduction in the number of defects being raised during testing. Those two are not necessarily related, but we do notice it fairly consistently.
my understanding of the term 'regression testing' is:
unit tests are written to test features when the system is created
when bugs are discovered, more unit tests are written to reproduce the bug and verify that it has been corrected
a regression test runs the entire set of tests prove that everything still works including that no old bugs have reappeared [i.e. to prove that the code has not "regressed"]
in practice, it is best to always run all existing unit tests when changes are made. the only time i'd bother with a subset of tests is when the full unit test suite takes "too long" to run [where "too long" is fairly subjective]
Start with what you are trying to accomplish. Then do what you need to do to accomplish that goal. And then use buzzword bingo to assign a word to what you actually do. Just like everyone else :-) Accuracy isn't all that important.
... regression test was a small (only enough to prove you didn't break anything with the introduction of a change or new modules) sample of the overall tests
If a small sample of tests is enough to prove that the system works, why do the rest of the tests even exist? And if you think you know that your change only affected a subset of functionality, then why do you need to test anything after making the change? Humans are fallible, nobody really knows if changing something breaks something else. IMO, if your tests are automated, re-run them all. And if they aren't automated, automate them. In the mean time, re-run whatever is automated.
In general, a subset of the feature tests for the new feature introduced in version X of a product becomes the basis of the regression tests for version X+1, X+2, and so on. Over time, you may reduce the time taken by the feature/regression tests of stable features which have not suffered from regressions. If a feature suffers from lots of regressions, then it may be beneficial to increase the emphasis on the feature.
I think that the article referring to 'extensive regression test' means run an extensive set of (individually simple) regression tests.