At the most recent, 45th Test Management Forum (TMF) held in London this week, topics ranged from a discussion by Grid-Tools’ own Llyr Wyn Jones to Mark Gilby from Sopra Steria’s excellent workshop on “Non Functional Testing in an Agile world”. Both led into engaging and lively discussions, and in both the question was raised: why optimize and reduce the total number of test cases, when you can just layer up cheap hardware to manage them?
The essence of these discussions was cost-benefit rather than the quality of testing, which was assumed. The question posed was: Granting that you have test cases that cover every possible scenario, does the time and effort of de-duplicating them while maintaining functional coverage outweigh the cost of storing and executing the total number of test cases?
Below are some reasons why it might be worth the time and effort, most of which have been drawn from the discussion that followed.
Cost. Even assuming that it is relatively cheap to buy and maintain an ever-growing stack of hardware – and that is a big assumption – the testing itself remains expensive. Each test case will have to be executed, and this drives up testing costs. If these tests are redundant, duplicate or broken, it will in effect be money that could have been spent elsewhere.
In one instance, Grid-Tools worked with a financial services company that had 150 test cases, providing 80% coverage. Although the functional coverage was higher than the industry average, the organization was over-testing by 6.5 times. They had 130 extra test cases, which they were paying an outsourced provider $200 per test to run, equating to $26,000 of wasted expenditure.
Execution time. In addition to cost, it takes time to execute any duplicate or redundant tests. Currently, 70% of testing is manual, and testing takes up to half the time in the SDLC. One of the main reasons for this is the time spent on redundant, duplicate or broken test cases, which constitute 30% of the testing effort. The testing bottlenecks that arise from this lead to the familiar pain points of budget over-run, project delays, and defects slipping through the net.
Maintenance time. Automation was frequently raised in the discussion, and it is true that automated testing does drastically reduce the time taken executing tests, in contrast to the previous point. However, the test cases and automated test scripts still have to be maintained in order to prevent broken or invalid tests causing automated test failure.
The more tests you have, the more likely projects will over-run, as testers either have to manually check and update each one by hand, or “burn” the lot and start again. As Dorothy Graham observes, the time taken preparing and maintaining scripts often outweighs the time saved executing them.[1]
Quality. Even if we assume that test cases exist for every possible scenario, will they then be executed? In the discussion, concerns were raised about falling into a “trap” when keeping an ever-growing test case library: testers might run the easiest tests until the risk threshold has been satisfied, thereby giving false assurance. For example, of 10 tests, they might run the 8 easiest, when it is the remaining two that are the critical tests. Not only does optimizing your test cases reduce the total number of tests, but it also enhances risk-based testing, as it increases the likelihood that business critical or high risk tests will be performed.
The discussion granted that test cases existed for every possible scenario. This was fair enough, given the need to remain within schedule and on point. But, considering the notions of risk, coverage and the observability of defects, a further case can be made for using “smart” or optimized test cases. It can be asked whether, if an organization just lets test cases amount on an ad hoc, test-by-test basis, they have any way of knowing that every scenario or requirement is being tested.
Especially if manually writing test cases, there is no guarantee that maximum functional coverage will be achieved. Scenarios that have not occurred before, or that are simply not obvious to testers will go untested. In fact, it is typical for testing to provide only 10-20% functional coverage, while it tends to be very “happy path” heavy and focused on expected results. In reality, however, it is the negative paths and unexpected results that cause a system to collapse. In order to guarantee that these are tested for, and to introduce sufficient measurability, a more formal, systematic approach to test case design is required.
A further assumption seemed to be that optimizing test cases is a necessarily slow and labor intensive process. However, the initial discussion itself grew out of a comment that tooling exists whereby users can de-duplicate test cases, automatically removing broken or redundant tests while maintaining 100% functional coverage. At least on the test case level, then, there is a good case for using “smart” test cases, and arguably the cost and benefit outweighs the effort of doing so.
If you missed Llyr Wyn Jones’ talk “A Critique of Testing (Design)”, you can read his consideration of test cases design methodologies here – A Critique of Testing.