This is a guest post by Matthew Heusser.
If you study test literature, you’ll see that most of it don’t talk about knowing when testing is done. Testing is done when all the tests have run, goes the thinking. Yet test ideas occur as the testing is proceeding. Often, five minutes of testing will net 10 additional minutes of test ideas.
So when do you stop? How much testing is enough?
In a graduate software engineering class, my old professor, Paul Jorgensen, showed his research on the triangle problem — a simple computer program that took three inputs and provided one output. Jorgensen found that every book and article on the subject had a different number of test ideas listed.
What all those books missed was context. The “right” amount of testing depends on social and political factors as much as it does technical factors.
The same is true for your own testing. Let’s examine seven ways you can decide you are done testing.
1. You walked through all the test ideas you were given
The feature or story came with some sample scenarios, and you ran through all of them. That means you let someone else do the heavy lifting of test design and did only what they wrote. However, some of the best ideas in testing occur to the engaged tester as they are doing the work, so there may yet be more to do. But if you were asked to stay within a specific scope, finishing the list is an easy way to know you’re done.
2. The time for testing ran out
Someone bursts into the room, saying the software must ship. (OK, perhaps it was presented in a more subtle way.) You may have something new to test at this point, but you’re unable to do two things at once. Or maybe the “To test” column of the board is filling up. Like it or not, we have a finite amount of time for testing, so these may be good indicators that it’s time to stop.
3. You’re experiencing diminishing returns
This strategy is similar to the way you know that popcorn is done in a microwave oven; as time goes on, the pops get farther and farther apart. You stopped finding bugs a long time ago, and the test ideas you have are sufficiently hard to run that you think they will yield lower results. Better yet, even if they did have errors, the errors would be small, or only happen with a complex and rare setup. It’s not worth continuing testing.
4. Testers are exhausted
This is related to diminishing returns but works better when the change is limited in scope (the team is only deploying a change to search) and rolls out to a limited set of power users through feature flags. If the impact is limited, you can say, “There are probably more bugs here, but we can let the customers find them and roll back if we have to.” Having this option is better than not having it.
5. Your test ideas are out of scope
Management looked at the list of test ideas you have and said, “We can live if those things fail.” Just ask them to put it in an email.
6. Remaining test ideas are below the cut line
For some reasonable reason, management (or the team) decided how much time to invest in testing. This might be four hours. You organize the tests, prioritize them, and run them until time runs out. That’s a lot different from having a pile of test ideas on the floor in no order and stopping at an arbitrary point. If you can, get buy-in on that sorted order. When bugs show up, if they are below the cut line, that needs to be the result of management’s (or the team’s) decision.
7. Tests are below an agreed-on priority
Karen Johnson’s RCRCRC heuristic provides the tools to decide what to test for a specific build, instead of doing the same thing every time. That list asks you to consider changes that are recent, core, risky, configuration-sensitive, recently repaired, or chronic. Use custom fields to your test cases to help decide what to test for this build, perhaps with a weight, and sort for the things that score high enough to test. Over time, track what bugs are missed and improve your algorithm.
Conclusion: Create a test strategy
One thing you’ll notice above is a recurring theme of published test ideas. These are transparent and available to everyone. Put them in your test case management tool. Call them charters and measure sessions or threads, and get as high-level or as detailed as you like — but have a list. Once you have that list, you can start to talk about coverage in a meaningful way.
Provide for the micro situation of when to stop testing a feature and the macro of when to stop testing the integration of features, and you may just find you have a test strategy that stands up to scrutiny under conditions of uncertainty and ambiguity. That’s a lot more than many organizations can say.
The list above helps aid in decisions when trust is low. When trust is higher, you can just test when you know it’s time to be done.
P.S. Over a decade ago, Michael Bolton wrote a great piece on stopping heuristics: guidelines for when to stop testing. I did not review that article before writing this, as I was trying to bring a fresh perspective. Reading it now, If you want a quick and dirty set of tripwires for your testing — “if you’ve stepped over this, you might be done” — check it out.
Matthew Heusser is the Managing Director of Excelon Development, with expertise in project management, development, writing, and systems improvement. And yes, he does software testing too.