The risks of coverage-directed test case generation

Gregory Gay, Matt Staats, Michael Whalen, Mats P.E. Heimdahl

Research output: Contribution to journalArticlepeer-review

84 Scopus citations


A number of structural coverage criteria have been proposed to measure the adequacy of testing efforts. In the avionics and other critical systems domains, test suites satisfying structural coverage criteria are mandated by standards. With the advent of powerful automated test generation tools, it is tempting to simply generate test inputs to satisfy these structural coverage criteria. However, while techniques to produce coverage-providing tests are well established, the effectiveness of such approaches in terms of fault detection ability has not been adequately studied. In this work, we evaluate the effectiveness of test suites generated to satisfy four coverage criteria through counterexample-based test generation and a random generation approach - where tests are randomly generated until coverage is achieved - contrasted against purely random test suites of equal size. Our results yield three key conclusions. First, coverage criteria satisfaction alone can be a poor indication of fault finding effectiveness, with inconsistent results between the seven case examples (and random test suites of equal size often providing similar - or even higher - levels of fault finding). Second, the use of structural coverage as a supplement - rather than a target - for test generation can have a positive impact, with random test suites reduced to a coverage-providing subset detecting up to 13.5 percent more faults than test suites generated specifically to achieve coverage. Finally, Observable MC/DC, a criterion designed to account for program structure and the selection of the test oracle, can - in part - address the failings of traditional structural coverage criteria, allowing for the generation of test suites achieving higher levels of fault detection than random test suites of equal size. These observations point to risks inherent in the increase in test automation in critical systems, and the need for more research in how coverage criteria, test generation approaches, the test oracle used, and system structure jointly influence test effectiveness.

Original languageEnglish (US)
Article number7081779
Pages (from-to)803-819
Number of pages17
JournalIEEE Transactions on Software Engineering
Issue number8
StatePublished - Aug 1 2015

Bibliographical note

Publisher Copyright:
© 2015 IEEE.


  • Software Testing
  • System Testing


Dive into the research topics of 'The risks of coverage-directed test case generation'. Together they form a unique fingerprint.

Cite this