With the rapidly growing scale of modern software, the reliability of software systems has become essential. To ease the developers’ pressure of writing unit tests manually, test generation tools such as Evo-Suite and Randoop were proposed. Although these approaches have been shown to be able to automatically generate tests for achieving high coverage, the generated tests may be ineffective in detecting real faults. Particularly, these automatically generate may suffer from several problems (we call the problematic tests): (1) incorrect oracle. (2) unexpected exception/error. (3) flaky test. We present a comprehensive study of EvoSuite in Defects4j, and performed a detailed analysis of the reasons behind these automatically generated problematic test. Our analysis identifies 528 problematic tests: 208 (39.4%) of them are caused by incorrect oracle, 319 (60.4%) are caused by unexpected exception/error, and one flaky test.
Program Display Configuration
Wed 29 May
Displayed time zone: Eastern Time (US & Canada)change