Mining Historical Test Logs to Predict Bugs and Localize Faults in the Test LogsTechnical TrackIndustry Program
Software testing is an integral part of modern software development. However, test runs can produce 1000’s of lines of logged output that make it difficult to find the cause of a fault in the logs. This problem is exacerbated by environmental failures that distract from product faults. In this paper we present techniques with the goal of capturing the maximum number of product faults, while flagging the minimum number of log lines for inspection.
We observe that the location of a fault in a log should be contained in the lines of a failing test log. In contrast, a passing test log should not contain the lines related to a failure. Lines that occur in both a passing and failing log introduce noise when attempting to find the fault in a failing log. We remove the lines that occur in the passing log from the failing log.
After removing these lines, we use information retrieval techniques to flag the most probable lines for investigation. We modify TF-IDF to identify the most relevant log lines related to past product failures. We then vectorize the logs and develop an exclusive version of KNN to identify which logs are likely to lead to product faults and which lines are the most probable indication of the failure.
Our best approach, LogFaultFlagger finds 89% of the total faults and flags less than 1% of the total failed log lines for inspection, which drastically outperforms previous work. Our tool makes daily predictions to Ericsson basestation testers.
Wed 29 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | DevOps and LoggingSoftware Engineering in Practice / Technical Track / Papers at Mansfield / Sherbrooke Chair(s): Diomidis Spinellis Athens University of Economics and Business | ||
14:00 20mTalk | An Empirical Investigation of Incident Triage for Online Service SystemsSEIPIndustry Program Software Engineering in Practice Junjie Chen Peking University, Xiaoting He Microsoft, Qingwei Lin Microsoft Research, China, Yong Xu Microsoft, China, Hongyu Zhang The University of Newcastle, Dan Hao Peking University, Feng Gao Microsoft, Zhangwei Xu Microsoft, Yingnong Dang Microsoft Azure, Dongmei Zhang Microsoft Research, China | ||
14:20 20mTalk | Tools and Benchmarks for Automated Log ParsingSEIPIndustry Program Software Engineering in Practice Jieming Zhu Huawei Noah's Ark Lab, Shilin He Chinese University of Hong Kong, Jinyang Liu Sun Yat-Sen University, Pinjia He Computer Science and Engineering, The Chinese University of Hong Kong, Qi Xie Southwest Minzu University, Zibin Zheng School of Data and Computer Science, Sun Yat-sen University, Michael Lyu | ||
14:40 20mTalk | Mining Historical Test Logs to Predict Bugs and Localize Faults in the Test LogsTechnical TrackIndustry Program Technical Track | ||
15:00 20mTalk | DLFinder: Characterizing and Detecting Duplicate Logging Code SmellsTechnical TrackIndustry Program Technical Track Zhenhao Li Concordia University, Tse-Hsun (Peter) Chen Concordia University, Jinqiu Yang , Weiyi Shang Concordia University, Canada | ||
15:20 10mTalk | Discussion Period Papers |