Tools and Benchmarks for Automated Log ParsingSEIPIndustry Program
Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information during system operation that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes, thus rendering the infeasibility of the traditional way of manual log inspection. Many recent studies and industrial tools resort to powerful text search and machine learning-based analytics solutions. Due to the unstructured nature of logs, a first crucial step is to parse log messages into structured data for subsequent analysis. In recent years, automated log parsing has been widely studied in both academia and industry, producing a series of log parsers by different techniques. To better understand the characteristics of these log parsers, in this paper, we present a comprehensive evaluation study on automated log parsing and further release the tools and benchmarks to researchers and practitioners. More specifically, we evaluate 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software. We report the benchmarking results in terms of accuracy, robustness, and efficiency, which are of practical importance when deploying automated log parsing in production. We also share the success stories and lessons learned in an industrial application at Huawei. We believe that our work could serve as the basis and provide valuable guidance to future research and technology transfer of automated log parsing.
Wed 29 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | DevOps and LoggingSoftware Engineering in Practice / Technical Track / Papers at Mansfield / Sherbrooke Chair(s): Diomidis Spinellis Athens University of Economics and Business | ||
14:00 20mTalk | An Empirical Investigation of Incident Triage for Online Service SystemsSEIPIndustry Program Software Engineering in Practice Junjie Chen Peking University, Xiaoting He Microsoft, Qingwei Lin Microsoft Research, China, Yong Xu Microsoft, China, Hongyu Zhang The University of Newcastle, Dan Hao Peking University, Feng Gao Microsoft, Zhangwei Xu Microsoft, Yingnong Dang Microsoft Azure, Dongmei Zhang Microsoft Research, China | ||
14:20 20mTalk | Tools and Benchmarks for Automated Log ParsingSEIPIndustry Program Software Engineering in Practice Jieming Zhu Huawei Noah's Ark Lab, Shilin He Chinese University of Hong Kong, Jinyang Liu Sun Yat-Sen University, Pinjia He Computer Science and Engineering, The Chinese University of Hong Kong, Qi Xie Southwest Minzu University, Zibin Zheng School of Data and Computer Science, Sun Yat-sen University, Michael Lyu | ||
14:40 20mTalk | Mining Historical Test Logs to Predict Bugs and Localize Faults in the Test LogsTechnical TrackIndustry Program Technical Track | ||
15:00 20mTalk | DLFinder: Characterizing and Detecting Duplicate Logging Code SmellsTechnical TrackIndustry Program Technical Track Zhenhao Li Concordia University, Tse-Hsun (Peter) Chen Concordia University, Jinqiu Yang , Weiyi Shang Concordia University, Canada | ||
15:20 10mTalk | Discussion Period Papers |