Characterizing and Detecting Duplicate Logging Code Smells
Software logs are widely used by developers to assist in various tasks. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this paper, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems and uncovered five patterns of duplicate logging code smells. For each instance of the problematic code smell, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the manually studied systems and two additional systems. In total, combining the results of DLFinder and our manual analysis, DLFinder is able to detect over 85% of the instances which were reported to developers and then fixed.