Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change PatternsTechnical Track
Existing approaches for detecting repetitive code changes relying on syntactic similarity cannot effectively detect semantic change patterns. In this work, we introduce a novel graph-based mining approach, CPatMiner, which is capable of detecting semantic code change patterns from a large number of open-source repositories by capturing dependencies between fine-grained change elements.
We evaluated CPatMiner by mining change patterns in a diverse corpus of 5,000+ open-source projects from GitHub with 170,000+ developers. We use three complementary methods. First, we sent the mined patterns to the authors and received 108 responses. 70% of respondents recognized those patterns as their meaningful frequent changes. 79% of respondents even named the patterns, and 44% wanted IDEs to automate such repetitive changes. The mined patterns belong to various activities: adaptive (9%), perfective (20%), corrective (35%) and preventive (36%). Second, we compared CPatMiner with the state-of-the-art, AST-based technique, and reported that CPatMiner detects 2.1x more meaningful patterns. Third, we used CPatMiner to search for patterns in a corpus of 88 GitHub projects with longer histories consisting of 164M SLOCs. It constructed 322K fine-grained change graphs containing 3M nodes, and detected 17K change patterns which provide unique insights on the practice of change patterns among individuals and teams. We found that a large percentage (75%) of the patterns from individual developers are commonly shared with others, and this holds true for teams. Moreover, we found that the patterns spread widely over time. Thus, we call for a community-based change pattern database to provide important resources in novel applications.
Fri 31 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | Mining Software Changes and PatternsTechnical Track / Demonstrations / Papers at Centre-Ville Chair(s): Ayşe Başar Ryerson University | ||
11:00 20mTalk | The List is the Process: Reliable Pre-Integration Tracking of Commits on Mailing ListsTechnical Track Technical Track Ralf Ramsauer OTH Regensburg, Daniel Lohmann Leibniz Universität Hannover, Wolfgang Mauerer OTH Regensburg / Siemens AG | ||
11:20 20mTalk | Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change PatternsTechnical Track Technical Track Hoan Nguyen Iowa State University, Tien N. Nguyen University of Texas at Dallas, Danny Dig School of EECS at Oregon State University, Son Nguyen The University of Texas at Dallas, Hieu Tran The University of Texas at Dallas, Michael Hilton Carnegie Mellon University, USA | ||
11:40 20mTalk | Coming: a Tool for Mining Change Pattern Instances from Git CommitsDemos Demonstrations | ||
12:00 20mTalk | PatchNet: A Tool for Deep Patch ClassificationDemos Demonstrations Thong Hoang Singapore Management University, Singapore, Julia Lawall Inria/LIP6, Richard J Oentaryo McLaren Applied Technologies, Singapore, Yuan Tian Queens University, Kingston, Canada, David Lo Singapore Management University | ||
12:20 10mTalk | Discussion Period Papers |