A Neural Model for Generating Natural Language Summaries of Program SubroutinesTechnical Track
Source code summarization – creating natural language descriptions of source code behavior – is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature.
Fri 31 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | Machine Learning in Static AnalysisPapers / Technical Track at Place du Canada Chair(s): Na Meng Virginia Tech | ||
11:00 20mTalk | Training Binary Classifiers as Data Structure InvariantsTechnical Track Technical Track Facundo Molina Universidad Nacional de Rio Cuarto, Argentina, Renzo Degiovanni SnT, University of Luxembourg, Pablo Ponzio Dept. of Computer Science FCEFQyN, University of Rio Cuarto, Germán Regis Universidad Nacional de Río Cuarto, Nazareno Aguirre Dept. of Computer Science FCEFQyN, University of Rio Cuarto, Marcelo F. Frias Dept. of Software Engineering Instituto Tecnológico de Buenos Aires | ||
11:20 20mTalk | Graph Embedding based Familial Analysis of Android Malware using Unsupervised LearningTechnical Track Technical Track Ming Fan MOEKLINNS Lab, Department of Computer Science and Technology, Xi'an Jiaotong University, 710049, China, Xiapu Luo , Jun Liu MOEKLINNS Lab, Department of Computer Science and Technology, Xi'an Jiaotong University, 710049, China, Meng Wang University of Bristol, UK, Chunyin Nong , Qinghua Zheng MOEKLINNS Lab, Department of Computer Science and Technology, Xi'an Jiaotong University, 710049, China, Ting Liu MOEKLINNS Lab, Department of Computer Science and Technology, Xi'an Jiaotong University, 710049, China | ||
11:40 20mTalk | A Novel Neural Source Code Representation based on Abstract Syntax TreeTechnical Track Technical Track Jian Zhang Beihang University, Xu Wang Beihang University, Hongyu Zhang The University of Newcastle, Hailong Sun Beihang University, Kaixuan Wang Beihang University, Xudong Liu Beihang University Pre-print | ||
12:00 20mTalk | A Neural Model for Generating Natural Language Summaries of Program SubroutinesTechnical Track Technical Track Alexander LeClair University Of Notre Dame, Siyuan Jiang Eastern Michigan University, Collin McMillan | ||
12:20 10mTalk | Discussion Period Papers |