Supporting Code Search with Context-Aware, Analytics-Driven, Effective Query Reformulation
Software developers often experience difficulties in preparing appropriate queries for code search. Recent finding has suggested that developers fail to choose the right search keywords from an issue report for 88% of times. Thus, despite a number of earlier studies, automatic reformulation of queries for the code search is an open problem that warrants further investigations. In this dissertation work, we hypothesize that code search could be improved by adopting appropriate term weighting, context-awareness and data-analytics in query reformulation. We ask three research questions to evaluate the hypothesis, and then conduct six studies to answer these questions. Our conducted studies improve code search by incorporating (1) novel, appropriate keyword selection algorithms, (2) context-awareness, (3) crowdsourced knowledge from Stack Overflow, and (4) large-scale data analytics into the query reformulation process.
I have been pursuing my PhD in Computer Science/Software Engineering at University of Saskatchewan, Canada, under the supervision of Dr. Chanchal Roy. My area of interest is software change automation. In particular, I am interested in automated query reformulations for concept/feature/bug localization and code search, and in automated support for code review activities. I develop tools and techniques for the automation of these activities. In my work, I generally employ Information Retrieval (IR), Static Code Analysis, Machine Learning (ML), Software Repository Mining and Large-scale Data Analytics. My works are accepted at ICSE(A*), ESEC/FSE(A*), ASE(A), ICMSE(A), MSR(A), EMSE(A) and SANER over the last few years. I have been awarded Geddes Award 2017 (PhD Student of the Year), NSERC Industry Engage Grant 2016, ACM CAPS Award 2017, Saskatchewan Innovation & Opportunity Scholarship 2017, and Dean’s Scholarship (2014-2017) for the outstanding academic and research excellence in the PhD program.