BugSwarm: Mining and Continuously Growing a Dataset of Reproducible Failures and Fixes (ICSE 2019 - Technical Track) - International Conference on Software Engineering 2019 in Montreal, Canada

Sat 25 - Fri 31 May 2019 Montreal, QC, Canada

Who

Naji Dmeiri, David A Tomassi, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Prem Devanbu, Bogdan Vasilescu, Cindy Rubio-González

Track

ICSE 2019 Technical Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 29 May 2019 16:00 - 16:20 at Viger - SE Datasets, Research Infrastructure, and Methodology Chair(s): Rashina Hoda

Abstract

Fault-detection, localization, and repair methods are vital to software quality; but it is difficult to evaluate their generality, applicability, and current effectiveness. Large, diverse, realistic datasets of durably-reproducible faults and fixes are vital to good experimental evaluation of approaches to software quality, but they are difficult and expensive to assemble and keep current. Modern continuous-integration (CI) approaches, like Travis-CI, which are widely used, fully configurable, and executed within custom-built containers, promise a path toward much larger defect datasets. If we can identify and archive failing and subsequent passing runs, the containers will provide a substantial assurance of durable future reproducibility of build and test. Several obstacles, however, must be overcome to make this a practical reality. We describe BugSwarm, a toolset that navigates these obstacles to enable the creation of a scalable, diverse, realistic, continuously growing set of durably reproducible failing and passing versions of real-world, open-source systems. The BugSwarm toolkit has already gathered 3,091 fail-pass pairs, in Java and Python, all packaged within fully reproducible containers. Furthermore, the toolkit can be run periodically to detect fail-pass activities, thus growing the dataset continually.

Link to Preprint

http://web.cs.ucdavis.edu/~rubio/includes/icse19.pdf

Naji Dmeiri

University of California, Davis

David A Tomassi

University of California, Davis

Yichen Wang

University of California, Davis

Antara Bhowmick

University of California, Davis

Yen-Chuan Liu

University of California, Davis

Prem Devanbu

University of California

United States

Bogdan Vasilescu

Carnegie Mellon University

United States

Cindy Rubio-González

University of California, Davis

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 29 May
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 18:00	SE Datasets, Research Infrastructure, and MethodologyJournal-First Papers / New Ideas and Emerging Results / Demonstrations / Papers / Technical Track at Viger Chair(s): Rashina Hoda The University of Auckland

16:00 20m Talk		BugSwarm: Mining and Continuously Growing a Dataset of Reproducible Failures and FixesTechnical Track Technical Track Naji Dmeiri University of California, Davis, David A Tomassi University of California, Davis, Yichen Wang University of California, Davis, Antara Bhowmick University of California, Davis, Yen-Chuan Liu University of California, Davis, Prem Devanbu University of California, Bogdan Vasilescu Carnegie Mellon University, Cindy Rubio-González University of California, Davis Pre-print
16:20 20m Talk		DefeXts: A Curated Dataset of Reproducible Real-World Bugs for Modern JVM LanguagesDemos Demonstrations Samuel Benton The University of Texas at Dallas, Ali Ghanbari The University of Texas at Dallas, Lingming Zhang
16:40 10m Talk		Open Collaborative Data – using OSS principles to share data in SW engineeringNIER New Ideas and Emerging Results Per Runeson Lund University
16:50 10m Talk		Leveraging Small Software Engineering Data Sets with Pre-trained Neural NetworksNIER New Ideas and Emerging Results Andrea Janes , Romain Robbes Free University of Bozen-Bolzano
17:00 20m Talk		ActionNet: Vision-based Workflow Action Recognition From Programming ScreencastsTechnical Track Technical Track Dehai Zhao , Zhenchang Xing Australia National University, Chunyang Chen Monash University, Xin Xia Monash University, Guoqiang Li Shanghai Jiao Tong University
17:20 10m Talk		The ABC of Software Engineering ResearchJournal-First Journal-First Papers Klaas-Jan Stol University College Cork and Lero, Ireland, Brian Fitzgerald Lero - The Irish Software Research Centre and University of Limerick Link to publication DOI
17:30 10m Talk		Mining Plausible Hypotheses from the Literature via Meta-AnalysisNIER New Ideas and Emerging Results Vladimir Ivanov , Giancarlo Succi Innopolis University, Jooyong Yi UNIST (Ulsan National Institute of Science and Technology)
17:40 10m Talk		Analyzing Families of Experiments in SE: a Systematic Mapping StudyJournal-First Journal-First Papers Adrian Santos Parrilla , Omar Gomez Escuela Superior Politecnica de Chimborazo Riobamba, Natalia Juristo Universidad Politecnica de Madrid
17:50 10m Talk		Discussion Period Papers