CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries (ICSE 2019 - Technical Track) - International Conference on Software Engineering 2019 in Montreal, Canada

Blogs (1) >>

Sat 25 - Fri 31 May 2019 Montreal, QC, Canada

Who

Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, Lin Tan

Track

ICSE 2019 Technical Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 31 May 2019 14:00 - 14:20 at Place du Canada - Testing of AI Systems Chair(s): Marija Mikic

Abstract

Deep learning (DL) systems are widely used in domains including aircraft collision avoidance systems, Alzheimer’s disease diagnosis, and autonomous driving cars. Despite the requirement for high reliability, DL systems are difficult to test. Existing DL testing work focuses on testing the DL models, not the implementations (e.g., DL software libraries) of the models. One key challenge of testing DL libraries is the difficulty of knowing the expected output of DL libraries given an input instance. Fortunately, there are multiple implementations of the same DL algorithms in different DL libraries. Thus, we propose CRADLE, a new approach that focuses on finding and localizing bugs in DL software libraries. CRADLE (1) performs cross-implementation inconsistency checking to detect bugs in DL libraries, and (2) leverages anomaly propagation tracking and analysis to localize faulty functions in DL libraries that cause the bugs. We evaluate CRADLE on three libraries (TensorFlow, CNTK, and Theano), 11 datasets (including ImageNet, MNIST, and KGS Go game), and 30 pre-trained models. CRADLE detects 12 bugs and 104 unique inconsistencies, and highlights functions relevant to the causes of inconsistencies for all 104 unique inconsistencies.

Link to Preprint

https://hvpham.github.io/files/CRADLE-icse19.pdf

Hung Viet Pham

University of Waterloo

Canada

Thibaud Lutellier

Weizhen Qi

University of Science and Technology of China

Lin Tan

Purdue University

United States