Blogs (1) >>
ICSE 2019
Sat 25 - Fri 31 May 2019 Montreal, QC, Canada

Distributed systems often face transient errors and localized component degradation and failure. Verifying that the overall system remains healthy in the face of such failures is challenging. At Netflix, we have built a platform for automatically generating and executing chaos experiments, which check how well the production system can handle component failures and slowdowns. This paper describes the platform and our experiences operating it.

Wed 29 May
Times are displayed in time zone: (GMT-04:00) Eastern Time (US & Canada) change

11:00 - 12:30: Controlled Experiments of Production SoftwarePapers / Software Engineering in Practice at St-Denis / Notre-Dame
Chair(s): Yvonne DittrichIT University of Copenhagen, Denmark
11:00 - 11:20
Talk
Software Engineering in Practice
Aleksander FabijanMicrosoft, Pavel DmitrievOutreach.io, Helena Holmström OlssonMalmö University, Jan BoschChalmers University of Technology, Sweden, Lukas VermeerBooking.com, Dylan LewisIntuit
11:20 - 11:40
Talk
Software Engineering in Practice
Tong XiaMicrosoft, Sumit BhardwajMicrosoft, Pavel DmitrievOutreach.io, Aleksander FabijanMicrosoft
11:40 - 12:00
Talk
Software Engineering in Practice
Paul Luo LiMicrosoft, Pavel DmitrievOutreach.io, Huibin Mary HuMicrosoft, Xiaoyu ChaiMicrosoft, Zoran DimovMicrosoft, Brandon PaddockMicrosoft, Ying LiMicrosoft, Alex KirshenbaumMicrosoft, Irina NiculescuMicrosoft, Taj ThoresenMicrosoft
12:00 - 12:20
Talk
Software Engineering in Practice
Ali BasiriNetflix, Lorin HochsteinNetflix, Nora JonesNetflix, Haley TuckerNetflix
Pre-print
12:20 - 12:30
Talk
Papers