Designing and debugging correct distributed systems is notoriously difficult. The correctness of a distributed system is largely determined by its handling of failure scenarios. The sequence of events leading to a bug can be long and complex, and is likely to include message reorderings and failures. On single-node systems, engineers have long relied on interactive debuggers to step through an execution of the program on various inputs, but till now they lacked the ability to easily simulate failure scenarios and control the order in which messages are delivered.
Oddity is a graphical, interactive debugger for distributed systems. It brings the power of traditional step-through debugging—fine-grained control and observation of a program as it executes—to distributed systems. Like step-through debuggers, it also enables exploratory testing, in which engineers examine the behavior of their systems without a specific bug in mind in order to better understand them. Programmers can directly control message and failure interleaving. Oddity supports time travel, allowing a developer to explore multiple branching executions of a system within a single debugging session. Above all, Oddity encourages distributed systems thinking: rather than assuming the normal case and attaching failure handling as an afterthought, systems must be developed around the possibility of failure.
Students in UW's CSE 452 are using Oddity in class for Spring 2018!
Check out our video demo!
You can clone, build, extend, fork, and experiment with Oddity on GitHub.
Oddity is lead by Doug Woos at the University of Washington.