Stream processing frameworks are a popular solution for handling computation in latency-sensitive applications. However, next generation applications such as augmented/virtual reality, autonomous driving, and Industry 4.0, have tighter latency constraints and produce much larger amounts of data. To address the real-time nature and high bandwidth usage of new applications, edge computing provides an extension to the cloud infrastructure through a hierarchy of datacenters located between the edge devices and the cloud.
Using existing stream processing frameworks out-of-the-box in this hierarchy is not efficient, as they adopt a stop-the-world approach during reconfiguration. This approach can lead to stalls on the order of several minutes. While costly reconfiguration was not an issue in the cloud where these events are extremely rare, edge computing is a more dynamic environment, characterized by frequent reconfigurations.
In this paper, we propose Shepherd, a new stream processing framework for edge computing. Shepherd minimizes downtime during application reconfiguration, with almost no impact on data processing latency. Our experiments show that, compared to Apache Storm, Shepherd reduces application downtime from several minutes to a few tens of milliseconds.