Aten: A Dispatcher for Big Data Applications in Heterogeneous Systems

Published in HPCS 2018, 2018

Recommended citation: R. R. De Souza Jr, Paulo; Matteussi, Kassiano; C. S. Anjos, Julio; D. D. dos Santos, Jobe and R. Geyer, Claudio; da Silva Veith, Alexandre

[Paper] [BIBTEX]

Abstract

Stream Processing Engines (SPEs) have to support high data ingestion to ensure the quality and efficiency for the end-user or a system administrator. The data flow processed by SPE fluctuates over time, and requires real-time or near real-time resource pool adjustments (network, memory, CPU and other). This scenario leads to the problem known as skewed data production caused by the non-uniform incoming flow at specific points on the environment, resulting in slow down of applications caused by network bottlenecks and inefficient load balance. This work proposes Aten as a solution to overcome unbalanced data flows processed by Big Data Stream applications in heterogeneous systems. Aten manages data aggregation and data streams within message queues, assuming different algorithms as strategies to partition data flow over all the available computational resources. The paper presents preliminary results indicating that is possible to maximize the throughput and also provide low latency levels for SPEs.