Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems Research Paper - LINA - Equipe Gestion de Données Distribuées Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems Research Paper

Résumé

Shuffle grouping is a technique used by stream processing frameworks to share input load among parallel instances of stateless operators. With shuffle grouping each tuple of a stream can be assigned to any available operator instance, independently from any previous assignment. A common approach to implement shuffle grouping is to adopt a Round-Robin policy, a simple solution that fares well as long as the tuple execution time is almost the same for all the tu-ples. However, such an assumption rarely holds in real cases where execution time strongly depends on tuple content. As a consequence, parallel stateless operators within stream processing applications may experience unpredictable unbal-ance that, in the end, causes undesirable increase in tuple completion times. In this paper we propose Online Shuffle Grouping (OSG), a novel approach to shuffle grouping aimed at reducing the overall tuple completion time. OSG estimates the execution time of each tuple, enabling a proac-tive and online scheduling of input load to the target operator instances. Sketches are used to efficiently store the otherwise large amount of information required to schedule incoming load. We provide a probabilistic analysis and illustrate , through both simulations and a running prototype, its impact on stream processing applications.
Fichier principal
Vignette du fichier
main.pdf (824.08 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01397658 , version 1 (16-11-2016)

Identifiants

Citer

Nicoló Rivetti, Emmanuelle Anceaume, Yann Busnel, Leonardo Querzoni, Bruno Sericola. Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems Research Paper. ACM/IFIP/USENIX Middleware 2016 , ACM/IFIP/USENIX, Dec 2016, Trento, Italy. ⟨10.1145/2988336.2988347⟩. ⟨hal-01397658⟩
969 Consultations
359 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More