Comparing Distributed Termination Detection Algorithms for Task-Based Runtime Systems on HPC platforms - Joint Laboratory on Extreme Scale Computing
Article Dans Une Revue International Journal of Networking and Computing Année : 2022

Comparing Distributed Termination Detection Algorithms for Task-Based Runtime Systems on HPC platforms

Résumé

This paper revisits distributed termination detection algorithms in the context of High-Performance Computing (HPC) applications. We introduce an efficient variant of the Credit Distribution Algorithm (CDA) and compare it to the original algorithm (HCDA) as well as to its two primary competitors: the Four Counters algorithm (4C) and the Efficient Delay-Optimal Distributed algorithm (EDOD). We analyze the behavior of each algorithm for some simplified task-based kernels and show the superiority of CDA in terms of the number of control messages. We then compare the implementation of these algorithms over a task-based runtime system, PaRSEC and show the advantages and limitations of each approach on a practical implementation.
Fichier principal
Vignette du fichier
ijnc22.pdf (860.33 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03920388 , version 1 (03-01-2023)

Identifiants

  • HAL Id : hal-03920388 , version 1

Citer

George Bosilca, Aurélien Bouteiller, Thomas Hérault, Valentin Le Fèvre, Yves Robert, et al.. Comparing Distributed Termination Detection Algorithms for Task-Based Runtime Systems on HPC platforms. International Journal of Networking and Computing, 2022, 12 (1). ⟨hal-03920388⟩
98 Consultations
159 Téléchargements

Partager

More