Monitoring the behavior of parallel programs: how to be scalable?

Abstract : It is easy to find errors and inefficient parts of a sequential program, by using a standard debugger/profiler, but there is no such tool in a parallel environment. The only way to study the race conditions of a parallel program is to execute it and collect data about its execution. The programmer can then use the generated trace files and specialized tuning tools to visualize and improve the behavior of the program: idle processors, communications, etc. The problem in large parallel systems is that these tools have to deal with an enormous amount of data. The classical approach to monitor and trace analysis i.e. sequential, event driven, post-mortem monitoring) is no longer realistic. To avoid this bottleneck, we introduced PIMSY (Parallel Implementation of a Monitoring System). The main idea of PIMSY is to let the trace data distributed among the parallel storage and to distribute the program (or the programs) that deal with the trace data.
Keywords : Monitoring Scalability
Document type :
Reports
Complete list of metadatas

Cited literature [38 references]  Display  Hide  Download

https://hal-lara.archives-ouvertes.fr/hal-02101853
Contributor : Colette Orange <>
Submitted on : Wednesday, April 17, 2019 - 9:08:07 AM
Last modification on : Sunday, May 19, 2019 - 1:20:45 AM

File

RR1993-22.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02101853, version 1

Collections

Citation

Jean-Yves Peterschmitt, Bernard Tourancheau, Vigouroux Xavier-Francois. Monitoring the behavior of parallel programs: how to be scalable?. [Research Report] LIP RR-1993-22, Laboratoire de l'informatique du parallélisme. 1993, 2+15p. ⟨hal-02101853⟩

Share

Metrics

Record views

3

Files downloads

7