Abstract : The multicore revolution is underway. Classical algorithms have to be revisited in order to take hierarchical memory layout into account. In this paper, we aim at minimizing the number of cache misses paid during the execution of the matrix product kernel on a multicore processor, and we show how th achieve the best possible trade-off between shared and distributed caches.
https://hal-lara.archives-ouvertes.fr/hal-02102826
Contributor : Colette Orange
<>
Submitted on : Wednesday, April 17, 2019 - 4:33:52 PM
Last modification on : Wednesday, November 20, 2019 - 2:52:53 AM