J. M. Anderson and M. S. Lam, Global optimizations for parallelism and locality on scalable parallel machines, ACM Sigplan Notices, vol.28, issue.6, pp.112-125, 1993.

T. Blank, The MasPar MP-1 architecture, Compcon Spring, pages 20{24, 1990.

L. E. Cannon, A c ellular computer to implement the Kalman lter algorithm, 1969.

P. Christy, Software to support massively parallel computing on the MasPar MP-1, Compcon Spring, pages 29{33, 1990.

A. Darte and Y. Robert, Constructive methods for scheduling uniform loop nests, IEEE Trans. Parallel Distributed Systems, vol.5, issue.8, pp.814-822, 1994.
URL : https://hal.archives-ouvertes.fr/hal-00857083

A. Darte and Y. Robert, Mapping uniform loop nests onto distributed memory architectures, Parallel Computing, vol.20, pp.679-710, 1994.
URL : https://hal.archives-ouvertes.fr/hal-00857077

P. Feautrier, Some eecient solutions to the aane scheduling problem, part I, onedimensional time, Int. J. Parallel Programming, vol.21, issue.5, pp.313-348, 1992.

P. Feautrier, Towards automatic distribution, Parallel Processing Letters, vol.4, issue.3, pp.233-244, 1994.

G. Hajos, Uber einfache und mehrfache bedeckung des n-dimensionalen raumes mit einem wurfelgitter, Math. Zeitschrift, vol.47, pp.427-467, 1942.

H. Charles, D. B. Koelbel, R. S. Loveman, G. L. Schreiber, M. E. Steele et al., The High Performance F ortran Handbook, 1994.

S. Y. Kung, VLSI array processors, 1988.

J. Hyuk, J. Lee, and A. B. Ortes, Data distribution independent parallel programs for matrix multiplication, 1994.

J. Hyuk, J. Lee, and A. B. Fortes, Modular mappings of rectangular algorithms, 1994.

J. Hyuk, J. Lee, and A. B. Fortes, On the injectivity of modular mappings, Application Speciic Array Processors, pp.237-247, 1994.

D. I. Moldovan and J. A. Fortes, Partitioning and mapping algorithms into xed-size systolic arrays, IEEE Transactions on Computers, vol.35, issue.1, pp.1-12, 1986.

M. Newman, Integral Matrices, 1972.

O. Michael, G. A. Boyle, and . Hedayat, Data alignment: Transformations to reduce communications on distributed memory architectures, Scalable High-performance Computing Conference SHPCC-92, pp.366-371, 1992.

F. P. Preparata and J. E. Vuillemin, Area-time optimal VLSI networks for multiplying matrices, Information Processing Letters, vol.11, issue.2, pp.77-80, 1980.

P. Quinton and Y. Robert, Systolic Algorithms and Architectures, 1989.

W. Shang and A. B. Fortes, Time optimal linear schedules for algorithms with uniform dependencies, IEEE Transactions on Computers, vol.40, issue.6, pp.723-742, 1991.
DOI : 10.1109/12.90251

M. E. Wolf and M. S. Lam, A loop transformation theory and an algorithm to maximize parallelism, IEEE Trans. Parallel Distributed Systems, vol.2, issue.4, pp.452-471, 1991.
DOI : 10.1109/71.97902