A. Agarwal, D. A. Kranz, and V. Natarajan, Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors, IEEE Trans. Parallel Distributed Systems, vol.6, issue.9, pp.943-962, 1995.

R. Andonov, H. Bourzouu, and . Sanjay-r-a-j-o-p-a-d-h-ye, Two-dimensional orthogonal tiling: from theory to practice, International Conference on High Performance Computing (HiPC), pp.225-231, 1996.

R. Andonov and S. Rajopadhye, Optimal tiling of two-dimensional uniform recurrences, Journal of Parallel and Distributed Computing

P. Boulet, A. Darte, T. Risset, and Y. Robert, (pen)-ultimate tiling? Integration, the VLSI Journal, vol.17, pp.33-51, 1994.

P. , Y. Calland, and T. Risset, Precise tiling for uniform loop nests, Application Speciic Array Processors ASAP 95, pp.330-337, 1995.

P. Y. Calland, J. Dongarra, and Y. Robert, Tiling with limited resources, Application Speciic Systems, Achitectures, and Processors, ASAP'97, pp.229-238, 1997.
URL : https://hal.archives-ouvertes.fr/inria-00073506

Y. Chen, S. Wang, and C. Wang, Tiling nested loops into maximal rectangular blocks, Journal of Parallel and Distributed Computing, vol.35, issue.2, pp.123-155, 1996.

J. Choi, J. Demmel, I. Dhillon, J. Dongarra, S. Ostrouchov et al., ScaLAPACK: A portable linear algebra library for distributed memory computers -design issues and performance, LAPACK Working Note #95), vol.97, pp.1-15, 1996.

. Ph and . Chretienne, Task scheduling over distributed memory machines, Parallel and Distributed A lgorithms, pp.165-176, 1989.

A. Darte, F. Georges-andr-e-silber, and . Vivien, Combining retiming and scheduling techniques for loop parallelization and loop tiling. Parallel Processing Letters, 1997.
URL : https://hal.archives-ouvertes.fr/hal-02102115

J. J. Dongarra and D. W. Walker, Software libraries for linear algebra computations on high performance computers, SIAM Review, vol.37, issue.2, pp.151-180, 1995.

K. Hh-ogstedt, L. Carter, and J. Ferrante, Determining the idle time of a tiling, Principles of Programming Languages, pp.160-173, 1997.

, Frann cois Irigoin and R emy T riolet. Supernode partitioning, Proc. 15th Annual ACM Symp. Principles of Programming Languages, pp.319-329, 1988.

A. W. Lim and M. S. Lam, Maximizing parallelism and minimizing synchronization with aane transforms, Proceedings of the 24th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1997.

N. Manjikian and T. S. Abdelrahman, Scheduling of wavefront parallelism on scalable shared memory multiprocessor, Proceedings of the International Conference o n P a r allel Processing ICPP 96, 1996.

H. Ohta, Y. Saito, M. Kainaga, and H. Ono, l DOACROSS loop nests, 1995 International Conference on Supercomputing, pp.270-279, 1995.

P. Pacheco, Parallel programming with MPI, 1997.

J. Ramanujam and P. Yappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, vol.16, issue.2, pp.108-120

R. Schreiber and J. J. Dongarra, Automatic blocking of nested loops, The University o f T ennessee, 1990.

S. Sharma, C. Huang, and P. Yappan, On data dependence analysis for compiling programs on distributed-memory machines, ACM Sigplan Notices, vol.28, issue.1, 1993.

M. E. Wolf and M. S. Lam, A data locality optimizing algorithm, SIGPLAN Conference on Programming Language Design and Implementation, pp.30-44, 1991.

M. E. Wolf and M. S. Lam, A loop transformation theory and an algorithm to maximize parallelism, IEEE Trans. Parallel Distributed Systems, vol.2, issue.4, pp.452-471, 1991.