Equipo : ALSOC
Fecha de salida : 26/06/2015
Dirección de investigación : Nathalie DRACH-TEMAM Co-supervisión : HEYDEMANN Karine
Dynamic optimization of data-flow task-parallel applications for large-scale NUMA systems
Within the last decade, microprocessor development reached a point at which higher clock rates and more complex micro-architectures became less energy-efficient, such that power consumption and energy density were pushed beyond reasonable limits. As a consequence, the industry has shifted to more energy efficient multi-core designs, integrating multiple processing units (cores) on a single chip. The number of cores is expected to grow exponentially and future systems are expected to integrate thousands of processing units. In order to provide sufficient memory bandwidth in these systems, main memory is physically distributed over multiple memory controllers with non-uniform access to memory (NUMA). Past research has identified programming models based on fine-grained, dependent tasks as a key technique to unleash the parallel processing power of massively parallel general-purpose computing architectures. However, the execution of task-paralel programs on architectures with non-uniform memory access and the dynamic optimizations to mitigate NUMA effects have received only little interest. In this thesis, we explore the main factors on performance and data locality of task-parallel programs and propose a set of transparent, portable and fully automatic on-line mapping mechanisms for tasks to cores and data to memory controllers in order to improve data locality and performance. Placement decisions are based on information about point-to-point data dependences, readily available in the run-time systems of modern task-parallel programming frameworks. The experimental evaluation of these techniques is conducted on our implementation in the run-time of the OpenStream language and a set of high-performance scientific benchmarks. Finally, we designed and implemented Aftermath, a tool for performance analysis and debugging of task-parallel applications and run-times.
Defensa : 25/06/2015 - 10h30 - Site Jussieu 25-26/105 miembros del jurado : M. Jean-François MÉHAUT, Professeur, Université Joseph Fourier / CEA, [Rapporteur]
M. Nacho NAVARRO, Associate Professor, Universitat Politècnica de Catalunya / Barcelona Supercomputing Center, [Rapporteur]
M. Albert COHEN, Directeur de Recherche, INRIA
M. Benoît DUPONT DE DINECHIN, CTO Kalray S.A.
Mme. Nathalie DRACH-TÉMAM, Professeur, Université Pierre et Marie Curie
Mme. Karine HEYDEMANN, Maître de Conférences, Université Pierre et Marie Curie
M. Raymond NAMYST, Professeur, Université de Bordeaux
M. Antoniu POP, Lecturer, The University of Manchester
M. Pierre SENS, Professeur, Université Pierre et Marie Curie
M. Marc SHAPIRO, Directeur de Recherche, INRIA / LIP6