GT Pequan


A communication-avoiding sparse direct solver

Thursday, July 5, 2018
Speaker(s) : Rich Vuduc (Georgia Institute of Technology)

Abstract: This talk describes several techniques to improve the strong scalability of a (right-looking, supernodal) sparse direct solver for distributed memory systems by reducing and hiding both internode and intranode communication.
To reduce inter-node communication, we present a communication-avoiding 3D sparse LU factorization algorithm. The "3D" refers to the use of a logical three-dimensional arrangement of MPI processes, and the method combines data redundancy with elimination tree parallelism. The 3D algorithm can be shown to reduce asymptotic communication costs by a factor of $O(sqrt{log n})$ and latency costs by a factor of $O(log n)$ for planar sparse matrices arising from finite element discretization of two-dimensional PDEs. For the non-planar case, it can reduce communication and latency costs by a constant factor.
On-node, we propose a novel technique, called the HALO, targeted at heterogeneous architectures consisting of multicore and manycore co-processors such as GPUs or Xeon Phi. The name HALO is a shorthand for highly asynchronous lazy offload, which refers to the way the method combines highly aggressive use of asynchrony with the accelerated offload, lazy updates, and data shadowing (a la halo or ghost zones), all of which serve to hide and reduce communication, whether to local memory, across the network, or over PCIe. The overall hybrid solver achieves a speedup of up to 3x on a variety of realistic test problems in single and multi-node configurations.
Bio: Richard (Rich) Vuduc is an Associate Professor at the Georgia Institute of Technology (Georgia Tech), in the School of Computational Science and Engineering. His research lab, The HPC Garage (@hpcgarage on Twitter and Instagram), is interested in high-performance computing, with an emphasis on algorithms, performance analysis, and performance engineering. He is a recipient of a DARPA Computer Science Study Group grant; an NSF CAREER award; a collaborative Gordon Bell Prize in 2010; Lockheed-Martin Dean's Award for Teaching Excellence (2013); and Best Paper Awards at the SIAM Conference on Data Mining (SDM, 2012) and the IEEE Parallel and Distributed Processing Symposium (IPDPS, 2015), among others. He has also served as his departments Associate Chair and Director of its graduate programs. External to Georgia Tech, he currently serves as Chair of the SIAM Activity Group on Supercomputing (2018-2020); co-chaired the Technical Papers Program of the Supercomputing (SC) Conference in 2016. He received his Ph.D. in Computer Science from the University of California, Berkeley, and was a postdoctoral scholar in the Center for Advanced Scientific Computing the Lawrence Livermore National Laboratory.

More details here …
Marc.Mezzarobba (at)