LIP6 CNRS Sorbonne Université Tremplin Carnot Interfaces
Direct Link LIP6 » 链接 » 巴黎六大计算机科学实验室日志

GT PequanRSS

Reproducible and Accurate BLAS for ExaScale Computing

报告人 : Roman Iakymchuk (Pequan)
As Exascale computing (10^18 operations per second) is likely to be reached within a decade, getting accurate results in floating-point arithmetic on such computers will be a challenge. However, another challenge will be the reproducibility of the results -- meaning getting a bitwise identical floating-point result from multiple runs of the same code -- due to non-associativity of floating-point operations and dynamic scheduling on parallel computers.
In this talk, I will present a reproducible and accurate (rounding-to-nearest) algorithms for the fundamental linear algebra operations -- like the ones included in the BLAS library -- in parallel environments such as Intel server CPUs, Intel Xeon Phi, and both NVIDIA and AMD GPUs. I will show that the performance of our algorithms is comparable with the standard non-deterministic BLAS routines.
marc (at)
 Mentions légales
网站导航 |