A Performance Comparison of Algebraic Multigrid Preconditioners on GPUs and MIC

By Karl Rupp

Vienna University of Technology

Published on

Abstract

Algebraic multigrid (AMG) preconditioners for accelerators such as graphics processing units (GPUs) and Intel's many-integrated core (MIC) architecture typically require a careful, problem-dependent trade-off between efficient hardware use, robustness, and convergence rate in order to minimize time-to-solution. Several variants of AMG with fine-grained parallelism have been proposed recently, but a comparison across different hardware architectures is difficult since the proposed approaches are mostly focused on a single architecture. To address this deficiency, we derived implementations of recently proposed AMG variants in CUDA, OpenCL, and OpenMP and run extensive benchmarks. Our performance results for GPUs from AMD and NVIDIA as well as for Intel's Xeon Phi reveal the sweet spots of each accelerator architecture, helping practitioners to select the best AMG variant and hardware for a given problem.

Cite this work

Researchers should cite this work as follows:

  • Karl Rupp (2016), "A Performance Comparison of Algebraic Multigrid Preconditioners on GPUs and MIC," https://nanohub.org/resources/23485.

    BibTex | EndNote

Submitter

NanoBio Node, Aly Taha

University of Illinois at Urbana-Champaign

Tags