Online citations, reference lists, and bibliographies.
← Back to Search

Fast Matrix-Free Discontinuous Galerkin Kernels On Modern Computer Architectures

M. Kronbichler, K. Kormann, I. Pasichnyk, M. Allalen
Published 2017 · Computer Science

Save to my Library
Download PDF
Analyze on Scholarcy Visualize in Litmaps
Share
Reduce the time it takes to create your bibliography by a factor of 10 by using the world’s favourite reference manager
Time to take this seriously.
Get Citationsy
This study compares the performance of high-order discontinuous Galerkin finite elements on modern hardware. The main computational kernel is the matrix-free evaluation of differential operators by sum factorization, exemplified on the symmetric interior penalty discretization of the Laplacian as a metric for a complex application code in fluid dynamics. State-of-the-art implementations of these kernels stress both arithmetics and memory transfer. The implementations of SIMD vectorization and shared-memory parallelization are detailed. Computational results are presented for dual-socket Intel Haswell CPUs at 28 cores, a 64-core Intel Knights Landing, and a 16-core IBM Power8 processor. Up to polynomial degree six, Knights Landing is approximately twice as fast as Haswell. Power8 performs similarly to Haswell, trading a higher frequency for narrower SIMD units. The performance comparison shows that simple ways to express parallelism through for loops perform better on medium and high core counts than a more elaborate task-based parallelization with dynamic scheduling according to dependency graphs, despite less memory transfer in the latter algorithm.
This paper references



This paper is referenced by
10.1007/978-3-030-60610-7_2
High-Performance Implementation of Discontinuous Galerkin Methods with Application in Fluid Flow
M. Kronbichler (2021)
10.1002/nla.2348
Matrix-free preconditioning for high-order H(curl) discretizations
A. Barker (2020)
10.1137/19M1276194
Enclave Tasking for Discontinuous Galerkin Methods on Dynamically Adaptive Meshes
Dominic E. Charrier (2018)
10.1134/S1995080219050196
Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set
B. Shabanov (2019)
10.1145/3325864
Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators
M. Kronbichler (2017)
10.1002/nme.6336
A matrix‐free approach for finite‐strain hyperelastic problems using geometric multigrid
Denis Davydov (2019)
Efficient Discontinuous Galerkin Methods for Wave Propagation and Iterative Optoacoustic Image Reconstruction
S. Schoeder (2019)
Asynchronous Teams and Tasks in a Message Passing Environment
B. Hazelwood (2019)
10.1145/3322813
Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors
M. Kronbichler (2019)
10.1137/18M1185399
Efficient Explicit Time Stepping of High Order Discontinuous Galerkin Schemes for Waves
S. Schoeder (2018)
10.1007/978-3-319-99654-7_7
Efficient High-Order Discontinuous Galerkin Finite Elements with Matrix-Free Implementations
M. Kronbichler (2018)
10.1002/fld.4683
A matrix-free high-order discontinuous Galerkin compressible Navier-Stokes solver: A performance comparison of compressible and incompressible formulations for turbulent incompressible flows
Niklas Fehn (2018)
10.1002/fld.4511
Efficiency of high-performance discontinuous Galerkin spectral element methods for under-resolved turbulent incompressible flows
N. Fehn (2018)
10.1016/j.jcp.2018.06.037
Robust and efficient discontinuous Galerkin methods for under-resolved turbulent incompressible flows
Niklas Fehn (2018)
10.1007/978-3-319-99654-7
Advances and New Trends in Environmental Informatics
Volker Weinberg (2018)
10.1016/j.jcp.2017.07.039
A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow
Benjamin Krank (2016)
10.1016/j.jcp.2017.09.031
On the stability of projection methods for the incompressible Navier-Stokes equations based on high-order discontinuous Galerkin discretizations
N. Fehn (2017)
Semantic Scholar Logo Some data provided by SemanticScholar