|
Digital
Library of the European Council for Modelling
and Simulation |
Title: |
Vectorized Implementation
Of The FEM Numerical Integration Algorithm On A Modern CPU |
Authors: |
Filip
Kruzel |
Published in: |
(2019). ECMS 2019
Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo,
Massimo Ficco, European Council for Modeling and Simulation. DOI: http://doi.org/10.7148/2019 ISSN:
2522-2422 (ONLINE) ISSN:
2522-2414 (PRINT) ISSN:
2522-2430 (CD-ROM) 33rd International ECMS Conference on
Modelling and Simulation,
Caserta, Italy, June 11th – June 14th, 2019 |
Citation
format: |
Filip Kruzel (2019). Vectorized Implementation Of The FEM Numerical Integration Algorithm On A Modern CPU, ECMS 2019 Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo, Massimo Ficco European Council for Modeling and Simulation. doi: 10.7148/2019-0414 |
DOI: |
https://doi.org/10.7148/2019-0414 |
Abstract: |
The main aim
of this study is to answer the question: how to e˙ectively implement the
creation of the finite element sti˙ness matrix in parallel simulations of
Fi-nite Element Method using the full advantages of mo-dern multiprocessors
such as parallelization combined with vectorization. In this work, an eÿcient
method for implementation of a Finite Element Method numerical integration
algorithm on a modern Intel Haswell CPU architecture was developed. This
algorithm was cho-sen, due to its non-trivial structure and the fact that its
optimization is often omitted in research in favour of accelerating the other
phases of FEM. Tests included two types of tasks to solve, with the use of two ty-pes of approximation and two types of finite elements.
During this study, several methods for the implementa-tion of the chosen algorithm
was investigated, including Intel Cilk Plus, Intel Intrinsics and other
computing techniques. Results were compared with an older San-dy Bridge
architecture, showing a significant impact of vectorization and large cache
on the performance of the modern CPUs. Our research gives suggestions for cho-osing
the optimal design of algorithms and e˙ectively using all of the features of
the modern CPUs. |
Full
text: |