Implementation of Numerical Integration to High-Order Elements on the GPUs

Filip Krużel; Krzysztof Banaś; Mateusz Nytko

doi:10.24423/cames.264

Authors

Filip Krużel Department of Computer Science, Cracow University of Technology, Kraków, Poland http://orcid.org/0000-0002-3462-9144
Krzysztof Banaś Department of Applied Computer Science and Modelling, AGH University of Science and Technology, Kraków, Poland http://orcid.org/0000-0002-4045-1530
Mateusz Nytko Department of Computer Science, Cracow University of Technology, Kraków, Poland http://orcid.org/0000-0003-0606-1835

Abstract

This article presents ways to implement a resource-consuming algorithm on hardware with a limited amount of memory, which is the GPU. Numerical integration for higher-order finite element approximation was chosen as an example algorithm. To perform computational tests, we use a non-linear geometric element and solve the convection-diffusion-reaction problem. For calculations, a Tesla K20m graphics card based on Kepler architecture and Radeon r9 280X based on Tahiti XT architecture were used. The results of computational experiments were compared with the theoretical performance of both GPUs, which allowed an assessment of actual performance. Our research gives suggestions for choosing the optimal design of algorithms as well as the right hardware for such a resource-demanding task.

Keywords:

GPU, numerical integration, finite element method, OpenCL, CUDA

References

1. AMD. White paper: AMD Graphics Cores Next (GCN) Architecture, Advanced Micro Devices Inc., Sunnyvale, CA, 2012.

2. K. Banaś, F. Krużel, OpenCL performance portability for Xeon Phi coprocessor and NVIDIA GPUs: A case study of finite element numerical integration, [in:] Euro-Par 2014: Parallel Processing Work-shops, vol. 8806 of Lecture Notes in Computer Science, Springer International Publishing, pp. 158–169, 2014.

3. K. Banaś, F. Krużel, J. Bielański, Optimal kernel design for finite element numerical integration on GPUs, Computing in Science and Engineering, 2019 [in print].

4. K. Banaś, F. Krużel, J. Bielański, K. Chłoń, A comparison of performance tuning process for different generations of NVIDIA GPUs and an example scientific computing algorithm, [in:] Parallel Processing and Applied Mathematics, R. Wyrzykowski, J. Dongarra, E. Deelman, K. Karczewski [Eds], Springer International Publishing, pp. 232–242, 2018.

5. E. Becker, G. Carey, J. Oden, Finite Elements. An Introduction, Prentice Hall, 1981.

6. L. Buatois, G. Caumon, B. Levy, Concurrent number cruncher: AGPU implementation of a general sparse linear solver, International Journal of Parallel, Emergent and Distributed Systems, 24(3): 205–223, 2009.

7. P. Ciarlet, The finite element method for elliptic problems, North-Holland, Amsterdam, 1978.

8. P.K. Das, G.C. Deka, History and evolution of GPU architecture, Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing, pp. 109– 135, 2016.

9. M. Geveler, D. Ribbrock, D. Göddeke, P. Zajac, S. Turek, Towards a complete FEM- based simulation toolkit on GPUs: Unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses, Computers & Fluids, 80: 327–332, 2013 (Part of Special Issue: Selected contributions of the 23rd International Conference on Parallel Fluid Dynamics ParCFD2011).

10. D. Göddeke, H. Wobker, R. Strzodka, J. Mohd-Yusof, P. McCormick, S. Turek, Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU, International Journal of Computational Science and Engineering, 4(4): 254–269, 2009.

11. C. Johnson, Numerical solution of partial differential equations by the finite element method, Cambridge University Press, 1987.

12. F. Krużel, K. Banaś, Finite element numerical integration on PowerXCell processors, [in:] PPAM’09: Proceedings of the 8th International Conference on Parallel Processing and Applied Mathematics, Springer-Verlag, pp. 517–524, 2010.

13. F. Krużel, K. Banaś, Vectorized OpenCL implementation of numerical integration for higher order finite elements, Computers and Mathematics with Applications, 66(10): 2030–2044, 2013.

14. F. Krużel, K. Banaś, Finite element numerical integration on Xeon Phi coprocessor, [in:] Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, M.P.M. Ganzha, L. Maciaszek [Eds], vol. 2 of Annals of Computer Science and Information Systems, IEEE, pp. 603–612, 2014.

15. F. Krużel, K. Banaś, AMD APU systems as a platform for scientific computing, Computer Methods in Materials Science, 15(2): 362–369, 2015.

16. F. Krużel, Vectorized implementation of the FEM numerical integration algorithm on a modern CPU, [in:] Proceedings of the 33rd International ECMS Conference on Modelling and Simulation: ECMS 2019, 11–14 June 2019, Caserta, Italy, 33(1): 414–420, 2019.

17. J. Mamza, P. Makyla, A. Dziekoński, A. Lamecki, M. Mrozowski, Multi-core and multi- processor implementation of numerical integration in Finite Element Method, [in:] 2012 19th International Conference on Microwave Radar and Wireless Communications, vol. 2, pp. 457–461, 2012.

18. Nvidia Corporation, NVIDIAs Next Generation CUDA Compute Architecture: Kepler GK110, Whitepaper, 2012.

19. Nvidia Corporation, Profiler User’s Guide, 2015.

20. R. Smith, AMD Radeon HD 7970 Review: 28nm and Graphics Core Next, Together As One, AnandTech, 2011, retrieved from https://www.anandtech.com/show/5261/amd- radeon-hd-7970-review on 12.09.2019

21. P. Šolín, K. Segeth, I. Doležel, Higher-order finite element methods, Chapman & Hall/CRC, 2004.

22. S. Williams, A. Waterman, D. Patterson, Roofline: An insightful visual performance model for multicore architectures, Communications in the ACM, 52(4): 65–76, 2009.

Online first
Accepted manuscripts
2026, Vol 33
	No 2	No 1
2025, Vol 32
	No 1	No 2	No 3	No 4
2024, Vol 31
	No 1	No 2	No 3	No 4
2023, Vol 30
	No 1	No 2	No 3	No 4
2022, Vol 29
	No 1-2		No 3	No 4
2021, Vol 28
	No 1	No 2	No 3	No 4
2020, Vol 27
	No 1	No 2-3		No 4
2019, Vol 26
	No 1	No 2	No 3-4
2018, Vol 25
	No 1	No 2-3		No 4
2017, Vol 24
	No 1	No 2	No 3	No 4
2016, Vol 23
	No 1	No 2-3		No 4
2015, Vol 22
	No 1	No 2	No 3	No 4
2014, Vol 21
	No 1	No 2	No 3-4
2013, Vol 20
	No 1	No 2	No 3	No 4
2012, Vol 19
	No 1	No 2	No 3	No 4
2011, Vol 18
	No 1-2		No 3	No 4
2010, Vol 17
	No 1	No 2/3/4
2009, Vol 16
	No 1	No 2	No 3-4
2008, Vol 15
	No 1	No 2	No 3-4
2007, Vol 14
	No 1	No 2	No 3	No 4
2006, Vol 13
	No 1	No 2	No 3	No 4
2005, Vol 12
	No 1	No 2-3		No 4
2004, Vol 11
	No 1	No 2-3		No 4
2003, Vol 10
	No 1	No 2	No 3	No 4
2002, Vol 9
	No 1	No 2	No 3	No 4
2001, Vol 8
	No 1	No 2-3		No 4
2000, Vol 7
	No 1	No 2	No 3	No 4
1999, Vol 6
	No 1	No 2	No 3-4
1998, Vol 5
	No 1	No 2	No 3	No 4
1997, Vol 4
	No 1	No 2	No 3-4
1996, Vol 3
	No 1	No 2	No 3	No 4
1995, Vol 2
	No 1	No 2	No 3	No 4
1994, Vol 1
	No 1-2		No 3-4

Implementation of Numerical Integration to High-Order Elements on the GPUs

Downloads

Authors

Abstract

Keywords:

References

Other articles by the same author(s)

cover

ippt-pan

Issue

Pages

Section

DOI

Received

Accepted

Published

License

How to Cite

Principal Contact

Address

Support Contact