We describe the process and outcome of our efforts to port a legacy Fortran benchmark code to heterogeneous GPU-accelerated computing architectures using OpenMP. The benchmark code is one of the multi-zone NAS Parallel Benchmarks (NPB-MZ) called SP-MZ. This “mini-app” mimics the computation and data movement that is found in popular legacy and modern implicit computational fluid dynamics (CFD) solvers. Our objective was to examine how efficiently legacy Fortran codes can be ported to accelerators by leveraging OpenMP directives. We describe the development and optimization process and demonstrate the performance impact of various code modifications. We show select profiling results from the NVIDIA Visual Profiler (nvpp) to help others diagnose and overcome performance issues in their own applications. We present results for two compute systems endowed with NVIDIA V100 accelerators.
|Original language||English (US)|
|Title of host publication||High Performance Computing - 35th International Conference, ISC High Performance 2020, Proceedings|
|Editors||Ponnuswamy Sadayappan, Bradford L. Chamberlain, Guido Juckeland, Hatem Ltaief|
|Number of pages||18|
|State||Published - 2020|
|Event||35th International Conference on High Performance Computing, ISC High Performance 2020 - Frankfurt, Germany|
Duration: Jun 22 2020 → Jun 25 2020
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||35th International Conference on High Performance Computing, ISC High Performance 2020|
|Period||6/22/20 → 6/25/20|
Bibliographical noteFunding Information:
The authors would like to thank all those who helped with valuable advice during the NERSC/OLCF Oakland Hackathon, especially, Kevin Gott and Jack Deslippe from NERSC and Tom Papatheodore from OLCF for facilitation of an excellent hackathon experience, Max Katz and Angela Chen from NVIDIA who taught us a lot about profiling, and Fazlay Rabbi from NERSC who helped us with compiling and benchmarking on the Ascent and Cori systems. Thanks to Joel Bretheim from the Naval Research Lab, who contributed to the code optimizations. Special thanks go to Kelvin Li from IBM for advising on the changes of Fortran value versus by-reference arguments. I. Nompelis would like to thank Prof. G. V. Candler’s group from the Department of Aerospace Engineering and Mechanics, Univ. of Minnesota for their support. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. This research also used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility supported under Contract No. DE-AC05-00OR22725. We thank OLCF and NERSC for their support during our study. This work was partially funded by NASA Contract No: 80ARC018D001. Work at the hackathon included funding from the Department of Defense High Performance Computing Modernization Program (HPCMP) under User Productivity, Technology Transfer and Training (PETTT) Contract No. GS04T09DBC0017 and for the University of Hawai‘i under Maui High Performance Computing Center (MHPCC) Contract No. N00024-19-D-6400.
© Springer Nature Switzerland AG 2020.
- Implicit CFD