The NEMO oceanic model, characterized by a resolution of 1/16° and tailored on the Mediterranean Basin used at CMCC, has been analyzed to discover possible bottlenecks to the parallel scalability. A detailed analysis of scalability on all of the routines called during a NEMO time step allowed to identify the SOR solver routine as the most expensive from the communication point of view. The function implements the red-black successive-over-relaxation method, an iterative search algorithm used for solving the elliptical equation for the barotropic stream function.
The algorithm iterates until reach the convergence; a limit on the maximum number of iteration is also set up. The high frequency of data exchanging within this routine implies a high communication overhead. The NEMO code includes an enhanced version of the routine, that reduce the frequency of communication by adding an extra-halo region. The use of this optimization requires the selection of the optimal value of the extra-halo dimension to trade-off computation and communication. A performance model, allowing the choice of the optimal extra-halo value for a pre-defined decomposition, has been designed. The model has been tested on the Mare Nostrum cluster at the Barcelona Supercomputing Centre.
This work was carried out under the HPC-EUROPA2 project (project number: 228398) with the support of the European Commission Capacities Area Research Infrastructures Initiative.The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the Barcelona Supercomputing Center, namely prof. Jose Maria Baldasano, prof. Jesus Labarta and their stuff members.
The research leading to these results has received funding from the Italian Ministry of Education, University and Research and the Italian Ministry of Environment, Land and Sea under the GEMINA project.
- jel: C63
Authors
- Keywords: Extra-halo, NEMO, Optimization, Performance model