Web of Science: 11 citations, Scopus: 13 citations, Google Scholar: citations
Finding, analysing and solving MPI communication bottlenecks in Earth System models
Tintó Prims, Oriol (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Castrillo, Miguel (Barcelona Supercomputing Center)
Acosta, Mario C. (Mario César) (Barcelona Supercomputing Center)
Mula-Valls, Oriol (Barcelona Supercomputing Center)
Sanchez Lorente, Alicia (Barcelona Supercomputing Center)
Serradell, Kim (Barcelona Supercomputing Center)
Cortés Fité, Ana (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Doblas-Reyes, Francisco J. (Institució Catalana de Recerca i Estudis Avançats)

Date: 2019
Abstract: It is a matter of consensus that the ability to efficiently use current and future high performance computing systems is crucial for science, however, the reality is that the performance currently achieved by most of the parallel scientific applications is far from desired. Despite inter-process communication has already been a matter of study in many different works, it is a fact that their recommendations are not taken into account in most of computational model development processes, at least in the case of Earth Science. This work presents a methodology that aims to help scientists working with computational models using inter-process communication, to deal with the difficulties they face when trying to understand their applications behaviour. Following a series of steps that are presented here, both users and developers will learn how to identify performance issues by characterizing applications scalability, identifying which parts present a bad performance and understand the role that inter-process communication plays. In this work, the Nucleus for European Modelling of the Ocean (NEMO), the state-of-the-art European global ocean circulation model, will be used as an example of success. It is a community code widely used in Europe, to the extent that more than a hundred million core hours are used every year in experiments involving NEMO. In the analysis exercise, it is shown how to answer the questions of where, why and what is degrading model's scalability, and how this information can help developers in finding solutions that will mitigate their eventual issues. This document also demonstrates how performance analysis carried out with small size experiments, using limited resources, can lead to optimizations that will impact bigger experiments running on thousands of cores, making it easier to deal with the exascale challenge.
Grants: Ministerio de Economía y Competitividad SEV-2011-00067
Ministerio de Ciencia y Tecnología TIN2014-53234-C2-1-R
European Commission 675191
Rights: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, la comunicació pública de l'obra i la creació d'obres derivades, fins i tot amb finalitats comercials, sempre i quan es reconegui l'autoria de l'obra original. Creative Commons
Language: Anglès
Document: Article ; recerca ; Versió publicada
Subject: Earth System modelling ; Ocean modelling ; Performance analysis ; Performance optimization ; MPI optimization
Published in: Journal of computational science, Vol. 36 (September 2019) , art. 100864, ISSN 1877-7511

DOI: 10.1016/j.jocs.2018.04.015

10 p, 2.5 MB

The record appears in these collections:
Articles > Research articles
Articles > Published articles

 Record created 2020-06-03, last modified 2024-06-23

   Favorit i Compartir