Web of Science: 1 citations, Scopus: 1 citations, Google Scholar: citations
An empirical method for processing I/O traces to analyze the performance of DL applications
Parraga Pinzon, Edixon Alexander (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
León, Betzabeth (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Méndez, Sandra (Barcelona Supercomputing Center)
Rexachs del Rosario, Dolores Isabel (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Suppi Boldrito, Remo (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Luque, Emilio (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)

Date: 2024
Description: 17 pàg.
Abstract: The exponential growth of data handled by Deep Learning (DL) applications has led to an unprecedented demand for computational resources, necessitating their execution on High Performance Computing (HPC) systems. However, understanding and optimizing Input/Output (I/O) of the DL applications can be challenging due to the complexity and scale of DL workloads and the heterogeneous nature of I/O operations. This paper addresses this issue by proposing an I/O traces processing method that simplifies the generation of reports on global I/O patterns and performance to aid in I/O performance analysis. Our approach focuses on understanding the temporal and spatial distributions of I/O operations and related with the behavior at I/O system level. The proposed method enables us to synthesize and extract key information from the reports generated by tools such as Darshan tool and the seff command. These reports offer a detailed view of I/O performance, providing a set of metrics that deepen our understanding of the I/O behavior of DL applications.
Grants: Agencia Estatal de Investigación PID2020-112496GB-I00
Note: Altres ajuts: the authors thankfully acknowledge RES resources provided by CESGA in FinisTerrae III to RES-DATA-2022-1-0014. - La monografia conté els Conference proceedings de la 12th Conference, JCC-BD&ET 2024, La Plata, Argentina, June 25-27, 2024.
Rights: Aquest material està protegit per drets d'autor i/o drets afins. Podeu utilitzar aquest material en funció del que permet la legislació de drets d'autor i drets afins d'aplicació al vostre cas. Per a d'altres usos heu d'obtenir permís del(s) titular(s) de drets.
Language: Anglès
Series: Communications in Computer and Information Science ; 2189
Document: Capítol de llibre ; recerca ; Versió acceptada per publicar
Subject: DL ; HPC ; I/O Analysis ; I/O behavior patterns
Published in: Cloud Computing, Big Data and Emerging Topics, 2024, p. 74-90, ISBN 9783031708060

DOI: 10.1007/978-3-031-70807-7_6


Available from: 2025-10-30
Postprint

The record appears in these collections:
Books and collections > Book chapters

 Record created 2025-01-16, last modified 2025-03-29



   Favorit i Compartir