Google Scholar: citations
Automatic tuning based on hardware performance counters and machine learning
Harutyunyan Gevorgyan, Suren (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
César Galobardes, Eduardo (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Sikora, Anna (Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Filipovič, Jiří (Masaryk University)
Alcaraz, Jordi (University of Oregon)

Date: 2026
Abstract: This paper presents a Machine Learning (ML) methodology for automatically tuning parallel applications in heterogeneous High Performance Computing (HPC) environments using Hardware Performance Counters (HwPCs). The methodology addresses three critical challenges: counter quantity versus accessibility tradeoff, data interpretation complexity, and dynamic optimization needs. The introduced ensemble-based methodology automatically identifies minimal yet informative HwPC sets for code region identification and tuning parameter optimization. Experimental validation demonstrates high accuracy in predicting optimal thread allocation ( > 0. 90 K-fold accuracy) and thread affinity ( > 0. 95 accuracy) while requiring only 4-6 HwPCs. Compared to search-based methods like OpenTuner, the methodology achieves competitive performance with dramatically reduced optimization time. The architecture-agnostic design enables consistent performance across CPU and GPU platforms. These results establish a foundation for efficient, portable, automatic, and scalable tuning of parallel applications.
Grants: Agencia Estatal de Investigación PID2023-146193OB-I00
Generalitat de Catalunya 2021/SGR-00574
Note: Altres ajuts: acords transformatius de la UAB
Rights: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades. Creative Commons
Language: Anglès
Document: Article ; recerca ; Versió publicada
Subject: Automatic dimension reduction ; Hardware performance counters ; Machine learning ensembles ; Parallel region classification ; Tuning parameter optimization
Published in: Future generation computer systems, Vol. 179 (June 2026) , art. 108358, ISSN 0167-739X

DOI: 10.1016/j.future.2025.108358


14 p, 10.8 MB

The record appears in these collections:
Articles > Research articles
Articles > Published articles

 Record created 2026-01-22, last modified 2026-01-24



   Favorit i Compartir