Automatic tuning based on hardware performance counters and machine learning
Harutyunyan Gevorgyan, Suren 
(Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
César Galobardes, Eduardo 
(Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Sikora, Anna 
(Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius)
Filipovič, Jiří 
(Masaryk University)
Alcaraz, Jordi 
(University of Oregon)
| Date: |
2026 |
| Abstract: |
This paper presents a Machine Learning (ML) methodology for automatically tuning parallel applications in heterogeneous High Performance Computing (HPC) environments using Hardware Performance Counters (HwPCs). The methodology addresses three critical challenges: counter quantity versus accessibility tradeoff, data interpretation complexity, and dynamic optimization needs. The introduced ensemble-based methodology automatically identifies minimal yet informative HwPC sets for code region identification and tuning parameter optimization. Experimental validation demonstrates high accuracy in predicting optimal thread allocation ( > 0. 90 K-fold accuracy) and thread affinity ( > 0. 95 accuracy) while requiring only 4-6 HwPCs. Compared to search-based methods like OpenTuner, the methodology achieves competitive performance with dramatically reduced optimization time. The architecture-agnostic design enables consistent performance across CPU and GPU platforms. These results establish a foundation for efficient, portable, automatic, and scalable tuning of parallel applications. |
| Grants: |
Agencia Estatal de Investigación PID2023-146193OB-I00 Generalitat de Catalunya 2021/SGR-00574
|
| Note: |
Altres ajuts: acords transformatius de la UAB |
| Rights: |
Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.  |
| Language: |
Anglès |
| Document: |
Article ; recerca ; Versió publicada |
| Subject: |
Automatic dimension reduction ;
Hardware performance counters ;
Machine learning ensembles ;
Parallel region classification ;
Tuning parameter optimization |
| Published in: |
Future generation computer systems, Vol. 179 (June 2026) , art. 108358, ISSN 0167-739X |
DOI: 10.1016/j.future.2025.108358
The record appears in these collections:
Articles >
Research articlesArticles >
Published articles
Record created 2026-01-22, last modified 2026-01-24