|
|
|||||||||||||||
|
Buscar | Enviar | Ayuda | Servicio de Bibliotecas | Sobre el DDD | Català English Español | |||||||||
| Página principal > Documentos de investigación > Prepublicacions > Machine learning for the analysis of healthy lifestyle data : |
| Fecha: | 2025 |
| Descripción: | 41 pàg. |
| Resumen: | Background: Advances in data science and technology have transformed lifestyle studies by enabling the integration of multimodal information and generation of large volumes of data. Despite the growing interest in machine learning (ML) in health behaviour research, significant methodological gaps remain. Objectives: The study aims to systematically review the applications of supervised ML algorithms in analyzing healthy lifestyle (HL) data, with a specific focus on the methodological approach employed. The specific objectives are to explore the types and sources of data used in health outcomes, examine the ML processes employed, including explainability artificial intelligence (XAI) methods, and review the software tools utilized. Additionally, this review aims to provide practical guidelines to enhance the quality and transparency of future ML research in health. Methods: Following the PRISMA-ScR recommendations, the search was conducted across PubMed, PsychINFO, and Web of Science, resulting in 48 studies that meet the inclusion criteria. Results: Most studies (37, 77%), integrated multidomain data from physical activity, diet, sleep, and stress. Data sources were split between self-acquired (25, 52. 08%) and health repositories (23, 47. 92%). Single items measurements were common, particularly for physical activity, diet and sleep. Despite a multimodel approach in 28 studies, random forest was the most frequently used algorithm. Only 10 studies (20. 83%) employed XAI methods, with 9 using SHapley Additive exPlanation (SHAP) values and 1 using Local Interpretable Model-agnostic Explanations (LIME). R was the most widely used software, with variations in the libraries employed. Conclusion: This review highlights methodological gaps in the application of supervised ML to HL data. The ML workflow should span from data acquisition to explainability, with iterative steps to improve the process. Multidomain approaches in data acquisition enhance understanding of health issues related to lifestyle but are constrained by low data representativeness due to methodological limitations in acquisition. While random forest was prevalent, a multimodel approach is recommended for comprehensive comparison. Lifestyle components consistently ranked among the top features in studies that incorporated XAI. Integrating XAI methods into the ML pipeline can support personalized interventions, provided the data is accurately collected. The R metapackage tidymodels facilitates process evaluation through unified syntax, improving replicability. Methodological and reporting guidelines are provided to enhance transparency and replicability in multidisciplinary ML research. |
| Ayudas: | Agencia Estatal de Investigación PID2019-107473RB-C21 Agencia Estatal de Investigación PID2022-141403NB-I00 Generalitat de Catalunya 2021/SGR-00806 |
| Derechos: | Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, la comunicació pública de l'obra i la creació d'obres derivades, fins i tot amb finalitats comercials, sempre i quan es reconegui l'autoria de l'obra original. |
| Lengua: | Anglès |
| Documento: | Prepublicació ; recerca ; Versió sotmesa a revisió |
| Materia: | Machine learning ; Artificial intelligence ; Healthy lifestyle ; Physical activity ; Diet ; Sleep ; Stress ; Review ; Data analysis ; XAI |
| Publicado en: | JMIR Human Factors, 2025, p. 1-65, ISSN 2292-9495 |
Preprint 41 p, 2.0 MB |