Web of Science: 2 citas, Scopus: 2 citas, Google Scholar: citas
Integer constraints for enhancing interpretability in linear regression
Carrizosa Priego, Emilio (Universidad de Sevilla. Departamento de Estadística e Investigación Operativa)
Olivares-Nadal, Alba V. (The University of Chicago Booth School of Business)
Ramírez-Cobo, Pepa (Universidad de Cádiz. Departamento de Estadística e Investigación Operativa)

Fecha: 2020
Resumen: One of the main challenges researchers face is to identify the most relevant features in a prediction model. As a consequence, many regularized methods seeking sparsity have flourished. Although sparse, their solutions may not be interpretable in the presence of spurious coefficients and correlated features. In this paper we aim to enhance interpretability in linear regression in presence of multicollinearity by: (i) forcing the sign of the estimated coefficients to be consistent with the sign of the correlations between predictors, and (ii) avoiding spurious coefficients so that only significant features are represented in the model. This will be addressed by modelling constraints and adding them to an optimization problem expressing some estimation procedure such as ordinary least squares or the lasso. The so-obtained constrained regression models will become Mixed Integer Quadratic Problems. The numerical experiments carried out on real and simulated datasets show that tightening the search space of some standard linear regression models by adding the constraints modelling (i) and/or (ii) help to improve the sparsity and interpretability of the solutions with competitive predictive quality.
Derechos: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades. Creative Commons
Lengua: Anglès
Documento: Article ; recerca ; Versió publicada
Materia: Linear regression ; Multicollinearity ; Sparsity ; Cardinality constraint ; Mixed integer non linear programming
Publicado en: SORT : statistics and operations research transactions, Vol. 44 Núm. 1 (January-June 2020) , p. 67-98 (Articles) , ISSN 2013-8830

Adreça alternativa: https://raco.cat/index.php/SORT/article/view/371180
DOI: 10.2436/20.8080.02.95


32 p, 463.2 KB

El registro aparece en las colecciones:
Artículos > Artículos publicados > SORT
Artículos > Artículos de investigación

 Registro creado el 2020-06-27, última modificación el 2023-10-15



   Favorit i Compartir