Web of Science: 7 citas, Scopus: 9 citas, Google Scholar: citas
Supervised learning using a symmetric bilinear form for record linkage
Abril Castellano, Daniel (Institut d'Investigació en Intel·ligència Artificial (IIIA-CSIC))
Torra i Reventós, Vicenç, dir. (Institut d'Investigació en Intel·ligència Artificial (IIIA-CSIC))
Navarro-Arribas, Guillermo (Universitat Autònoma de Barcelona. Departament d'Enginyeria de la Informació i de les Comunicacions)

Fecha: 2015
Resumen: Record Linkage is used to link records of two different files corresponding to the same individuals. These algorithms are used for database integration. In data privacy, these algorithms are used to evaluate the disclosure risk of a protected data set by linking records that belong to the same individual. The degree of success when linking the original (unprotected data) with the protected data gives an estimation of the disclosure risk. In this paper we propose a new parameterized aggregation operator and a supervised learning method for disclosure risk assessment. The parameterized operator is a symmetric bilinear form and the supervised learning method is formalized as an optimization problem. The target of the optimization problem is to find the values of the aggregation parameters that maximize the number of re-identification (or correct links). We evaluate and compare our proposal with other non-parametrized variations of record linkage, such as those using the Mahalanobis distance and the Euclidean distance (one of the most used approaches for this purpose). Additionally, we also compare it with other previously presented parameterized aggregation operators for record linkage such as the weighted mean and the Choquet integral. From these comparisons we show how the proposed aggregation operator is able to overcome or at least achieve similar results than the other parameterized operators. We also study which are the necessary optimization problem conditions to consider the described aggregation functions as metric functions.
Ayudas: Ministerio de Ciencia e Innovación CSD2007-00004
Ministerio de Ciencia e Innovación TIN2010-15764
Ministerio de Ciencia e Innovación TIN2011-27076-C03-03
European Commission 262608
Derechos: Tots els drets reservats.
Lengua: Anglès
Documento: Article ; recerca ; Versió acceptada per publicar
Materia: Record linkage ; Data privacy ; Disclosure risk ; Bilinear form ; Choquet integral
Publicado en: Information fusion, Vol. 26 (Novembre 2015) , p. 144-153, ISSN 1566-2535

DOI: 10.1016/j.inffus.2014.11.004


Post-print
32 p, 660.1 KB

El registro aparece en las colecciones:
Documentos de investigación > Documentos de los grupos de investigación de la UAB > Centros y grupos de investigación (producción científica) > Ingeniería > Security of Networks and Distributed Applications (SENDA)
Artículos > Artículos de investigación
Artículos > Artículos publicados

 Registro creado el 2017-03-13, última modificación el 2022-02-06



   Favorit i Compartir