Automatic error localisation for categorical, continuous and integer data
Waal, Ton De (Netherlands)

Data: 2005
Resum: Data collected by statistical offices generally contain errors, which have to be corrected before reliable data can be published. This correction process is referred to as statistical data editing. At statistical offices, certain rules, so-called edits, are often used during the editing process to determine whether a record is consistent or not. Inconsistent records are considered to contain errors, while consistent records are considered error-free. In this article we focus on automatic error localisation based on the Fellegi-Holt paradigm, which says that the data should be made to satisfy all edits by changing the fewest possible number of fields. Adoption of this paradigm leads to a mathematical optimisation problem. We propose an algorithm for solving this optimisation problem for a mix of categorical, continuous and integer-valued data. We also propose a heuristic procedure based on the exact algorithm. For five realistic data sets involving only integer-valued variables we evaluate the performance of this heuristic procedure.
Drets: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades. Creative Commons
Llengua: Anglès
Document: Article ; recerca ; Versió publicada
Matèria: Branch-and-bound ; Categorical data ; Continuous data ; Error localisation ; Fourier-Motzkin ; Elimination ; Integer-valued data ; Statistical data editing
Publicat a: SORT : statistics and operations research transactions, Vol. 29, Núm. 1 (January-June 2005) , p. 57-100, ISSN 2013-8830

Adreça alternativa: https://raco.cat/index.php/SORT/article/view/28877


44 p, 241.6 KB

El registre apareix a les col·leccions:
Articles > Articles publicats > SORT
Articles > Articles de recerca

 Registre creat el 2012-07-20, darrera modificació el 2022-02-13



   Favorit i Compartir