Scopus: 0 citas, Google Scholar: citas
Distilling Structure from Imagery : Graph-based Models for the Interpretation of Document Images
Riba, Pau

Fecha: 2020
Resumen: From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in ℝn, is not properly defined for graphs. In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition.
Derechos: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades. Creative Commons
Lengua: Anglès
Documento: Article ; recerca ; Versió publicada
Materia: Computer vision ; Structural pattern recognition ; Document analysis ; Pattern Recognition ; Graph-based Representations ; Graph Indexing ; Hierarchical Graphs ; Graph Embeddings ; Graph Neural Networks ; Graph Edit Distance ; Table Detection
Publicado en: ELCVIA : Electronic Letters on Computer Vision and Image Analysis, Vol. 19 Núm. 2 (2020) , p. 9-10 (Special Issue on Recent PhD Thesis Dissemination (2020)) , ISSN 1577-5097

Adreça original: https://elcvia.cvc.uab.es/article/view/v19-n2-Riba
Adreça alternativa: https://raco.cat/index.php/ELCVIA/article/view/378782
Adreça alternativa: https://hdl.handle.net/10803/670774
DOI: 10.5565/rev/elcvia.1313


2 p, 86.4 KB

El registro aparece en las colecciones:
Artículos > Artículos publicados > ELCVIA
Artículos > Artículos de investigación

 Registro creado el 2021-01-13, última modificación el 2022-02-05



   Favorit i Compartir