Web of Science: 17 citations, Scopus: 23 citations, Google Scholar: citations,
Few shots are all you need : A progressive learning approach for low resource handwritten text recognition
Souibgui, Mohamed Ali (Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)
Fornes Bisquerra, Alicia (Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)
Kessentini, Yousri (Artificial Intelligence and Networks)
Megyesi, Beáta (Uppsala University. Department of Linguistics and Philology)

Date: 2022
Abstract: Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github. com/dali92002/HTRbyMatching.
Grants: Agencia Estatal de Investigación RTI2018-095645-B-C21
Note: Altres ajuts: acords transformatius de la UAB
Rights: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, la comunicació pública de l'obra i la creació d'obres derivades, fins i tot amb finalitats comercials, sempre i quan es reconegui l'autoria de l'obra original. Creative Commons
Language: Anglès
Document: Article ; recerca ; Versió publicada
Subject: Handwritten text recognition ; Few-shot learning ; Unsupervised progressive learning ; Ciphered manuscripts
Published in: Pattern Recognition Letters, Vol. 160 (August 2022) , p. 43-49, ISSN 0167-8655

DOI: 10.1016/j.patrec.2022.06.003


7 p, 1.2 MB

The record appears in these collections:
Articles > Research articles
Articles > Published articles

 Record created 2022-07-25, last modified 2023-10-01



   Favorit i Compartir