Few shots are all you need : A progressive learning approach for low resource handwritten text recognition
Souibgui, Mohamed Ali 
(Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)
Fornes Bisquerra, Alicia 
(Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)
Kessentini, Yousri 
(Artificial Intelligence and Networks)
Megyesi, Beáta 
(Uppsala University. Department of Linguistics and Philology)
| Data: |
2022 |
| Resum: |
Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github. com/dali92002/HTRbyMatching. |
| Ajuts: |
Agencia Estatal de Investigación RTI2018-095645-B-C21
|
| Nota: |
Altres ajuts: acords transformatius de la UAB |
| Drets: |
Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, la comunicació pública de l'obra i la creació d'obres derivades, fins i tot amb finalitats comercials, sempre i quan es reconegui l'autoria de l'obra original.  |
| Llengua: |
Anglès |
| Document: |
Article ; recerca ; Versió publicada |
| Matèria: |
Handwritten text recognition ;
Few-shot learning ;
Unsupervised progressive learning ;
Ciphered manuscripts |
| Publicat a: |
Pattern Recognition Letters, Vol. 160 (August 2022) , p. 43-49, ISSN 0167-8655 |
DOI: 10.1016/j.patrec.2022.06.003
El registre apareix a les col·leccions:
Articles >
Articles de recercaArticles >
Articles publicats
Registre creat el 2022-07-25, darrera modificació el 2023-10-01