||In our PhD thesis we give a very detailed and in-depth survey of natural scene text detection methods and propose two novel methods, namely SWT (Stroke Width Transform) voting-based color reduction method and SWT direction determination method. SWT voting-based color reduction method (to which we will refer also as SWT-V) is a novel text detection method that - opposed to many other text detection methods - combines both structural and color information in order to detect text. The proposed method upgrades the text detection oriented color reduction method (to which we will refer to as TOCR) with the additional SWT voting stage and substantially outperforms other state-of-the-art text detection methods. All the image colors rich with SWT pixels that most likely belong to text characters are blocked from being mean-shifted away in the color reduction process. One of the disadvantages of the SWT method, however, is the problem of ‘light text on the dark background’ described in the following sections. To cope with the problem and in order to provide true SWT values to the SWT voting stage we propose an adaptive SWT direction determination method. The method uses SWT profiles to partition an image into subblocks and analyzes their SWT histograms of both SWT search directions. Text detection literature does not explicitly address the SWT direction issue, therefore, the proposed method represents a unique scientific contribution to the research field. All text detection methods were evaluated on the CVL OCR DB text detection evaluation dataset.
||Advisor: Peter Peer. Date and location of PhD thesis defense: 24 October 2013, University of Ljubljana
||Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.
||other ; abstract ; publishedVersion
Computer vision ;
Character and text recognition ;
||ELCVIA : Electronic Letters on Computer Vision and Image Analysis, Vol. 13, Núm. 2 (2014) , p. 38-39, ISSN 1577-5097