Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video
In: IET Computer Vision

Zayene Oussama ; Masmoudi Touj Sameh ; Hennebert Jean ; Ingold Rolf ; Essoukri Ben Amara Najoua

March 2018, No: 1, p. 1-11
IET Digital Library
ISSN: 1751-9632

Avec comité de lecture

Langue(s): eng

Abstract: This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.

Keyword(s): machine learning ; deep learning ; text recognition in video ; video processing

Filière(s): Informatique (TIC)

Accès libre

 Notice créée le 2018-05-08, modifiée le 2018-05-08

Télécharger le documentPDF
Lien externe:
Télécharger le documentFichiers