HANDWRITTEN OPTICAL CHARACTER RECOGNITION:IMPLEMENTATION FOR KAZAKH LANGUAGE

Authors

  • Mukhtar Kalken Author

DOI:

https://doi.org/10.47344/sdubnts.v57i4.618

Keywords:

OCR, handwritten text recognition, KOHTD, neural networks, CNN

Abstract

Many documents, including as invoices, taxes, memoranda, and surveys, historical data, and test replies, still require handwriting with the transformation to digital information interchange. Handwritten text recognition (HTR), which is an automatic approach to decode records using a computer, is required in this aspect. For this proposal, I present a study of the implementation of optical recognition algorithms for handwritten text in the Kazakh language, using a recently collected database. The database, called the Kazakh Autonomous Handwritten Text Dataset (KOHTD), contains more than 140,335 segmented images of handwritten exam papers. As an algorithm, I used the proposed model by Harald Scheidl, which consists of several layers of neural networks and an CTC decoder. The trained model by putting an interval of lr = 0.01 and a batch size of 60 showed effective results with indicators of about 85% accuracy.

Downloads

Published

2024-10-18

How to Cite

Kalken, M. . (2024). HANDWRITTEN OPTICAL CHARACTER RECOGNITION:IMPLEMENTATION FOR KAZAKH LANGUAGE. Journal of Emerging Technologies and Computing, 57(4), 11-19. https://doi.org/10.47344/sdubnts.v57i4.618