HANDWRITTEN OPTICAL CHARACTER RECOGNITION:IMPLEMENTATION FOR KAZAKH LANGUAGE
DOI:
https://doi.org/10.47344/sdubnts.v57i4.618Keywords:
OCR, handwritten text recognition, KOHTD, neural networks, CNNAbstract
Many documents, including as invoices, taxes, memoranda, and surveys, historical data, and test replies, still require handwriting with the transformation to digital information interchange. Handwritten text recognition (HTR), which is an automatic approach to decode records using a computer, is required in this aspect. For this proposal, I present a study of the implementation of optical recognition algorithms for handwritten text in the Kazakh language, using a recently collected database. The database, called the Kazakh Autonomous Handwritten Text Dataset (KOHTD), contains more than 140,335 segmented images of handwritten exam papers. As an algorithm, I used the proposed model by Harald Scheidl, which consists of several layers of neural networks and an CTC decoder. The trained model by putting an interval of lr = 0.01 and a batch size of 60 showed effective results with indicators of about 85% accuracy.