Research of handwriting text recognition algorithm using machine learning
Jazyk: | ruština |
---|---|
Rok vydání: | 2022 |
Předmět: | |
DOI: | 10.18720/spbpu/3/2022/vr/vr22-2830 |
Popis: | Ðа ÑегоднÑÑний Ð´ÐµÐ½Ñ Ð¾Ð´Ð½Ð¾Ð¹ из неÑеÑеннÑÑ Ð·Ð°Ð´Ð°Ñ Ð² облаÑÑи компÑÑÑеÑного зÑÐµÐ½Ð¸Ñ Ð¸ иÑкÑÑÑÑвенного инÑеллекÑа ÑвлÑеÑÑÑ Ð·Ð°Ð´Ð°Ñа ÑаÑÐ¿Ð¾Ð·Ð½Ð°Ð²Ð°Ð½Ð¸Ñ ÑÑкопиÑного ÑекÑÑа. РазлиÑнÑе ÑеÑÐµÐ½Ð¸Ñ Ð´Ð°Ð½Ð½Ð¾Ð¹ задаÑи или пÑÐ¸Ð¼ÐµÐ½Ð¸Ð¼Ñ Ð² ÑÐ·ÐºÐ¸Ñ Ð¾Ð±Ð»Ð°ÑÑÑÑ Ð¸ не пÑеÑендÑÑÑ Ð½Ð° обÑноÑÑÑ, или показÑваÑÑ Ð½ÐµÐ´Ð¾ÑÑаÑоÑное каÑеÑÑво ÑаÑпознаваниÑ. Ð ÑабоÑе иÑÑледÑеÑÑÑ Ð·Ð°Ð´Ð°Ñа ÑаÑÐ¿Ð¾Ð·Ð½Ð°Ð²Ð°Ð½Ð¸Ñ ÑÑкопиÑного киÑиллиÑеÑкого ÑекÑÑа. РаÑÑмаÑÑиваÑÑÑÑ Ð¿ÑедложеннÑе Ñанее акÑÑалÑнÑе ÑеÑÐµÐ½Ð¸Ñ Ð´Ð°Ð½Ð½Ð¾Ð¹ задаÑи, пÑедлагаеÑÑÑ Ð¾Ð±ÑÐ°Ñ ÑÑÑÑкÑÑÑа алгоÑиÑма ÑаÑпознаваниÑ, меÑÐ¾Ð´Ñ ÑеÑÐµÐ½Ð¸Ñ Ð¿Ð¾Ð´Ð·Ð°Ð´Ð°Ñ Ð¸ пÑогÑÐ°Ð¼Ð¼Ð½Ð°Ñ ÑеализаÑÐ¸Ñ Ð¾ÑделÑнÑÑ Ð¼Ð¾Ð´Ñлей. РазÑабоÑаннÑй алгоÑиÑм ÑаÑÐ¿Ð¾Ð·Ð½Ð°Ð²Ð°Ð½Ð¸Ñ ÑекÑÑа оÑнован на вÑделении в ÑекÑÑе оÑделÑнÑÑ Ñлов и ÑаÑпознавании в Ð½Ð¸Ñ Ð¾ÑделÑнÑÑ Ñимволов пÑи помоÑи нейÑоннÑÑ ÑеÑей. РезÑлÑÑаÑÑ ÑаÑÐ¿Ð¾Ð·Ð½Ð°Ð²Ð°Ð½Ð¸Ñ Ñимволов обобÑаÑÑÑÑ Ð¿Ñименением алгоÑиÑма поÑÑобÑабоÑки, коÑоÑÑй опÑеделÑÐµÑ Ð½Ð°Ð¸Ð±Ð¾Ð»ÐµÐµ доÑÑовеÑнÑе ваÑианÑÑ ÑаÑпознанного Ñлова. Ð ÑабоÑе пÑедлагаÑÑÑÑ ÑеÑÐµÐ½Ð¸Ñ Ð´Ð»Ñ ÐºÐ°Ð¶Ð´Ð¾Ð³Ð¾ из ÑÑапов ÑабоÑÑ Ð°Ð»Ð³Ð¾ÑиÑма и вÑпомогаÑелÑнÑÑ Ð·Ð°Ð´Ð°Ñ. РеÑаÑÑÑÑ Ð·Ð°Ð´Ð°Ñи пÑедобÑабоÑки изобÑажениÑ, вÑÐ´ÐµÐ»ÐµÐ½Ð¸Ñ Ð½Ð° нем гÑÐ°Ð½Ð¸Ñ Ñлов и задаÑи поÑÑобÑабоÑки. РеÑаеÑÑÑ Ð·Ð°Ð´Ð°Ñа поÑÑÑÐ¾ÐµÐ½Ð¸Ñ Ð¶Ð¸Ð·Ð½ÐµÐ½Ð½Ð¾Ð³Ð¾ Ñикла нейÑонной ÑеÑи, пÑедназнаÑенной Ð´Ð»Ñ ÑаÑÐ¿Ð¾Ð·Ð½Ð°Ð²Ð°Ð½Ð¸Ñ Ñимволов, в ÑаÑÑноÑÑи, задаÑа поÑÑÑÐ¾ÐµÐ½Ð¸Ñ ÑазмеÑенной вÑбоÑки Ñ Ð³ÑаниÑами оÑделÑнÑÑ Ñимволов. Также ÑеÑаеÑÑÑ Ð·Ð°Ð´Ð°Ñа подбоÑа макÑопаÑамеÑÑов алгоÑиÑма поÑÑобÑабоÑки Ñ Ð¸ÑполÑзованием алгоÑиÑма на оÑнове ÑволÑÑионной ÑÑÑаÑегии. Today, one of the unsolved problems in the area of computer vision and artificial intelligence is the problem of recognition handwriting text. Various solutions to this problem do not pretend to be general, or to lack the quality of properties.This paper research the problem of recognition handwriting Cyrillic text. There is a review of previously proposed decisions of this problem. A text recognition algorithm has been developed based on the selection of detached words in the text and the detection of detached characters in them using neural networks. Character recognition results are summarized by applying a post-processing algorithm that determines the most likely variants of the recognized word. The paper considers solutions for each of the studies of the algorithm and auxiliary problems. The tasks of image preprocessing, word boundaries detection and post-processing tasks are solved. The problem of constructing the life cycle of a neural network designed for large symbols is being solved, in particular, the problem of constructing a labeled sample with boundaries of individual symbols. The problem of selection of macro parameters for the post-processing algorithm is also solved using an algorithm based on an evolutionary strategy. |
Databáze: | OpenAIRE |
Externí odkaz: |