Popis: |
Handwritten documents are, as always, highly challenging for recognition tasks compared to printed documents. Rather than using isolated characters as elementary components for recognition, practical documents use words or character strings. In any handwritten recognition task, the segmentation of lines and words plays a pivotal role as the outputs produced at this stage can drastically affect the performance and the results of the recognition tasks. An approach combining two distinct techniques, namely horizontal projection profile and seam carving, for the segmentation of lines has been proposed in this paper. Using the horizontal projection profile method, a general idea of the location of lines in the document is obtained first, but since only using the horizontal projection profile method works better for printed documents, it is not enough for handwritten documents, so the seam carving method is applied to finely segment the lines, where line separation distance varies from writer to writer. Dynamic programming is used to create an energy matrix from the input image and determine the minimum energy paths from left to right. For word segmentation, contour points are traced before applying the seam carving algorithm to find possible paths, and paths that are intersecting with the characters of the text are removed. The standard publicly available IAM English handwritten dataset and the Bangla Writing dataset are used to analyse the text-line and line-word segmentation technique, and the results show promising recognition accuracy. |