A Semi-automatic Solution Archive for Cross-Cut Shredded Text Documents Reconstruction
Autor: | Jinlin Guo, Songyang Lao, Shuxuan Guo, Hang Xiang |
---|---|
Rok vydání: | 2015 |
Předmět: | |
Zdroj: | Lecture Notes in Computer Science ISBN: 9783319219776 ICIG (1) |
DOI: | 10.1007/978-3-319-21978-3_39 |
Popis: | Automatic reconstruction of cross-cut shredded text documents (RCCSTD) is important in some areas and it is still a highly challenging problem so far. In this work, we propose a novel semi-automatic reconstruction solution archive for RCCSTD. This solution archive consists of five components, namely preprocessing, row clustering, error evaluation function (EEF), optimal reconstructing route searching and human mediation (HM). Specifically, a row clustering algorithm based on signal correlation coefficient and cross-correlation sequence, and an improved EEF based on gradient vector is separately evaluated by combining with HM and without HM. Experimental results show that row clustering is effective for identifying and grouping shreds belonging to a same row of text documents. The EEF proposed in this work improves the precision and produces high performance in RCCSTD regardless of using HM or not. Overall, extra HM boosts both of the performance of row clustering and shred reconstructing. |
Databáze: | OpenAIRE |
Externí odkaz: |