Popis: |
With the increasing interest in document analysis research the number of available OCR, segmentation, noise removal and various other document analysis algorithms has grown considerably. However, algorithms are still purpose- specific, and to obtain optimal results, different algorithms for different situations are usually needed. The problem is to reliably evaluate the performance of an algorithm in a given situation. A framework for a benchmarking system for document analysis algorithms is presented. The system consists of a set of test cases for measuring the performance of different document analysis algorithms. The system is expandable, new algorithm types to be tested can be added by creating new test cases and benchmarking methods. The whole benchmarking process can be automated to allow mass performance testing with numerous algorithms. A set of weights is used to adjust the relative significance of the different aspects of a test case. The results of the benchmarking are expressed as a single value, which presents the performance of the algorithm in a given test case. The result can be easily compared with the results of other algorithms, which enables the ranking of the tested algorithms. Experiments with benchmarking system show promising results. The performance ranking also complies well with subjective human evaluation. |