Popis: |
Text documents are the most common means for exchanging formal knowledge among people. Text is a rich medium that can contain a vast range of information, but text can be difficult to decipher automatically. Many organizations have vast repositories of textual data but with few means of automatically mining that text. Text mining methods seek to use an understanding of natural language text to extract information relevant to user needs. This article evaluates a new text mining methodology: prototypematching for text clustering, developed by the authors’ research group. The methodology was applied to four applications: clustering documents based on their abstracts, analyzing financial data, distinguishing authorship, and evaluating multiple translation similarity. The results are discussed in terms of common business applications and possible future research. |