Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization
Autor: | Soon Cheol Park, Lim Cheon Choi, Wei Song, Xiao Feng Ding |
---|---|
Přispěvatelé: | Song, Wei, Cheon, Choi Lim, Cheol, Park Soon, Ding, Xiao feng |
Jazyk: | angličtina |
Rok vydání: | 2011 |
Předmět: |
Fuzzy clustering
Computer science business.industry General Engineering normalized Google distance Fuzzy control system unsupervised categorization Machine learning computer.software_genre Fuzzy logic Automatic summarization Computer Science Applications Categorization Artificial Intelligence topics estimation Data mining Artificial intelligence Normalized Google distance extractive summarization Cluster analysis business computer Sentence fuzzy evolutionary optimization Premature convergence |
Popis: | Modern information retrieval (IR) systems consist of many challenging components, e.g. clustering, summarization,etc. Nowadays, without browsing the whole volume of data sets, IR systems present users with clusters of documents they are interested in, and summarize each document briefly which facilitates the task of finding the desired documents. This paper proposes a fuzzy evolutionary optimization modeling(FEOM) and its applications to unsupervised categorization and extractive summarization. In view of the nature of biological evolution, we take advantage of several fuzzy control parameters to adaptively regulate the behaviors of the evolutionary optimization, which can effectively prevent premature convergence to a local optimal solution. As a portable, modular and extensively executable model, FEOM is firstly implemented for clustering text documents. The searching capability of FEOM is exploited to explore appropriate partitions of documents such that the similarity metric of the resulting clusters is optimized. In order to further investigate its effectiveness as a generic data clustering model, FEOM is then applied to sentence clustering based extractive document summarization. It selects the most important sentence from each cluster to represent the overall meaning of document. We demonstrate the improved performance by a series of experiments using standard test sets, e.g. Reuter document collection,20-newsgroup corpus, DUC01 and DUC02, as evaluated by some commonly used metrics, i.e. F-measureand ROUGE. The experimental results show that FEOM achieves performance as good as or betterthan state of arts of clustering and summarizing systems. Refereed/Peer-reviewed |
Databáze: | OpenAIRE |
Externí odkaz: |