Popis: |
The main objective of this paper is to provide cluster summarization of huge text document. Mining process includes the sharing of large scale amount of data from various sources, which gets concluded at the mined data. In distributed data mining, adopting aflat node distribution model can affect scalability, modularity, flexibility which are being overcome by using dynamic peer to peer document clustering and cluster summarization. The Dynamic P2P document clustering and cluster summarization (DP2PCS) architecture is based upon bonus words and stigma words. For document clustering applications, the system summarizes the distributed document clusters using a distributed key-phrase extraction algorithm, thus providing interpretation of the clusters. Document summarization is used for fast information retrieval in less time. Compared to existing system the dynamic nature of proposed system facilitates a scalable cluster wherein the peers may join or leave the group at will. The summarization process on an average reduces the original documents content by 63 percentage based on the word count. |