Semantic Routed Network for Distributed Search Engines

Autor: Biswas, Amitava
Rok vydání: 2010
Předmět:
Druh dokumentu: Diplomová práce
Popis: Searching for textual information has become an important activity on the web. To satisfy the rising demand and user expectations, search systems should be fast, scalable and deliver relevant results. To decide which objects should be retrieved, search systems should compare holistic meanings of queries and text document objects, as perceived by humans. Existing techniques do not enable correct comparison of composite holistic meanings like: "evidences on role of DR2 gene in development of diabetes in Caucasian population", which is composed of multiple elementary meanings: "evidence", "DR2 gene", etc. Thus these techniques can not discern objects that have a common set of keywords but convey different meanings. Hence we need new methods to compare composite meanings for superior search quality. In distributed search engines, for scalability, speed and efficiency, index entries should be systematically distributed across multiple index-server nodes based on the meaning of the objects. Furthermore, queries should be selectively sent to those index nodes which have relevant entries. This requires an overlay Semantic Routed Network which will route messages, based on meaning. This network will consist of fast response networking appliances called semantic routers. These appliances need to: (a) carry out sophisticated meaning comparison computations at high speed; and (b) have the right kind of behavior to automatically organize an optimal index system. This dissertation presents the following artifacts that enable the above requirements: (1) An algebraic theory, a design of a data structure and related techniques to efficiently compare composite meanings. (2) Algorithms and accelerator architectures for high speed meaning comparisons inside semantic routers and index-server nodes. (3) An overlay network to deliver search queries to the index nodes based on meanings. (4) Algorithms to construct a self-organizing, distributed meaning based index system. The proposed techniques can compare composite meanings ~105 times faster than an equivalent software code and existing hardware designs. Whereas, the proposed index organization approach can lead to 33% savings in number of servers and power consumption in a model search engine having 700,000 servers. Therefore, using all these techniques, it is possible to design a Semantic Routed Network which has a potential to improve search results and response time, while saving resources.
Databáze: Networked Digital Library of Theses & Dissertations