Digging for knowledge
Autor: | Harold H. Szu, Dalila Benachenhou, Jeffrey Jenkins, Masud Cader, Liden Miao, Steve Goehl, Charles Hsu |
---|---|
Rok vydání: | 2009 |
Předmět: |
Theoretical computer science
business.industry Intersection (set theory) Computer science Cartesian product Query language Newspaper Set (abstract data type) symbols.namesake Tree (data structure) Text mining Index (publishing) Data retrieval Knowledge extraction symbols Set operations The Internet business |
Zdroj: | SPIE Proceedings. |
ISSN: | 0277-786X |
DOI: | 10.1117/12.822736 |
Popis: | The "smile of a mother" is always recognized, whenever and wherever. But why is my PC always dumb and unable to recognize me or my needs, whoever or whatever? This paper postulates that such a 6 W's query and search system needs matching storage. Such a lament will soon be mended with a smarter PC, or a smarter Google engine, a network computer, working in the field of data retrieval, feature extraction, reduction, and knowledge precipitation. Specifically, the strategy of modern information storage and retrieval shall work like our brains, which are constantly overwhelmed by 5 pairs of identical tapes taken by eyes, ears, etc. 5 high fidelity sensors generate 5 pairs of high definition tapes which produce the seeing and hearing etc. in our perception. This amounts to 10 tapes recorded in a non-abridged fashion. How can we store and retrieve them when we need to? We must reduce the redundancy, enhancing the signal noise ratio, and fusing invariant features using a simple set of mathematical operations to write according to the union and read by the intersection in the higher dimensional vector space. For example, (see paper for equation) where the query must be phrased in terms of the union of imprecise or partial set of 6w's denoted by the union of lower case w's. The upper case W's are the archival storage of a primer tree. A simplified humanistic representation may be called the 6W space (who, what, where, when, why, how), also referred to as the Newspaper geometry. It seems like mapping the 6W to the 3W (World Wide Web) is becoming relatively easier. It may thus become efficient and robust by rapidly digging for knowledge through the set operations of union, writing, and intersection, reading, upon the design of 6 W query searching engine matched efficiently by the 6W vector index databases. In fact, Newspaper 6D geometry may be reduced furthermore by PCA (Principal Component Analysis) eigenvector mathematics and mapped into the 2D causality space comprised of the causes (What, How, Why) and the effects (Where, When and Who). If this hypothesis of brain strategy were true, one must then develop a 6W query language to support a 6Wordered set storage of linkage pointers in high D space. In other words, one can easily map the basic 1st Gen. Google Web, 1-D statistical PageRanking databases, to a nested 6W tree where each branch of sub-6-W is stemming from the prime 6 W tree, using a system of automated text mining assisted by syntactic semantics to discern the properties of the 6W for that query. Goehl et al. has demonstrated previously that such is doable, but one may need more tools to support the knowledge extraction and automated feature reduction. In this paper, we have set out to demonstrate lossless down sampling using the 2nd Gen wavelet transform, the so-called "1-D Cartesian lifting processing of Swelden" adopted by JPEG 2000. "The loss of statistics, if any (including PageRanking and 1-D lifting), is the loss of geometry insights," such as 2-D vector time series, video, whose 1-D lifting Cartesian product will loss the diagonal changes insights. |
Databáze: | OpenAIRE |
Externí odkaz: |