Apply Extended Learning Vector Quantization to Classify Mixed and Categorical Data

Autor: Hung-Yi Tsai, 蔡宏益
Rok vydání: 2012
Druh dokumentu: 學位論文 ; thesis
Popis: 100
With rapid growth of information technology, most of corporations have collected a large amount of digital data, such as data regarding employees, customers and transactions, etc. Thus, mining useful patterns from the data becomes an important issue. Learning Vector Quantization (LVQ) is a prototype-based classification technique and can process a large volume of data within reasonable computation time. However, traditional LVQ process only numeric data due to the use of Euclidean distance but cannot directly handle categorical data which must be converted in advance, by typically using 1-of-k method. Nevertheless, after the conversion, categorical data lose their semantic information, leading to reduced classification performance. In this work, we propose an Improved LVQ (ILVQ) to deal with mixed-type data by using distance hierarchy for expressing relationship between categorical data. Experimental results prove ILVQ is better than traditional LVQ in classifying mixed-type data.
Databáze: Networked Digital Library of Theses & Dissertations