The study of speed prediction using k-NN algorithm under Hadoop Environment

Autor: Huang, Chien Kai, 黃建凱
Rok vydání: 2012
Druh dokumentu: 學位論文 ; thesis
Popis: 100
This thesis studies the two different distance calculation (Euclidean distance and Manhattan Distance) methods in k-NN algorithms using Hadoop distributed platform to predict the vehicle speed as an example. Several experiments are adopted to show the performance. The results can demonstrate how to shorten the execution time so as to improve the efficiency. The results can also provide the consequent research. k-NN algorithm is a non-supervised machine learning algorithms. Users can calculate the expected data through a vast amount of data, but the results are often influenced by the amount of data. Calculating those vast and multi-dimensional data need huge computing power so as to be time consuming. The results which can be refered as distance which can be obtained from different methods. The most popular ones are Manhattan distance and Euclidean distance which are capable multidimensional analysis through k-NN algorithm. While using the algorithms mentioned above to predict driving speed, those algorithms can be implemented at Hadoop platform. We can witness the efficiency so as to improve the running speed. We also find the Manhattan distance theorem has better efficiency than the Euclidean distance. But no matter Euclidean distance or Manhattan distance algorithm the increasing of data quantity will not linearly increasing the computing time so as to witness the high scalability.
Databáze: Networked Digital Library of Theses & Dissertations