Efficient learning of neighbor representations for boundary trees and forests
Autor: | Stark C. Draper, Tharindu Adikari |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
010302 applied physics
FOS: Computer and information sciences Computer Science - Machine Learning Similarity (geometry) Computer science Boundary (topology) Machine Learning (stat.ML) 02 engineering and technology 01 natural sciences Measure (mathematics) Machine Learning (cs.LG) Reduction (complexity) Tree (data structure) Semantic similarity Statistics - Machine Learning 020204 information systems 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Differentiable function Algorithm MNIST database |
Zdroj: | CISS |
Popis: | We introduce a semiparametric approach to neighbor-based classification. We build off the recently proposed Boundary Trees algorithm by Mathy et al.(2015) which enables fast neighbor-based classification, regression and retrieval in large datasets. While boundary trees use an Euclidean measure of similarity, the Differentiable Boundary Tree algorithm by Zoran et al.(2017) was introduced to learn low-dimensional representations of complex input data, on which semantic similarity can be calculated to train boundary trees. As is pointed out by its authors, the differentiable boundary tree approach contains a few limitations that prevents it from scaling to large datasets. In this paper, we introduce Differentiable Boundary Sets, an algorithm that overcomes the computational issues of the differentiable boundary tree scheme and also improves its classification accuracy and data representability. Our algorithm is efficiently implementable with existing tools and offers a significant reduction in training time. We test and compare the algorithms on the well known MNIST handwritten digits dataset and the newer Fashion-MNIST dataset by Xiao et al.(2017). 9 pages, 2 figures |
Databáze: | OpenAIRE |
Externí odkaz: |