An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization
Autor: | Claudomiro Sales, Adriano Barreto, Caio Flexa, Eduardo Cardoso, Igor Moreira |
---|---|
Rok vydání: | 2020 |
Předmět: | |
Zdroj: | Intelligent Systems ISBN: 9783030613792 BRACIS (2) |
Popis: | Visualizing multidimensional Big Data is defying: high dimensionalities hinder or even preclude visual inspections. A means of tackling this issue is to use DR (Dimensionality Reduction) techniques, producing low-dimensional representations of high-dimensional data. Popular DR algorithms (e.g., Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding), albeit helpful, are computationally expensive. Most have \(\mathcal {O}(n^2)\) or \(\mathcal {O}(n^3)\) ATC (Asymptotic Time Complexity) and/or calculate pairwise distances of the entire data set, exceeding available memory and rendering Big Data DR time-consuming or impracticable. These issues impede the employment of DR for online learning appliances, where recurrent, cumulative model updates are habitual. The stochastic factor of some approaches similarly obstructs any meaningful inspection on how knowledge is spatially disposed. The recently introduced PCS (Polygonal Coordinate System)—an incremental, geometric-based technique with linear ATC—is compelling; however, its restriction to 2-D embeddings amounts to significant information loss. We propose the Big Data ready, incremental PES (Pyramidal Embedding System), which builds on PCS virtues by additionally generating 3-D embeddings through its pyramid-like interspace, mitigating quality degradation. Visual inspections, as well as pairwise distance based statistical analyses, validate the PES ability to retain structural arrangements when embedding high- and low-dimensional data while retaining flexibility in resources consumption. |
Databáze: | OpenAIRE |
Externí odkaz: |