An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization

Autor: Claudomiro Sales, Adriano Barreto, Caio Flexa, Eduardo Cardoso, Igor Moreira
Rok vydání: 2020
Předmět:
Zdroj: Intelligent Systems ISBN: 9783030613792
BRACIS (2)
Popis: Visualizing multidimensional Big Data is defying: high dimensionalities hinder or even preclude visual inspections. A means of tackling this issue is to use DR (Dimensionality Reduction) techniques, producing low-dimensional representations of high-dimensional data. Popular DR algorithms (e.g., Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding), albeit helpful, are computationally expensive. Most have \(\mathcal {O}(n^2)\) or \(\mathcal {O}(n^3)\) ATC (Asymptotic Time Complexity) and/or calculate pairwise distances of the entire data set, exceeding available memory and rendering Big Data DR time-consuming or impracticable. These issues impede the employment of DR for online learning appliances, where recurrent, cumulative model updates are habitual. The stochastic factor of some approaches similarly obstructs any meaningful inspection on how knowledge is spatially disposed. The recently introduced PCS (Polygonal Coordinate System)—an incremental, geometric-based technique with linear ATC—is compelling; however, its restriction to 2-D embeddings amounts to significant information loss. We propose the Big Data ready, incremental PES (Pyramidal Embedding System), which builds on PCS virtues by additionally generating 3-D embeddings through its pyramid-like interspace, mitigating quality degradation. Visual inspections, as well as pairwise distance based statistical analyses, validate the PES ability to retain structural arrangements when embedding high- and low-dimensional data while retaining flexibility in resources consumption.
Databáze: OpenAIRE