Autor:	Glukhov, Vacslav
Rok vydání:	2024
Předmět:	Computer Science - Artificial Intelligence
Druh dokumentu:	Working Paper
Popis:	Implications of uncertain objective functions and permutative symmetry of traditional deep learning architectures are discussed. It is shown that traditional architectures are polluted by an astronomical number of equivalent global and local optima. Uncertainty of the objective makes local optima unattainable, and, as the size of the network grows, the global optimization landscape likely becomes a tangled web of valleys and ridges. Some remedies which reduce or eliminate ghost optima are discussed including forced pre-pruning, re-ordering, ortho-polynomial activations, and modular bio-inspired architectures. Comment: 22 pages, 3 figures
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2411.07008 Zobrazit plný text záznamu View this record from Arxiv