Autor:	Herwana, Cipta, Kadian, Abhishek
Rok vydání:	2022
Předmět:	Computer Science - Machine Learning
Druh dokumentu:	Working Paper
Popis:	The Information Bottleneck theory provides a theoretical and computational framework for finding approximate minimum sufficient statistics. Analysis of the Stochastic Gradient Descent (SGD) training of a neural network on a toy problem has shown the existence of two phases, fitting and compression. In this work, we analyze the SGD training process of a Deep Neural Network on MNIST classification and confirm the existence of two phases of SGD training. We also propose a setup for estimating the mutual information for a Deep Neural Network through Variational Inference. Comment: arXiv admin note: text overlap with arXiv:1703.00810, arXiv:2202.06749 by other authors
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2212.12667 Zobrazit plný text záznamu View this record from Arxiv