Asynchronous Message-Passing and Zeroth-Order Optimization Based Distributed Learning with a Use-Case in Resource Allocation in Communication Networks

Autor:	Behmandpoor, Pourya, Moonen, Marc, Patrinos, Panagiotis
Rok vydání:	2023
Předmět:	Electrical Engineering and Systems Science - Signal Processing Computer Science - Machine Learning Computer Science - Multiagent Systems Mathematics - Optimization and Control
Druh dokumentu:	Working Paper
DOI:	10.1109/TSIPN.2024.3487421
Popis:	Distributed learning and adaptation have received significant interest and found wide-ranging applications in machine learning and signal processing. While various approaches, such as shared-memory optimization, multi-task learning, and consensus-based learning (e.g., federated learning and learning over graphs), focus on optimizing either local costs or a global cost, there remains a need for further exploration of their interconnections. This paper specifically focuses on a scenario where agents collaborate towards a common task (i.e., optimizing a global cost equal to aggregated local costs) while effectively having distinct individual tasks (i.e., optimizing individual local parameters in a local cost). Each agent's actions can potentially impact other agents' performance through interactions. Notably, each agent has access to only its local zeroth-order oracle (i.e., cost function value) and shares scalar values, rather than gradient vectors, with other agents, leading to communication bandwidth efficiency and agent privacy. Agents employ zeroth-order optimization to update their parameters, and the asynchronous message-passing between them is subject to bounded but possibly random communication delays. This paper presents theoretical convergence analyses and establishes a convergence rate for nonconvex problems. Furthermore, it addresses the relevant use-case of deep learning-based resource allocation in communication networks and conducts numerical experiments in which agents, acting as transmitters, collaboratively train their individual policies to maximize a global reward, e.g., a sum of data rates.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2311.04604 Zobrazit plný text záznamu View this record from Arxiv