Enhanced shared experiences in heterogeneous network with generative AI

Autor: Neeraj Kumar, Ankur Narang, Brejesh Lall, Nitish Kumar Singh
Rok vydání: 2021
Předmět:
Zdroj: ITU Journal on Future and Evolving Technologies. 2:27-46
ISSN: 2616-8375
DOI: 10.52953/kwgh6836
Popis: COVID-19 has made the immersive experiences such as video conferencing, virtual reality/augmented reality, the most important modes of exchanging information. Despite much advancement in the network bandwidth and codec techniques, the current system still suffers from glitches, lags and poor video quality, especially under unreliable network conditions. In this paper, we propose the method of a video streaming pipeline to provide better video quality under erratic network conditions. We propose an environment where the participants can interact with each other through video conferencing by only sending the audio in the network. We propose a Multimodal Adaptive Normalization (MAN)-based architecture to synthesize a talking person video of arbitrary length using as input: an audio signal and a single image of a person. The architecture uses multimodal adaptive normalization, keypoint heatmap predictor, optical flow predictor and class activation map-based layers to learn movements of expressive facial components and hence generates a highly expressive talking-head video of the given person. We demonstrate the effectiveness of proposed streaming that dynamically controls the Quality of Experience (QoE) as per the requirements.
Databáze: OpenAIRE