Generative adversarial networks with stochastic gradient descent with momentum algorithm for video-based facial expression

Autor: Cherian, Aswathy K., Vaidhehi, M., Arshey, M., Briskilal, J., Simpson, Serin V.
Zdroj: International Journal of Information Technology; August 2024, Vol. 16 Issue: 6 p3703-3722, 20p
Abstrakt: Video data is an asset that may be used in various settings, such as a live broadcast on a personal blog or a security camera at a manufacturing facility. Both of these examples are examples of how video data can be used. It is becoming increasingly common practice across a wide range of applications to use a machine learning appliance as a tool for processing video. Recent years have seen significant advancements made in the field of machine learning in computer vision. These advancements have been achieved. The presentation of humans is approached or even surpassed in areas such as item identification, object categorization, and image segmentation. Despite this, challenging difficulties exist, such as identifying human emotions. This study aims to recognize human emotions by analyzing still images and motion pictures taken from motion pictures using numerous machine learning procedures. To accomplish this, neural networks constructed based on Generative Adversarial Networks (GAN) were used to classify each face picture obtained from a frame into one of the seven categories of facial emotions we chose. To communicate feelings, videos are mined for informative aspects such as audio, single, and multiple video frames. During this process stage, separate instances of the OpenSMILE and Inception-ResNet-v2 models extract feature vectors from the audio and frames. After that, numerous classification models are trained using stochastic gradient descent with the impetus approach (SGDMA). The findings from each of the pictures are compiled into a table, and from that, it is determined which facial expression was seen on-screen the most often throughout the film. The classification of audio feature vectors is accomplished with the application of GAN-SGDMA. The Inception-ResNet-v2 algorithm is utilized to recognize feelings conveyed by still photographs. The findings of several experiments suggest that the presented distributed model GAN-SGDMA could significantly boost the speed at which facial expressions are detected and classified based on the video. We demonstrate the effectiveness of our GAN-SGDMA approach in this paper by applying it to GAN-structured face expression recognition datasets, and the results we acquire are remarkable.
Databáze: Supplemental Index