Hybrid Multidimensional Deep Convolutional Neural Network for Multimodal Fusion

Autor:	Olena Vynokurova, Dmytro Peleshko
Rok vydání:	2020
Předmět:	Kernel (linear algebra) Modality (human–computer interaction) Computer science business.industry Hybrid system Face (geometry) Feature (machine learning) Topology (electrical circuits) Pattern recognition Artificial intelligence Network topology business Convolutional neural network
Zdroj:	2020 IEEE Third International Conference on Data Stream Mining & Processing (DSMP).
DOI:	10.1109/dsmp47368.2020.9204215
Popis:	The Hybrid Multidimensional Deep Convolutional Neural Network (HMDCNN) topology for the multimodal recognition of the speech, the face, the lips, and human gestures behavior is proposed. In this case a hybridization is understood to be compatible use of 2D and 3D convolutional neural networks in one multimodal architecture. Conducted researches relate to improving the understanding of complex dynamic scenes. The basic unit of the proposed hybrid system is deep neural network topology, which combines 2D and 3D convolutional neural network (CNN) for each modality with proposed intermediate-level feature fusion subsystem. Such a feature map fusion method is based on scaling procedure with a specific combination of pooling operation with non-square kernels and allows merging different type of modalities.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::3923b947eafc62fb04c1e09b98e8792a https://doi.org/10.1109/dsmp47368.2020.9204215 Zobrazit plný text záznamu