Výsledky vyhledávání - "Visual speech"

Akademický článek

SlowFast-TCN: A Deep Learning Approach for Visual Speech Recognition

Autor: Nicole Yah Yie Ha, Lee-Yeng Ong, Meng-Chew Leow

Publikováno v: Emerging Science Journal, Vol 8, Iss 6, Pp 2554-2569 (2024)

Visual Speech Recognition (VSR), commonly referred to as automated lip-reading, is an emerging technology that interprets speech by visually analyzing lip movements. A challenge in VSR where visually distinct words produce similar lip movements is kn

Externí odkaz: https://doaj.org/article/184428b4cbec4b46b552ec048f0f78e6

Zobrazit plný text záznamu

Akademický článek

Sla-former: conformer using shifted linear attention for audio-visual speech recognition

Autor: Yewei Xiao, Jian Huang, Xuanming Liu, Aosu Zhu

Publikováno v: Complex & Intelligent Systems, Vol 10, Iss 4, Pp 5721-5741 (2024)

Abstract Conformer-based models have proven highly effective in Audio-visual Speech Recognition, integrating auditory and visual inputs to significantly enhance speech recognition accuracy. However, the widely utilized softmax attention mechanism wit

Externí odkaz: https://doaj.org/article/161fb7353b394aadbfa77731cc1122d9

Zobrazit plný text záznamu

Akademický článek

Continuous lipreading based on acoustic temporal alignments

Autor: David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Publikováno v: EURASIP Journal on Audio, Speech, and Music Processing, Vol 2024, Iss 1, Pp 1-15 (2024)

Abstract Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art employs powerful end-to-end architectures based on deep learning which depend on large amounts

Externí odkaz: https://doaj.org/article/63367b081cf944e18bc66619b622bdba

Zobrazit plný text záznamu

Akademický článek

JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition

Autor: Chang Sun, Bo Qin, Hong Yang

Publikováno v: IEEE Open Journal of Signal Processing, Vol 5, Pp 1147-1152 (2024)

Visual Speech Recognition (VSR) tasks are generally recognized to have a lower theoretical performance ceiling than Automatic Speech Recognition (ASR), owing to the inherent limitations of conveying semantic information visually. To mitigate this cha

Externí odkaz: https://doaj.org/article/7255be9c6d304f6b9e198ae143ad7c87

Zobrazit plný text záznamu

Akademický článek

A survey of technologies for automatic Dysarthric speech recognition

Autor: Zhaopeng Qian, Kejing Xiao, Chongchong Yu

Publikováno v: EURASIP Journal on Audio, Speech, and Music Processing, Vol 2023, Iss 1, Pp 1-19 (2023)

Abstract Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is a powerful tool for extracting the content from speakers with dysarthria. However, the narro

Externí odkaz: https://doaj.org/article/468574ec11904261bfd628a1c09e2885

Zobrazit plný text záznamu

Akademický článek

Neural network-based method for visual recognition of driver’s voice commands using attention mechanism

Autor: Alexandr A. Axyonov, Elena V. Ryumina, Dmitry A. Ryumin, Denis V. Ivanko, Alexey A. Karpov

Publikováno v: Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki, Vol 23, Iss 4, Pp 767-775 (2023)

Visual speech recognition or automated lip-reading systems actively apply to speech-to-text translation. Video data proves to be useful in multimodal speech recognition systems, particularly when using acoustic data is difficult or not available at

Externí odkaz: https://doaj.org/article/66375603e608485f93cbe94a01929e68

Zobrazit plný text záznamu

Akademický článek

Editorial: Multisensory speech in perception and production

Autor: Kauyumari Sanchez, Karl David Neergaard, James W. Dias

Publikováno v: Frontiers in Human Neuroscience, Vol 18 (2024)

Externí odkaz: https://doaj.org/article/a6c74f88ced7450e9d51e327c2463e86

Zobrazit plný text záznamu

Akademický článek

Modern automatic recognition technologies for visual communication tools

Autor: V.O. Yachnaya, V.R. Lutsiv, R.O. Malashin

Publikováno v: Компьютерная оптика, Vol 47, Iss 2, Pp 287-305 (2023)

Communication refers to a wide range of different behaviors and activities aimed at handing over information. The communication process includes verbal, paraverbal and non-verbal components, conveying the informational part of a message and its emoti

Externí odkaz: https://doaj.org/article/c169b0a537a146c28320fcedbdb747c8

Zobrazit plný text záznamu

Akademický článek

A representation of abstract linguistic categories in the visual system underlies successful lipreading

Autor: Aaron R Nidiffer, Cody Zhewei Cao, Aisling O'Sullivan, Edmund C Lalor

Publikováno v: NeuroImage, Vol 282, Iss , Pp 120391- (2023)

There is considerable debate over how visual speech is processed in the absence of sound and whether neural activity supporting lipreading occurs in visual brain areas. Much of the ambiguity stems from a lack of behavioral grounding and neurophysiolo

Externí odkaz: https://doaj.org/article/88a422e2bc9f4b6788b1de7815107748

Zobrazit plný text záznamu

Akademický článek

Visual Speech Recognition for Kannada Language Using VGG16 Convolutional Neural Network

Autor: Shashidhar Rudregowda, Sudarshan Patil Kulkarni, Gururaj H L, Vinayakumar Ravi, Moez Krichen

Publikováno v: Acoustics, Vol 5, Iss 1, Pp 343-353 (2023)

Visual speech recognition (VSR) is a method of reading speech by noticing the lip actions of the narrators. Visual speech significantly depends on the visual features derived from the image sequences. Visual speech recognition is a stimulating proces

Externí odkaz: https://doaj.org/article/1ab88d9b29094efa8ffd92b261fa69e0

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání