Výsledky vyhledávání - "visual speech recognition"

Akademický článek

SlowFast-TCN: A Deep Learning Approach for Visual Speech Recognition

Autor: Nicole Yah Yie Ha, Lee-Yeng Ong, Meng-Chew Leow

Publikováno v: Emerging Science Journal, Vol 8, Iss 6, Pp 2554-2569 (2024)

Visual Speech Recognition (VSR), commonly referred to as automated lip-reading, is an emerging technology that interprets speech by visually analyzing lip movements. A challenge in VSR where visually distinct words produce similar lip movements is kn

Externí odkaz: https://doaj.org/article/184428b4cbec4b46b552ec048f0f78e6

Zobrazit plný text záznamu

Akademický článek

Sla-former: conformer using shifted linear attention for audio-visual speech recognition

Autor: Yewei Xiao, Jian Huang, Xuanming Liu, Aosu Zhu

Publikováno v: Complex & Intelligent Systems, Vol 10, Iss 4, Pp 5721-5741 (2024)

Abstract Conformer-based models have proven highly effective in Audio-visual Speech Recognition, integrating auditory and visual inputs to significantly enhance speech recognition accuracy. However, the widely utilized softmax attention mechanism wit

Externí odkaz: https://doaj.org/article/161fb7353b394aadbfa77731cc1122d9

Zobrazit plný text záznamu

Akademický článek

Continuous lipreading based on acoustic temporal alignments

Autor: David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Publikováno v: EURASIP Journal on Audio, Speech, and Music Processing, Vol 2024, Iss 1, Pp 1-15 (2024)

Abstract Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art employs powerful end-to-end architectures based on deep learning which depend on large amounts

Externí odkaz: https://doaj.org/article/63367b081cf944e18bc66619b622bdba

Zobrazit plný text záznamu

Akademický článek

JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition

Autor: Chang Sun, Bo Qin, Hong Yang

Publikováno v: IEEE Open Journal of Signal Processing, Vol 5, Pp 1147-1152 (2024)

Visual Speech Recognition (VSR) tasks are generally recognized to have a lower theoretical performance ceiling than Automatic Speech Recognition (ASR), owing to the inherent limitations of conveying semantic information visually. To mitigate this cha

Externí odkaz: https://doaj.org/article/7255be9c6d304f6b9e198ae143ad7c87

Zobrazit plný text záznamu

Akademický článek

A survey of technologies for automatic Dysarthric speech recognition

Autor: Zhaopeng Qian, Kejing Xiao, Chongchong Yu

Publikováno v: EURASIP Journal on Audio, Speech, and Music Processing, Vol 2023, Iss 1, Pp 1-19 (2023)

Abstract Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is a powerful tool for extracting the content from speakers with dysarthria. However, the narro

Externí odkaz: https://doaj.org/article/468574ec11904261bfd628a1c09e2885

Zobrazit plný text záznamu

Akademický článek

Neural network-based method for visual recognition of driver’s voice commands using attention mechanism

Autor: Alexandr A. Axyonov, Elena V. Ryumina, Dmitry A. Ryumin, Denis V. Ivanko, Alexey A. Karpov

Publikováno v: Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki, Vol 23, Iss 4, Pp 767-775 (2023)

Visual speech recognition or automated lip-reading systems actively apply to speech-to-text translation. Video data proves to be useful in multimodal speech recognition systems, particularly when using acoustic data is difficult or not available at

Externí odkaz: https://doaj.org/article/66375603e608485f93cbe94a01929e68

Zobrazit plný text záznamu

Akademický článek

Modern automatic recognition technologies for visual communication tools

Autor: V.O. Yachnaya, V.R. Lutsiv, R.O. Malashin

Publikováno v: Компьютерная оптика, Vol 47, Iss 2, Pp 287-305 (2023)

Communication refers to a wide range of different behaviors and activities aimed at handing over information. The communication process includes verbal, paraverbal and non-verbal components, conveying the informational part of a message and its emoti

Externí odkaz: https://doaj.org/article/c169b0a537a146c28320fcedbdb747c8

Zobrazit plný text záznamu

Akademický článek

Method for visual analysis of driver's face for automatic lip-reading in the wild

Autor: A.A. Axyonov, D.A. Ryumin, A.M. Kashevnik, D.V. Ivanko, A.A. Karpov

Publikováno v: Компьютерная оптика, Vol 46, Iss 6, Pp 955-962 (2022)

The paper proposes a method of visual analysis for automatic speech recognition of the vehicle driver. Speech recognition in acoustically noisy conditions is one of big challenges of artificial intelligence. The problem of effective automatic lip-rea

Externí odkaz: https://doaj.org/article/f072be7316aa453f942983362a000386

Zobrazit plný text záznamu

Akademický článek

Read my lips: Artificial intelligence word-level arabic lipreading system

Autor: Waleed Dweik, Sundus Altorman, Safa Ashour

Publikováno v: Egyptian Informatics Journal, Vol 23, Iss 4, Pp 1-12 (2022)

Lipreading is the ability to recognize words or sentences from the mouth movements of a speaking person. This process is also known as Visual Speech Recognition (VSR). Lipreading has two main advantages: facilitate communication for people with heari

Externí odkaz: https://doaj.org/article/1ea3e88a621e4e3dbc538e88a82cdb94

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání