Zobrazeno 1 - 10
of 934
pro vyhledávání: '"visual speech recognition"'
Publikováno v:
Emerging Science Journal, Vol 8, Iss 6, Pp 2554-2569 (2024)
Visual Speech Recognition (VSR), commonly referred to as automated lip-reading, is an emerging technology that interprets speech by visually analyzing lip movements. A challenge in VSR where visually distinct words produce similar lip movements is kn
Externí odkaz:
https://doaj.org/article/184428b4cbec4b46b552ec048f0f78e6
Publikováno v:
Complex & Intelligent Systems, Vol 10, Iss 4, Pp 5721-5741 (2024)
Abstract Conformer-based models have proven highly effective in Audio-visual Speech Recognition, integrating auditory and visual inputs to significantly enhance speech recognition accuracy. However, the widely utilized softmax attention mechanism wit
Externí odkaz:
https://doaj.org/article/161fb7353b394aadbfa77731cc1122d9
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2024, Iss 1, Pp 1-15 (2024)
Abstract Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art employs powerful end-to-end architectures based on deep learning which depend on large amounts
Externí odkaz:
https://doaj.org/article/63367b081cf944e18bc66619b622bdba
Publikováno v:
IEEE Open Journal of Signal Processing, Vol 5, Pp 1147-1152 (2024)
Visual Speech Recognition (VSR) tasks are generally recognized to have a lower theoretical performance ceiling than Automatic Speech Recognition (ASR), owing to the inherent limitations of conveying semantic information visually. To mitigate this cha
Externí odkaz:
https://doaj.org/article/7255be9c6d304f6b9e198ae143ad7c87
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2023, Iss 1, Pp 1-19 (2023)
Abstract Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is a powerful tool for extracting the content from speakers with dysarthria. However, the narro
Externí odkaz:
https://doaj.org/article/468574ec11904261bfd628a1c09e2885
Publikováno v:
Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki, Vol 23, Iss 4, Pp 767-775 (2023)
Visual speech recognition or automated lip-reading systems actively apply to speech-to-text translation. Video data proves to be useful in multimodal speech recognition systems, particularly when using acoustic data is difficult or not available at
Externí odkaz:
https://doaj.org/article/66375603e608485f93cbe94a01929e68
Publikováno v:
Компьютерная оптика, Vol 47, Iss 2, Pp 287-305 (2023)
Communication refers to a wide range of different behaviors and activities aimed at handing over information. The communication process includes verbal, paraverbal and non-verbal components, conveying the informational part of a message and its emoti
Externí odkaz:
https://doaj.org/article/c169b0a537a146c28320fcedbdb747c8
Publikováno v:
Компьютерная оптика, Vol 46, Iss 6, Pp 955-962 (2022)
The paper proposes a method of visual analysis for automatic speech recognition of the vehicle driver. Speech recognition in acoustically noisy conditions is one of big challenges of artificial intelligence. The problem of effective automatic lip-rea
Externí odkaz:
https://doaj.org/article/f072be7316aa453f942983362a000386
Publikováno v:
Egyptian Informatics Journal, Vol 23, Iss 4, Pp 1-12 (2022)
Lipreading is the ability to recognize words or sentences from the mouth movements of a speaking person. This process is also known as Visual Speech Recognition (VSR). Lipreading has two main advantages: facilitate communication for people with heari
Externí odkaz:
https://doaj.org/article/1ea3e88a621e4e3dbc538e88a82cdb94
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.