Towards image-based laryngeal videostroboscopy using deep learning-enabled compressed sensing.

Autor: Wölfl, Anna-Maria, Schützenberger, Anne, Breininger, Katharina, Kist, Andreas M.
Předmět:
Zdroj: Biomedical Signal Processing & Control; Sep2023:Part C, Vol. 86, pN.PAG-N.PAG, 1p
Abstrakt: Laryngeal videostroboscopy is an audio-mediated imaging technique allowing the visualization of vocal fold oscillation behavior: the audio signal is used to determine the fundamental frequency F 0 , which represents the vocal fold oscillation frequency. Knowing F 0 allows to trigger the strobe illumination unit to provide a still image or slow-motion view of the vocal fold oscillation. However, this procedure involves several hardware components, noisy audio signals, and a chain of complex, error-prone algorithms that have to be orchestrated. We hypothesize that endoscopic images suffice to determine F 0 with a view towards providing an alternative, image-based approach for estimating F 0 during laryngeal videoendoscopy. In this study, we show that we are able to predict the relative glottal opening state to create sample points on the glottal area waveform, an endoscopic image-derived signal capable of deriving F 0. As imaging frame rates from ordinary endoscopic cameras do not fulfill the Shannon–Nyquist criterion, we solve this problem with compressed sensing. We developed and evaluated the proposed approach using high-speed videoendoscopy (HSV) to simulate different, realistic low frame rates that are similar to those used in videostroboscopy. We show that we are able to predict F 0 with over 95% accuracy using at most 75 sample points of a 600 ms long footage. Using endoscopic images and our algorithm only, we showcase that we can achieve a stroboscopic effect. This shows, that our proposed method in combination with the developed algorithm may be considered in the future to be integrated into clinical videostroboscopy. • Endoscopic images can predict the relative glottis opening state. • Compressed sensing allows the reconstruction of the glottal area waveform. • Vocal folds oscillation frequency can be estimated below Shannon-Nyquist criterion. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index