Zobrazeno 1 - 10
of 897
pro vyhledávání: '"Espy, P."'
Autor:
Premananth, Gowtham, Siriwardena, Yashish M., Resnik, Philip, Bansal, Sonia, Kelly, Deanna L., Espy-Wilson, Carol
This paper presents a novel multimodal framework to distinguish between different symptom classes of subjects in the schizophrenia spectrum and healthy controls using audio, video, and text modalities. We implemented Convolution Neural Network and Lo
Externí odkaz:
http://arxiv.org/abs/2406.09706
Autor:
Siriwardena, Yashish M., Swedlow, Nathan, Howard, Audrey, Gitterman, Evan, Darcy, Dan, Espy-Wilson, Carol, Fanelli, Andrea
Conversion of non-native accented speech to native (American) English has a wide range of applications such as improving intelligibility of non-native speech. Previous work on this domain has used phonetic posteriograms as the target speech represent
Externí odkaz:
http://arxiv.org/abs/2406.05947
Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in
Externí odkaz:
http://arxiv.org/abs/2405.13018
We present an empirical model for auroral (90--150 km) electron--ion pair production rates, ionization rates for short, derived from SSUSI (Special Sensor Ultraviolet Spectrographic Imager) electron energy and flux data. Using the Fang et al., 2010 p
Externí odkaz:
http://arxiv.org/abs/2312.11130
This study focuses on how different modalities of human communication can be used to distinguish between healthy controls and subjects with schizophrenia who exhibit strong positive symptoms. We developed a multi-modal schizophrenia classification sy
Externí odkaz:
http://arxiv.org/abs/2309.15136
The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and ge
Externí odkaz:
http://arxiv.org/abs/2309.09220
Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to AS
Externí odkaz:
http://arxiv.org/abs/2309.07927
The velopharyngeal (VP) valve regulates the opening between the nasal and oral cavities. This valve opens and closes through a coordinated motion of the velum and pharyngeal walls. Nasalance is an objective measure derived from the oral and nasal aco
Externí odkaz:
http://arxiv.org/abs/2306.00203
Autor:
Benway, Nina R, Siriwardena, Yashish M, Preston, Jonathan L, Hitchcock, Elaine, McAllister, Tara, Espy-Wilson, Carol
Publikováno v:
Proc. INTERSPEECH 2023, 4568-4572
Acoustic-to-articulatory speech inversion could enhance automated clinical mispronunciation detection to provide detailed articulatory feedback unattainable by formant-based mispronunciation detection algorithms; however, it is unclear the extent to
Externí odkaz:
http://arxiv.org/abs/2305.16085
Accurate analysis of speech articulation is crucial for speech analysis. However, X-Y coordinates of articulators strongly depend on the anatomy of the speakers and the variability of pellet placements, and existing methods for mapping anatomical lan
Externí odkaz:
http://arxiv.org/abs/2305.10775