Zobrazeno 1 - 10
of 1 257
pro vyhledávání: '"MANDEL, MICHAEL"'
Autor:
Sivakumar, Viswanath, Seely, Jeffrey, Du, Alan, Bittner, Sean R, Berenzweig, Adam, Bolarinwa, Anuoluwapo, Gramfort, Alexandre, Mandel, Michael I
Surface electromyography (sEMG) non-invasively measures signals generated by muscle activity with sufficient sensitivity to detect individual spinal neurons and richness to identify dozens of gestures and their nuances. Wearable wrist-based sEMG sens
Externí odkaz:
http://arxiv.org/abs/2410.20081
Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities, notably in connecting ideas and adhering to logical rules to solve problems. These models have evolved to accommodate various data modalities, including sound and image
Externí odkaz:
http://arxiv.org/abs/2406.04615
Across various research domains, remotely-sensed weather products are valuable for answering many scientific questions; however, their temporal and spatial resolutions are often too coarse to answer many questions. For instance, in wildlife research,
Externí odkaz:
http://arxiv.org/abs/2309.16867
We introduce ImportantAug, a technique to augment training data for speech classification and recognition models by adding noise to unimportant regions of the speech and not to important regions. Importance is predicted for each utterance by a data a
Externí odkaz:
http://arxiv.org/abs/2112.07156
Recent works have shown that Deep Recurrent Neural Networks using the LSTM architecture can achieve strong single-channel speech enhancement by estimating time-frequency masks. However, these models do not naturally generalize to multi-channel inputs
Externí odkaz:
http://arxiv.org/abs/2012.01576
Recurrent neural networks using the LSTM architecture can achieve significant single-channel noise reduction. It is not obvious, however, how to apply them to multi-channel inputs in a way that can generalize to new microphone configurations. In cont
Externí odkaz:
http://arxiv.org/abs/2012.03388
Spatial clustering techniques can achieve significant multi-channel noise reduction across relatively arbitrary microphone configurations, but have difficulty incorporating a detailed speech/noise model. In contrast, LSTM neural networks have success
Externí odkaz:
http://arxiv.org/abs/2012.02191
This paper aims at eliminating the interfering speakers' speech, additive noise, and reverberation from the noisy multi-talker speech mixture that benefits automatic speech recognition (ASR) backend. While the recently proposed Weighted Power minimiz
Externí odkaz:
http://arxiv.org/abs/2011.09162
Autor:
Trinh, Viet Anh, Mandel, Michael I
Publikováno v:
Proceedings of Interspeech 2020
In this paper, we propose a metric that we call the structured saliency benchmark (SSBM) to evaluate importance maps computed for automatic speech recognizers on individual utterances. These maps indicate time-frequency points of the utterance that a
Externí odkaz:
http://arxiv.org/abs/2005.10929