Zobrazeno 1 - 10
of 10 153
pro vyhledávání: '"Masuyama A"'
Selective state space models (SSMs) represented by Mamba have demonstrated their computational efficiency and promising outcomes in various tasks, including automatic speech recognition (ASR). Mamba has been applied to ASR task with the attention-bas
Externí odkaz:
http://arxiv.org/abs/2411.06968
Autor:
Shi, Jiatong, Tian, Jinchuan, Wu, Yihan, Jung, Jee-weon, Yip, Jia Qi, Masuyama, Yoshiki, Chen, William, Wu, Yuning, Tang, Yuxun, Baali, Massa, Alharhi, Dareen, Zhang, Dong, Deng, Ruifan, Srivastava, Tejes, Wu, Haibin, Liu, Alexander H., Raj, Bhiksha, Jin, Qin, Song, Ruihua, Watanabe, Shinji
Neural codecs have become crucial to recent speech and audio generation research. In addition to signal compression capabilities, discrete codecs have also been found to enhance downstream training efficiency and compatibility with autoregressive lan
Externí odkaz:
http://arxiv.org/abs/2409.15897
In machine learning algorithm design, there exists a trade-off between the interpretability and performance of the algorithm. In general, algorithms which are simpler and easier for humans to comprehend tend to show worse performance than more comple
Externí odkaz:
http://arxiv.org/abs/2407.08973
This paper explores the capability of Mamba, a recently proposed architecture based on state space models (SSMs), as a competitive alternative to Transformer-based models. In the speech domain, well-designed Transformer-based models, such as the Conf
Externí odkaz:
http://arxiv.org/abs/2406.16808
Autor:
Masuyama, Yoshiki, Wichern, Gordon, Germain, François G., Pan, Zexu, Khurana, Sameer, Hori, Chiori, Roux, Jonathan Le
Head-related transfer functions (HRTFs) are important for immersive audio, and their spatial interpolation has been studied to upsample finite measurements. Recently, neural fields (NFs) which map from sound source direction to HRTF have gained atten
Externí odkaz:
http://arxiv.org/abs/2402.17907
Autor:
Keiichiro Nakamura, Kunitoshi Shigeyasu, Thuy Ha Vu, Jota Maki, Kazuhiro Okamoto, Hisashi Masuyama
Publikováno v:
Scientific Reports, Vol 14, Iss 1, Pp 1-13 (2024)
Abstract Intrauterine infection (IUI) is mainly an ascending infection in which vaginal and cervical pathogens ascend to the uterus and can affect the fetus. Until now, there is still no effective diagnostic biomarker for IUI, such as chorioamnioniti
Externí odkaz:
https://doaj.org/article/849fec4eaedc46df8748fb1d93aab993
Autor:
Pan, Zexu, Wichern, Gordon, Masuyama, Yoshiki, Germain, Francois G., Khurana, Sameer, Hori, Chiori, Roux, Jonathan Le
Target speech extraction aims to extract, based on a given conditioning cue, a target speech signal that is corrupted by interfering sources, such as noise or competing speakers. Building upon the achievements of the state-of-the-art (SOTA) time-freq
Externí odkaz:
http://arxiv.org/abs/2310.19644
Autor:
Masuyama, Naoki, Nojima, Yusuke, Toda, Yuichiro, Loo, Chu Kiong, Ishibuchi, Hisao, Kubota, Naoyuki
Publikováno v:
IEEE Access, vol. 12, pp. 139692-139710, September 2024
With the increasing importance of data privacy protection, various privacy-preserving machine learning methods have been proposed. In the clustering domain, various algorithms with a federated learning framework (i.e., federated clustering) have been
Externí odkaz:
http://arxiv.org/abs/2309.03487
We propose an optimization-based method for reconstructing a time-domain signal from a low-dimensional spectral representation such as a mel-spectrogram. Phase reconstruction has been studied to reconstruct a time-domain signal from the full-band sho
Externí odkaz:
http://arxiv.org/abs/2307.12232
Autor:
Masuyama, Yoshiki, Chang, Xuankai, Zhang, Wangyou, Cornell, Samuele, Wang, Zhong-Qiu, Ono, Nobutaka, Qian, Yanmin, Watanabe, Shinji
Neural speech separation has made remarkable progress and its integration with automatic speech recognition (ASR) is an important direction towards realizing multi-speaker ASR. This work provides an insightful investigation of speech separation in re
Externí odkaz:
http://arxiv.org/abs/2307.12231