Zobrazeno 1 - 10
of 109
pro vyhledávání: '"Itoyama Katsutoshi"'
Unmanned aerial vehicles (UAVs) have revolutionized search and rescue (SAR) operations, but the lack of specialized human detection datasets for training machine learning models poses a significant challenge.To address this gap, this paper introduces
Externí odkaz:
http://arxiv.org/abs/2408.04922
This study investigates mask-based beamformers (BFs), which estimate filters for target sound extraction (TSE) using time-frequency masks. Although multiple mask-based BFs have been proposed, no consensus has been established on the best one for targ
Externí odkaz:
http://arxiv.org/abs/2407.15310
Autor:
Wang, Jiang, He, Yuanzheng, Su, Daobilige, Itoyama, Katsutoshi, Nakadai, Kazuhiro, Wu, Junfeng, Huang, Shoudong, Li, Youfu, Kong, He
Robot audition systems with multiple microphone arrays have many applications in practice. However, accurate calibration of multiple microphone arrays remains challenging because there are many unknown parameters to be identified, including the relat
Externí odkaz:
http://arxiv.org/abs/2405.19813
The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology. Traditional object detection models, trained on datasets biased towards large objects, struggle to perform optimally in
Externí odkaz:
http://arxiv.org/abs/2401.14661
This study investigates mask-based beamformers (BFs), which estimate filters to extract target speech using time-frequency masks. Although several BF methods have been proposed, the following aspects are yet to be comprehensively investigated. 1) Whi
Externí odkaz:
http://arxiv.org/abs/2309.12065
We describe a novel metric-based learning approach that introduces a multimodal framework and uses deep audio and geophone encoders in siamese configuration to design an adaptable and lightweight supervised model. This framework eliminates the need f
Externí odkaz:
http://arxiv.org/abs/2111.07979
Publikováno v:
EURASIP Journal on Advances in Signal Processing, Vol 2010, Iss 1, p 172961 (2010)
We describe a novel query-by-example (QBE) approach in music information retrieval that allows a user to customize query examples by directly modifying the volume of different instrument parts. The underlying hypothesis of this approach is that the m
Externí odkaz:
https://doaj.org/article/bae2f696cb0944c190d69baaefc095f9
Autor:
Shimada, Kazuki, Bando, Yoshiaki, Mimura, Masato, Itoyama, Katsutoshi, Yoshii, Kazuyoshi, Kawahara, Tatsuya
This paper describes multichannel speech enhancement for improving automatic speech recognition (ASR) in noisy environments. Recently, the minimum variance distortionless response (MVDR) beamforming has widely been used because it works well if the s
Externí odkaz:
http://arxiv.org/abs/1903.09341
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network (DNN) to take
Externí odkaz:
http://arxiv.org/abs/1710.11439