Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Pothuganti, Rithik"'
This technical report details our work towards building an enhanced audio-visual sound event localization and detection (SELD) network. We build on top of the audio-only SELDnet23 model and adapt it to be audio-visual by merging both audio and video
Externí odkaz:
http://arxiv.org/abs/2401.17129