Výsledky vyhledávání - "Zhao, Jinzheng"

Report

Text-Queried Target Sound Event Localization

Autor: Zhao, Jinzheng, Qian, Xinyuan, Xu, Yong, Liu, Haohe, Cao, Yin, Berghi, Davide, Wang, Wenwu

Sound event localization and detection (SELD) aims to determine the appearance of sound classes, together with their Direction of Arrival (DOA). However, current SELD systems can only predict the activities of specific classes, for example, 13 classe

Externí odkaz: http://arxiv.org/abs/2406.16058

Zobrazit plný text záznamu

Report

Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Review

Autor: Cui, Meng, Liu, Xubo, Liu, Haohe, Zhao, Jinzheng, Li, Daoliang, Wang, Wenwu

Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which a

Externí odkaz: http://arxiv.org/abs/2406.17800

Zobrazit plný text záznamu

Report

Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection

Autor: Berghi, Davide, Wu, Peipei, Zhao, Jinzheng, Wang, Wenwu, Jackson, Philip J. B.

Sound event localization and detection (SELD) combines two subtasks: sound event detection (SED) and direction of arrival (DOA) estimation. SELD is usually tackled as an audio-only problem, but visual information has been recently included. Few audio

Externí odkaz: http://arxiv.org/abs/2312.09034

Zobrazit plný text záznamu

Report

Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions

Autor: Zhao, Jinzheng, Xu, Yong, Qian, Xinyuan, Berghi, Davide, Wu, Peipei, Cui, Meng, Sun, Jianyuan, Jackson, Philip J. B., Wang, Wenwu

Audio-visual speaker tracking has drawn increasing attention over the past few years due to its academic values and wide application. Audio and visual modalities can provide complementary information for localization and tracking. With audio and visu

Externí odkaz: http://arxiv.org/abs/2310.14778

Zobrazit plný text záznamu

Report

Towards Robust and Generalizable Training: An Empirical Study of Noisy Slot Filling for Input Perturbations

Autor: Liu, Jiachi, Wang, Liwen, Dong, Guanting, Song, Xiaoshuai, Wang, Zechen, Wang, Zhengyang, Lei, Shanglin, Zhao, Jinzheng, He, Keqing, Xiao, Bo, Xu, Weiran

In real dialogue scenarios, as there are unknown input noises in the utterances, existing supervised slot filling models often perform poorly in practical applications. Even though there are some studies on noise-robust models, these works are only e

Externí odkaz: http://arxiv.org/abs/2310.03518

Zobrazit plný text záznamu

Report

Audio Visual Speaker Localization from EgoCentric Views

Autor: Zhao, Jinzheng, Xu, Yong, Qian, Xinyuan, Wang, Wenwu

The use of audio and visual modality for speaker localization has been well studied in the literature by exploiting their complementary characteristics. However, most previous works employ the setting of static sensors mounted at fixed positions. Unl

Externí odkaz: http://arxiv.org/abs/2309.16308

Zobrazit plný text záznamu

Report

Generative Zero-Shot Prompt Learning for Cross-Domain Slot Filling with Inverse Prompting

Autor: Li, Xuefeng, Wang, Liwen, Dong, Guanting, He, Keqing, Zhao, Jinzheng, Lei, Hao, Liu, Jiachi, Xu, Weiran

Zero-shot cross-domain slot filling aims to transfer knowledge from the labeled source domain to the unlabeled target domain. Existing models either encode slot descriptions and examples or design handcrafted question templates using heuristic rules,

Externí odkaz: http://arxiv.org/abs/2307.02830

Zobrazit plný text záznamu

Report

PSSAT: A Perturbed Semantic Structure Awareness Transferring Method for Perturbation-Robust Slot Filling

Autor: Dong, Guanting, Guo, Daichi, Wang, Liwen, Li, Xuefeng, Wang, Zechen, Zeng, Chen, He, Keqing, Zhao, Jinzheng, Lei, Hao, Cui, Xinyue, Huang, Yi, Feng, Junlan, Xu, Weiran

Most existing slot filling models tend to memorize inherent patterns of entities and corresponding contexts from training data. However, these models can lead to system failure or undesirable outputs when being exposed to spoken language perturbation

Externí odkaz: http://arxiv.org/abs/2208.11508

Zobrazit plný text záznamu

Report

A Robust Contrastive Alignment Method For Multi-Domain Text Classification

Autor: Li, Xuefeng, Lei, Hao, Wang, Liwen, Dong, Guanting, Zhao, Jinzheng, Liu, Jiachi, Xu, Weiran, Zhang, Chunyun

Multi-domain text classification can automatically classify texts in various scenarios. Due to the diversity of human languages, texts with the same label in different domains may differ greatly, which brings challenges to the multi-domain text class

Externí odkaz: http://arxiv.org/abs/2204.12125

Zobrazit plný text záznamu

Report

Separate What You Describe: Language-Queried Audio Source Separation

Autor: Liu, Xubo, Liu, Haohe, Kong, Qiuqiang, Mei, Xinhao, Zhao, Jinzheng, Huang, Qiushi, Plumbley, Mark D., Wang, Wenwu

In this paper, we introduce the task of language-queried audio source separation (LASS), which aims to separate a target source from an audio mixture based on a natural language query of the target source (e.g., "a man tells a joke followed by people

Externí odkaz: http://arxiv.org/abs/2203.15147

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání