Zobrazeno 1 - 10
of 3 998
pro vyhledávání: '"GAO Tian"'
We propose a single-channel Deep Cascade Fusion of Diarization and Separation (DCF-DS) framework for back-end speech recognition, combining neural speaker diarization (NSD) and speech separation (SS). First, we sequentially integrate the NSD and SS m
Externí odkaz:
http://arxiv.org/abs/2411.06667
Mechanistic interpretability aims to provide human-understandable insights into the inner workings of neural network models by examining their internals. Existing approaches typically require significant manual effort and prior knowledge, with strate
Externí odkaz:
http://arxiv.org/abs/2410.16484
In multimodal sentiment analysis, collecting text data is often more challenging than video or audio due to higher annotation costs and inconsistent automatic speech recognition (ASR) quality. To address this challenge, our study has developed a robu
Externí odkaz:
http://arxiv.org/abs/2410.15029
Autor:
Zhou, Yujun, Yang, Jingdong, Guo, Kehan, Chen, Pin-Yu, Gao, Tian, Geyer, Werner, Moniz, Nuno, Chawla, Nitesh V, Zhang, Xiangliang
Laboratory accidents pose significant risks to human life and property, underscoring the importance of robust safety protocols. Despite advancements in safety training, laboratory personnel may still unknowingly engage in unsafe practices. With the i
Externí odkaz:
http://arxiv.org/abs/2410.14182
Autor:
Ye, Jiayi, Wang, Yanbo, Huang, Yue, Chen, Dongping, Zhang, Qihui, Moniz, Nuno, Gao, Tian, Geyer, Werner, Huang, Chao, Chen, Pin-Yu, Chawla, Nitesh V, Zhang, Xiangliang
LLM-as-a-Judge has been widely utilized as an evaluation method in various benchmarks and served as supervised rewards in model training. However, despite their excellence in many domains, potential issues are under-explored, undermining their reliab
Externí odkaz:
http://arxiv.org/abs/2410.02736
Neural operators (NOs) have demonstrated remarkable success in learning mappings between function spaces, serving as efficient approximators for the forward solutions of complex physical systems governed by partial differential equations (PDEs). Howe
Externí odkaz:
http://arxiv.org/abs/2410.02136
Although fully end-to-end speaker diarization systems have made significant progress in recent years, modular systems often achieve superior results in real-world scenarios due to their greater adaptability and robustness. Historically, modular speak
Externí odkaz:
http://arxiv.org/abs/2409.16803
A prior global topological map (e.g., the OpenStreetMap, OSM) can boost the performance of autonomous mapping by a ground mobile robot. However, the prior map is usually incomplete due to lacking labeling in partial paths. To solve this problem, this
Externí odkaz:
http://arxiv.org/abs/2409.08824
Autor:
Niu, Shutong, Wang, Ruoyu, Du, Jun, Yang, Gaobin, Tu, Yanhui, Wu, Siyuan, Qian, Shuangqing, Wu, Huaxin, Xu, Haitao, Zhang, Xueyang, Zhong, Guolong, Yu, Xindi, Chen, Jieru, Wang, Mengzhi, Cai, Di, Gao, Tian, Wan, Genshun, Ma, Feng, Pan, Jia, Gao, Jianqing
This technical report outlines our submission system for the CHiME-8 NOTSOFAR-1 Challenge. The primary difficulty of this challenge is the dataset recorded across various conference rooms, which captures real-world complexities such as high overlap r
Externí odkaz:
http://arxiv.org/abs/2409.02041
Despite the recent popularity of attention-based neural architectures in core AI fields like natural language processing (NLP) and computer vision (CV), their potential in modeling complex physical systems remains under-explored. Learning problems in
Externí odkaz:
http://arxiv.org/abs/2408.07307