Zobrazeno 1 - 10
of 1 909 456
pro vyhledávání: '"Ming, A"'
This paper proposes a novel Sequence-to-Sequence Neural Diarization (SSND) framework to perform online and offline speaker diarization. It is developed from the sequence-to-sequence architecture of our previous target-speaker voice activity detection
Externí odkaz:
http://arxiv.org/abs/2411.13849
Autor:
Wang, Zong, Guo, Zhihua, Chen, Zhihua, Li, Ming, Zhou, Zihang, Zhang, Chengjie, Fei, Shao-Ming, Ma, Zhihao
We establish the profound equivalence between measures of genuine multipartite entanglement(GME) and their corresponding coherence measures. Initially we construct two distinct classes of measures for genuine multipartite entanglement utilizing real
Externí odkaz:
http://arxiv.org/abs/2411.11485
Extremely nonlinear interactions between intense light pulses and atoms or molecules can generate new frequencies. Here, we observed high-order harmonics of acoustic waves in laser-induced plasma in air ionized by carrier-envelope offset phase (CEP)
Externí odkaz:
http://arxiv.org/abs/2411.08304
Autor:
Liang, Chia Xin, Tian, Pu, Yin, Caitlyn Heqi, Yua, Yao, An-Hou, Wei, Ming, Li, Wang, Tianyang, Bi, Ziqian, Liu, Ming
This survey and application guide to multimodal large language models(MLLMs) explores the rapidly developing field of MLLMs, examining their architectures, applications, and impact on AI and Generative Models. Starting with foundational concepts, we
Externí odkaz:
http://arxiv.org/abs/2411.06284
Autor:
Huang, Chien-yu, Chen, Wei-Chih, Yang, Shu-wen, Liu, Andy T., Li, Chen-An, Lin, Yu-Xiang, Tseng, Wei-Cheng, Diwan, Anuj, Shih, Yi-Jen, Shi, Jiatong, Chen, William, Chen, Xuanjun, Hsiao, Chi-Yuan, Peng, Puyuan, Wang, Shih-Heng, Kuan, Chun-Yi, Lu, Ke-Han, Chang, Kai-Wei, Yang, Chih-Kai, Ritter-Gutierrez, Fabian, Chuang, Ming To, Huang, Kuan-Po, Arora, Siddhant, Lin, You-Kuan, Yeo, Eunjung, Chang, Kalvin, Chien, Chung-Ming, Choi, Kwanghee, Hsieh, Cheng-Hsiu, Lin, Yi-Cheng, Yu, Chee-En, Chiu, I-Hsiang, Guimarães, Heitor R., Han, Jionghao, Lin, Tzu-Quan, Lin, Tzu-Yuan, Chang, Homu, Chang, Ting-Wu, Chen, Chun Wei, Chen, Shou-Jen, Chen, Yu-Hua, Cheng, Hsi-Chun, Dhawan, Kunal, Fang, Jia-Lin, Fang, Shi-Xin, Chiang, Kuan-Yu Fang, Fu, Chi An, Hsiao, Hsien-Fu, Hsu, Ching Yu, Huang, Shao-Syuan, Wei, Lee Chen, Lin, Hsi-Che, Lin, Hsuan-Hao, Lin, Hsuan-Ting, Lin, Jian-Ren, Liu, Ting-Chun, Lu, Li-Chun, Pai, Tsung-Min, Pasad, Ankita, Kuan, Shih-Yun Shan, Shon, Suwon, Tang, Yuxun, Tsai, Yun-Shao, Wei, Jui-Chiang, Wei, Tzu-Chieh, Wu, Chengxi, Wu, Dien-Ruei, Yang, Chao-Han Huck, Yang, Chieh-Chi, Yip, Jia Qi, Yuan, Shao-Xiang, Noroozi, Vahid, Chen, Zhehuai, Wu, Haibin, Livescu, Karen, Harwath, David, Watanabe, Shinji, Lee, Hung-yi
Multimodal foundation models, such as Gemini and ChatGPT, have revolutionized human-machine interactions by seamlessly integrating various forms of data. Developing a universal spoken language model that comprehends a wide range of natural language i
Externí odkaz:
http://arxiv.org/abs/2411.05361
Autor:
Zhang, Charles, Peng, Benji, Sun, Xintian, Niu, Qian, Liu, Junyu, Chen, Keyu, Li, Ming, Feng, Pohsun, Bi, Ziqian, Liu, Ming, Zhang, Yichao, Fei, Cheng, Yin, Caitlyn Heqi, Yan, Lawrence KQ, Wang, Tianyang
Word embeddings and language models have transformed natural language processing (NLP) by facilitating the representation of linguistic elements in continuous vector spaces. This review visits foundational concepts such as the distributional hypothes
Externí odkaz:
http://arxiv.org/abs/2411.05036
Autor:
Sun, Xintian, Peng, Benji, Zhang, Charles, Jin, Fei, Niu, Qian, Liu, Junyu, Chen, Keyu, Li, Ming, Feng, Pohsun, Bi, Ziqian, Liu, Ming, Zhang, Yichao
Remote sensing has evolved from simple image acquisition to complex systems capable of integrating and processing visual and textual data. This review examines the development and application of multi-modal language models (MLLMs) in remote sensing,
Externí odkaz:
http://arxiv.org/abs/2411.05826
Autor:
Chen, Keyu, Fei, Cheng, Bi, Ziqian, Liu, Junyu, Peng, Benji, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Yin, Caitlyn Heqi, Zhang, Yichao, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Ren, Jintao, Niu, Qian, Chen, Silin, Hsieh, Weiche, Yan, Lawrence K. Q., Liang, Chia Xin, Xu, Han, Tseng, Hong-Ming, Song, Xinyuan, Liu, Ming
With a focus on natural language processing (NLP) and the role of large language models (LLMs), we explore the intersection of machine learning, deep learning, and artificial intelligence. As artificial intelligence continues to revolutionize fields
Externí odkaz:
http://arxiv.org/abs/2411.05026
Brain tumor detection in multiplane Magnetic Resonance Imaging (MRI) slices is a challenging task due to the various appearances and relationships in the structure of the multiplane images. In this paper, we propose a new You Only Look Once (YOLO)-ba
Externí odkaz:
http://arxiv.org/abs/2410.21822
Autor:
Hsieh, Weiche, Bi, Ziqian, Liu, Junyu, Peng, Benji, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Chen, Keyu, Yin, Caitlyn Heqi, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Ren, Jintao, Niu, Qian, Chen, Silin, Liu, Ming
Digital Signal Processing (DSP) and Digital Image Processing (DIP) with Machine Learning (ML) and Deep Learning (DL) are popular research areas in Computer Vision and related fields. We highlight transformative applications in image enhancement, filt
Externí odkaz:
http://arxiv.org/abs/2410.20304