Zobrazeno 1 - 10
of 15 211
pro vyhledávání: '"An, Ruihua"'
Image captioning, which generates natural language descriptions of the visual information in an image, is a crucial task in vision-language research. Previous models have typically addressed this task by aligning the generative capabilities of machin
Externí odkaz:
http://arxiv.org/abs/2408.16809
Autor:
Chen, Jie, Chen, Zhipeng, Wang, Jiapeng, Zhou, Kun, Zhu, Yutao, Jiang, Jinhao, Min, Yingqian, Zhao, Wayne Xin, Dou, Zhicheng, Mao, Jiaxin, Lin, Yankai, Song, Ruihua, Xu, Jun, Chen, Xu, Yan, Rui, Wei, Zhewei, Hu, Di, Huang, Wenbing, Wen, Ji-Rong
Continual pre-training (CPT) has been an important approach for adapting language models to specific domains or tasks. To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which si
Externí odkaz:
http://arxiv.org/abs/2407.18743
Autor:
Zhu, Yutao, Zhou, Kun, Mao, Kelong, Chen, Wentong, Sun, Yiding, Chen, Zhipeng, Cao, Qian, Wu, Yihan, Chen, Yushuo, Wang, Feng, Zhang, Lei, Li, Junyi, Wang, Xiaolei, Wang, Lei, Zhang, Beichen, Dong, Zican, Cheng, Xiaoxue, Chen, Yuhan, Tang, Xinyu, Hou, Yupeng, Ren, Qiangqiang, Pang, Xincheng, Xie, Shufang, Zhao, Wayne Xin, Dou, Zhicheng, Mao, Jiaxin, Lin, Yankai, Song, Ruihua, Xu, Jun, Chen, Xu, Yan, Rui, Wei, Zhewei, Hu, Di, Huang, Wenbing, Gao, Ze-Feng, Chen, Yueguo, Lu, Weizheng, Wen, Ji-Rong
Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of
Externí odkaz:
http://arxiv.org/abs/2406.19853
Autor:
Li, Shiqian, Li, Zhi, Mu, Zhancun, Xin, Shiji, Dai, Zhixiang, Leng, Kuangdai, Zhang, Ruihua, Song, Xiaodong, Zhu, Yixin
Global seismic tomography, taking advantage of seismic waves from natural earthquakes, provides essential insights into the earth's internal dynamics. Advanced Full-waveform Inversion (FWI) techniques, whose aim is to meticulously interpret every det
Externí odkaz:
http://arxiv.org/abs/2406.18202
Autor:
Ji, Zhiyou, Li, Guoliang, Han, Ruihua, Wang, Shuai, Bai, Bing, Xu, Wei, Ye, Kejiang, Xu, Chengzhong
Robotic data gathering (RDG) is an emerging paradigm that navigates a robot to harvest data from remote sensors. However, motion planning in this paradigm needs to maximize the RDG efficiency instead of the navigation efficiency, for which the existi
Externí odkaz:
http://arxiv.org/abs/2404.10541
This paper investigates the fronthaul compression problem in a user-centric cloud radio access network, in which single-antenna users are served by a central processor (CP) cooperatively via a cluster of remote radio heads (RRHs). To satisfy the fron
Externí odkaz:
http://arxiv.org/abs/2403.09004
Modeling a generalized visuomotor policy has been a longstanding challenge for both computer vision and robotics communities. Existing approaches often fail to efficiently leverage cross-dataset resources or rely on heavy Vision-Language models, whic
Externí odkaz:
http://arxiv.org/abs/2403.07312
Autor:
Han, Ruihua, Wang, Shuai, Wang, Shuaijun, Zhang, Zeqing, Chen, Jianjun, Lin, Shijie, Li, Chengyang, Xu, Chengzhong, Eldar, Yonina C., Hao, Qi, Pan, Jia
Navigating a nonholonomic robot in a cluttered environment requires extremely accurate perception and locomotion for collision avoidance. This paper presents NeuPAN: a real-time, highly-accurate, map-free, robot-agnostic, and environment-invariant ro
Externí odkaz:
http://arxiv.org/abs/2403.06828
Virtual reality (VR) is a promising data engine for autonomous driving (AD). However, data fidelity in this paradigm is often degraded by VR inconsistency, for which the existing VR approaches become ineffective, as they ignore the inter-dependency b
Externí odkaz:
http://arxiv.org/abs/2403.03541
Autor:
Wu, Yihan, Maiti, Soumi, Peng, Yifan, Zhang, Wangyou, Li, Chenda, Wang, Yuyue, Wang, Xihua, Watanabe, Shinji, Song, Ruihua
Recent advancements in language models have significantly enhanced performance in multiple speech-related tasks. Existing speech language models typically utilize task-dependent prompt tokens to unify various speech tasks in a single model. However,
Externí odkaz:
http://arxiv.org/abs/2401.18045