Zobrazeno 1 - 10
of 22 599
pro vyhledávání: '"Zhilin A."'
Autor:
Xu, Haoran, Liu, Ziqian, Fu, Rong, Su, Zhongling, Wang, Zerui, Cai, Zheng, Pei, Zhilin, Zhang, Xingcheng
With the evolution of large language models, traditional Transformer models become computationally demanding for lengthy sequences due to the quadratic growth in computation with respect to the sequence length. Mamba, emerging as a groundbreaking arc
Externí odkaz:
http://arxiv.org/abs/2408.03865
We propose an experimental study of adaptive time-stepping methods for efficient modeling of the aggregation-fragmentation kinetics. Precise modeling of this phenomena usually requires utilization of the large systems of nonlinear ordinary differenti
Externí odkaz:
http://arxiv.org/abs/2407.16559
Domain adversarial training has shown its effective capability for finding domain invariant feature representations and been successfully adopted for various domain adaptation tasks. However, recent advances of large models (e.g., vision transformers
Externí odkaz:
http://arxiv.org/abs/2407.12782
Autor:
Li, Zhilin, Zhao, Yongheng, Hu, Yiqing, Li, Yang, Zhang, Keyao, Gao, Zhibing, Tan, Lirou, Liu, Hanli, Li, Xiaoli, Cao, Aihua, Cui, Zaixu, Zhao, Chenguang
Background: The use of near-infrared lasers for transcranial photobiomodulation (tPBM) offers a non-invasive method for influencing brain activity and is beneficial for various neurological conditions. Objective: To investigate the safety and neuropr
Externí odkaz:
http://arxiv.org/abs/2407.09922
Autor:
Zhu, Zhilin, Hong, Xiaopeng, Ma, Zhiheng, Zhuang, Weijun, Ma, Yaohui, Dai, Yong, Wang, Yaowei
Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and
Externí odkaz:
http://arxiv.org/abs/2407.09367
Autor:
Parmar, Jupinder, Prabhumoye, Shrimai, Jennings, Joseph, Liu, Bo, Jhunjhunwala, Aastha, Wang, Zhilin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan
The impressive capabilities of recent language models can be largely attributed to the multi-trillion token pretraining datasets that they are trained on. However, model developers fail to disclose their construction methodology which has lead to a l
Externí odkaz:
http://arxiv.org/abs/2407.06380
Autor:
Liang, Quanmin, Huang, Zhilin, Zheng, Xiawu, Yang, Feidiao, Peng, Jun, Huang, Kai, Tian, Yonghong
Publikováno v:
International Joint Conference on Artificial Intelligence 2024
Current Event Stream Super-Resolution (ESR) methods overlook the redundant and complementary information present in positive and negative events within the event stream, employing a direct mixing approach for super-resolution, which may lead to detai
Externí odkaz:
http://arxiv.org/abs/2406.19640
Autor:
Nvidia, Adler, Bo, Agarwal, Niket, Aithal, Ashwath, Anh, Dong H., Bhattacharya, Pallab, Brundyn, Annika, Casper, Jared, Catanzaro, Bryan, Clay, Sharon, Cohen, Jonathan, Das, Sirshak, Dattagupta, Ayush, Delalleau, Olivier, Derczynski, Leon, Dong, Yi, Egert, Daniel, Evans, Ellie, Ficek, Aleksander, Fridman, Denys, Ghosh, Shaona, Ginsburg, Boris, Gitman, Igor, Grzegorzek, Tomasz, Hero, Robert, Huang, Jining, Jawa, Vibhu, Jennings, Joseph, Jhunjhunwala, Aastha, Kamalu, John, Khan, Sadaf, Kuchaiev, Oleksii, LeGresley, Patrick, Li, Hui, Liu, Jiwei, Liu, Zihan, Long, Eileen, Mahabaleshwarkar, Ameya Sunil, Majumdar, Somshubra, Maki, James, Martinez, Miguel, de Melo, Maer Rodrigues, Moshkov, Ivan, Narayanan, Deepak, Narenthiran, Sean, Navarro, Jesus, Nguyen, Phong, Nitski, Osvald, Noroozi, Vahid, Nutheti, Guruprasad, Parisien, Christopher, Parmar, Jupinder, Patwary, Mostofa, Pawelec, Krzysztof, Ping, Wei, Prabhumoye, Shrimai, Roy, Rajarshi, Saar, Trisha, Sabavat, Vasanth Rao Naik, Satheesh, Sanjeev, Scowcroft, Jane Polak, Sewall, Jason, Shamis, Pavel, Shen, Gerald, Shoeybi, Mohammad, Sizer, Dave, Smelyanskiy, Misha, Soares, Felipe, Sreedhar, Makesh Narsimhan, Su, Dan, Subramanian, Sandeep, Sun, Shengyang, Toshniwal, Shubham, Wang, Hao, Wang, Zhilin, You, Jiaxuan, Zeng, Jiaqi, Zhang, Jimmy, Zhang, Jing, Zhang, Vivienne, Zhang, Yian, Zhu, Chen
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distri
Externí odkaz:
http://arxiv.org/abs/2406.11704
Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space. However, scalability is hindered by the growing complexity of solutions, mainly due to the abstract na
Externí odkaz:
http://arxiv.org/abs/2406.11551
Autor:
Wang, Ze, Li, Kangkang, Wang, Yue, Zhou, Xin, Cheng, Yinke, Jing, Boxuan, Sun, Fengxiao, Li, Jincheng, Li, Zhilin, Gong, Qihuang, He, Qiongyi, Li, Bei-Bei, Yang, Qi-Fan
Increasing the number of entangled entities is crucial for achieving exponential computational speedups and secure quantum networks. Despite recent progress in generating large-scale entanglement through continuous-variable (CV) cluster states, trans
Externí odkaz:
http://arxiv.org/abs/2406.10715