Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Lei, Weixian"'
Autor:
Lin, Kevin Qinghong, Li, Linjie, Gao, Difei, Yang, Zhengyuan, Wu, Shiwei, Bai, Zechen, Lei, Weixian, Wang, Lijuan, Shou, Mike Zheng
Building Graphical User Interface (GUI) assistants holds significant promise for enhancing human workflow productivity. While most agents are language-based, relying on closed-source API with text-rich meta-information (e.g., HTML or accessibility tr
Externí odkaz:
http://arxiv.org/abs/2411.17465
Autor:
Lei, Weixian, Ge, Yixiao, Yi, Kun, Zhang, Jianfeng, Gao, Difei, Sun, Dylan, Ge, Yuying, Shan, Ying, Shou, Mike Zheng
Aiming to advance AI agents, large foundation models significantly improve reasoning and instruction execution, yet the current focus on vision and language neglects the potential of perceiving diverse modalities in open-world environments. However,
Externí odkaz:
http://arxiv.org/abs/2311.16081
Though the success of CLIP-based training recipes in vision-language models, their scalability to more modalities (e.g., 3D, audio, etc.) is limited to large-scale data, which is expensive or even inapplicable for rare modalities. In this paper, we p
Externí odkaz:
http://arxiv.org/abs/2308.10185
Autor:
Wu, Jay Zhangjie, Ge, Yixiao, Wang, Xintao, Lei, Weixian, Gu, Yuchao, Shi, Yufei, Hsu, Wynne, Shan, Ying, Qie, Xiaohu, Shou, Mike Zheng
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T2V) generator. Despite their promising results, such paradigm is computationally expensive. In this work, we propose
Externí odkaz:
http://arxiv.org/abs/2212.11565
Autor:
Singh, Parantak, Li, You, Sikarwar, Ankur, Lei, Weixian, Gao, Daniel, Talbot, Morgan Bruce, Sun, Ying, Shou, Mike Zheng, Kreiman, Gabriel, Zhang, Mengmi
Curriculum design is a fundamental component of education. For example, when we learn mathematics at school, we build upon our knowledge of addition to learn multiplication. These and other concepts must be mastered before our first algebra lesson, w
Externí odkaz:
http://arxiv.org/abs/2211.15470
Imbalanced training data is a significant challenge for medical image classification. In this study, we propose a novel Progressive Class-Center Triplet (PCCT) framework to alleviate the class imbalance issue particularly for diagnosis of rare diseas
Externí odkaz:
http://arxiv.org/abs/2207.04793
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Singh P; Nanyang Technological University (NTU), Singapore.; CFAR and I2R, Agency for Science, Technology and Research, Singapore., Li Y; CFAR and I2R, Agency for Science, Technology and Research, Singapore.; University of Wisconsin-Madison, USA., Sikarwar A; Nanyang Technological University (NTU), Singapore.; CFAR and I2R, Agency for Science, Technology and Research, Singapore., Lei W; Show Lab, National University of Singapore, Singapore., Gao D; Show Lab, National University of Singapore, Singapore., Talbot MB; Boston Children's Hospital, Harvard Medical School, USA.; Harvard-MIT Health Sciences and Technology, MIT., Sun Y; CFAR and I2R, Agency for Science, Technology and Research, Singapore., Shou MZ; Show Lab, National University of Singapore, Singapore., Kreiman G; Boston Children's Hospital, Harvard Medical School, USA., Zhang M; Nanyang Technological University (NTU), Singapore.; CFAR and I2R, Agency for Science, Technology and Research, Singapore.
Publikováno v:
... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision [IEEE Int Conf Comput Vis Workshops] 2023 Oct; Vol. 2023, pp. 11674-11685. Date of Electronic Publication: 2024 Jan 15.