Zobrazeno 1 - 10
of 133
pro vyhledávání: '"Hao Yaru"'
This work investigates the selection of high-quality pre-training data from massive corpora to enhance LMs' capabilities for downstream usage. We formulate data selection as a generalized Optimal Control problem, which can be solved theoretically by
Externí odkaz:
http://arxiv.org/abs/2410.07064
This work studies the general principles of improving the learning of language models (LMs), which aims at reducing the necessary training steps for achieving superior performance. Specifically, we present a theory for the optimal learning of LMs. We
Externí odkaz:
http://arxiv.org/abs/2402.17759
In this work, we use large language models (LLMs) to augment and accelerate research on the P versus NP problem, one of the most important open problems in theoretical computer science and mathematics. Specifically, we propose Socratic reasoning, a g
Externí odkaz:
http://arxiv.org/abs/2309.05689
Publikováno v:
Journal of Research in Interactive Marketing, 2024, Vol. 18, Issue 5, pp. 759-786.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/JRIM-10-2023-0330
We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual world. Specifically, we represent refer expressions as links in Markd
Externí odkaz:
http://arxiv.org/abs/2306.14824
Autor:
Huang, Shaohan, Dong, Li, Wang, Wenhui, Hao, Yaru, Singhal, Saksham, Ma, Shuming, Lv, Tengchao, Cui, Lei, Mohammed, Owais Khan, Patra, Barun, Liu, Qiang, Aggarwal, Kriti, Chi, Zewen, Bjorck, Johan, Chaudhary, Vishrav, Som, Subhojit, Song, Xia, Wei, Furu
A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence. In this work, we introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities,
Externí odkaz:
http://arxiv.org/abs/2302.14045
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
Large pretrained language models have shown surprising in-context learning (ICL) ability. With a few demonstration input-label pairs, they can predict the label for an unseen input without parameter updates. Despite the great success in performance,
Externí odkaz:
http://arxiv.org/abs/2212.10559
Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation, a general
Externí odkaz:
http://arxiv.org/abs/2212.09611
Large language models have exhibited intriguing in-context learning capability, achieving promising zero- and few-shot performance without updating the parameters. However, conventional in-context learning is usually restricted by length constraints,
Externí odkaz:
http://arxiv.org/abs/2212.06713
In this paper, we move towards combining large parametric models with non-parametric prototypical networks. We propose prototypical fine-tuning, a novel prototypical framework for fine-tuning pretrained language models (LM), which automatically learn
Externí odkaz:
http://arxiv.org/abs/2211.13638